[go: up one dir, main page]

WO2023155924A1 - Guide rna and uses thereof - Google Patents

Guide rna and uses thereof Download PDF

Info

Publication number
WO2023155924A1
WO2023155924A1 PCT/CN2023/077462 CN2023077462W WO2023155924A1 WO 2023155924 A1 WO2023155924 A1 WO 2023155924A1 CN 2023077462 W CN2023077462 W CN 2023077462W WO 2023155924 A1 WO2023155924 A1 WO 2023155924A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
seq
nos
optionally
guide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/077462
Other languages
French (fr)
Inventor
Yingsi ZHOU
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huidagene Therapeutics Co Ltd
Original Assignee
Huidagene Therapeutics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huidagene Therapeutics Co Ltd filed Critical Huidagene Therapeutics Co Ltd
Publication of WO2023155924A1 publication Critical patent/WO2023155924A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • A61P35/02Antineoplastic agents specific for leukemia
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • C12N15/1138Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing against receptors or cell surface proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/88Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation using microencapsulation, e.g. using amphiphile liposome vesicle
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/31Chemical structure of the backbone
    • C12N2310/315Phosphorothioates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/34Spatial arrangement of the modifications
    • C12N2310/344Position-specific modifications, e.g. on every purine, at the 3'-end
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]

Definitions

  • the disclosure contains an electronic sequence listing ( “xxx. xml” ; Size is xxx bytes and it was created on xxx) , the contents of which is incorporated hereby by reference in its entirety. Wherever a sequence is an RNA sequence, the T in the sequence shall be deemed as U.
  • CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
  • Cas CRISPR-associated genes
  • the disclosure provides certain advantages over the prior art.
  • the disclosure herein is not limited to specific advantages, in an aspect, the disclosure provides a guide RNA comprising (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and (2) a guide sequence capable of hybridizing to a target sequence on a target strand of a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC in a target cell, thereby guiding the complex to the target gene; wherein the target gene comprises a protospacer sequence on the nontarget strand of the target gene and a protospacer adjacent motif (PAM) adjacent (e.g.,
  • PAM proto
  • the disclosure provides a system comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) or a polynucleotide encoding the napDNAbp; and (2) the guide RNA of the disclosure, or a polynucleotide encoding the guide RNA.
  • napDNAbp nucleic acid programmable DNA binding protein
  • the guide RNA of the disclosure or a polynucleotide encoding the guide RNA.
  • the disclosure provides a complex comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) complexed with (2) the guide RNA of the disclosure.
  • napDNAbp nucleic acid programmable DNA binding protein
  • the disclosure provides a polynucleotide encoding the guide RNA of the disclosure.
  • the disclosure provides a ribonucleoprotein (RNP) comprising the system or complex of the disclosure comprising the napDNAbp and the guide RNA.
  • RNP ribonucleoprotein
  • the disclosure provides a lipid nanoparticle (LNP) comprising the system of the disclosure comprising a mRNA encoding the napDNAbp and the guide RNA.
  • LNP lipid nanoparticle
  • the disclosure provides a method for modifying a target gene in a target cell, comprising contacting the target cell with the system, the RNP, or the LNP of the disclosure, wherein the guide sequence is capable of hybridizing to a target sequence on a target strand of the target gene, wherein the target gene is modified (e.g., cleaved) by the complex.
  • the disclosure provides a method of producing a modified target cell, comprising (1) optionally harvesting a target cell from a subject; (2) optionally sorting and/or optionally amplifying the harvested target cell; (3) modifying a target gene in the (optionally sorted and/or optionally amplified) target cell by the method of the disclosure; (4) optionally inserting a donor sequence (e.g., a chimeric antigen receptor (CAR) -encoding donor sequence) into the genome of the target cell; and (5) optionally purifying the modified target cell.
  • a donor sequence e.g., a chimeric antigen receptor (CAR) -encoding donor sequence
  • the disclosure provides a cell or a progeny thereof, wherein the cell is modified by the method of the disclosure.
  • the disclosure provides a method for preventing or treating a disease or disorder in a subject, comprising administering to the subject (e.g., an effective amount of) the system, the complex, the RNP, the LNP, or the cell or progeny thereof of the disclosure.
  • activity refers to a biological activity.
  • activity includes enzymatic activity, e.g., catalytic ability of an effector.
  • activity can include nuclease activity.
  • nucleic acid programmable nucleotide binding protein may be used interchangeably with “polynucleotide programmable nucleotide binding domain” to refer to a protein that associates with a nucleic acid (e.g., DNA or RNA) , such as a guide nucleic acid or guide polynucleotide (e.g., gRNA) , that guides the protein to a specific nucleic acid sequence.
  • a nucleic acid programmable nucleotide binding protein is a nucleic acid programmable DNA binding protein (napDNAbp) .
  • the nucleic acid programmable nucleotide binding protein is a nucleic acid programmable RNA binding protein. In some embodiments, the nucleic acid programmable nucleotide binding protein is a Cas9 protein.
  • a Cas9 protein can associate with a guide RNA that guides the Cas9 protein to a specific DNA sequence that is complementary to the guide RNA.
  • the napDNAbp is a Cas9 domain, for example a nuclease active Cas9, a Cas9 nickase (nCas9) , or a nuclease inactive Cas9 (dCas9) .
  • Non-limiting examples of the napDNAbp include, Cas9 (e.g., dCas9 and nCas9) , Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12f/Cas14, Cas12g, Cas12h, Cas12i, and Cas12k.
  • Cas enzymes include Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas5d, Cas5t, Cas5h, Cas5a, Cas6, Cas7, Cas8, Cas8a, Cas8b, Cas8c, Cas9 (also known as Csn1 or Csx12) , Cas10, Cas10d, Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12f/Cas14, Cas12g, Cas12h, Cas12i, Cas12k, Csy1, Csy2, Csy3, Csy4, Cse1, Cse2, Cse3, Cse4, Cse5e, Csc1, Csc2, Csa5, Csn1, Csn2, Csm1, Csm2, Csm3, Csm4, C
  • napDNAbp are also within the scope of this disclosure, e.g., IscB, IsrB, although they may not be specifically listed in this disclosure. See, e.g., Makarova et al. “Classification and Nomenclature of CRISPR-Cas Systems: Where from Here? ” CRISPR J. 2018 October; 1: 325-336. doi: 10.1089/crispr. 2018.0033; Yan et al., “Functionally diverse type V CRISPR-Cas systems” Science. 2019 Jan. 4; 363 (6422) : 88-91. doi: 10.1126/science. aav7271, the entire contents of each are hereby incorporated by reference.
  • the term “complex” refers to a grouping of two or more molecules.
  • the complex comprises a polypeptide and a nucleic acid molecule interacting with (e.g., binding to, coming into contact with, adhering to) one another.
  • the term “complex” can refer to a grouping of a guide RNA and a polypeptide (e.g., a napDNAbp, such as, a Cas12i polypeptide) .
  • the term “complex” can refer to a grouping of a guide RNA, a polypeptide, and a target sequence.
  • the term “complex” can refer to a grouping of a target gene-targeting guide RNA and a napDNAbp.
  • PAM protospacer adjacent motif
  • a target sequence e.g., a target sequence of a target gene
  • a complex comprising a guide RNA (e.g., a target gene-targeting guide RNA) and a napDNAbp binds.
  • the guide RNA binds to a first strand of the target (e.g., the target strand or the spacer-complementary strand) , and a PAM sequence as described herein is present in the second, complementary strand (e.g., the nontarget strand or the non-spacer-complementary strand) and adjacent to the protospacer sequence complementary to the target sequence to which the guide RNA binds.
  • a first strand of the target e.g., the target strand or the spacer-complementary strand
  • a PAM sequence as described herein is present in the second, complementary strand (e.g., the nontarget strand or the non-spacer-complementary strand) and adjacent to the protospacer sequence complementary to the target sequence to which the guide RNA binds.
  • adjacent includes instances in which the guide RNA of a complex comprising a guide RNA and a napDNAbp specifically binds, interacts, or associates with a target sequence that is immediately adjacent to a PAM, or in the case of a double-stranded target where the PAM is present in the non-target strand (e.g., the non-spacer-complementary strand) , with a target sequence that is complementary to a protospacer sequence immediately adjacent to a PAM. In such instances, there are no nucleotides between the target sequence or protospacer sequence and the PAM.
  • adjacent also includes instances in which there are a small number (e.g., 1, 2, 3, 4, or 5) of nucleotides between the target sequence, to which the guide RNA binds, or the protospacer sequence and the PAM.
  • guide RNA refers to any RNA molecule that facilitates the targeting of a napDNAbp (e.g., a Cas12i polypeptide) described herein to a target sequence (e.g., a sequence of a target gene) .
  • a guide RNA may be designed to include sequences that are complementary to a specific nucleic acid sequence (e.g., a sequence of a target gene) .
  • a guide RNA may comprise a DNA targeting sequence (i.e., a guide sequence) and a scaffold sequence.
  • crRNA or “RNA guide” is also used herein to refer to a guide RNA.
  • a guide sequence is complementary to a target sequence.
  • the term “complementary” refers to the ability of nucleobases of a first nucleic acid molecule, such as a guide RNA, to base pair with nucleobases of a second nucleic acid molecule, such as a target sequence. Two complementary nucleic acid molecules are able to non-covalently bind under appropriate temperature and solution ionic strength conditions.
  • a first nucleic acid molecule e.g., a guide sequence of a guide RNA
  • comprises 100%complementarity to a second nucleic acid e.g., a target sequence
  • a first nucleic acid molecule e.g., a guide sequence of a guide RNA
  • a second nucleic acid molecule e.g., a target sequence
  • the first nucleic acid molecule comprises at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%complementarity to the second nucleic acid.
  • the term “substantially complementary” refers to a polynucleotide (e.g., a guide sequence of a guide RNA) that has a certain level of complementarity to a target sequence.
  • the level of complementarity is such that the polynucleotide (e.g., a guide sequence of a guide RNA) can hybridize to the target sequence (e.g., a sequence of a target gene) with sufficient affinity to permit an effector polypeptide (e.g., a napDNAbp) that is complexed with the polynucleotide or a function domain associated (e.g., fused) with the effector polypeptide to act (e.g., cleave, deaminize) on the target sequence or its complement (e.g., a sequence of a target gene or its complement) .
  • an effector polypeptide e.g., a napDNAbp
  • a function domain associated e.g., fused
  • a guide sequence that is substantially complementary to a target sequence has less than 100%complementarity to the target sequence. In some embodiments, a guide sequence that is substantially complementary to a target sequence has at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%complementarity to the target sequence.
  • target sequence refer to a nucleic acid sequence to which a guide RNA specifically binds.
  • the DNA targeting sequence (e.g., spacer) of a guide RNA binds to a target sequence.
  • the guide RNA binds to a first strand of the target (i.e., the target strand or the spacer-complementary strand) , and a PAM sequence as described herein is present in the second, complementary strand (i.e., the non-target strand or the non-spacer-complementary strand) and adjacent to the protospacer sequence complementary to the target sequence to which the guide RNA binds.
  • the target strand i.e., the spacer-complementary strand
  • the non-target strand i.e., the non-spacer-complementary strand
  • the target sequence or its complement i.e., a protospacer sequence
  • upstream and downstream refer to relative positions within a single nucleic acid (e.g., DNA) sequence in a nucleic acid molecule. “Upstream” and “downstream” relate to the 5’ to 3’ direction, respectively, in which RNA transcription occurs.
  • a first sequence is upstream of a second sequence when the 3’ end of the first sequence occurs before the 5’ end of the second sequence.
  • a first sequence is downstream of a second sequence when the 5’ end of the first sequence occurs after the 3’ end of the second sequence.
  • the 5’-NTTN-3’ sequence is upstream of an indel described herein, and a napDNAbp-induced indel is downstream of the 5’-NTTN-3’ sequence.
  • nucleic acid polynucleotide, ” and “nucleotide sequence” are used interchangeably to refer to a polymeric form of nucleotides of any length, including deoxyribonucleotides, ribonucleotides, combinations thereof, and analogs thereof.
  • Oligo are used interchangeably to refer to a short polynucleotide, having no more than about 50 nucleotides.
  • complementarity refers to the ability of a nucleic acid to form hydrogen bond (s) with another nucleic acid by traditional Watson-Crick base-pairing.
  • a percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (i.e., Watson-Crick base pairing) with a second nucleic acid (e.g., about 5, 6, 7, 8, 9, 10 out of 10, being about 50%, 60%, 70%, 80%, 90%, and 100%complementary respectively) .
  • Perfectly complementary or “completely complementary” means that all the contiguous residues of a nucleic acid sequence form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence.
  • “Substantially complementary” as used herein refers to a degree of complementarity that is at least about any one of 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100%over a region of about 40, 50, 60, 70, 80, 100, 150, 200, 250 or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
  • Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
  • the hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner.
  • a sequence capable of hybridizing with a given sequence is referred to as the “complement” of the given sequence.
  • Percentage (%) sequence identity with respect to a nucleic acid sequence is defined as the percentage of nucleotides in a candidate sequence that are identical with the nucleotides in the specific nucleic acid sequence, after aligning the sequences by allowing gaps, if necessary, to achieve the maximum percent sequence identity. “Percentage (%) sequence identity” with respect to a peptide, polypeptide or protein sequence is the percentage of amino acid residues in a candidate sequence that are identical substitutions to amino acid residues in the specific peptide or amino acid sequence, after aligning the sequences by allowing gaps, if necessary, to achieve the maximum percent sequence homology.
  • Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or MEGALIGNTM (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
  • polypeptide and “peptide” are used interchangeably herein to refer to polymers of amino acids of any length.
  • the polymer may he linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids.
  • a protein may have one or more polypeptides.
  • the terms also encompass an amino acid polymer that has been modified; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation, such as conjugation with a labeling component.
  • regulatory element is intended to include promoters, enhancers, internal ribosome entry sites (IRES) and other expression control elements (e.g., transcription termination signals such as polyadenylation signals and poly-U sequences) .
  • IRES internal ribosome entry sites
  • regulatory elements e.g., transcription termination signals such as polyadenylation signals and poly-U sequences
  • Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cells and those that direct expression of a nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences) .
  • Regulatory elements may also direct expression in a time-dependent manner, e.g., in a cell cycle-dependent or developmental stage-dependent manner, which may or may not be tissue or cell type specific.
  • a “variant” is interpreted to mean a polynucleotide or polypeptide that differs from a reference polynucleotide or polypeptide, respectively, but retains essential properties.
  • a typical variant of a polynucleotide differs in nucleic acid sequence from another, reference polynucleotide. Changes in the nucleic acid sequence of the variant may or may not alter the amino acid sequence of a polypeptide encoded by the reference polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence, as discussed below.
  • a typical variant of a polypeptide differs in amino acid sequence from another, reference polypeptide. Generally, differences are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical.
  • a variant and reference polypeptide may differ in amino acid sequence by one or more substitutions, additions, deletions in any combination.
  • a substituted or inserted amino acid residue may or may not be one encoded by the genetic code.
  • a variant of a polynucleotide or polypeptide may be a naturally occurring such as an allelic variant, or it may be a variant that is not known to occur naturally. Non-naturally occurring variants of polynucleotides and polypeptides may be made by mutagenesis techniques, by direct synthesis, and by other recombinant methods known to skilled artisans.
  • wild type has the meaning commonly understood by those skilled in the art to mean a typical form of an organism, a strain, a gene, or a feature that distinguishes it from a mutant or variant when it exists in nature. It can be isolated from sources in nature and not intentionally modified.
  • nucleic acid molecule or polypeptide As used herein, the terms “non-naturally occurring” or “engineered” are used interchangeably and refer to artificial participation. When these terms are used to describe a nucleic acid molecule or polypeptide, it is meant that the nucleic acid molecule or polypeptide is at least substantially freed from at least one other component of its association in nature or as found in nature.
  • the term "identity" is used to mean the matching of sequences between two polypeptides or between two nucleic acids.
  • a position in the two sequences being compared is occupied by the same base or amino acid monomer subunit (for example, a position in each of the two DNA molecules is occupied by adenine, or a position in each of the two polypeptides is occupied by lysine, and then each molecule is identical at that position.
  • the "percent identity" between the two sequences is a function of the number of matching positions shared by the two sequences divided by the number of positions to be compared x 100. For example, if 6 of the 10 positions of the two sequences match, then the two sequences have 60%identity.
  • the DNA sequences CTGACT and CAGGTT share 50%identity (3 out of a total of 6 positions match) .
  • the comparison is made when the two sequences are aligned to produce maximum identity.
  • Such alignment can be achieved by, for example, the method of Needleman et al. (1970) J. Mol. Biol. 48: 443-453, which can be conveniently performed by a computer program such as the Align program (DNAstar, Inc. ) . It is also possible to use the algorithm of E. Meyers and W. Miller (Comput. Appl Biosci., 4: 11-17 (1988) ) integrated into the ALIGN program (version 2.0) , using the PAM 120 weight residue table.
  • the gap length penalty of 12 and the gap penalty of 4 were used to determine the percent identity between the two amino acid sequences.
  • the Needleman and Wunsch (J MoI Biol. 48:444-453 (1970) ) algorithms in the GAP program integrated into the GCG software package can be used, using the Blossum 62 matrix or The PAM250 matrix and the gap weight of 16, 14, 12, 10, 8, 6 or 4 and the length weight of 1, 2, 3, 4, 5 or 6 to determine the percent identity between two amino acid sequences.
  • “Stem-loop structure” refers to a nucleic acid that has a secondary structure that includes regions of nucleotides known or predicted to form a double-strand (stem portion) that is linked by regions of single-stranded nucleotides (loop portions) .
  • the terms “hairpin” and “turnback” structures are also used herein to refer to stem-loop structures. Such structures are well known in the art, and these terms are used in accordance with their commonly known meanings in the art.
  • stem-loop structures do not require precise base pairing.
  • the stem may include one or more base mismatches.
  • base pairing may be exact, i.e., not including any mismatches.
  • Cas12i, ” “Cas12i protein, ” or “Cas12i polypeptide” as used herein, include any Cas12i protein described in the disclosure and its variants, such as mutants, and derivatives, such as Cas12i fusion proteins, as well as dCas12i proteins substantially lacking catalytic activity, nCas12i nickases with nickase single-strand cleavage activity, and their derivatives, such as dCas12i fusion proteins (such as dCas12i-TadA) .
  • the disclosure also provides nucleotide sequences encoding Cas12i proteins and variants and derivatives thereof.
  • crRNA is used herein interchangeably with guide molecule, gRNA, or guide RNA, comprising a portion capable of recruiting and forming a protein-RNA complex with a CRISPR-Cas protein (such as any of the Cas12i proteins and variants and derivatives thereof as described herein) (e.g., direct repeats/DRs) and a portion that is sufficiently complementary to a target sequence to hybridize to the target sequence and direct the specific binding of the aforementioned protein-RNA complex to the target sequence (e.g. spacer/Spacer) .
  • CRISPR-Cas protein such as any of the Cas12i proteins and variants and derivatives thereof as described herein
  • a portion that is sufficiently complementary to a target sequence to hybridize to the target sequence and direct the specific binding of the aforementioned protein-RNA complex to the target sequence e.g. spacer/Spacer
  • a “cell” as used herein, is understood to refer not only to the particular individual cell, but to the progeny or potential progeny of the cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
  • transduction and “transfection” as used herein include all methods known in the art using an infectious agent (such as a virus) or other means to introduce DNA into cells for expression of a protein or molecule of interest.
  • infectious agent such as a virus
  • virus or virus like agent there are chemical-based transfection methods, such as those using calcium phosphate, dendrimers, liposomes, or cationic polymers (e.g., DEAE-dextran or polyethylenimine) ; non-chemical methods, such as electroporation, cell squeezing, sonoporation, optical transfection, impalefection, protoplast fusion, delivery of plasmids, or transposons; particle-based methods, such as using a gene gun, magnetofection or magnet assisted transfection, particle bombardment; and hybrid methods, such as nucleofection.
  • transfected or “transformed” or “transduced” as used herein refers to a process by which exogenous nucleic acid is transferred or introduced into a target cell.
  • a “transfected” or “transformed” or “transduced” cell is one, which has been transfected, transformed or transduced with exogenous nucleic acid.
  • in vivo refers to inside the body of the organism from which the cell is obtained. “Ex vivo” or “in vitro” means outside the body of the organism from which the cell is obtained.
  • treatment is an approach for obtaining beneficial or desired results including clinical results.
  • beneficial or desired clinical results include, but are not limited to, one or more of the following: alleviating one or more symptoms resulting from the disease, diminishing the extent of the disease, stabilizing the disease (e.g., preventing or delaying the worsening of the disease) , preventing or delaying the spread (e.g., metastasis) of the disease, preventing or delaying the recurrence of the disease, reducing recurrence rate of the disease, delay or slowing the progression of the disease, ameliorating the disease state, providing a remission (partial or total) of the disease, decreasing the dose of one or more other medications required to treat the disease, delaying the progression of the disease, increasing the quality of life, and/or prolonging survival.
  • treatment is a reduction of pathological consequence of a disease (such as cancer) .
  • the methods of the disclosure contemplate any one or more of these
  • CAR Chimeric antigen receptor
  • CAR engineered receptors, which can be used to graft one or more antigen specificity onto immune effector cells, such as T cells.
  • Some CARs are also known as “artificial T-cell receptors, ” “chimeric T cell receptors, ” or “chimeric immune receptors. ”
  • the CAR comprises an extracellular antigen binding domain specific for one or more antigens (such as tumor antigens) , a transmembrane domain, and an intracellular signaling domain of a T cell and/or other receptors.
  • CAR-T refers to a T cell that expresses a CAR.
  • T cell receptor refers to endogenous or recombinant T cell receptor comprising an extracellular antigen-binding domain that binds to a specific antigenic peptide bound in an MHC molecule.
  • the TCR comprises a TCR polypeptide chain and a TCR polypeptide chain.
  • the TCR specifically binds a tumor antigen.
  • TCR-T refers to a T cell that expresses a recombinant TCR.
  • T-cell antigen coupler receptor or “TAC receptor” as used herein refers to an engineered receptor comprising an extracellular antigen-binding domain that binds to a specific antigen and a T-cell receptor (TCR) binding domain, a transmembrane domain, and an intracellular domain of a co-receptor molecule.
  • TAC receptor co-opts the endogenous TCR of a T cell that expressed the TAC receptor to elicit antigen-specific T-cell response against a target cell.
  • TCR fusion protein or “TFP” as used herein refers to an engineered receptor comprising an extracellular antigen-binding domain that binds to a specific antigen fused to a subunit of the TCR complex or a portion thereof, including TCR chain, TCR chain, TCR chain, TCR chain, CD3, CD3, or CD3.
  • the subunit of the TCR complex or portion thereof comprise a transmembrane domain and at least a portion of the intracellular domain of the naturally occurring TCR subunit.
  • the TFP comprises the extracellular domain of the TCR subunit or a portion thereof. In some embodiments, the TFP does not comprise the extracellular domain of the TCR subunit.
  • embodiments of the disclosure described herein include “consisting” and/or “consisting essentially of” embodiments.
  • reference to “not” a value or parameter generally means and describes “other than” a value or parameter.
  • the method is not used to treat cancer of type X means the method is used to treat cancer of types other than X.
  • a and/or B is intended to include both A and B; A or B; A (alone) ; and B (alone) .
  • the term “and/or” as used herein a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone) ; B (alone) ; and C (alone) .
  • FIG. 1 illustrates an exemplary target gene and an exemplary guide RNA of the disclosure.
  • the disclosure relates to a guide RNA capable of binding to a target gene and uses thereof.
  • a system comprising a guide RNA having one or more characteristics is described herein.
  • a method of producing the guide RNA is described.
  • a method of delivering a system comprising the guide RNA is described.
  • the disclosure provides a guide RNA.
  • the guide RNAs is capable of directing a nucleic acid programmable DNA binding protein (napDNAbp) to a target DNA (e.g., gene) .
  • napDNAbp nucleic acid programmable DNA binding protein
  • the guide RNA may comprise (1) a scaffold sequence capable of forming a complex with a napDNAbp and (2) a guide sequence capable of hybridizing to a target sequence of a target DNA, thereby guiding the complex to the target DNA.
  • the guide RNA is “programmable” (i.e., programmable guide RNA) because the guide sequence can be tailored to target a specific target sequence /target site.
  • Cas12i is a programable RNA-guided dsDNA endonuclease that may generate a double-strand break (DSB) on a target dsDNA as guided by a programable guide RNA referred to as CRISPR RNA (crRNA) or guide RNA (gRNA) comprising a spacer sequence (or a guide sequence) and a direct repeat (DR) sequence (or a scaffold sequence) .
  • CRISPR RNA CRISPR RNA
  • gRNA guide RNA
  • DR direct repeat
  • the direct repeat sequence is responsible for forming a complex with a Cas12i polypeptide and the spacer sequence is responsible for hybridizing to a target sequence of a target dsDNA, thereby guiding the complex comprising the gRNA and the Cas12i polypeptide to the target dsDNA.
  • a target gene (partially) as an example of a target dsDNA is depicted to comprise a 5’ to 3’ upside strand and a 3’ to 5’ downside strand.
  • a gRNA is depicted to comprise a spacer sequence in green and a direct repeat sequence in orange.
  • the spacer sequence is designed to hybridize to a part of the downside strand, and so the guide sequence “targets” the part of the downside strand.
  • the downside strand is referred to as a “target DNA strand” or a “target strand (TS) ” of the target dsDNA
  • the upside strand is referred to as a “non-target DNA strand” or a “non-target strand (NTS) ” of the target dsDNA.
  • target sequence The part of the target strand based on which the guide sequence is designed and to which the guide sequence may hybridize is referred to as a “target sequence”
  • sequence on the non-target strand corresponding to and base pairing with the target sequence is referred to as the “reverse complementary sequence of the target sequence” , “reverse complementary sequence” , “complement” , or “protospacer sequence” .
  • programmable guide RNA As used herein, “programmable guide RNA” , “guide RNA” , “gRNA” , “CRISPR RNA” , and “crRNA” are exchangeable. As used herein, “spacer sequence” and “guide sequence” are exchangeable. As used herein, “scaffold sequence” and “direct repeat (DR) sequence” are exchangeable.
  • the disclosure provides a guide RNA comprising (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and (2) a guide sequence capable of hybridizing to a target sequence of a target DNA, thereby guiding the complex to the target DNA.
  • napDNAbp nucleic acid programmable DNA binding protein
  • the term “guiding” as used herein is be exchangeable with “directing” or “targeting” .
  • the guide RNA may guide, direct, or target the complex formed with and comprising the napDNAbp as described herein and the guide RNA, or in other words, guide, direct, or target the napDNAbp as described herein, to a target sequence of a target DNA.
  • Two or more guide RNAs may target two or more separate napDNAbp (e.g., napDNAbp having the same or different sequence) as described herein to two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) target sequences (same or different) of one, two, or more target DNA.
  • napDNAbp e.g., napDNAbp having the same or different sequence
  • a guide RNA is target sequence-specific. That is, in some embodiments, a guide RNA binds specifically to one or more target sequences (e.g., within a cell) and not to non-targeted sequences (e.g., non-specific DNA or random sequences within the same cell) .
  • the target DNA is a target dsDNA. In some embodiments, the target dsDNA is a target gene. In some embodiments, the target DNA is a target ssDNA.
  • the structure (or configuration) of the guide RNA may vary depending on various purposes, e.g., for improved activity, such as, DNA cleavage activity.
  • the guide RNA comprises one or more scaffold sequences and one or more guide sequences, wherein the scaffold sequence (s) and the guide sequence (s) may have various structures (or configurations) .
  • the guide sequence (s) and the scaffold sequence (s) of the guide RNA are present within the same RNA molecule.
  • the guide sequence (s) and the scaffold sequence (s) are linked directly to one another with or without a linker (e.g., a short polynucleotide sequence) .
  • the linker is a short linker, e.g., an RNA linker of 1, 2, 3, or more nucleotides in length.
  • the guide sequence (s) and the scaffold sequence (s) of the guide RNA are present in separate RNA molecules, which are joined to one another by base pairing interactions.
  • the guide RNA comprises a guide sequence followed by a scaffold sequence, referring to the sequences in the 5’ to 3’ direction (i.e., 5’-guide sequence -scaffold sequence -3’ ) .
  • the guide RNA comprises a scaffold sequence followed by a guide sequence, referring to the sequences in the 5’ to 3’ direction (i.e., 5’ -scaffold sequence -guide sequence -3’ ) .
  • the guide RNA comprises one scaffold sequence and one guide sequence in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -3’ or 5’ -guide sequence -scaffold sequence -3’, and wherein the “-” between the scaffold sequence and the guide sequence represents an optional linker.
  • the guide RNA comprises two scaffold sequences and one guide sequence in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -scaffold sequence -3’, wherein the two scaffold sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker.
  • the guide RNA comprises two scaffold sequences and two guide sequences in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -scaffold sequence -guide sequence -3’ or 5’ –guide sequence -scaffold sequence -guide sequence -scaffold sequence -3’, wherein the two scaffold sequences are the same or different, wherein the two guide sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker.
  • the guide RNA comprises three scaffold sequences and two guide sequences in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -scaffold sequence -guide sequence –scaffold sequence -3’, wherein the three scaffold sequences are the same or different, wherein the two guide sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker.
  • the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of guide sequences. In some embodiments, one or more or each of the guide sequences is capable of hybridizing to a target sequence. In some embodiments, the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of guide sequences capable of hybridizing to a plurality of target sequences, respectively. In some embodiments, the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of scaffold sequences. In some embodiments, one or more or each of the scaffold sequences is capable of forming a complex with a napDNAbp.
  • the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of scaffold sequences capable of forming a complex with a plurality of napDNAbp, respectively.
  • the plurality of guide sequences comprised in a guide RNA are identical or different.
  • the plurality of scaffold sequences comprised in a guide RNA are identical or different.
  • the target dsDNA in the disclosure can be a target gene or a part of a target gene such that the guide RNA or the system of the disclosure is applied to a gene or a part of a target gene.
  • the target gene is in a target cell.
  • the system of the disclosure introduces a mutation (e.g., an indel) to the target gene in the target cell.
  • one or more endogenous DNA repair pathways such as Non-homologous end joining (NHEJ) or Homology directed recombination (HDR) , are induced in the target cell to repair a double-strand break induced and thus introduce a mutation (e.g., an indel) in the target gene as a result of guide sequence-specific cleavage by the system.
  • NHEJ Non-homologous end joining
  • HDR Homology directed recombination
  • exemplary mutations include, but are not limited to, insertions, deletions, and substitutions.
  • the target gene is chromosomal DNA. In some embodiments, the target gene is a gene encoding a functional RNA or a functional polypeptide. In some embodiments, the target gene includes regulatory elements, e.g., a promoter, enhancer, silencer, or insulator. In some embodiments, the target gene is a donor site for splicing. In some embodiments, the target gene is an acceptor site for splicing. In some embodiments, the target gene comprises a plurality of nucleic acids.
  • the target gene is a mammalian gene. In some embodiments, the target gene is a human gene.
  • the target gene is selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC, and any homology, mutant, or variant thereof, and any intron, exon, complement, or fragment thereof.
  • the disclosure provides a guide RNA comprising: (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and (2) a guide sequence capable of hybridizing to a target sequence on a target strand of a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC in a target cell, thereby guiding the complex to the target gene; wherein the target gene comprises a protospacer sequence on the nontarget strand of the target gene and a protospacer adjacent motif (PAM) adjacent (e.g., 5’ ) to the protospacer sequence, wherein the protospacer sequence is fully complementary to the target sequence on the
  • PAM
  • the target gene comprises A2AR gene.
  • the A2AR gene encodes adenosine A2a receptor, which is a G protein-coupled receptor.
  • A2AR is also known as ADORA2A.
  • Exemplary sequences of A2AR can be found, for example, at NCBI Gene ID 135, NCBI Reference Sequence NG_052804.1, and Ensembl ID ENSG00000128271.
  • SEQ ID NOs: 1-73 comprise exemplary protospacer sequences of the A2AR gene.
  • the target gene comprises AAVS1 gene, also known as adeno-associated virus integration site 1.
  • AAVS1 comprises a region of human chromosome 19. In some embodiments, the region of human chromosome 19 is 19q13.
  • AAVS1 is described Kotin, et. al. Embo J. 11 (13) : 5071-5078 and Ward and Walsh, Virology 433(2) : 356-366.
  • SEQ ID NOs: 74-79 comprise exemplary protospacer sequences of the AAVS1 gene.
  • the region comprising AAVS1 comprises the PPP1R12C gene, also known as protein phosphatase 1 regulatory subunit 12C.
  • PPP1R12C also known as protein phosphatase 1 regulatory subunit 12C.
  • Exemplary sequences of PPP1R12C can be found, for example, at NCBI Gene ID 54776, NCBI Reference Sequence NC_000019.10, and Ensembl ID ENSG00000125503.
  • the target gene comprises B2M gene.
  • the B2M gene encodes beta-2-microglobulin, which is a component of MHC class 1 molecules.
  • Exemplary sequences of B2M can be found, for example, at NCBI Gene ID 567, NCBI Reference Sequence NG_012920.2, and Ensembl ID ENSG00000166710.
  • SEQ ID NOs: 80-129 comprise exemplary protospacer sequences of the B2M gene.
  • the target gene comprises BCL11A gene.
  • the BCL11A gene encodes a regulatory C2H2 type zinc-finger protein that can bind DNA and is involved in suppression of fetal hemoglobin production.
  • BCL11A is also known as BAF chromatin remodeling complex subunit BCL11A.
  • Exemplary sequences of BCL11A can be found, for example, at NCBI Gene ID 53335, NCBI Reference Sequence NG_011968.1, and Ensembl ID ENSG00000119866.
  • SEQ ID NOs: 130-368 comprise exemplary protospacer sequences of the BCL11A gene.
  • the target gene comprises CCR5 gene.
  • the CCR5 gene encodes C-C chemokine receptor type 5, which is a chemokine receptor on the surface of white blood cells.
  • Exemplary sequences of CCR5 can be found, for example, at NCBI Gene ID 1234, NCBI Reference Sequence NG_012637.1, and Ensembl ID ENSG00000160791.
  • SEQ ID NOs: 369-520 comprise exemplary protospacer sequences of the CCR5 gene.
  • the target gene comprises CD16a gene.
  • the CD16a gene encodes cluster of differentiation 16.
  • CD16a is found on the surface of NK cells and activates antibody-dependent cell-mediated cytotoxicity (ADCC) .
  • CD16a is also known as FCGR3a or FcyRIII (Fc gamma receptor IIIa) .
  • Exemplary sequences of CD16a can be found, for example, at NCBI Gene ID 2214, NCBI Reference Sequence NG_009066.1, and Ensembl ID ENSG00000203747.
  • SEQ ID NOs: 521-573 comprise exemplary protospacer sequences of the CD16a gene.
  • the target gene comprises a CD3 nucleic acid.
  • CD3 is a protein complex comprising a CD3 ⁇ chain, a CD3 ⁇ chain, and two CD3 ⁇ chains. CD3 is involved in activating cytotoxic T cells and T helper cells.
  • the target gene can be any one or more nucleic acids comprising the CD3 ⁇ chain, the CD3 ⁇ chain, and the CD3 ⁇ chain. Exemplary sequences of CD3 ⁇ can be found, for example, at NCBI Gene ID 917, NCBI Reference Sequence NG_007566.1, and Ensembl ID ENSG00000160654.
  • Exemplary sequences of CD3 ⁇ can be found, for example, at NCBI Gene ID 915, NCBI Reference Sequence NG_009891.1, and Ensembl ID ENSG00000167286.
  • Exemplary sequences of CD3 ⁇ can be found, for example, at NCBI Gene ID 916, NCBI Reference Sequence NG_007383.1, and Ensembl ID ENSG00000198851.
  • the target gene comprises CD52 gene.
  • the CD52 gene encodes CD52 molecule.
  • CD52 is present on the surface of mature lymphocytes.
  • Exemplary sequences of CD52 can be found, for example, at NCBI Gene ID 1043, NCBI Reference Sequence NC_000001.11, and Ensembl ID ENSG00000169442.
  • SEQ ID NOs: 574-591 comprise exemplary protospacer sequences of the CD52 gene.
  • the target gene comprises CD7 gene.
  • the CD7 gene encodes the CD7 protein.
  • CD7 is a member of the immunoglobulin superfamily and found on thymocytes and mature T cells. Exemplary sequences of CD7 can be found, for example, at NCBI Gene ID 924, NCBI Reference Sequence NC_000017.11, and Ensembl ID ENSG00000173762.
  • SEQ ID NOs: 592-608 comprise exemplary protospacer sequences of the CD7 gene.
  • the target gene comprises CIITA gene.
  • the CIITA gene encodes class II major histocompatibility complex transactivator.
  • CIITA controls expression of human leukocyte antigen class II genes.
  • Exemplary sequences of CIITA can be found, for example, at NCBI Gene ID 4261, NCBI Reference Sequence NG_009628.1, and Ensembl ID ENSG00000179583.
  • SEQ ID NOs: 609-792 comprise exemplary protospacer sequences of the CIITA gene.
  • the target gene comprises CISH gene.
  • the CISH gene encodes cytokine-inducible SH2-containing protein.
  • Exemplary sequences of CISH can be found, for example, at NCBI Gene ID 1154, NCBI Reference Sequence NG_023194.1, and Ensembl ID ENSG00000114737.
  • SEQ ID NOs: 793-833 comprise exemplary protospacer sequences of the CISH gene.
  • the target gene comprises CTLA4 gene.
  • the CTLA4 gene encodes cytotoxic T-lymphocyte-associated protein 4, which is a protein receptor that functions as an immune checkpoint and downregulates immune responses.
  • CTLA4 is also known as CD152.
  • Exemplary sequences of CTLA4 can be found, for example, at NCBI Gene ID 1493, NCBI Reference Sequence NG_011502.1, and Ensembl ID ENSG00000163599.
  • SEQ ID NOs: 834-885 comprise exemplary protospacer sequences of the CTLA4 gene.
  • the target gene comprises CXCR4 gene.
  • the CXCR4 gene encodes C-X-C chemokine receptor type 4.
  • CXCR4 is a chemokine receptor expressed on lymphocytes.
  • CXCR4 is also known as fusin or CD184.
  • Exemplary sequences of CXCR4 can be found, for example, at NCBI Gene ID 7852, NCBI Reference Sequence NG_011587.1, and Ensembl ID ENSG00000121966.
  • SEQ ID NOs: 886-1002 comprise exemplary protospacer sequences of the CXCR4 gene.
  • the target gene comprises GAPDH gene.
  • the GAPDH gene encodes glyceraldehyde 3-phospate dehydrogenase.
  • Exemplary sequences of GAPDH can be found, for example, at NCBI Gene ID 2597, NCBI Reference Sequence NG_007073.2, and Ensembl ID ENSG00000111640.
  • the target gene comprises HBB gene.
  • the HBB gene encodes beta globin, which together with alpha globin make up the most common form of hemoglobin in adult humans.
  • the HBB variant HbS causes sickle cell disease. Mutations in the HBB gene also cause the group of blood disorders Beta thalassemias. HBB is also known as hemoglobin subunit beta.
  • Exemplary sequences of HBB can be found, for example, at NCBI Gene ID 3043, NCBI Reference Sequence NG_059281.1, and Ensembl ID ENSG00000244734.
  • SEQ ID NOs: 1003-1048 comprise exemplary protospacer sequences of the HBB gene.
  • the target gene comprises HEXB gene.
  • the HEXB gene encodes beta-hexosaminidase subunit beta, which forms the beta subunit of ⁇ -hexosamininidase.
  • Exemplary sequences of HEXB can be found, for example, at NCBI Gene ID 3074, NCBI Reference Sequence NG_009770.2, and Ensembl ID ENSG00000049860.
  • SEQ ID NOs: 1049-1246 comprise exemplary protospacer sequences of the HEXB gene.
  • the target gene comprises IL2RG gene.
  • the IL2RG gene encodes common chain gamma, which is a cytokine receptor subunit common to several interleukin receptors, including IL-2R, IL-4R, IL-7R, IL-9R, and IL-15R.
  • IL2RG is also known as interleukin 2 receptor subunit gamma.
  • Exemplary sequences of IL2RG can be for example, at NCBI Gene ID 3561, NCBI Reference Sequence NG_009088.1, and Ensembl ID ENSG00000147168.
  • SEQ ID NOs: 1247-1353 comprise exemplary target sequences of the IL2RG gene.
  • the target gene comprises KLRG1 gene.
  • the KLRG1 gene encodes killer cell lectin-like receptor G1.
  • KLRG1 is preferentially expressed in NK cells.
  • Exemplary sequences of KLRG1 can be found, for example, at NCBI Gene ID 10219, NCBI Reference Sequence NC_000012.12, and Ensembl ID ENSG00000139187.
  • SEQ ID NOs: 1354-1414 comprise exemplary protospacer sequences of the KLRG1 gene.
  • the target gene comprises KLRK1 gene.
  • the KLRK1 gene encodes killer cell lectin like receptor K1, which is expressed by NK cells.
  • KLRK1 is also known as NKG2D, KLR, and CD314. Exemplary sequences of KLRK1 can be found, for example, at NCBI Gene ID 22914, NCBI Reference Sequence NG_027762.1, and Ensembl ID ENSG00000213809.
  • the target gene comprises LAG3 gene.
  • the LAG3 gene encodes lymphocyte-activation gene 3.
  • LAG3 is a cell surface molecule with diverse effects on T cell function, including as an immune checkpoint receptor.
  • LAG3 is also known as CD223.
  • Exemplary sequences of LAG3 can be found, for example, at NCBI Gene ID 3902, NCBI Reference Sequence NC_000012.12, and Ensembl ID ENSG00000089692.
  • SEQ ID NOs: 1415-1492 comprise exemplary protospacer sequences of the LAG3 gene.
  • the target gene comprises NKG2A gene.
  • the NKG2A gene encodes killer cell lectin like receptor C1, which is an activating receptor expressed on NK cells.
  • NKG2A is also known as KLRC1 and CD159a.
  • Exemplary sequences of NKG2A can be found, for example, at NCBI Gene ID 3821, NCBI Reference Sequence NC_000012.12, Ensembl ID ENSG00000134545.
  • SEQ ID NOs: 1493-1635 comprise exemplary protospacer sequences of the NKG2A gene.
  • the target gene comprises PD-1 gene.
  • the PD-1 gene encodes programmed cell death 1, which is an immune-inhibitory receptor expressed in activated T cells.
  • PD-1 is also known as PD1, PDCD1, and CD279.
  • Exemplary sequences of PD1 can be found, for example, at NCBI Gene ID 5133, NCBI Reference Sequence NG_012110.1, and Ensembl ID ENSG00000188389.
  • SEQ ID NOs: 1636-1670 comprise exemplary protospacer sequences of the PD-1 gene.
  • the target gene comprises PD-L1 gene.
  • the PD-L1 gene encodes programmed death ligand 1, which encodes an immune inhibitory receptor ligand expressed by hematopoietic and non-hematopoietic cells, including T cells, B cells, and various types of tumor cells.
  • PD-L1 is also known as CD274, PDL1, or B7H1.
  • Exemplary sequences of PD-L1 can be found, for example, at NCBI Gene ID 29126, NCBI Reference Sequence NC_000009.12, and Ensembl ID ENSG00000120217.
  • the target gene comprises TGFBR2 gene.
  • the TGFBR2 gene encodes transforming growth factor beta receptor 2. Mutations in TGFBR2 have been associated with several diseases and conditions, including cancer. Exemplary sequences of TGFBR2 can be found, for example, at NCBI Gene ID 7048, NCBI Reference Sequence NG_007490.1, and Ensembl ID ENSG00000163513. SEQ ID NOs: 1671-1801 comprise exemplary protospacer sequences of the TGFBR2 gene.
  • the target gene comprises TIGIT gene.
  • the TIGIT gene encodes T cell immunoreceptor with Ig and ITIM domains.
  • TIGIT is an immune receptor found on T cells and natural killer cells.
  • Exemplary sequences of TIGIT can be found, for example, at NCBI Gene ID 201633, NCBI Reference Sequence NC_000003.12, and Ensembl ID ENSG00000181847.
  • SEQ ID NOs: 1802-1845 comprise exemplary protospacer sequences of the TIGIT gene.
  • the target gene comprises TIM3 gene.
  • the TIM3 gene encodes T-cell immunoglobulin and mucin-domain containing-3.
  • TIM3 is a cell surface protein expressed on T cells.
  • TIM3 is also known as HAVCR2 (hepatitis A virus receptor 2) .
  • Exemplary sequences of TIM3 can be found, for example, at NCBI Gene ID 84868, NCBI Reference Sequence NG_030444.1, and Ensembl ID ENSG00000135077.
  • SEQ ID NOs: 1846-1932 comprise exemplary protospacer sequences of the TIM3 gene.
  • the target gene comprises TRAC gene.
  • the TRAC gene encodes T cell receptor alpha constant, which is a component of the T cell receptor protein complex.
  • Exemplary sequences of TRAC can be found, for example, at NCBI Gene ID 28755, NCBI Reference Sequence NG001332.3, and Ensembl ID ENSG00000277734.
  • SEQ ID NOs: 1933-2029 comprise exemplary protospacer sequences of the TRAC gene.
  • the target gene comprises TRBC2 gene.
  • TRBC2 encodes T cell receptor beta constant 2.
  • Exemplary sequences of TRBC2 can be found, for example at NCBI Gene ID 28638, NCBI Reference Sequence NG_001333.2, and Ensembl ID ENSG00000211751.
  • the target gene comprises TRBC1.
  • TRBC1 encodes T cell receptor beta constant 1.
  • Exemplary sequences of TRBC1 can be found, for example at NCBI Gene ID 28639, NCBI Reference Sequence NG_001333.2, and Ensembl ID ENSG00000211772.
  • the target gene comprises TRG gene.
  • TRG is the T cell receptor gamma locus. Exemplary sequences of TRG can be found, for example at NCBI Gene ID 6965 and NCBI Reference Sequence NG_001336.2.
  • the target gene comprises TRD.
  • TRD is the T cell receptor delta locus. Exemplary sequences of TRD can be found, for example at NCBI Gene ID 6964 and NCBI Reference Sequence NG_001332.3.
  • the guide sequence of the guide RNA of the disclosure is designed to hybridize to a target sequence on the target strand of a target dsDNA.
  • the guide sequence may be designed to be fully (100%) complementary to the target sequence but, in some embodiments, one or more mismatches (i.e., less than 100%complementary) may be tolerated for hybridization.
  • the sequence on the other side (the nontarget strand) of the target dsDNA corresponding to the target sequence is a protospacer sequence.
  • the protospacer sequence on the nontarget strand is fully complementary to the target sequence on the target strand unless a mutation is present in the protospacer sequence and/or the target sequence.
  • the guide sequence is identical to the protospacer sequence except for the U in the guide sequence due to its RNA nature and correspondingly the T in the protospacer sequence due to its DNA nature.
  • symbol “t” is used to denote both T in DNA and U in RNA (See “Table 1: List of nucleotides symbols” , the definition of symbol “t” is “thymine in DNA/uracil in RNA (t/u) ” ) .
  • a single SEQ ID NO in the sequence listing is used to denote both such guide sequence and protospacer sequence, although such a single SEQ ID NO may be marked as either DNA or RNA in the sequence listing.
  • a reference is made to a SEQ ID NO that recites a protospacer /guide sequence it refers to a protospacer sequence that is a DNA sequence or a guide sequence that is an RNA sequence, depending on the context.
  • the protospacer sequence on the nontarget strand of the target dsDNA may be associated with a PAM (protospacer adjacent motif) adjacent to the protospacer sequence.
  • PAM is a short motif (short DNA sequence) that can be identified or recognized by a napDNAbp, e.g., Cas9, Cas12.
  • the PAM may be upstream (5’ to) or downstream (3’ to) of a protospacer sequence.
  • the PAM is immediately adjacent to the protospacer sequence or, for example, within a small number (e.g., 1, 2, 3, 4, or 5) of nucleotides from the protospacer sequence.
  • the PAM is 5’ or 3’ to the protospacer sequence.
  • the PAM is immediately 5’ or 3’ to the protospacer sequence.
  • the PAM is immediately 3’ to the protospacer sequence for CRISPR-Cas9 systems
  • the PAM is immediately 5’ to the protospacer sequence for CRISPR-Cas12 systems.
  • Any sequence on the nontarget strand of a target dsDNA adjacent to a PAM may be a potential protospacer sequence for a system of the disclosure comprising a napDNAbp (e.g., Cas9, Cas12) capable of identifying and recognizing the PAM.
  • a protospacer sequence can be identified by querying a PAM on a target dsDNA (e.g., a gene) by a tool or algorithm in the art, and then the efficacy (e.g., on-target DNA editing activity, off-target DNA editing activity) of the system of the disclosure for the protospacer sequence can be evaluated by a method in the art or in the disclosure.
  • the PAM comprises, consists essentially of, or consists of sequence 5’-TTN-3’, wherein N is A, T, G, or C. In some embodiments, the PAM comprises, consists essentially of, or consists of sequence 5’-NGG-3’, wherein N is A, T, G, or C.
  • the optimal length for a protospacer sequence may vary depending on the selection of a napDNAbp. For example, a length of at least 16 nt, and preferably 20 nt, would be favorable to a CRISPR-Cas12i system comprising xCas12i or its variant (e.g., Cas12Max, hfCas12Max) .
  • Such an optimal length can be determined by a skilled in the art based on the selection of a napDNAbp and a serious of conventional experiments to evaluate the change of intended effect (e.g., DNA cleavage activity) with the length.
  • the protospacer sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides. In some embodiments, the protospacer sequence is about 20 nucleotides in length.
  • the protospacer sequence comprises at least about 14 contiguous nucleotides of the nontarget strand of the target dsDNA (e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, or more contiguous nucleotides of the nontarget strand of the target gene, or in a numerical range between any of two preceding values, e.g., from about 14 to about 50 contiguous nucleotides of the target dsDNA) .
  • the protospacer sequence comprises, consists essentially of, or consists of 20 contiguous nucleotides of the nontarget strand of the target dsDNA (e.g
  • the protospacer sequence is immediately 5’ or 3’ to a PAM comprises, consists essentially of, or consists of sequence 5’-TTN-3’, wherein N is A, T, G, or C.
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-2029 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-2029.
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 14 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 15 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 16 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 17 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 18 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 19 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 1 through nucleotide 20 of any indicated sequence
  • the protospacer sequence can comprise nucleotide 2 through nucleotide 15 of any indicated sequence, and so on.
  • sequence of any one of SEQ ID NOs: 1-2029 is identified from the nontarget strand of the target gene by identifying a PAM as described herein on the nontarget strand of the target gene and electing the sequence immediately 3’ to the PAM.
  • the target gene is A2AR
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-73 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-73.
  • the target gene is AAVS1
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79.
  • the target gene is B2M
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126
  • the target gene is BCL11A
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137.
  • the target gene is CCR5, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375.
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375; or (2) a sequence having a sequence identity of at least about
  • the target gene is CD16a
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 521-573; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 521-573.
  • the target gene is CD52
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 574-591; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 574-591.
  • the target gene is CD7
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 592-608; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 592-608.
  • the target gene is CIITA
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610.
  • the target gene is CISH
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795.
  • the target gene is CTLA4, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842.
  • the target gene is CXCR4, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897.
  • the target gene is HBB
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011.
  • the target gene is HEXB
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1049-1246; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1049-1246.
  • the target gene is IL2RG
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253.
  • the target gene is KLRG1
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1354-1414; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1354-1414.
  • the target gene is LAG3, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418.
  • the target gene is NKG2A
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1493-1635; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1493-1635.
  • the target gene is PD1
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639.
  • the target gene is TGFBR2
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681.
  • the target gene is TIGIT
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809.
  • the target gene is TIM3, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847.
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs:
  • the target gene is TRAC
  • the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937.
  • the continuous nucleotides comprised in the protospacer sequence are immediately 3’ to a PAM as described herein.
  • the guide sequence of the guide RNA of the disclosure is designed to hybridize to a target sequence on the target strand of a target dsDNA.
  • guide sequence as used herein is exchangeable with “spacer sequence” .
  • the guide sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides. In some embodiments, the guide sequence is about 20 nucleotides in length.
  • the guide sequence is about 50%to about 100%, e.g., at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 100%, optionally about 100% (fully) , complementary to the target sequence. In some embodiments, the guide sequence is fully (100%) complementary to the target sequence.
  • the guide sequence contains no more than 1, 2, 3, 4, or 5 mismatches to the target sequence.
  • the guide sequence comprises no mismatch with the target sequence in the first 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides from the 5’ end of the guide sequence.
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-2029 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-2029.
  • the guide sequence can comprise nucleotide 1 through nucleotide 14 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 15 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 16 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 17 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 18 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 19 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 20 of any indicated sequence, the guide sequence can comprise nucleotide 2 through nucleotide 15 of any indicated sequence, and so on.
  • the target gene is A2AR
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-73 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-73.
  • the target gene is AAVS1
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79.
  • the target gene is B2M
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and
  • the target gene is BCL11A
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137.
  • the target gene is CCR5
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375.
  • the target gene is CD16a
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 521-573; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 521-573.
  • the target gene is CD52
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 574-591; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 574-591.
  • the target gene is CD7
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 592-608; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 592-608.
  • the target gene is CIITA
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610.
  • the target gene is CISH
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795.
  • the target gene is CTLA4, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842.
  • the target gene is CXCR4, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897.
  • the target gene is HBB
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011.
  • the target gene is HEXB
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1049-1246; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1049-1246.
  • the target gene is IL2RG
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253.
  • the target gene is KLRG1
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1354-1414; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1354-1414.
  • the target gene is LAG3, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418.
  • 14 e.g., 20
  • the target gene is NKG2A
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1493-1635; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1493-1635.
  • the target gene is PD1
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639.
  • the target gene is TGFBR2
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681.
  • the target gene is TIGIT
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809.
  • the target gene is TIM3, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847.
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 18
  • the target gene is TRAC
  • the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937.
  • the scaffold sequence is a direct repeat (DR) sequence, e.g., a DR sequence for CRISPR-Cas12 systems.
  • the scaffold sequence is a scaffold sequence comprising a tracr mate sequence fused to a tracr sequence with or without a linker, e.g., such a scaffold sequence for CRISPR-Cas9 system.
  • the scaffold sequence of the guide RNA of the disclosure serves as a binding site to which a napDNAbp of the disclosure can be bound to complex with the guide RNA to form an RNA-protein complex, which is guided by the guide RNA to a target sequence of a target DNA through the hybridization of the guide sequence of the guide RNA to the target sequence.
  • any scaffold sequence that can mediate the binding or complexing of the napDNAbp to the guide RNA can be used in the disclosure. If a napDNAbp is selected, the scaffold sequence can be determined accordingly. For example, if a Cas12i polypeptide is selected as the napDNAbp, a scaffold sequence that can mediate the binding or complexing of the Cas12i polypeptide (ascaffold sequence corresponding to the Cas12i polypeptide) to the guide RNA comprising the scaffold sequence and a guide sequence can be selected accordingly for use in combination with the Cas12i polypeptide.
  • the scaffold sequence is a direct repeat sequence corresponding to xCas12i or its mutants (e.g., Cas12Max, hfCas12Max) .
  • the scaffold sequence (1) is as set forth in SEQ ID NO: 2032 or 2033; (2) comprises the sequence of SEQ ID NO: 2032 or 2033; (3) comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the sequence of SEQ ID NO: 2032 or 2033; or (4) comprises at least about 14 (e.g., at least about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30) contiguous nucleot
  • the scaffold sequence can comprise nucleotide 1 through nucleotide 14 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 15 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 16 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 17 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 18 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 19 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 20 of any indicated sequence, the scaffold sequence can comprise nucleotide 2 through nucleotide 15 of any indicated sequence, and so on.
  • the secondary structure of the scaffold sequence plays a role in its binding with a napDNAbp, and the change of one or more nucleotides (e.g., addition, deletion, substitution) of the scaffold sequence may be tolerated and may not significantly affect the functionality of the scaffold sequence as long as the secondary structure of the scaffold sequence retains.
  • the scaffold sequence has substantially the same secondary structure as the secondary structure of SEQ ID NO: 2032 or 2033.
  • the DR sequence 2 as set forth in SEQ ID NO: 2033 is a N-terminal truncation of DR sequence 1 as set forth in SEQ ID NO: 2032 (30 nt) , and both were demonstrated to work with xCas12i and its mutants (e.g., Cas12Max, hfCas12Max) .
  • the scaffold sequence comprises the stem-loop structure of the secondary structure of SEQ ID NO: 2032 or 2033.
  • the scaffold sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides.
  • the guide RNA of the disclosure comprises (1) a scaffold sequence of the disclosure, (2) a guide sequence of the disclosure, and (3) a scaffold sequence of the disclosure, wherein (1) , (2) , and (3) are in 5’ to 3’ direction.
  • the guide RNA of the disclosure comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to a sequence comprising (1) a scaffold sequence of the disclosure, (2) a guide sequence of the disclosure, and (3) a scaffold sequence of the disclosure, wherein (1) , (2) , and (3) are in 5’ to 3’ direction.
  • the guide RNA of the disclosure comprises (1) a scaffold sequence of SEQ ID NO: 2032 or 2033, (2) a guide sequence of any one of SEQ ID NOs: 1-2029, and (3) a scaffold sequence of SEQ ID NO: 2032 or 2033, wherein (1) , (2) , and (3) are in 5’ to 3’ direction.
  • the guide RNA of the disclosure comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to a sequence comprising (1) a scaffold sequence of SEQ ID NO: 2032 or 2033, (2) a guide sequence of any one of SEQ ID NOs: 1-2029, and (3) a scaffold sequence of SEQ ID NO: 2032 or 2033, wherein (1) , (2) , and (3) are in 5’ to 3’ direction.
  • the guide RNA (1) is as set forth in any one of SEQ ID NOs: 2045-2051; (2) comprises the sequence of any one of SEQ ID NOs: 2045-2051; or (3) comprises a sequence having a sequence identity of at least about 60%(e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the sequence of any one of SEQ ID NOs: 2045-2051.
  • 60% e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%
  • the guide RNA of the disclosure comprises a modification as described herein.
  • the sequence (e.g., a guide RNA, a mRNA encoding napDNAbp) of the disclosure may include one or more covalent modifications with respect to a reference sequence, in particular the parent polyribonucleotide, which are included within the disclosure.
  • the sequence is a modified guide RNA.
  • the sequence is a modified mRNA encoding a napDNAbp of the disclosure.
  • Exemplary modifications can include any modification to the sugar, the nucleobase, the internucleoside linkage (e.g., to a linking phosphate/to a phosphodiester linkage/to the phosphodiester backbone) , and any combination thereof.
  • Some of the exemplary modifications provided herein are described in detail below.
  • the sequence may include any useful modification, such as to the sugar, the nucleobase, or the internucleoside linkage (e.g., to a linking phosphate/to a phosphodiester linkage/to the phosphodiester backbone) .
  • One or more atoms of a pyrimidine nucleobase may be replaced or substituted with optionally substituted amino, optionally substituted thiol, optionally substituted alkyl (e.g., methyl or ethyl) , or halo (e.g., chloro or fluoro) .
  • modifications e.g., one or more modifications
  • RNAs ribonucleic acids
  • DNAs deoxyribonucleic acids
  • TAAs threose nucleic acids
  • GNAs glycol nucleic acids
  • PNAs peptide nucleic acids
  • LNAs locked nucleic acids
  • the modification may include a chemical or cellular induced modification.
  • RNA modifications are described by Lewis and Pan in “RNA modifications and structures cooperate to guide RNA-protein interactions” from Nat Reviews Mol Cell Biol, 2017, 18: 202-210.
  • Different sugar modifications, nucleotide modifications, and/or internucleoside linkages may exist at various positions in the sequence.
  • nucleotide analogs or other modification may be located at any position (s) of the sequence, such that the function of the sequence is not substantially decreased.
  • the sequence may include from about 1%to about 100%modified nucleotides (either in relation to overall nucleotide content, or in relation to one or more types of nucleotide, i.e., any one or more of A, G, U/T, or C) or any intervening percentage (e.g., from 1%to 20%, from 1%to 25%, from 1%to 50%, from 1%to 60%, from 1%to 70%, from 1%to 80%, from 1%to 90%, from 1%to 95%, from 10%to 20%, from 10%to 25%, from 10%to 50%, from 10%to 60%, from 10%to 70%, from 10%to 80%, from 10%to 90%, from 10%to 95%, from 10%to 100%, from 20%to 25%, from 20%to 50%, from 20%to 60%, from 20%to 70%, from 20%to 80%, from 20%to 90%, from 20%to 95%, from 20%to 100%, from 50%to 60%, from 50%to 70%, from 50%to 80%, from 50%to 90%, from 20%to 95%, from 20%to 100%, from 50%to
  • sugar modifications e.g., at the 2’ position or 4’ position
  • replacement of the sugar at one or more ribonucleotides of the sequence may, as well as backbone modifications, include modification or replacement of the phosphodiester linkages.
  • Specific examples of a sequence include, but are not limited to, sequences including modified backbones or no natural internucleoside linkages such as internucleoside modifications, including modification or replacement of the phosphodiester linkages.
  • Sequences having modified backbones include, among others, those that do not have a phosphorus atom in the backbone.
  • modified RNAs that do not have a phosphorus atom in their internucleoside backbone can also be considered to be oligonucleosides.
  • a sequence will include ribonucleotides with a phosphorus atom in its internucleoside backbone.
  • Modified sequence backbones may include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates such as 3’-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates such as 3 ’-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, and boranophosphates having normal 3’-5’ linkages, 2’-5’ linked analogs of these, and those having inverted polarity wherein the adjacent pairs of nucleoside units are linked 3’-5’ to 5’-3’ or 2’-5’ to 5’-2’ .
  • Various salts, mixed salts and free acid forms are also included.
  • the sequence may be negatively or positively charged.
  • the modified nucleotides which may be incorporated into the sequence, can be modified on the internucleoside linkage (e.g., phosphate backbone) .
  • internucleoside linkage e.g., phosphate backbone
  • the phrases “phosphate” and “phosphodiester” are used interchangeably.
  • Backbone phosphate groups can be modified by replacing one or more of the oxygen atoms with a different substituent.
  • the modified nucleosides and nucleotides can include the wholesale replacement of an unmodified phosphate moiety with another internucleoside linkage as described herein.
  • modified phosphate groups include, but are not limited to, phosphorothioate, phosphoroselenates, boranophosphates, boranophosphate esters, hydrogen phosphonates, phosphoramidates, phosphorodiamidates, alkyl or aryl phosphonates, and phosphotriesters.
  • Phosphorodithioates have both non-linking oxygens replaced by sulfur.
  • the phosphate linker can also be modified by the replacement of a linking oxygen with nitrogen (bridged phosphoramidates) , sulfur (bridged phosphorothioates) , and carbon (bridged methylene-phosphonates) .
  • the ⁇ -thio substituted phosphate moiety is provided to confer stability to RNA and DNA polymers through the unnatural phosphorothioate backbone linkages. Phosphorothioate DNA and RNA have increased nuclease resistance and subsequently a longer half-life in a cellular environment.
  • a modified nucleoside includes an alpha-thio-nucleoside (e.g., 5’-O- (1-thiophosphate) -adenosine, 5’-O- (1-thiophosphate) -cytidine ( ⁇ -thio-cytidine) , 5’-O- (1-thiophosphate) -guanosine, 5’-O- (1-thiophosphate) -uridine, or 5’-O- (1-thiophosphate) -pseudouridine) .
  • alpha-thio-nucleoside e.g., 5’-O- (1-thiophosphate) -adenosine, 5’-O- (1-thiophosphate) -cytidine ( ⁇ -thio-cytidine) , 5’-O- (1-thiophosphate) -guanosine, 5’-O- (1-thiophosphate) -uridine, or 5’-O- (1-thio
  • internucleoside linkages that may be employed according to the disclosure, including internucleoside linkages which do not contain a phosphorous atom, are described herein.
  • the sequence may include one or more cytotoxic nucleosides.
  • cytotoxic nucleosides may be incorporated into sequence, such as bifunctional modification.
  • Cytotoxic nucleoside may include, but are not limited to, adenosine arabinoside, 5-azacytidine, 4’-thio-aracytidine, cyclopentenylcytosine, cladribine, clofarabine, cytarabine, cytosine arabinoside, 1- (2-C-cyano-2-deoxy-beta-D-arabino-pentofuranosyl) -cytosine, decitabine, 5-fhiorouracil, fludarabine, floxuridine, gemcitabine, a combination of tegafur and uracil, tegafur ( (RS) -5-fluoro-1- (tetrahydrofuran-2-yl) pyrimidine-2, 4 (1H, 3H) -dione) ,
  • Additional examples include fludarabine phosphate, N4-behenoyl-1-beta-D-arabinofuranosylcytosine, N4-octadecyl-1-beta-D-arabinofuranosylcytosine, N4-palmitoyl-1- (2-C-cyano-2-deoxy-beta-D-arabino-pentofuranosyl) cytosine, and P-4055 (cytarabine 5’-elaidic acid ester) .
  • the sequence includes one or more post-transcriptional modifications (e.g., capping, cleavage, polyadenylation, splicing, poly-Asequence, methylation, acylation, phosphorylation, methylation of lysine and arginine residues, acetylation, and nitrosylation of thiol groups and tyrosine residues, etc. ) .
  • the one or more post-transcriptional modifications can be any post-transcriptional modification, such as any of the more than one hundred different nucleoside modifications that have been identified in RNA (Rozenski, J, Crain, P, and McCloskey, J. (1999) .
  • the sequence comprises at least one nucleoside selected from the group consisting of pyridin-4-one ribonucleoside, 5-aza-uridine, 2-thio-5-aza-uridine, 2-thiouridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxyuridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyluridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, 1-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-l-methyl-pseudouridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-l-methyl-pseud
  • the sequence comprises at least one nucleoside selected from the group consisting of 5-aza-cytidine, pseudoisocytidine, 3-methyl-cytidine, N4-acetylcytidine, 5-formylcytidine, N4-methylcytidine, 5-hydroxymethylcytidine, 1-methyl-pseudoisocytidine, pyrrolo-cytidine, pyrrolo-pseudoisocytidine, 2-thio-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoisocytidine, 4-thio-1-methyl-pseudoisocytidine, 4-thio-1-methyl-1-deaza-pseudoisocytidine, 1-methyl-1-deaza-pseudoisocytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebul
  • the sequence comprises at least one nucleoside selected from the group consisting of 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2, 6-diaminopurine, 7-deaza-8-aza-2, 6-diaminopurine, 1-methyladenosine, N6-methyladenosine, N6-isopentenyladenosine, N6- (cis-hydroxyisopentenyl) adenosine, 2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine, N6-glycinylcarbamoyladenosine, N6-threonylcarbamoyladenosine, 2-methylthio-N6-threonyl carbamoyladenosine
  • the sequence comprises at least one nucleoside selected from the group consisting of inosine, 1-methyl-inosine, wyosine, wybutosine, 7-deaza-guanosine, 7-deaza-8-aza-guanosine, 6-thio-guanosine, 6-thio-7-deaza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methylinosine, 6-methoxy-guanosine, 1-methylguanosine, N2-methylguanosine, N2, N2-dimethylguanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, l-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2, N2-dimethyl-6-thio-guanosine.
  • nucleoside selected from the group consisting of in
  • the sequence may or may not be uniformly modified along the entire length of the sequence.
  • nucleotides e.g., naturally-occurring nucleotides, purine or pyrimidine, or any one or more or all of A, G, U, C, I, and pU
  • the sequence includes a pseudouridine.
  • the sequence includes an inosine, which may aid in the immune system characterizing the sequence as endogenous versus viral RNAs. The incorporation of inosine may also mediate improved RNA stability and/or reduced degradation. See for example, Yu, Z. et al. (2015) RNA editing by ADAR1 marks dsRNA as “self’ . Cell Res. 25, 1283-1284, which is incorporated herein by reference in its entirety.
  • the guide RNA comprises at its 3’ end a polyU tail.
  • the polyU tail comprises four to seven uracil, and optionally four uracil.
  • the guide RNA comprises a 2’-O-methyl-and phosphorothioate modification.
  • the 2’-O-methyl-and phosphorothioate modification is located at one or more of the nucleotides of the guide RNA, e.g., at the first three nucleotides at the 5’-end of the guide RNA, and/or, at each U of the polyU tail.
  • the guide RNA comprises at its 3’ end a 3’ modified poly U tail containing a 2’-O-methyl-and phosphorothioate modification.
  • the 3’ modified poly U tail comprises three, four, five, six, seven, or more uracils.
  • the 3’ modified poly U tail comprises a 2’-O-methyl-and phosphorothioate modification on its first two, three, four, five, six, seven, or more 5’ end uracils.
  • the 3’ modified poly U tail comprises a 2’-O-methyl-and phosphorothioate modification on each of its first two, three, four, five, six, seven, or more 5’ end uracils.
  • the 3’ modified poly U tail is consisting of four uracils with 2’-O-methyl-and phosphorothioate modifications on each of its first three 5’ end uracils.
  • the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification. In some embodiments, the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification on one, two, or three nucleotides of the first three nucleotides. In some embodiments, the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification on each of the first three nucleotides.
  • the disclosure provides a system or composition comprising: (1) a nucleic acid programmable DNA binding protein (napDNAbp) or a polynucleotide encoding the napDNAbp; and (2) the guide RNA of the disclosure, or a polynucleotide encoding the guide RNA.
  • the polynucleotide encoding the napDNAbp and/or the polynucleotide encoding the guide RNA is a DNA or an RNA.
  • the system comprises two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) guide RNAs of the disclosure.
  • the system or composition of the disclosure comprises a first guide RNA and a second guide RNA, wherein the first guide RNA comprises a guide sequence as set forth in SEQ ID NO: 1933, 1934, or 1935, and the second guide RNA comprises a guide sequence as set forth in SEQ ID NO: 80, 90, or 117.
  • the system or composition of the disclosure comprises a first guide RNA and a second guide RNA, wherein the first guide RNA comprises a guide sequence as set forth in SEQ ID NO: 1934, and the second guide RNA comprises a guide sequence as set forth in SEQ ID NO: 117.
  • the disclosure provides a complex comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) complexed or bound with (2) the guide RNA of the disclosure.
  • the guide RNA and the napDNAbp complex or bind to each other in a molar ratio of about 1: 1.
  • the complex comprising the napDNAbp and the guide RNA binds to a target sequence.
  • the complex further comprises a target sequence on a target strand of the target gene; optionally hybridized to the guide sequence of the guide RNA.
  • the complex comprising the napDNAbp and the guide RNA binds to a target sequence at a molar ratio of about 1: 1.
  • the complex comprises enzymatic activity, such as nuclease activity, that can cleave the target sequence.
  • enzymatic activity such as nuclease activity
  • the guide RNA, the napDNAbp, and the target sequence either alone or together, do not naturally occur.
  • Cas12i polypeptides as exemplary napDNAbp herein are smaller than other nucleases.
  • Cas12f polypeptides are even smaller.
  • xCas12i is 1, 080 amino acids in length
  • Cas12i2 is 1, 054 amino acids in length
  • S. pyogenes Cas9 is 1,368 amino acids in length, S.
  • thermophilus Cas9 (StCas9) is 1,128 amino acids in length, FnCpfl is 1,300 amino acids in length, AsCpfl is 1, 307 amino acids in length, and LbCpfl is 1,246 amino acids in length.
  • Cas12i guide RNAs which do not require a trans-activating CRISPR RNA (tracrRNA) , are also smaller than Cas9 guide RNAs. The smaller Cas12i polypeptide and guide RNA sizes are beneficial for delivery.
  • Systems comprising a Cas12i polypeptide also demonstrate decreased off-target activity compared to systems comprising an SpCas9 polypeptide. See PCT/US2021/025257, which is incorporated by reference in its entirety.
  • indels induced by systems comprising a Cas12i polypeptide differ from indels induced by systems comprising an SpCas9 polypeptide.
  • SpCas9 polypeptides primarily induce insertions and deletions of 1 nucleotide in length.
  • Cas12i polypeptides induce larger deletions, which can be beneficial in disrupting a larger portion of a target gene.
  • the system or composition further comprises a donor sequence (e.g., encoding a chimeric antigen receptor (CAR) ) for insertion into the genome of the target cell, e.g., by homologous recombination.
  • the donor sequence is a donor DNA.
  • the donor sequence may be inserted at the site where the genome is modified by the system or composition of the disclosure, or at a site irrelevant to where the genome is modified by the system or composition of the disclosure.
  • the donor sequence encodes a chimeric antigen receptor (CAR) .
  • the chimeric antigen receptor (CAR) comprises a scFv targeting an antigen on a B cell, a T cell, or a tumor cell, such as, CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM.
  • the donor sequence encodes or comprises a sequence of a wild type gene or a non-disease-causing version of a gene.
  • the insertion of the donor sequence results in introduction of a wildtype or non-disease-causing version of a gene to the target cell.
  • the insertion of the donor sequence results in introduction of a selection marker or a reporter protein to the target cell.
  • the insertion of the donor sequence results in knock-in of a gene in the target cell.
  • the insertion of the donor sequence results in a knockout mutation in the target cell.
  • the insertion of the donor sequence results in a substitution mutation, such as a single nucleotide substitution, in the target cell.
  • the insertion induces a phenotypic change to the target cell.
  • the guide RNA of the disclosure can work with any proper nucleic acid programmable DNA binding protein (napDNAbp) that can identify or recognize the PAM adjacent to the protospacer sequence and can be guided by the guide RNA to the target DNA.
  • napDNAbp nucleic acid programmable DNA binding protein
  • the napDNAbp can be a Cas12i polypeptide that can identify or recognize the PAM 5’-TTN-3’ immediately 5’ to the protospacer sequence and be guided by the guide RNA to the target DNA.
  • the napDNAbp is capable of identifying or recognizing a PAM comprising, consisting essentially of, or consisting of sequence 5’-TTN-3’, and N is A, T, G, or C, e.g., Cas12i polypeptides, Cas12f polypeptides.
  • the napDNAbp is a CRISPR-associated protein (Cas) .
  • the napDNAbp is an IscB polypeptide, which is not a Cas.
  • the napDNAbp is an IsrB polypeptide, which is not a Cas.
  • the napDNAbp is a Class 2, Type II CRISPR-associated protein (Cas9) , e.g., spCas9, saCas9.
  • the napDNAbp is a Class 2, Type V CRISPR-associated protein (Cas12) .
  • the Cas12 is a Cas12i, Cas12a (Cpf1) , Cas12b (C2c1) , Cas12c (C2c3) , Cas12d (CasY) , Cas12e (CasX) , Cas12f (Cas14) , or Cas12k (C2c10, C2C7) polypeptide.
  • the Cas is a Cas12i polypeptide.
  • the Cas12i is any Cas12i polypeptide in any of the patent applications CN202111290670.8, US17/819, 795, CN202111289092.6, CN202210081981.1, PCT/CN2022/089074, PCT/CN2022/129376, PCT/CN2023/073420, the disclosure of which are incorporated herein by reference in their entirety.
  • the Cas12i is xCas12i or any mutant or variant thereof, e.g., Cas12Max, hfCas12Max.
  • the Cas12i is a mutant or variant of xCas12i with increased on-target dsDNA cleavage activity, such as, Cas12Max (SEQ ID NO: 2031) .
  • the Cas12i is a mutant or variant of xCas12i with decreased off-target dsDNA cleavage activity.
  • the Cas is a mutant or variant of xCas12i with both increased on-target dsDNA cleavage activity and decreased off-target dsDNA cleavage activity, such as, hfCas12Max (SEQ ID NO: 2044) .
  • the Cas12i is any Cas12i polypeptide in any of the patent applications PCT/US2019/022375, US16/680104, US17/020414, US17/020215, US17/139678, US17/497725, US17/055719, US16/862261, US17/260791, US17/506627, US17/829692, US17/435563, US17/619165, US17/626072, US17/638065, US17/634461, US17/641523, US17/505578, US17/782254, US17/830212, US17/831852, US17/832114, US17/832038, US17/814318, the disclosure of which are incorporated herein by reference in their entirety.
  • the Cas12i is Cas12i1, Cas12i2, Cas12i3, Cas12i4, or any mutant or variant thereof, e.g., a mutant or variant thereof with increased on-target dsDNA cleavage activity and/or decreased off-target dsDNA cleavage activity.
  • the Cas12i (1) is as set forth in SEQ ID NO: 2030 (xCas12i, also known as SiCas12i) , 2031 (Cas12Max) , or 2044 (hfCas12Max) ; (2) comprises the amino acid sequence of SEQ ID NO: 2030, 2031, or 2044; or (3) comprises an amino acid sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the amino acid sequence of SEQ ID NO: 2030, 2031, or 2044.
  • xCas12i also known as SiCas12i
  • 2031 Cas12Max
  • 2044 hfCas12Max
  • the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) nuclear localization signal (NLS) . In some embodiments, the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) nuclear export signal (NES) . In some embodiments, the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) NLS and at least one (e.g., two, three, four, five, six, or more) NES.
  • NLS nuclear localization signal
  • NES nuclear export signal
  • the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) NLS and at least one (e.g., two, three, four, five, six, or more) NES.
  • the NLS comprises or is SV40 NLS (SEQ ID NO: 2038) , bpSV40 NLS (BP NLS, bpNLS) , or NP NLS (Xenopus laevis Nucleoplasmin NLS, nucleoplasmin NLS, SEQ ID NO: 2039) .
  • the napDNAbp can be self-inactivating. See, for example, Epstein et al., “Engineering a Self-Inactivating CRISPR System for AAV Vectors, ” Mol. Ther., 24 (2016) : S50, which is incorporated by reference in its entirety.
  • changes to the napDNAbp may be one or more amino acid changes
  • changes to the napDNAbp may also be of a substantive nature, such as fusion of polypeptides as amino-and/or carboxyl-terminal extensions.
  • the napDNAbp may contain additional peptides, e.g., one or more peptides.
  • additional peptides may include epitope peptides for labelling, such as a polyhistidine tag (His-tag) , Myc, and FLAG.
  • the napDNAbp can be fused to a detectable moiety such as a fluorescent protein (e.g., green fluorescent protein (GFP) or yellow fluorescent protein (YFP) ) .
  • GFP green fluorescent protein
  • YFP yellow fluorescent protein
  • the disclosure provides a polynucleotide encoding the guide RNA of the disclosure.
  • the polynucleotide further comprises a polynucleotide encoding the napDNAbp of the disclosure.
  • the polynucleotide is a DNA or an RNA. In some embodiments, one or more of the nucleotides of the polynucleotide is modified.
  • the polynucleotide can be codon-optimized for use in a particular host cell or organism.
  • the nucleic acid can be codon-optimized for any non-human eukaryote including mice, rats, rabbits, dogs, livestock, or non-human primates. Codon usage tables are readily available, for example, at the “Codon Usage Database” available at www. kazusa. orjp/codon/and these tables can be adapted in a number of ways. See Nakamura et al. Nucl. Acids Res. 28: 292 (2000) , which is incorporated herein by reference in its entirety.
  • the polynucleotide is codon optimized for expression in eukaryotic (e.g., mammalian, such as, human) cells.
  • the systems or complexes of the disclosure may be formulated, for example, including a carrier, such as a carrier and/or a polymeric carrier, e.g., a liposome, and delivered by known methods to a cell (e.g., a prokaryotic, eukaryotic, plant, mammalian, etc. ) .
  • a carrier such as a carrier and/or a polymeric carrier, e.g., a liposome
  • transfection e.g., lipid-mediated, cationic polymers, calcium phosphate, dendrimers
  • electroporation or other methods of membrane disruption e.g., nucleofection
  • viral delivery e.g., lentivirus, retrovirus, adenovirus, AAV
  • microinjection e.g., lentivirus, retrovirus, adenovirus, AAV
  • microprojectile bombardment “gene gun”
  • fugene direct sonic loading, cell squeezing, optical transfection, protoplast fusion, impalefection, magnetofection, exosome-mediated transfer, lipid nanoparticle-mediated transfer, and any combination thereof.
  • the method comprises delivering one or more nucleic acids (e.g., nucleic acids encoding the napDNAbp, guide RNA, donor DNA, etc. ) , one or more transcripts thereof, and/or a pre-formed guide RNA/napDNAbp complex to a cell, where a ternary complex is formed.
  • nucleic acids e.g., nucleic acids encoding the napDNAbp, guide RNA, donor DNA, etc.
  • Exemplary intracellular delivery methods include, but are not limited to: viruses or virus-like agents; chemical-based transfection methods, such as those using calcium phosphate, dendrimers, liposomes, or cationic polymers (e.g., DEAE-dextran or polyethylenimine) ; non-chemical methods, such as microinjection, electroporation, cell squeezing, sonoporation, optical transfection, impalefection, protoplast fusion, bacterial conjugation, delivery of plasmids or transposons; particle-based methods, such as using a gene gun, magnetofection or magnet assisted transfection, particle bombardment; and hybrid methods, such as nucleofection.
  • the disclosure further provides cells produced by such methods, and organisms (such as animals, plants, or fungi) comprising or produced from such cells.
  • the napDNAbp and the guide RNA are delivered together.
  • the napDNAbp and the guide RNA are packaged together in a single AAV particle.
  • the napDNAbp and the guide RNA are delivered together via lipid nanoparticles (LNPs) .
  • the napDNAbp and the guide RNA are delivered separately.
  • the napDNAbp and the guide RNA are packaged into separate AAV particles.
  • the napDNAbp is delivered by a first delivery mechanism and the guide RNA is delivered by a second delivery mechanism.
  • the disclosure provides a vector comprising the polynucleotide of the disclosure.
  • the polynucleotide encoding the guide RNA is operably linked to and under the regulation of a promoter. In some embodiments, the polynucleotide encoding the napDNAbp is operably linked to and under the regulation of a promoter. In some embodiments, the polynucleotide encoding the guide RNA and the polynucleotide encoding the napDNAbp are operably under the regulation of a same promoter. In some embodiments, the polynucleotide encoding the guide RNA and the polynucleotide encoding the napDNAbp are each operably under the regulation of a promoter.
  • the promoter is selected from the group consisting of a ubiquitous promoter, a tissue-specific promoter, a cell-type specific promoter, a constitutive promoter, and an inducible promoter.
  • the promoter comprises or is a promoter selected from the group consisting of: a (human) U6 promoter (e.g., SEQ ID NO: 2034) , a CBh promoter (e.g., SEQ ID NO: 2035) , an elongation factor 1 ⁇ short (EFS) promoter, a (human) Cbh promoter, a MHCK7 promoter, a Cba promoter, a pol I promoter, a pol II promoter, a pol III promoter, a T7 promoter, a H1 promoter, a retroviral Rous sarcoma virus LTR promoter, a (human) cytomegalovirus (CMV) promoter (e.g., SEQ ID NO: 2042
  • the polynucleotide comprises a Kozak sequence (e.g., SEQ ID NO: 2036) .
  • the polynucleotide comprises a bGH polyA coding sequence (e.g., SEQ ID NO: 2040) .
  • the polynucleotide comprises a CMV enhancer (e.g., SEQ ID NO: 2041) .
  • the polynucleotide encoding the napDNAbp is 5' or 3' to the polynucleotide encoding the guide RNA.
  • the vector is a plasmid. In some embodiments, the vector is a mammalian plasmid for expression in eukaryotic cells.
  • the vector is a viral vector.
  • the vector is a retroviral vector, a phage vector, an adenoviral vector, a herpes simplex viral (HSV) vector, an AAV vector, or a lentiviral vector.
  • the AAV vector is an AAV vector capable of encapsidating a DNA or an AAV vector capable of encapsidating an RNA.
  • the AAV vector comprises a capsid with a serotype of AAV1, AAV2, AAV3, AAV3A, AAV3B, AAV4, AAV5, AAV6, AAV7, AAVrh74, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV-DJ, AAV. PHP. eB, a member of the Clade to which any of the AAV1-AAV13 belong, a functional truncated variant thereof, or a functional mutant thereof.
  • the AAV vector comprises 5’ ITR and/or 3’ ITR from wild type AAV2.
  • the system of the disclosure may be delivered via a single AAV due to the suitable sizes of the components of the system.
  • the system of the disclosure can be delivered via ribonucleoprotein (RNP) delivery.
  • RNP ribonucleoprotein
  • an Addgene blog “CRISPR 101: Ribonucleoprotein (RNP) Delivery” by Andrew Hempstead, https: //blog. addgene. org/crispr-101-ribonucleoprotein-rnp-delivery) explains the RNP delivery of a CRISPR-Cas9 system comprising a Cas9 protein and a guide RNA targeting a genomic site of interest.
  • This delivery method can be similarly applied to the system of the disclosure, mutatis mutandis.
  • the disclosure provides a ribonucleoprotein (RNP) comprising the system or complex of the disclosure comprising the napDNAbp and the guide RNA.
  • RNP ribonucleoprotein
  • the RNP comprises an excess or supersaturation amount of the guide RNA over the napDNAbp.
  • the RNP comprises the napDNAbp and the guide RNA in a ratio in a range of about 1: 1 to about 1: 2, e.g., about 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5, 1: 1.6, 1: 1.7, 1: 1.8, 1: 1.9, 1: 2, e.g., a ratio of 1: 1.875, or in a range between any of two preceding ratios, e.g., a ratio of about 1: 1.7 to about 1: 1.9.
  • the RNP further comprises the donor sequence as described herein. Methods and materials for production and delivery of such a RNP is known in the art.
  • the disclosure provides a lipid nanoparticle (LNP) comprising the system of the disclosure comprising a mRNA encoding the napDNAbp and the guide RNA.
  • the mRNA comprises a 5’ UTR.
  • the mRNA comprises a 3’ polyA tail.
  • the LNP comprises the mRNA and the guide RNA in a ratio in a range of about 1: 1 to about 1: 2, e.g., about 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5, 1: 1.6, 1: 1.7, 1: 1.8, 1: 1.9, 1: 2, e.g., a ratio of 1: 1.875, or in a range between any of two preceding ratios, e.g., a ratio of about 1: 1.7 to about 1: 1.9.
  • the LNP further comprises the donor sequence as described herein. Methods and materials for production and delivery of such a LNP is known in the art.
  • the cells can be delivered to a variety of cells.
  • the cell is an isolated cell.
  • the cell is in cell culture or a co-culture of two or more cell types.
  • the cell is ex vivo.
  • the cell is obtained from a living organism and maintained in a cell culture.
  • the cell is a single-cellular organism.
  • the cell is a prokaryotic cell. In some embodiments, the cell is a bacterial cell or derived from a bacterial cell. In some embodiments, the cell is an archaeal cell or derived from an archaeal cell.
  • the cell is a eukaryotic cell. In some embodiments, the cell is a plant cell or derived from a plant cell. In some embodiments, the cell is a fungal cell or derived from a fungal cell. In some embodiments, the cell is an animal cell or derived from an animal cell. In some embodiments, the cell is an invertebrate cell or derived from an invertebrate cell. In some embodiments, the cell is a vertebrate cell or derived from a vertebrate cell. In some embodiments, the cell is a mammalian cell or derived from a mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the cell is a zebra fish cell.
  • the cell is a rodent cell. In some embodiments, the cell is synthetically made, sometimes termed an artificial cell. In some embodiments, the cell is engineered, i.e., an engineered cell, to comprise a modification relative to a corresponding non-engineered cell. In some embodiments, the modification comprises an insertion of a sequence into the genome of the cell. In some embodiments, the modification comprises a deletion of a sequence from the genome of the cell. In some embodiments, the modification comprises both an insertion of a sequence into the genome of the cell and a deletion of a sequence from the genome of the cell.
  • the cell is derived from a cell line.
  • a wide variety of cell lines for tissue culture are known in the art. Examples of cell lines include, but are not limited to, HEK293T (293T) , MF7, K562, HeLa, CHO, and transgenic varieties thereof. Cell lines are available from a variety of sources known to those with skill in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassas, Virginia) ) .
  • the cell is an immortal or immortalized cell.
  • the cell is a primary cell.
  • the cell is a stem cell such as a totipotent stem cell (e.g., omnipotent) , a pluripotent stem cell, a multipotent stem cell, an oligopotent stem cell, or an unipotent stem cell.
  • the cell is an induced pluripotent stem cell (iPSC) or derived from an iPSC.
  • the cell is a differentiated cell.
  • the differentiated cell is a muscle cell (e.g., a myocyte) , a fat cell (e.g., an adipocyte) , a bone cell (e.g., an osteoblast, osteocyte, osteoclast) , a blood cell (e.g., a monocyte, a lymphocyte, a neutrophil, an eosinophil, a basophil, a macrophage, a erythrocyte, or a platelet) , a nerve cell (e.g., a neuron) , an epithelial cell, an immune cell (e.g., a lymphocyte, a T cell, a B cell, a NK cell, a neutrophil, a monocyte, or a macrophage) , a liver cell (e.g., a hepatocyte) , a fibroblast, or a sex cell (e.g., an egg, a sperm cell)
  • a muscle cell e
  • the cell is a terminally differentiated cell.
  • the terminally differentiated cell is a neuronal cell, an adipocyte, a cardiomyocyte, a skeletal muscle cell, an epidermal cell, or a gut cell.
  • the cell is an immune cell.
  • the immune cell is a T cell.
  • the immune cell is a B cell.
  • the immune cell is a Natural Killer (NK) cell.
  • the immune cell is a Tumor Infiltrating Lymphocyte (TIL) .
  • the cell is a mammalian cell, e.g., a human cell, a monkey cell, or a murine cell.
  • the murine cell is derived from a wild-type mouse, an immunosuppressed mouse, or a disease-specific mouse model.
  • the cell is a cell within a living tissue, organ, or organism.
  • the disclosure provides a cell or a progeny thereof comprising the guide RNA, the system, the polynucleotide, the vector, the LNP, and/or the RNP of the disclosure.
  • the disclosure provides a cell or a progeny thereof, wherein the cell is modified by the method of the disclosure, and also termed as a modified cell.
  • the gRNA, the system, the complex, and the method of the disclosure are applicable for any suitable cell type with respect to the target cell of the disclosure in which a target DNA (e.g., gene) may be located, the cell of the disclosure, or the modified cell of the disclosure (collectively, “the cell” ) .
  • a target DNA e.g., gene
  • the cell is in vivo, ex vivo, or in vitro.
  • the cell is a eukaryotic cell (e.g., an animal cell, a vertebrate cell, a mammalian cell, a non-human mammalian cell, a non-human primate cell, a rodent (e.g., mouse or rat) cell, a human cell, a plant cell, or a yeast cell) or a prokaryotic cell (e.g., a bacteria cell) .
  • a eukaryotic cell e.g., an animal cell, a vertebrate cell, a mammalian cell, a non-human mammalian cell, a non-human primate cell, a rodent (e.g., mouse or rat) cell, a human cell, a plant cell, or a yeast cell
  • a prokaryotic cell e.g., a bacteria cell
  • the cell is a cell isolated from natural sources, such as a tissue biopsy. In some embodiments, the cell is a cell isolated from an in vitro cultured cell line. In some embodiments, the cell is from a primary cell line. In some embodiments, the cell is from an immortalized cell line. In some embodiments, the cell is an engineered cell. In some embodiments, the cell is a genetically engineered cell.
  • the cell is a cultured cell, an isolated primary cell, or a cell within a living organism.
  • the cell is an immune cell.
  • the cell is a T cell (such as, CAR-T cell, a cytotoxic T cell, a helper T cell, a regulatory T cell, a natural killer (NK) T cell, an iNK-T cell, an NK-T like cell, a ⁇ T cell, a tumor-infiltrating T cell and a dendritic cell (DC) -activated T cell) .
  • the cell is a B cell.
  • the cell is a NK cell (such as, CAR-NK cell) .
  • the cell is a universal CAR-T cell or a universal CAR-NK cell.
  • the cell comprises a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC, and any homology, mutant, or variant thereof, and any intron, exon, complement, or fragment thereof.
  • a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC, and any homology, mutant, or variant thereof, and any intron, exon, complement, or fragment thereof.
  • the cell expresses one or more engineered receptors.
  • engineered receptors include, but are not limited to, CAR, TCR, TAC receptor, and TFPs.
  • the engineered receptor comprises an extracellular domain that specifically binds to an antigen (e.g., a tumor antigen) , a transmembrane domain, and an intracellular signaling domain (e.g., CD3zeta domain) .
  • the intracellular signaling domain comprises a primary intracellular signaling domain and/or a co-stimulatory domain, e.g., CD28, 4-1BB.
  • the engineered receptor comprises one or more specific binding domains that target at least one (e.g., tumor) antigen, for example, CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM.
  • the binding domain is scFv.
  • the engineered receptor is a chimeric antigen receptor (CAR) .
  • CAR chimeric antigen receptor
  • Many chimeric antigen receptors are known in the art and may be suitable for the modified therapeutic cells of the disclosure.
  • CARs can also be constructed with a specificity for any cell surface marker by utilizing antigen binding fragments or antibody variable domains of, for example, antibody molecules. Any method for producing a CAR may be used herein. See, for example, US6, 410, 319, US7, 446, 191, US7, 514, 537, US9765342B2, WO 2002/077029, WO2015/142675, US2010/065818, US 2010/025177, US 2007/059298, WO2017025038A1, and Berger C. et al., J. Clinical Investigation 118: 1 294-308 (2008) , which are hereby incorporated by reference.
  • the cell is a T cell expressing an engineered receptor. In some embodiments, the T cell expresses a CAR (CAR-T cell) .
  • the cell is NK cell expressing an engineered receptor. In some embodiments, the NK cell expresses a CAR (CAR-NK cell) .
  • a nucleic acid encoding the CAR is introduced into the cell before the cell is modified according to the methods described herein. In some embodiments, a nucleic acid encoding the CAR is introduced into the cell after the cell is modified according to the methods described herein.
  • the cell is an iPSC.
  • the iPSC is first modified by the methods described herein, and the modified iPSC is then differentiated into an immune cell, such as a T cell or an NK cell, which is optionally modified as a universal T cell, a universal CAR-T cell, a universal NK cell, or a universal CAR-NK cell.
  • the cell is modified in vivo. In some embodiments, the cell is modified ex vivo. In some embodiments, the cell is derived from a healthy individual. In some embodiments, the cell is derived from an individual having a disease.
  • the cell is derived from an individual, modified according to the methods described herein, and subsequently used to treat the individual from which the cell was derived.
  • the cell is derived from a first individual, modified according to the methods described herein, and used for treatment of a second individual different from the first individual.
  • the cell is modified according to the methods described herein to produce an allogeneic cell with reduced or no potential for graft-versus-host-disease or other immune-mediated rejection of the cells in a recipient individual.
  • Allogeneic cells are also referred to as “off the shelf” cells in the art.
  • the cell is modified to produce an allogenic cell.
  • the allogeneic cell is an allogeneic CAR-T cell.
  • the allogeneic cell is an allogenic CAR-NK cell. Allogeneic CAR-T and allogeneic CAR-NK cells are also referred to as “off the shelf, ” or “universal” CAR-T and CAR-NK cells.
  • the cell is a stem cell (such as, iPS cell, hematopoietic stem cell (HSC) ) .
  • the HSC is CD34+ hematopoietic stem cell.
  • the cell is derived from or heterogenous to a subject.
  • the disclosure provides a host comprising the cell or progeny thereof of the disclosure.
  • the host is a non-human animal or a plant.
  • the non-human animal is an animal (e.g., rodent or non-human primate) model for a human genetic disorder.
  • the disclosure provides a (e.g., pharmaceutical) composition
  • a (e.g., pharmaceutical) composition comprising the guide RNA, the system, the complex, the polynucleotide, the vector, the RNP, the LNP, and/or the cell or progeny thereof of the disclosure.
  • the composition comprises a pharmaceutically acceptable excipient.
  • the composition is formulated for delivery by a nanoparticle, e.g., a lipid nanoparticle (LNP) , a ribonucleoprotein (RNP) , a liposome, an exosome, a microvesicle, a nucleic acid (e.g., DNA) nanoassembly, a gene gun, or an implantable device.
  • a nanoparticle e.g., a lipid nanoparticle (LNP) , a ribonucleoprotein (RNP) , a liposome, an exosome, a microvesicle, a nucleic
  • the disclosure provides a delivery system comprising: (1) a delivery vehicle, and (2) the guide RNA, the system, the complex, the polynucleotide, the vector, the RNP, the LNP, the cell or progeny thereof, and/or the composition of the disclosure.
  • the delivery vehicle is a nanoparticle, e.g., a lipid nanoparticle (LNP) , a ribonucleoprotein (RNP) , a liposome, an exosome, a microvesicle, a nucleic acid (e.g., DNA) nanoassembly, a gene-gun, or an implantable device.
  • a nanoparticle e.g., a lipid nanoparticle (LNP) , a ribonucleoprotein (RNP) , a liposome, an exosome, a microvesicle, a nucleic acid (e.g., DNA) nanoassembly, a gene-gun, or an implant
  • kits that can be used, for example, to carry out a method described herein.
  • the kits include a system of the disclosure comprising a guide RNA and a napDNAbp herein.
  • the systems include a polynucleotide of the disclosure that encodes such a napDNAbp, and optionally the polynucleotide is comprised within a vector, e.g., as described herein.
  • the kits include a polynucleotide of the disclosure that encodes a guide RNA disclosed herein.
  • the napDNAbp and the guide RNA can be packaged within the same or other vessel within a kit or can be packaged in separate vials or other vessels, the contents of which can be mixed prior to use.
  • the kits can additionally include, optionally, a buffer and/or instructions for use of the system of the disclosure comprising the guide RNA and napDNAbp.
  • the disclosure provides a kit comprising the guide RNA, the system, the complex, the polynucleotide, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, and/or the delivery system of the disclosure.
  • the kit further comprises an instruction for modifying a target DNA.
  • the disclosure includes methods for production of the guide RNA of the disclosure, methods for production of the napDNAbp of the disclosure, and methods for complexing the guide RNA of the disclosure and the napDNAbp of the disclosure.
  • the guide RNA of the disclosure is made by in vitro transcription of a DNA template.
  • the guide RNA is generated by in vitro transcription of a DNA template encoding the guide RNA using an upstream promoter sequence (e.g., a T7 polymerase promoter sequence) .
  • the DNA template encodes multiple guide RNAs or the in vitro transcription reaction includes multiple different DNA templates, each encoding a different guide RNA.
  • the guide RNA is made using chemical synthetic methods.
  • the guide RNA is made by expressing the guide RNA sequence in cells transfected with a plasmid including sequences that encode the guide RNA.
  • the plasmid encodes multiple different guide RNAs. In some embodiments, multiple different plasmids, each encoding a different guide RNA, are transfected into the cells.
  • the guide RNA is expressed from a plasmid that encodes the guide RNA and also encodes a napDNAbp. In some embodiments, the guide RNA is expressed from a plasmid that expresses the guide RNA but not a napDNAbp. In some embodiments, the guide RNA is purchased from a commercial vendor. In some embodiments, the guide RNA is synthesized using one or more modified nucleotide, e.g., as described above.
  • the napDNAbp of the disclosure can be prepared by (a) culturing bacteria which produce the napDNAbp, isolating the napDNAbp, optionally, purifying the napDNAbp, and optionally, complexing the napDNAbp with a guide RNA.
  • the napDNAbp can be also prepared by (b) a known genetic engineering technique, specifically, by isolating a gene encoding the napDNAbp from bacteria, constructing an expression vector based on the gene, and then transferring the vector into an appropriate host cell that expresses a guide RNA for expression of the napDNAbp that complexes with the guide RNA in the host cell.
  • the napDNAbp can be prepared by (c) an in vitro coupled transcription-translation system and then complexing with a guide RNA.
  • a host cell is used to express the napDNAbp.
  • the host cell is not particularly limited, and various known cells can be used. Specific examples of the host cell include bacteria such as E. coli, yeasts (including budding yeast, e.g., Saccharomyces cerevisiae, and fission yeast, e.g., Schizosaccharomyces pombe) , nematodes (Caenorhabditis elegans) , Xenopus laevis oocytes, and animal cells (for example, CHO cells, COS cells, and HEK293 cells) .
  • the method for transferring the expression vector described above into host cells, i.e., the transformation method is not particularly limited, and known methods such as electroporation, the calcium phosphate method, the liposome method and the DEAE dextran method can be used.
  • the host cell After a host cell is transformed with the expression vector, the host cell may be cultured, cultivated, or bred, for production of the napDNAbp. After expression of the napDNAbp, the host cell can be collected and napDNAbp purified from the cultures according to conventional methods (for example, filtration, centrifugation, cell disruption, gel filtration chromatography, ion exchange chromatography, etc. ) .
  • conventional methods for example, filtration, centrifugation, cell disruption, gel filtration chromatography, ion exchange chromatography, etc.
  • a variety of methods can be used to determine the level of production of a napDNAbp in a host cell. Such methods include, but are not limited to, for example, methods that utilize either polyclonal or monoclonal antibodies specific for the napDNAbp or a labeling tag as described elsewhere herein. Exemplary methods include, but are not limited to, enzyme-linked immunosorbent assays (ELISA) , radioimmunoassays (MA) , fluorescent immunoassays (FIA) , and fluorescent activated cell sorting (FACS) . These and other assays are well known in the art (See, e.g., Maddox et al., J. Exp. Med. 158: 1211 [1983] ) .
  • ELISA enzyme-linked immunosorbent assays
  • MA radioimmunoassays
  • FACS fluorescent activated cell sorting
  • the disclosure provides methods of in vivo expression of the napDNAbp in a cell, comprising providing a polynucleotide (e.g., a DNA or an RNA) encoding the napDNAbp to a cell wherein the polynucleotide encodes the napDNAbp, expressing the napDNAbp in the cell, and obtaining the napDNAbp from the cell.
  • a polynucleotide e.g., a DNA or an RNA
  • the guide RNA of the disclosure is complexed with the napDNAbp of the disclosure to form a ribonucleoprotein.
  • the complexation of the guide RNA and the napDNAbp occurs at a temperature lower than about any one of 20 °C, 21 °C, 22 °C, 23 °C, 24 °C, 25 °C, 26 °C, 27 °C, 28 °C, 29 °C, 30 °C, 31 °C, 32 °C, 33 °C, 34 °C, 35 °C, 36 °C, 37 °C, 38 °C, 39 °C, 40 °C, 41 °C, 42 °C, 43 °C, 44 °C, 45 °C, 50 °C, or 55 °C.
  • the guide RNA does not dissociate from the napDNAbp at about 37 °Cover an incubation period of at least about any one of 10 mins, 15 mins, 20 mins, 25 mins, 30 mins, 35 mins, 40 mins, 45 mins, 50 mins, 55 mins, 1 hour, 2 hours, 3 hours, 4 hours, or more hours.
  • the guide RNA and napDNAbp are complexed in a complexation buffer.
  • the napDNAbp is stored in a buffer that is replaced with a complexation buffer to form a complex with the guide RNA.
  • the napDNAbp is stored in a complexation buffer.
  • the guide RNA is stored in a buffer that is replaced with a complexation buffer to form a complex with the napDNAbp.
  • the guide RNA is stored in a complexation buffer.
  • the complexation buffer has a pH in a range of about 7.3 to 8.6. In one embodiment, the pH of the complexation buffer is about 7.3. In one embodiment, the pH of the complexation buffer is about 7.4. In one embodiment, the pH of the complexation buffer is about 7.5. In one embodiment, the pH of the complexation buffer is about 7.6. In one embodiment, the pH of the complexation buffer is about 7.7. In one embodiment, the pH of the complexation buffer is about 7.8. In one embodiment, the pH of the complexation buffer is about 7.9. In one embodiment, the pH of the complexation buffer is about 8.0. In one embodiment, the pH of the complexation buffer is about 8.1. In one embodiment, the pH of the complexation buffer is about 8.2. In one embodiment, the pH of the complexation buffer is about 8.3. In one embodiment, the pH of the complexation buffer is about 8.4. In one embodiment, the pH of the complexation buffer is about 8.5. In one embodiment, the pH of the complexation buffer is about 8.6.
  • the napDNAbp is overexpressed and complexed with the guide RNA in a host cell prior to purification as described herein.
  • RNA e.g., mRNA
  • DNA encoding the napDNAbp is introduced into a cell so that the napDNAbp is expressed in the cell.
  • the guide RNA is also introduced into the cell, whether simultaneously, separately, or sequentially from a single RNA (e.g., mRNA) or DNA construct, such that the ribonucleoprotein complex is formed in the cell.
  • the ribonucleoprotein complex is formed with the guide RNA and the napDNAbp in a molar ratio of about 0.5: 1, 0.6: 1, 0.7: 1, 0.8: 1, 0.9: 1, 1: 1, 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5 or a molar ratio in a range composed of any two of the preceding molar ratios.
  • the disclosure provides a method for modifying a target DNA, comprising contacting the target DNA with the system, the complex, the vector, the RNP, or the LNP of the disclosure, wherein the guide sequence is capable of hybridizing to a target sequence of the target DNA, wherein the target DNA is modified (e.g., cleaved) by the complex.
  • the disclosure provides use of the system, the complex, the vector, the RNP, or the LNP of the disclosure in the manufacture of an agent for modifying a target DNA, wherein the guide sequence is capable of hybridizing to a target sequence of the target DNA, wherein the target DNA is modified (e.g., cleaved) by the complex.
  • the disclosure provides the system, the complex, the vector, the RNP, or the LNP of the disclosure, for use in modifying a target DNA, wherein the guide sequence is capable of hybridizing to a target sequence of the target DNA, wherein the target DNA is modified (e.g., cleaved) by the complex.
  • the napDNAbp has enzymatic activity (e.g., nuclease activity) .
  • the napDNAbp induces one or more DNA double-stranded breaks in the target DNA.
  • the napDNAbp induces one or more DNA single-stranded breaks in the target DNA.
  • the napDNAbp induces one or more DNA nicks in the target DNA.
  • DNA breaks and/or nicks result in formation of one or more indels (e.g., one or more deletions) in the target DNA.
  • a guide RNA disclosed herein forms a complex with a napDNAbp and directs the napDNAbp to a protospacer sequence adjacent to a 5’-TTN-3’ sequence.
  • the complex induces a deletion (e.g., a nucleotide deletion or DNA deletion) adjacent to the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
  • the complex induces a deletion adjacent to a T/C-rich sequence.
  • the deletion is downstream of a 5’-TTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion is downstream of a T/C-rich sequence.
  • the deletion alters expression of the target gene. In some embodiments, the deletion alters function of the target gene. In some embodiments, the deletion inactivates the target gene. In some embodiments, the deletion is a frameshifting deletion. In some embodiments, the deletion is a non-frameshifting deletion. In some embodiments, the deletion leads to cell toxicity or cell death (e.g., apoptosis) .
  • the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) from the 5’-TTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
  • the deletion ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) from the 5’-NTTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5’-NTTN-3’ sequence, wherein N is A, T, G, or C.
  • nucleotides e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides
  • the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) from the 5’-TTN-3’ sequence and ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) from the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
  • nucleotides e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides
  • N is A, T, G, or C.
  • the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5’-TTN-3’ sequence and ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
  • nucleotides e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides downstream of the 5’-TTN-3’ sequence and ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
  • the deletion is up to about 50 nucleotides in length (e.g., about or up to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides) .
  • the deletion is between about 4 nucleotides and about 50 nucleotides in length (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides) .
  • the disclosure provides a method of producing a modified target cell, comprising: (1) optionally harvesting a target cell from a subject; (2) optionally sorting and/or optionally amplifying the harvested target cell; (3) modifying a target gene in the (optionally sorted and/or optionally amplified) target cell by the method of any preceding claim;
  • a donor sequence e.g., a chimeric antigen receptor (CAR) -encoding donor sequence
  • CAR chimeric antigen receptor
  • the modified cell is used for CAR-T cell therapy or CAR-NK cell therapy.
  • the disclosure provides a method for diagnosing, preventing, or treating a disease or disorder in a subject, comprising administering to the subject (e.g., an effective amount of) the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure.
  • the disclosure provides use of (e.g., an effective amount of) the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure in the manufacture of an agent, a medicament, or a kit for diagnosing, preventing, or treating a disease or disorder in a subject.
  • an effective amount of e.g., an effective amount of the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure in the manufacture of an agent, a medicament, or a kit for diagnosing, preventing, or treating a disease or disorder in a subject.
  • the disclosure provides (e.g., an effective amount of) the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure, for use in diagnosing, preventing, or treating a disease or disorder in a subject.
  • Any suitable delivery or administration method known in the art may be used to deliver the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure.
  • the disease or disorder is associated with a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC.
  • a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC.
  • the disease or disorder is associated with a target or an antigen selected from the group consisting of CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM.
  • a target or an antigen selected from the group consisting of CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2,
  • the disease or disorder is a cancer, e.g., a hematologic malignancy or solid cancer, such as, chronic lymphocytic leukemia (CLL) , acute lymphoblastic leukemia (ALL) .
  • CLL chronic lymphocytic leukemia
  • ALL acute lymphoblastic leukemia
  • the disease or disorder is a hematologic disease or disorder, e.g., thalassemia, sickle cell disease, ⁇ -hemoglobinopathy, ⁇ -thalassemia.
  • the target cell or the cell is derived from the same subject as the subject to whom the modified cell is administered. In some embodiments, the target cell or the cell is derived from a subject different from the subject to whom the modified cell is administered.
  • EXAMPLE 1 Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in mammalian cells
  • This example demonstrates the cleavage activity of CRISPR-Cas12i system at a target gene (e.g., AAVS1, B2M, BCL11A, CCR5, CIITA, CISH, CTLA4, CXCR4, HBB, IL2RG, LAG3, PD1, TGFBR2, TIGIT, TIM3, TRAC) in mammalian cells.
  • a target gene e.g., AAVS1, B2M, BCL11A, CCR5, CIITA, CISH, CTLA4, CXCR4, HBB, IL2RG, LAG3, PD1, TGFBR2, TIGIT, TIM3, TRAC
  • Cas12i polypeptide SEQ ID NO: 2031, “Cas12Max” , which is N243R mutant of xCas12i (SiCas12i) of SEQ ID NO: 2030) and
  • DR direct repeat
  • scaffold sequence SEQ ID NO: 2032
  • Cas12Max (SiCas12i-N243R) , 1080 aa, SEQ ID NO: 2031
  • the target gene comprises a protospacer sequence on the nontarget DNA strand (NTS) of the target gene, which is completely complementary to the target sequence on the target DNA strand (TS) of the target gene and a protospacer adjacent motif (PAM) 5’ to the protospacer sequence.
  • NTS nontarget DNA strand
  • TS target DNA strand
  • PAM protospacer adjacent motif
  • a representative set of 99 protospacer sequences were selected from the protospacer sequences (Table 1) identified from the 16 genes listed above, by using 5’-TTN-3’ PAM sequence to verify the cleavage activity of the CRISPR-Cas12i system.
  • each of the spacer sequences of the guide RNAs as tested in this Example was designed to be fully complementary to, and capable of hybridizing to, the target sequence and thus was identical to each of the representative protospacer sequences except for the replacement of a thymine (T) with an uracil (U) due to the nature of DNA and RNA.
  • T thymine
  • U uracil
  • the plasmid comprised, from 5’ to 3’, a U6 promoter (SEQ ID NO: 2034) operably linked to a sequence encoding the guide RNA as described above, a CBh promoter (SEQ ID NO: 2035) , a Kozak sequence (SEQ ID NO: 2036) , a sequence encoding 3xFLAG (SEQ ID NO: 2037) , a sequence encoding SV40 NLS (SEQ ID NO: 2038) , a sequence encoding Cas12Max (SEQ ID NO: 2031) , a sequence encoding NP NLS (SEQ ID NO: 2039) , a sequence encoding a bGH polyA signal (SEQ ID NO: 2040) , a CMV enhancer (SEQ ID NO: 2041)
  • U6 promoter (RNA polymerase III promoter for human U6 snRNA) , 241 nt, SEQ ID NO: 2034
  • SV40 NLS nuclear localization signal of SV40 large T antigen
  • NP NLS nucleoplasmin NLS
  • nucleoplasmin NLS nuclear localization signal from nucleoplasmin
  • bGH polyA bovine growth hormone polyadenylation signal
  • CMV enhancer human cytomegalovirus immediate early enhancer
  • CMV promoter human cytomegalovirus (CMV) immediate early promoter
  • 204 nt SEQ ID NO: 2042
  • HEK293T cells were cultured in 24-well tissue culture plates according to standard methods for 12 hours, before the expression plasmid was transfected into the cells using standard polyethyleneimine (PEI) transfection. The transfected cells were then cultured at 37°C under 5%CO 2 for 48 hours. Then the cultured cells expressing mCherry were sorted by flow cytometry, and the cleavage activity of the CRISPR-Cas12i system was measured by using Indel rate (%) by sequencing with a pair of primers upstream and downstream of the protospacer sequence and TIDE analysis.
  • PEI polyethyleneimine
  • the cleavage activity of the CRISPR-Cas12i system for each tested protospacer/spacer sequence is shown in Table 2.
  • the results show the significant cleavage activity of the CRISPR-Cas12i system for the listed genes, for example, up to 91.8%for TRAC gene, indicating the promising application of the CRISPR-Cas12i system in targeted gene editing.
  • EXAMPLE 2 Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in T cells
  • This example demonstrates the cleavage activity of CRISPR-Cas12i system at a target gene (e.g., TRAC, B2M, PD1, IL2RG, CCR5, KLRG1, CD52, CD7, LAG3, CTLA4, TIM3, CIITA, TIGIT, CD16a, CXCR4) in human primary T cells.
  • a target gene e.g., TRAC, B2M, PD1, IL2RG, CCR5, KLRG1, CD52, CD7, LAG3, CTLA4, TIM3, CIITA, TIGIT, CD16a, CXCR4
  • RNP ribonucleoprotein
  • PCR cycle conditions 95 °C for 3 min; 35 cycles of 95 °C for 20 s, 60 °C for 20 s, and 72 °C for 30 s (500 bp) ; 72 °C for 5 min.
  • TRAC gRNA-3-long was prepared in a long form of DR-spacer-DR-spacer. All the gRNA in Example 2 contains a polyU tail composed of four uracil (UUUU) (shown as polyT tail TTTT) at the 3’-end of the gRNA.
  • EXAMPLE 3 Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in NK cells
  • This example provides a strategy for using a CRISPR-Cas12i system to knock out a gene (e.g., PD-1, LAG-3, TIM-3, TIGIT, CISH, TGFBR2, KLRC1 (NKG2A) , A2AR, KLRK1, KLRG1) in NK cells, e.g., to produce universal NK cells.
  • a gene e.g., PD-1, LAG-3, TIM-3, TIGIT, CISH, TGFBR2, KLRC1 (NKG2A) , A2AR, KLRK1, KLRG1
  • NK cells from a donor are collected and counted using an automated cell counter. A sample from each donor is collected and analyzed for viability. NK cells are expanded in NK-cell media. Following expansion, cells are collected, counted, and cell density is adjusted in P3 primary cell buffer (Lonza) .
  • a Cas12i-gRNA mixture is prepared by mixing purified xCas12i or its variant and a gRNA targeting a gene (e.g., PD-1, LAG-3, TIM-3, TIGIT, CISH, TGFBR2, KLRC1 (NKG2A) , A2AR, KLRK1, KLRG1) in NK cells in a ratio.
  • a gRNA targeting a gene e.g., PD-1, LAG-3, TIM-3, TIGIT, CISH, TGFBR2, KLRC1 (NKG2A) , A2AR, KLRK1, KLRG1
  • Exemplary protospacer sequence /spacer sequences for such a gRNA may be found in SEQ ID NOs: 1-2029.
  • NK cells are dispensed into an electroporation plate and the Cas12i-gRNA mixture is added to the cells.
  • Several different final concentrations of the Cas12i-gRNA mixture are used (e.g., a final concentration of concentration of, for example, 2 ⁇ M, 5 ⁇ M, 10 ⁇ M, or 16 ⁇ M) .
  • the plate is electroporated using an electroporation device. Following electroporation, replacement media is added to quench the reaction and cells are transferred to a new plate with pre-warmed media.
  • the NK cells are then incubated until further analysis.
  • the NK cells successfully modified according to the method are identified, with an insertion or deletion created in the target gene.
  • a nucleic acid encoding a chimeric antigen receptor (CAR) is optionally introduced into the NK cells before or after the introduction of the CRISPR-Cas12i system.
  • EXAMPLE 4 Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in hematopoietic stem cells (HSC)
  • This example provides a strategy for using a CRISPR-Cas12i system to knock out a gene (e.g., BCL11A, HBB, HEXB) in HSC, e.g., to treat a hemoglobinopathy.
  • a gene e.g., BCL11A, HBB, HEXB
  • Bone marrow CD34+ hematopoietic stem cells are assessed for cell number and viability using acridine orange/propidium iodide staining using a cell counter.
  • CD34+ cells are cultured in serum-free expansion media with the appropriate supplement for approximately 48 hours.
  • a Cas12i-gRNA mixture is prepared by mixing purified xCas12i or its variant and a gRNA targeting a gene (e.g., BCL11A, HBB, HEXB) in HSC in a ratio.
  • a gRNA targeting a gene e.g., BCL11A, HBB, HEXB
  • Exemplary protospacer sequence /spacer sequences for such a gRNA may be found in SEQ ID NOs: 1-2029.
  • a donor DNA template corresponding to the wildtype sequence of the target gene may also be added for insertion at the cleavage site of the target gene.
  • the sequence of the donor DNA template is adjusted based on the gRNA used so that the sequence of the modified gene (e.g., BCL11A, HBB, HEXB) of the modified CD34+ cells after modification reflects the corresponding wildtype sequence.
  • Cells are washed with PBS and resuspended in buffer and supplement (Lonza #VXP-3032) with transfection enhancer oligo.
  • Cells are dispensed into an electroporation plate at and the Cas12i-gRNA mixture with a donor DNA template encoding a wild-type gene (e.g., BCL11A, HBB, HEXB) is added to the cells.
  • a donor DNA template encoding a wild-type gene (e.g., BCL11A, HBB, HEXB) is added to the cells.
  • Several different final concentrations of the Cas12i-gRNA mixture are used (e.g., a final concentration of concentration of, for example 2 ⁇ M, 5 ⁇ M, 10 ⁇ M, or 16 ⁇ M) .
  • the plate is electroporated using an electroporation device.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Medicinal Chemistry (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Oncology (AREA)
  • Hematology (AREA)
  • Mycology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Provided is a guide RNA for DNA (e.g., gene) targeting and uses thereof.

Description

GUIDE RNA AND USES THEREOF
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of and priority to PCT Patent Application No. PCT/CN2022/077116, filed on February 21, 2022, entitled “CELLS MODIFIED BY CRISPR-CAS12I SYSTEM AND USES THEREOF” , and PCT Patent Application No. PCT/CN2022/142073, filed on December 26, 2022, entitled “PROGRAMMABLE GUIDE RNA FOR DNA TARGETING” , the entire contents of which, including any sequence listing and drawings, are incorporated herein by reference in its entirety.
REFERENCE TO AN ELECTRONIC SEQUENCE LISTING
The disclosure contains an electronic sequence listing ( “xxx. xml” ; Size is xxx bytes and it was created on xxx) , the contents of which is incorporated hereby by reference in its entirety. Wherever a sequence is an RNA sequence, the T in the sequence shall be deemed as U.
BACKGROUND
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) genes, collectively known as CRISPR-Cas or CRISPR/Cas systems, are adaptive immune systems in archaea and bacteria that defend particular species against foreign genetic elements.
Citation or identification of any document in this application is not an admission that such a document is available as prior art to the disclosure.
SUMMARY
It is against the above background that the disclosure provides certain advantages over the prior art. Although the disclosure herein is not limited to specific advantages, in an aspect, the disclosure provides a guide RNA comprising (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and (2) a guide sequence capable of hybridizing to a target sequence on a target strand of a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC in a target cell, thereby guiding the complex to the target gene; wherein the target gene comprises a protospacer sequence on the nontarget strand of the target gene and a protospacer adjacent motif (PAM) adjacent (e.g., 5’ ) to the protospacer sequence, wherein the protospacer sequence is fully complementary to the target sequence on the target strand of the target gene.
In another aspect, the disclosure provides a system comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) or a polynucleotide encoding the napDNAbp; and (2) the guide RNA of the disclosure, or a polynucleotide encoding the guide RNA.
In yet another aspect, the disclosure provides a complex comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) complexed with (2) the guide RNA of the disclosure.
In yet another aspect, the disclosure provides a polynucleotide encoding the guide RNA of the disclosure.
In yet another aspect, the disclosure provides a ribonucleoprotein (RNP) comprising the system or complex of the disclosure comprising the napDNAbp and the guide RNA.
In yet another aspect, the disclosure provides a lipid nanoparticle (LNP) comprising the system of the disclosure comprising a mRNA encoding the napDNAbp and the guide RNA.
In yet another aspect, the disclosure provides a method for modifying a target gene in a target cell, comprising contacting the target cell with the system, the RNP, or the LNP of the disclosure, wherein the guide sequence is capable of hybridizing to a target sequence on a target strand of the target gene, wherein the target gene is modified (e.g., cleaved) by the complex.
In yet another aspect, the disclosure provides a method of producing a modified target cell, comprising (1) optionally harvesting a target cell from a subject; (2) optionally sorting and/or optionally amplifying the harvested target cell; (3) modifying a target gene in the (optionally sorted and/or optionally amplified) target cell by the method of the disclosure; (4) optionally inserting a donor sequence (e.g., a chimeric antigen receptor (CAR) -encoding donor sequence) into the genome of the target cell; and (5) optionally purifying the modified target cell.
In yet another aspect, the disclosure provides a cell or a progeny thereof, wherein the cell is modified by the method of the disclosure.
In yet another aspect, the disclosure provides a method for preventing or treating a disease or disorder in a subject, comprising administering to the subject (e.g., an effective amount of) the system, the complex, the RNP, the LNP, or the cell or progeny thereof of the disclosure.
The details of one or more embodiments of the disclosure are set forth in the description below. Other features or advantages of the disclosure will be apparent from the following drawings and detailed description of several embodiments, and also from the appended claims. It is understood that although each and every aspect of the  disclosure is described in one or more embodiments, the embodiments can be combined according to the principle and spirits of the disclosure mutatis mutandis.
Definitions
The disclosure will be described with respect to particular embodiments, but the disclosure is not limited thereto but only by the claims. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which this disclosure belongs. Terms as set forth hereinafter are generally to be understood in their common sense unless indicated otherwise.
As used herein, the term “activity” refers to a biological activity. In some embodiments, activity includes enzymatic activity, e.g., catalytic ability of an effector. For example, activity can include nuclease activity.
As used herein, the term “nucleic acid programmable nucleotide binding protein” may be used interchangeably with “polynucleotide programmable nucleotide binding domain” to refer to a protein that associates with a nucleic acid (e.g., DNA or RNA) , such as a guide nucleic acid or guide polynucleotide (e.g., gRNA) , that guides the protein to a specific nucleic acid sequence. In some embodiments, the nucleic acid programmable nucleotide binding protein is a nucleic acid programmable DNA binding protein (napDNAbp) . In some embodiments, the nucleic acid programmable nucleotide binding protein is a nucleic acid programmable RNA binding protein. In some embodiments, the nucleic acid programmable nucleotide binding protein is a Cas9 protein. A Cas9 protein can associate with a guide RNA that guides the Cas9 protein to a specific DNA sequence that is complementary to the guide RNA. In some embodiments, the napDNAbp is a Cas9 domain, for example a nuclease active Cas9, a Cas9 nickase (nCas9) , or a nuclease inactive Cas9 (dCas9) . Non-limiting examples of the napDNAbp include, Cas9 (e.g., dCas9 and nCas9) , Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12f/Cas14, Cas12g, Cas12h, Cas12i, and Cas12k. Non-limiting examples of Cas enzymes include Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas5d, Cas5t, Cas5h, Cas5a, Cas6, Cas7, Cas8, Cas8a, Cas8b, Cas8c, Cas9 (also known as Csn1 or Csx12) , Cas10, Cas10d, Cas12a/Cpf1, Cas12b/C2c1, Cas12c/C2c3, Cas12d/CasY, Cas12e/CasX, Cas12f/Cas14, Cas12g, Cas12h, Cas12i, Cas12k, Csy1, Csy2, Csy3, Csy4, Cse1, Cse2, Cse3, Cse4, Cse5e, Csc1, Csc2, Csa5, Csn1, Csn2, Csm1, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Csb1, Csb2, Csb3, Csx17, Csx14, Csx10, Csx16, CsaX, Csx3, Csx1, Csx1S, Csx11, Csf1, Csf2, CsO, Csf4, Csd1, Csd2, Cst1, Cst2, Csh1, Csh2, Csa1, Csa2, Csa3, Csa4, Csa5, Type II Cas effector proteins, Type V Cas effector proteins, Type VI Cas effector proteins, CARF, DinG, homologues thereof, or modified or engineered versions thereof. Other napDNAbp are also within the scope of this disclosure, e.g., IscB, IsrB, although they may not be specifically listed in this disclosure. See, e.g., Makarova et al. “Classification and Nomenclature of CRISPR-Cas Systems: Where from Here? ” CRISPR J. 2018 October; 1: 325-336. doi: 10.1089/crispr. 2018.0033; Yan et al., “Functionally diverse type V CRISPR-Cas systems” Science. 2019 Jan. 4; 363 (6422) : 88-91. doi: 10.1126/science. aav7271, the entire contents of each are hereby incorporated by reference.
As used herein, the term “complex” refers to a grouping of two or more molecules. In some embodiments, the complex comprises a polypeptide and a nucleic acid molecule interacting with (e.g., binding to, coming into contact with, adhering to) one another. As used herein, the term “complex” can refer to a grouping of a guide RNA and a polypeptide (e.g., a napDNAbp, such as, a Cas12i polypeptide) . As used herein, the term “complex” can refer to a grouping of a guide RNA, a polypeptide, and a target sequence. As used herein, the term “complex” can refer to a grouping of a target gene-targeting guide RNA and a napDNAbp.
As used herein, the term “protospacer adjacent motif’ or “PAM” refers to a DNA sequence adjacent to a target sequence (e.g., a target sequence of a target gene) to which a complex comprising a guide RNA (e.g., a target gene-targeting guide RNA) and a napDNAbp binds. In the case of a double-stranded target, the guide RNA binds to a first strand of the target (e.g., the target strand or the spacer-complementary strand) , and a PAM sequence as described herein is present in the second, complementary strand (e.g., the nontarget strand or the non-spacer-complementary strand) and adjacent to the protospacer sequence complementary to the target sequence to which the guide RNA binds.
As used herein, the term “adjacent” includes instances in which the guide RNA of a complex comprising a guide RNA and a napDNAbp specifically binds, interacts, or associates with a target sequence that is immediately adjacent to a PAM, or in the case of a double-stranded target where the PAM is present in the non-target strand (e.g., the non-spacer-complementary strand) , with a target sequence that is complementary to a protospacer sequence immediately adjacent to a PAM. In such instances, there are no nucleotides between the target sequence or protospacer sequence and the PAM. The term “adjacent” also includes instances in which there are a small number (e.g., 1, 2, 3, 4, or 5) of nucleotides between the target sequence, to which the guide RNA binds, or the protospacer sequence and the PAM.
As used herein, the term “guide RNA” refers to any RNA molecule that facilitates the targeting of a napDNAbp (e.g., a Cas12i polypeptide) described herein to a target sequence (e.g., a sequence of a target gene) . A guide RNA may be designed to include sequences that are complementary to a specific nucleic acid sequence (e.g., a sequence of a target gene) . A guide RNA may comprise a DNA targeting sequence (i.e., a guide sequence) and a scaffold sequence. The term “crRNA” or “RNA guide” is also used herein to refer to a guide RNA.
In some embodiments, a guide sequence is complementary to a target sequence. As used herein, the term “complementary” refers to the ability of nucleobases of a first nucleic acid molecule, such as a guide RNA, to base pair with nucleobases of a second nucleic acid molecule, such as a target sequence. Two complementary nucleic acid molecules are able to non-covalently bind under appropriate temperature and solution ionic strength conditions. In some embodiments, a first nucleic acid molecule (e.g., a guide sequence of a guide RNA) comprises 100%complementarity to a second nucleic acid (e.g., a target sequence) . In some embodiments, a first nucleic acid molecule (e.g., a guide sequence of a guide RNA) is complementary to a second nucleic acid molecule (e.g., a target sequence) if the first nucleic acid molecule comprises at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%complementarity to the second nucleic acid. As used herein, the term “substantially complementary” refers to a polynucleotide (e.g., a guide sequence of a guide RNA) that has a certain level of complementarity to a target sequence. In some embodiments, the level of complementarity is such that the polynucleotide (e.g., a guide sequence of a guide RNA) can hybridize to the target sequence (e.g., a sequence of a target gene) with sufficient affinity to permit an effector polypeptide (e.g., a napDNAbp) that is complexed with the polynucleotide or a function domain associated (e.g., fused) with the effector polypeptide to act (e.g., cleave, deaminize) on the target sequence or its complement (e.g., a sequence of a target gene or its complement) . In some embodiments, a guide sequence that is substantially complementary to a target sequence has less than 100%complementarity to the target sequence. In some embodiments, a guide sequence that is substantially complementary to a target sequence has at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%complementarity to the target sequence.
As used herein, the term “target sequence” refer to a nucleic acid sequence to which a guide RNA specifically binds. In some embodiments, the DNA targeting sequence (e.g., spacer) of a guide RNA binds to a target sequence. In the case of a double-stranded target, the guide RNA binds to a first strand of the target (i.e., the target strand or the spacer-complementary strand) , and a PAM sequence as described herein is present in the second, complementary strand (i.e., the non-target strand or the non-spacer-complementary strand) and adjacent to the protospacer sequence complementary to the target sequence to which the guide RNA binds. In some embodiments, the target strand (i.e., the spacer-complementary strand) comprises a 5’-NAAN-3’ sequence. In some embodiments, the non-target strand (i.e., the non-spacer-complementary strand) comprises a 5’-NTTN-3’ sequence. In some embodiments, the target sequence or its complement (i.e., a protospacer sequence) is a sequence within a target gene sequence, including, but not limited to, the sequence set forth in SEQ ID NO: 339 or the reverse complement thereof.
As used herein, the terms “upstream” and “downstream” refer to relative positions within a single nucleic acid (e.g., DNA) sequence in a nucleic acid molecule. “Upstream” and “downstream” relate to the 5’ to 3’ direction, respectively, in which RNA transcription occurs. A first sequence is upstream of a second sequence when the 3’ end of the first sequence occurs before the 5’ end of the second sequence. A first sequence is downstream of a second sequence when the 5’ end of the first sequence occurs after the 3’ end of the second sequence. In some embodiments, the 5’-NTTN-3’ sequence is upstream of an indel described herein, and a napDNAbp-induced indel is downstream of the 5’-NTTN-3’ sequence.
The terms “nucleic acid, ” “polynucleotide, ” and "nucleotide sequence" are used interchangeably to refer to a polymeric form of nucleotides of any length, including deoxyribonucleotides, ribonucleotides, combinations thereof, and analogs thereof. “Oligonucleotide” and “oligo” are used interchangeably to refer to a short polynucleotide, having no more than about 50 nucleotides.
As used herein, “complementarity” refers to the ability of a nucleic acid to form hydrogen bond (s) with another nucleic acid by traditional Watson-Crick base-pairing. A percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (i.e., Watson-Crick base pairing) with a second nucleic acid (e.g., about 5, 6, 7, 8, 9, 10 out of 10, being about 50%, 60%, 70%, 80%, 90%, and 100%complementary respectively) . “Perfectly complementary” or “completely complementary” means that all the contiguous residues of a nucleic acid sequence form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence. “Substantially complementary” as used herein refers to a degree of complementarity that is at least about any one of 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100%over a region of about 40, 50, 60, 70, 80, 100, 150, 200, 250 or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
“Hybridization” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. A sequence capable of hybridizing with a given sequence is referred to as the “complement” of the given sequence.
“Percentage (%) sequence identity” with respect to a nucleic acid sequence is defined as the percentage of nucleotides in a candidate sequence that are identical with the nucleotides in the specific nucleic acid sequence, after aligning the sequences by allowing gaps, if necessary, to achieve the maximum percent sequence identity. “Percentage (%) sequence identity” with respect to a peptide, polypeptide or protein sequence is the percentage of amino acid residues in a candidate sequence that  are identical substitutions to amino acid residues in the specific peptide or amino acid sequence, after aligning the sequences by allowing gaps, if necessary, to achieve the maximum percent sequence homology. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or MEGALIGNTM (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
The terms "polypeptide" , and "peptide" are used interchangeably herein to refer to polymers of amino acids of any length. The polymer may he linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids. A protein may have one or more polypeptides. The terms also encompass an amino acid polymer that has been modified; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation, such as conjugation with a labeling component.
The term "regulatory element" is intended to include promoters, enhancers, internal ribosome entry sites (IRES) and other expression control elements (e.g., transcription termination signals such as polyadenylation signals and poly-U sequences) . Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990) . Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cells and those that direct expression of a nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences) . Regulatory elements may also direct expression in a time-dependent manner, e.g., in a cell cycle-dependent or developmental stage-dependent manner, which may or may not be tissue or cell type specific.
As used herein, a “variant” is interpreted to mean a polynucleotide or polypeptide that differs from a reference polynucleotide or polypeptide, respectively, but retains essential properties. A typical variant of a polynucleotide differs in nucleic acid sequence from another, reference polynucleotide. Changes in the nucleic acid sequence of the variant may or may not alter the amino acid sequence of a polypeptide encoded by the reference polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence, as discussed below. A typical variant of a polypeptide differs in amino acid sequence from another, reference polypeptide. Generally, differences are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more substitutions, additions, deletions in any combination. A substituted or inserted amino acid residue may or may not be one encoded by the genetic code. A variant of a polynucleotide or polypeptide may be a naturally occurring such as an allelic variant, or it may be a variant that is not known to occur naturally. Non-naturally occurring variants of polynucleotides and polypeptides may be made by mutagenesis techniques, by direct synthesis, and by other recombinant methods known to skilled artisans.
As used herein, the term "wild type" has the meaning commonly understood by those skilled in the art to mean a typical form of an organism, a strain, a gene, or a feature that distinguishes it from a mutant or variant when it exists in nature. It can be isolated from sources in nature and not intentionally modified.
As used herein, the terms "non-naturally occurring" or "engineered" are used interchangeably and refer to artificial participation. When these terms are used to describe a nucleic acid molecule or polypeptide, it is meant that the nucleic acid molecule or polypeptide is at least substantially freed from at least one other component of its association in nature or as found in nature.
As used herein, the term "identity" is used to mean the matching of sequences between two polypeptides or between two nucleic acids. When a position in the two sequences being compared is occupied by the same base or amino acid monomer subunit (for example, a position in each of the two DNA molecules is occupied by adenine, or a position in each of the two polypeptides is occupied by lysine, and then each molecule is identical at that position. The "percent identity" between the two sequences is a function of the number of matching positions shared by the two sequences divided by the number of positions to be compared x 100. For example, if 6 of the 10 positions of the two sequences match, then the two sequences have 60%identity. For example, the DNA sequences CTGACT and CAGGTT share 50%identity (3 out of a total of 6 positions match) . Typically, the comparison is made when the two sequences are aligned to produce maximum identity. Such alignment can be achieved by, for example, the method of Needleman et al. (1970) J. Mol. Biol. 48: 443-453, which can be conveniently performed by a computer program such as the Align program (DNAstar, Inc. ) . It is also possible to use the algorithm of E. Meyers and W. Miller (Comput. Appl Biosci., 4: 11-17 (1988) ) integrated into the ALIGN program (version 2.0) , using the PAM 120 weight residue table. The gap length penalty of 12 and the gap penalty of 4 were used to determine the percent identity between the two amino acid sequences. In addition, the Needleman and Wunsch (J MoI Biol. 48:444-453 (1970) ) algorithms in the GAP program integrated into the GCG software package (available at www. gcg. com) can be used, using the Blossum 62 matrix or The PAM250 matrix and the gap weight of 16, 14, 12, 10, 8, 6 or 4 and the length weight of 1, 2, 3, 4, 5 or 6 to determine the percent identity between two amino acid sequences.
"Stem-loop structure" refers to a nucleic acid that has a secondary structure that includes regions of nucleotides known or predicted to form a double-strand (stem portion) that is linked by regions of single-stranded nucleotides (loop portions) . The terms "hairpin" and "turnback" structures are also used herein to refer to stem-loop structures. Such structures are well known in the art, and these terms are used in accordance with their commonly known meanings in the art. As is known in  the art, stem-loop structures do not require precise base pairing. Thus, the stem may include one or more base mismatches. Alternatively, base pairing may be exact, i.e., not including any mismatches.
Unless otherwise specified, “Cas12i, ” “Cas12i protein, ” or “Cas12i polypeptide” as used herein, include any Cas12i protein described in the disclosure and its variants, such as mutants, and derivatives, such as Cas12i fusion proteins, as well as dCas12i proteins substantially lacking catalytic activity, nCas12i nickases with nickase single-strand cleavage activity, and their derivatives, such as dCas12i fusion proteins (such as dCas12i-TadA) . The disclosure also provides nucleotide sequences encoding Cas12i proteins and variants and derivatives thereof.
The term “crRNA” is used herein interchangeably with guide molecule, gRNA, or guide RNA, comprising a portion capable of recruiting and forming a protein-RNA complex with a CRISPR-Cas protein (such as any of the Cas12i proteins and variants and derivatives thereof as described herein) (e.g., direct repeats/DRs) and a portion that is sufficiently complementary to a target sequence to hybridize to the target sequence and direct the specific binding of the aforementioned protein-RNA complex to the target sequence (e.g. spacer/Spacer) .
A “cell” as used herein, is understood to refer not only to the particular individual cell, but to the progeny or potential progeny of the cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
The term “transduction” and “transfection” as used herein include all methods known in the art using an infectious agent (such as a virus) or other means to introduce DNA into cells for expression of a protein or molecule of interest. Besides a virus or virus like agent, there are chemical-based transfection methods, such as those using calcium phosphate, dendrimers, liposomes, or cationic polymers (e.g., DEAE-dextran or polyethylenimine) ; non-chemical methods, such as electroporation, cell squeezing, sonoporation, optical transfection, impalefection, protoplast fusion, delivery of plasmids, or transposons; particle-based methods, such as using a gene gun, magnetofection or magnet assisted transfection, particle bombardment; and hybrid methods, such as nucleofection.
The term “transfected” or “transformed” or “transduced” as used herein refers to a process by which exogenous nucleic acid is transferred or introduced into a target cell. A “transfected” or “transformed” or “transduced” cell is one, which has been transfected, transformed or transduced with exogenous nucleic acid.
The term “in vivo” refers to inside the body of the organism from which the cell is obtained. “Ex vivo” or “in vitro” means outside the body of the organism from which the cell is obtained.
As used herein, “treatment” or “treating” is an approach for obtaining beneficial or desired results including clinical results. For purposes of the disclosure, beneficial or desired clinical results include, but are not limited to, one or more of the following: alleviating one or more symptoms resulting from the disease, diminishing the extent of the disease, stabilizing the disease (e.g., preventing or delaying the worsening of the disease) , preventing or delaying the spread (e.g., metastasis) of the disease, preventing or delaying the recurrence of the disease, reducing recurrence rate of the disease, delay or slowing the progression of the disease, ameliorating the disease state, providing a remission (partial or total) of the disease, decreasing the dose of one or more other medications required to treat the disease, delaying the progression of the disease, increasing the quality of life, and/or prolonging survival. Also encompassed by “treatment” is a reduction of pathological consequence of a disease (such as cancer) . The methods of the disclosure contemplate any one or more of these aspects of treatment.
“Chimeric antigen receptor” or "CAR" as used herein refers to engineered receptors, which can be used to graft one or more antigen specificity onto immune effector cells, such as T cells. Some CARs are also known as “artificial T-cell receptors, ” “chimeric T cell receptors, ” or “chimeric immune receptors. ” In some embodiments, the CAR comprises an extracellular antigen binding domain specific for one or more antigens (such as tumor antigens) , a transmembrane domain, and an intracellular signaling domain of a T cell and/or other receptors. “CAR-T” refers to a T cell that expresses a CAR.
“T cell receptor” or “TCR” as used herein refers to endogenous or recombinant T cell receptor comprising an extracellular antigen-binding domain that binds to a specific antigenic peptide bound in an MHC molecule. In some embodiments, the TCR comprises a TCR polypeptide chain and a TCR polypeptide chain. In some embodiments, the TCR specifically binds a tumor antigen. “TCR-T” refers to a T cell that expresses a recombinant TCR.
“T-cell antigen coupler receptor” or “TAC receptor” as used herein refers to an engineered receptor comprising an extracellular antigen-binding domain that binds to a specific antigen and a T-cell receptor (TCR) binding domain, a transmembrane domain, and an intracellular domain of a co-receptor molecule. The TAC receptor co-opts the endogenous TCR of a T cell that expressed the TAC receptor to elicit antigen-specific T-cell response against a target cell.
“TCR fusion protein” or “TFP” as used herein refers to an engineered receptor comprising an extracellular antigen-binding domain that binds to a specific antigen fused to a subunit of the TCR complex or a portion thereof, including TCR chain, TCR chain, TCR chain, TCR chain, CD3, CD3, or CD3. The subunit of the TCR complex or portion thereof comprise a transmembrane domain and at least a portion of the intracellular domain of the naturally occurring TCR subunit. In some embodiments, the TFP comprises the extracellular domain of the TCR subunit or a portion thereof. In some embodiments, the TFP does not comprise the extracellular domain of the TCR subunit.
It is understood that embodiments of the disclosure described herein include “consisting” and/or “consisting essentially of” embodiments. As used herein, reference to “not” a value or parameter generally means and describes “other than” a value or parameter. For example, the method is not used to treat cancer of type X means the method is used to treat cancer of types other than X.
The term “about X-Y” used herein has the same meaning as “about X to about Y. ”
As used herein and in the appended claims, the singular forms “a, ” “an, ” and “the” include plural referents unless the context clearly dictates otherwise. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely, ” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.
The term “and/or” as used herein a phrase such as “A and/or B” is intended to include both A and B; A or B; A (alone) ; and B (alone) . Likewise, the term “and/or” as used herein a phrase such as “A, B, and/or C” is intended to encompass each of the following embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone) ; B (alone) ; and C (alone) .
As used herein, when the term “about” is ahead of a serious of numbers (for example, about 1, 2, 3) , it is understood that each of the serious of numbers is modified by the term “about” (that is, about 1, about 2, about 3) .
BRIEF DESCRIPTION OF THE DRAWINGS
An understanding of the features and advantages of the disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure may be utilized, and the accompanying drawings of which:
FIG. 1 illustrates an exemplary target gene and an exemplary guide RNA of the disclosure.
The figures herein are for illustrative purposes only and are not necessarily drawn to scale.
DETAILED DESCRIPTION
The disclosure relates to a guide RNA capable of binding to a target gene and uses thereof. In some aspects, a system comprising a guide RNA having one or more characteristics is described herein. In some aspects, a method of producing the guide RNA is described. In some aspects, a method of delivering a system comprising the guide RNA is described.
OVERVIEW
In an aspect, the disclosure provides a guide RNA. The guide RNAs is capable of directing a nucleic acid programmable DNA binding protein (napDNAbp) to a target DNA (e.g., gene) .
Typically, the guide RNA may comprise (1) a scaffold sequence capable of forming a complex with a napDNAbp and (2) a guide sequence capable of hybridizing to a target sequence of a target DNA, thereby guiding the complex to the target DNA. The guide RNA is “programmable” (i.e., programmable guide RNA) because the guide sequence can be tailored to target a specific target sequence /target site.
To more specifically illustrate the structure -function relationship of the guide RNA, CRISPR-Cas12i system is taken as an example. Cas12i is a programable RNA-guided dsDNA endonuclease that may generate a double-strand break (DSB) on a target dsDNA as guided by a programable guide RNA referred to as CRISPR RNA (crRNA) or guide RNA (gRNA) comprising a spacer sequence (or a guide sequence) and a direct repeat (DR) sequence (or a scaffold sequence) . Without wishing to be bound by theory, it is believed that the direct repeat sequence is responsible for forming a complex with a Cas12i polypeptide and the spacer sequence is responsible for hybridizing to a target sequence of a target dsDNA, thereby guiding the complex comprising the gRNA and the Cas12i polypeptide to the target dsDNA. Referring to FIG. 1, a target gene (partially) as an example of a target dsDNA is depicted to comprise a 5’ to 3’ upside strand and a 3’ to 5’ downside strand. A gRNA is depicted to comprise a spacer sequence in green and a direct repeat sequence in orange. The spacer sequence is designed to hybridize to a part of the downside strand, and so the guide sequence “targets” the part of the downside strand. And thus, the downside strand is referred to as a “target DNA strand” or a “target strand (TS) ” of the target dsDNA, while the upside strand is referred to as a “non-target DNA strand” or a “non-target strand (NTS) ” of the target dsDNA. The part of the target strand based on which the guide sequence is designed and to which the guide sequence may hybridize is referred to as a “target sequence” , while the sequence on the non-target strand corresponding to and base pairing with the target sequence is referred to as the “reverse complementary sequence of the target sequence” , “reverse complementary sequence” , “complement” , or “protospacer sequence” .
As used herein, “programmable guide RNA” , “guide RNA” , “gRNA” , “CRISPR RNA” , and “crRNA” are exchangeable. As used herein, “spacer sequence” and “guide sequence” are exchangeable. As used herein, “scaffold sequence” and “direct repeat (DR) sequence” are exchangeable.
GUIDE RNA
In an aspect, the disclosure provides a guide RNA comprising (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and (2) a guide sequence capable of hybridizing  to a target sequence of a target DNA, thereby guiding the complex to the target DNA. The term “guiding” as used herein is be exchangeable with “directing” or “targeting” . The guide RNA may guide, direct, or target the complex formed with and comprising the napDNAbp as described herein and the guide RNA, or in other words, guide, direct, or target the napDNAbp as described herein, to a target sequence of a target DNA. Two or more guide RNAs (same or different) may target two or more separate napDNAbp (e.g., napDNAbp having the same or different sequence) as described herein to two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) target sequences (same or different) of one, two, or more target DNA.
Those skilled in the art reading the below examples of particular kinds of guide RNAs will understand that, in some embodiments, a guide RNA is target sequence-specific. That is, in some embodiments, a guide RNA binds specifically to one or more target sequences (e.g., within a cell) and not to non-targeted sequences (e.g., non-specific DNA or random sequences within the same cell) .
In some embodiments, the target DNA is a target dsDNA. In some embodiments, the target dsDNA is a target gene. In some embodiments, the target DNA is a target ssDNA.
The structure (or configuration) of the guide RNA may vary depending on various purposes, e.g., for improved activity, such as, DNA cleavage activity. In some embodiments, the guide RNA comprises one or more scaffold sequences and one or more guide sequences, wherein the scaffold sequence (s) and the guide sequence (s) may have various structures (or configurations) .
In some embodiments, the guide sequence (s) and the scaffold sequence (s) of the guide RNA are present within the same RNA molecule. In some embodiments, the guide sequence (s) and the scaffold sequence (s) are linked directly to one another with or without a linker (e.g., a short polynucleotide sequence) . In some embodiments, the linker is a short linker, e.g., an RNA linker of 1, 2, 3, or more nucleotides in length. In some embodiments, the guide sequence (s) and the scaffold sequence (s) of the guide RNA are present in separate RNA molecules, which are joined to one another by base pairing interactions.
For example, in some embodiments, the guide RNA comprises a guide sequence followed by a scaffold sequence, referring to the sequences in the 5’ to 3’ direction (i.e., 5’-guide sequence -scaffold sequence -3’ ) . In some embodiments, the guide RNA comprises a scaffold sequence followed by a guide sequence, referring to the sequences in the 5’ to 3’ direction (i.e., 5’ -scaffold sequence -guide sequence -3’ ) . In some embodiments, the guide RNA comprises one scaffold sequence and one guide sequence in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -3’ or 5’ -guide sequence -scaffold sequence -3’, and wherein the “-” between the scaffold sequence and the guide sequence represents an optional linker. In some embodiments, the guide RNA comprises two scaffold sequences and one guide sequence in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -scaffold sequence -3’, wherein the two scaffold sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker. In some embodiments, the guide RNA comprises two scaffold sequences and two guide sequences in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -scaffold sequence -guide sequence -3’ or 5’ –guide sequence -scaffold sequence -guide sequence -scaffold sequence -3’, wherein the two scaffold sequences are the same or different, wherein the two guide sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker. In some embodiments, the guide RNA comprises three scaffold sequences and two guide sequences in the structure (or configuration) of 5’ -scaffold sequence -guide sequence -scaffold sequence -guide sequence –scaffold sequence -3’, wherein the three scaffold sequences are the same or different, wherein the two guide sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker.
In some embodiments, the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of guide sequences. In some embodiments, one or more or each of the guide sequences is capable of hybridizing to a target sequence. In some embodiments, the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of guide sequences capable of hybridizing to a plurality of target sequences, respectively. In some embodiments, the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of scaffold sequences. In some embodiments, one or more or each of the scaffold sequences is capable of forming a complex with a napDNAbp. In some embodiments, the guide RNA comprises a plurality (e.g., 2, 3, 4, 5 or more) of scaffold sequences capable of forming a complex with a plurality of napDNAbp, respectively. In some embodiments, the plurality of guide sequences comprised in a guide RNA are identical or different. In some embodiments, the plurality of scaffold sequences comprised in a guide RNA are identical or different.
Target Gene
The target dsDNA in the disclosure can be a target gene or a part of a target gene such that the guide RNA or the system of the disclosure is applied to a gene or a part of a target gene. In some embodiments, the target gene is in a target cell. In some embodiments, the system of the disclosure introduces a mutation (e.g., an indel) to the target gene in the target cell. In some embodiments, one or more endogenous DNA repair pathways, such as Non-homologous end joining (NHEJ) or Homology directed recombination (HDR) , are induced in the target cell to repair a  double-strand break induced and thus introduce a mutation (e.g., an indel) in the target gene as a result of guide sequence-specific cleavage by the system. Exemplary mutations include, but are not limited to, insertions, deletions, and substitutions.
In some embodiments, the target gene is chromosomal DNA. In some embodiments, the target gene is a gene encoding a functional RNA or a functional polypeptide. In some embodiments, the target gene includes regulatory elements, e.g., a promoter, enhancer, silencer, or insulator. In some embodiments, the target gene is a donor site for splicing. In some embodiments, the target gene is an acceptor site for splicing. In some embodiments, the target gene comprises a plurality of nucleic acids.
In some embodiments, the target gene is a mammalian gene. In some embodiments, the target gene is a human gene.
In some embodiments, the target gene is selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC, and any homology, mutant, or variant thereof, and any intron, exon, complement, or fragment thereof.
In some embodiments, the disclosure provides a guide RNA comprising: (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and (2) a guide sequence capable of hybridizing to a target sequence on a target strand of a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC in a target cell, thereby guiding the complex to the target gene; wherein the target gene comprises a protospacer sequence on the nontarget strand of the target gene and a protospacer adjacent motif (PAM) adjacent (e.g., 5’ ) to the protospacer sequence, wherein the protospacer sequence is fully complementary to the target sequence on the target strand of the target gene.
In some embodiments, the target gene comprises A2AR gene. The A2AR gene encodes adenosine A2a receptor, which is a G protein-coupled receptor. A2AR is also known as ADORA2A. Exemplary sequences of A2AR can be found, for example, at NCBI Gene ID 135, NCBI Reference Sequence NG_052804.1, and Ensembl ID ENSG00000128271. SEQ ID NOs: 1-73 comprise exemplary protospacer sequences of the A2AR gene.
In some embodiments, the target gene comprises AAVS1 gene, also known as adeno-associated virus integration site 1. AAVS1 comprises a region of human chromosome 19. In some embodiments, the region of human chromosome 19 is 19q13. AAVS1 is described Kotin, et. al. Embo J. 11 (13) : 5071-5078 and Ward and Walsh, Virology 433(2) : 356-366. SEQ ID NOs: 74-79 comprise exemplary protospacer sequences of the AAVS1 gene.
In some embodiments, the region comprising AAVS1 comprises the PPP1R12C gene, also known as protein phosphatase 1 regulatory subunit 12C. Exemplary sequences of PPP1R12C can be found, for example, at NCBI Gene ID 54776, NCBI Reference Sequence NC_000019.10, and Ensembl ID ENSG00000125503.
In some embodiments, the target gene comprises B2M gene. The B2M gene encodes beta-2-microglobulin, which is a component of MHC class 1 molecules. Exemplary sequences of B2M can be found, for example, at NCBI Gene ID 567, NCBI Reference Sequence NG_012920.2, and Ensembl ID ENSG00000166710. SEQ ID NOs: 80-129 comprise exemplary protospacer sequences of the B2M gene.
In some embodiments, the target gene comprises BCL11A gene. The BCL11A gene encodes a regulatory C2H2 type zinc-finger protein that can bind DNA and is involved in suppression of fetal hemoglobin production. BCL11A is also known as BAF chromatin remodeling complex subunit BCL11A. Exemplary sequences of BCL11A can be found, for example, at NCBI Gene ID 53335, NCBI Reference Sequence NG_011968.1, and Ensembl ID ENSG00000119866. SEQ ID NOs: 130-368 comprise exemplary protospacer sequences of the BCL11A gene.
In some embodiments, the target gene comprises CCR5 gene. The CCR5 gene encodes C-C chemokine receptor type 5, which is a chemokine receptor on the surface of white blood cells. Exemplary sequences of CCR5 can be found, for example, at NCBI Gene ID 1234, NCBI Reference Sequence NG_012637.1, and Ensembl ID ENSG00000160791. SEQ ID NOs: 369-520 comprise exemplary protospacer sequences of the CCR5 gene.
In some embodiments, the target gene comprises CD16a gene. The CD16a gene encodes cluster of differentiation 16. CD16a is found on the surface of NK cells and activates antibody-dependent cell-mediated cytotoxicity (ADCC) . CD16a is also known as FCGR3a or FcyRIII (Fc gamma receptor IIIa) . Exemplary sequences of CD16a can be found, for example, at NCBI Gene ID 2214, NCBI Reference Sequence NG_009066.1, and Ensembl ID ENSG00000203747. SEQ ID NOs: 521-573 comprise exemplary protospacer sequences of the CD16a gene.
In some embodiments, the target gene comprises a CD3 nucleic acid. CD3 is a protein complex comprising a CD3γ chain, a CD3δ chain, and two CD3ε chains. CD3 is involved in activating cytotoxic T cells and T helper cells. In some embodiments, the target gene can be any one or more nucleic acids comprising the CD3γ chain, the CD3δ chain, and the CD3ε chain. Exemplary sequences of CD3γ can be found, for example, at NCBI Gene ID 917, NCBI Reference Sequence NG_007566.1, and Ensembl ID ENSG00000160654. Exemplary sequences of CD3δ can be found, for example, at NCBI Gene ID 915, NCBI Reference Sequence NG_009891.1, and Ensembl ID ENSG00000167286. Exemplary sequences of CD3ε can be found, for example, at NCBI Gene ID 916, NCBI Reference Sequence NG_007383.1, and Ensembl ID ENSG00000198851.
In some embodiments, the target gene comprises CD52 gene. The CD52 gene encodes CD52 molecule. CD52 is present on  the surface of mature lymphocytes. Exemplary sequences of CD52 can be found, for example, at NCBI Gene ID 1043, NCBI Reference Sequence NC_000001.11, and Ensembl ID ENSG00000169442. SEQ ID NOs: 574-591 comprise exemplary protospacer sequences of the CD52 gene.
In some embodiments, the target gene comprises CD7 gene. The CD7 gene encodes the CD7 protein. CD7 is a member of the immunoglobulin superfamily and found on thymocytes and mature T cells. Exemplary sequences of CD7 can be found, for example, at NCBI Gene ID 924, NCBI Reference Sequence NC_000017.11, and Ensembl ID ENSG00000173762. SEQ ID NOs: 592-608 comprise exemplary protospacer sequences of the CD7 gene.
In some embodiments, the target gene comprises CIITA gene. The CIITA gene encodes class II major histocompatibility complex transactivator. CIITA controls expression of human leukocyte antigen class II genes. Exemplary sequences of CIITA can be found, for example, at NCBI Gene ID 4261, NCBI Reference Sequence NG_009628.1, and Ensembl ID ENSG00000179583. SEQ ID NOs: 609-792 comprise exemplary protospacer sequences of the CIITA gene.
In some embodiments, the target gene comprises CISH gene. The CISH gene encodes cytokine-inducible SH2-containing protein. Exemplary sequences of CISH can be found, for example, at NCBI Gene ID 1154, NCBI Reference Sequence NG_023194.1, and Ensembl ID ENSG00000114737. SEQ ID NOs: 793-833 comprise exemplary protospacer sequences of the CISH gene.
In some embodiments, the target gene comprises CTLA4 gene. The CTLA4 gene encodes cytotoxic T-lymphocyte-associated protein 4, which is a protein receptor that functions as an immune checkpoint and downregulates immune responses. CTLA4 is also known as CD152. Exemplary sequences of CTLA4 can be found, for example, at NCBI Gene ID 1493, NCBI Reference Sequence NG_011502.1, and Ensembl ID ENSG00000163599. SEQ ID NOs: 834-885 comprise exemplary protospacer sequences of the CTLA4 gene.
In some embodiments, the target gene comprises CXCR4 gene. The CXCR4 gene encodes C-X-C chemokine receptor type 4. CXCR4 is a chemokine receptor expressed on lymphocytes. CXCR4 is also known as fusin or CD184. Exemplary sequences of CXCR4 can be found, for example, at NCBI Gene ID 7852, NCBI Reference Sequence NG_011587.1, and Ensembl ID ENSG00000121966. SEQ ID NOs: 886-1002 comprise exemplary protospacer sequences of the CXCR4 gene.
In some embodiments, the target gene comprises GAPDH gene. The GAPDH gene encodes glyceraldehyde 3-phospate dehydrogenase. Exemplary sequences of GAPDH can be found, for example, at NCBI Gene ID 2597, NCBI Reference Sequence NG_007073.2, and Ensembl ID ENSG00000111640.
In some embodiments, the target gene comprises HBB gene. The HBB gene encodes beta globin, which together with alpha globin make up the most common form of hemoglobin in adult humans. The HBB variant HbS causes sickle cell disease. Mutations in the HBB gene also cause the group of blood disorders Beta thalassemias. HBB is also known as hemoglobin subunit beta. Exemplary sequences of HBB can be found, for example, at NCBI Gene ID 3043, NCBI Reference Sequence NG_059281.1, and Ensembl ID ENSG00000244734. SEQ ID NOs: 1003-1048 comprise exemplary protospacer sequences of the HBB gene.
In some embodiments, the target gene comprises HEXB gene. The HEXB gene encodes beta-hexosaminidase subunit beta, which forms the beta subunit of β-hexosamininidase. Exemplary sequences of HEXB can be found, for example, at NCBI Gene ID 3074, NCBI Reference Sequence NG_009770.2, and Ensembl ID ENSG00000049860. SEQ ID NOs: 1049-1246 comprise exemplary protospacer sequences of the HEXB gene.
In some embodiments, the target gene comprises IL2RG gene. The IL2RG gene encodes common chain gamma, which is a cytokine receptor subunit common to several interleukin receptors, including IL-2R, IL-4R, IL-7R, IL-9R, and IL-15R. IL2RG is also known as interleukin 2 receptor subunit gamma. Exemplary sequences of IL2RG can be for example, at NCBI Gene ID 3561, NCBI Reference Sequence NG_009088.1, and Ensembl ID ENSG00000147168. SEQ ID NOs: 1247-1353 comprise exemplary target sequences of the IL2RG gene.
In some embodiments, the target gene comprises KLRG1 gene. The KLRG1 gene encodes killer cell lectin-like receptor G1. KLRG1 is preferentially expressed in NK cells. Exemplary sequences of KLRG1 can be found, for example, at NCBI Gene ID 10219, NCBI Reference Sequence NC_000012.12, and Ensembl ID ENSG00000139187. SEQ ID NOs: 1354-1414 comprise exemplary protospacer sequences of the KLRG1 gene.
In some embodiments, the target gene comprises KLRK1 gene. The KLRK1 gene encodes killer cell lectin like receptor K1, which is expressed by NK cells. KLRK1 is also known as NKG2D, KLR, and CD314. Exemplary sequences of KLRK1 can be found, for example, at NCBI Gene ID 22914, NCBI Reference Sequence NG_027762.1, and Ensembl ID ENSG00000213809.
In some embodiments, the target gene comprises LAG3 gene. The LAG3 gene encodes lymphocyte-activation gene 3. LAG3 is a cell surface molecule with diverse effects on T cell function, including as an immune checkpoint receptor. LAG3 is also known as CD223. Exemplary sequences of LAG3 can be found, for example, at NCBI Gene ID 3902, NCBI Reference Sequence NC_000012.12, and Ensembl ID ENSG00000089692. SEQ ID NOs: 1415-1492 comprise exemplary protospacer sequences of the LAG3 gene.
In some embodiments, the target gene comprises NKG2A gene. The NKG2A gene encodes killer cell lectin like receptor C1, which is an activating receptor expressed on NK cells. NKG2A is also known as KLRC1 and CD159a. Exemplary  sequences of NKG2A can be found, for example, at NCBI Gene ID 3821, NCBI Reference Sequence NC_000012.12, Ensembl ID ENSG00000134545. SEQ ID NOs: 1493-1635 comprise exemplary protospacer sequences of the NKG2A gene.
In some embodiments, the target gene comprises PD-1 gene. The PD-1 gene encodes programmed cell death 1, which is an immune-inhibitory receptor expressed in activated T cells. PD-1 is also known as PD1, PDCD1, and CD279. Exemplary sequences of PD1 can be found, for example, at NCBI Gene ID 5133, NCBI Reference Sequence NG_012110.1, and Ensembl ID ENSG00000188389. SEQ ID NOs: 1636-1670 comprise exemplary protospacer sequences of the PD-1 gene.
In some embodiments, the target gene comprises PD-L1 gene. The PD-L1 gene encodes programmed death ligand 1, which encodes an immune inhibitory receptor ligand expressed by hematopoietic and non-hematopoietic cells, including T cells, B cells, and various types of tumor cells. PD-L1 is also known as CD274, PDL1, or B7H1. Exemplary sequences of PD-L1 can be found, for example, at NCBI Gene ID 29126, NCBI Reference Sequence NC_000009.12, and Ensembl ID ENSG00000120217.
In some embodiments, the target gene comprises TGFBR2 gene. The TGFBR2 gene encodes transforming growth factor beta receptor 2. Mutations in TGFBR2 have been associated with several diseases and conditions, including cancer. Exemplary sequences of TGFBR2 can be found, for example, at NCBI Gene ID 7048, NCBI Reference Sequence NG_007490.1, and Ensembl ID ENSG00000163513. SEQ ID NOs: 1671-1801 comprise exemplary protospacer sequences of the TGFBR2 gene.
In some embodiments, the target gene comprises TIGIT gene. The TIGIT gene encodes T cell immunoreceptor with Ig and ITIM domains. TIGIT is an immune receptor found on T cells and natural killer cells. Exemplary sequences of TIGIT can be found, for example, at NCBI Gene ID 201633, NCBI Reference Sequence NC_000003.12, and Ensembl ID ENSG00000181847. SEQ ID NOs: 1802-1845 comprise exemplary protospacer sequences of the TIGIT gene.
In some embodiments, the target gene comprises TIM3 gene. The TIM3 gene encodes T-cell immunoglobulin and mucin-domain containing-3. TIM3 is a cell surface protein expressed on T cells. TIM3 is also known as HAVCR2 (hepatitis A virus receptor 2) . Exemplary sequences of TIM3 can be found, for example, at NCBI Gene ID 84868, NCBI Reference Sequence NG_030444.1, and Ensembl ID ENSG00000135077. SEQ ID NOs: 1846-1932 comprise exemplary protospacer sequences of the TIM3 gene.
In some embodiments, the target gene comprises TRAC gene. The TRAC gene encodes T cell receptor alpha constant, which is a component of the T cell receptor protein complex. Exemplary sequences of TRAC can be found, for example, at NCBI Gene ID 28755, NCBI Reference Sequence NG001332.3, and Ensembl ID ENSG00000277734. SEQ ID NOs: 1933-2029 comprise exemplary protospacer sequences of the TRAC gene.
In some embodiments, the target gene comprises TRBC2 gene. TRBC2 encodes T cell receptor beta constant 2. Exemplary sequences of TRBC2 can be found, for example at NCBI Gene ID 28638, NCBI Reference Sequence NG_001333.2, and Ensembl ID ENSG00000211751.
In some embodiments, the target gene comprises TRBC1. TRBC1 encodes T cell receptor beta constant 1. Exemplary sequences of TRBC1 can be found, for example at NCBI Gene ID 28639, NCBI Reference Sequence NG_001333.2, and Ensembl ID ENSG00000211772.
In some embodiments, the target gene comprises TRG gene. TRG is the T cell receptor gamma locus. Exemplary sequences of TRG can be found, for example at NCBI Gene ID 6965 and NCBI Reference Sequence NG_001336.2.
In some embodiments, the target gene comprises TRD. TRD is the T cell receptor delta locus. Exemplary sequences of TRD can be found, for example at NCBI Gene ID 6964 and NCBI Reference Sequence NG_001332.3.
Protospacer Sequence, Target Sequence, and PAM
The guide sequence of the guide RNA of the disclosure is designed to hybridize to a target sequence on the target strand of a target dsDNA. The guide sequence may be designed to be fully (100%) complementary to the target sequence but, in some embodiments, one or more mismatches (i.e., less than 100%complementary) may be tolerated for hybridization. The sequence on the other side (the nontarget strand) of the target dsDNA corresponding to the target sequence is a protospacer sequence. Generally, the protospacer sequence on the nontarget strand is fully complementary to the target sequence on the target strand unless a mutation is present in the protospacer sequence and/or the target sequence.
In the case that the guide sequence is fully complementary to the target sequence and the target sequence is fully complementary to the protospacer sequence, the guide sequence is identical to the protospacer sequence except for the U in the guide sequence due to its RNA nature and correspondingly the T in the protospacer sequence due to its DNA nature. According to electric sequence listing standard ST. 26 by WIPO, symbol “t” is used to denote both T in DNA and U in RNA (See “Table 1: List of nucleotides symbols” , the definition of symbol “t” is “thymine in DNA/uracil in RNA (t/u) ” ) . Thus, such a guide sequence would be set forth in the same sequence as such a protospacer sequence in the sequence listing. For convenience, a single SEQ ID NO in the sequence listing is used to denote both such guide sequence and protospacer sequence, although such a single SEQ ID NO may be  marked as either DNA or RNA in the sequence listing. When a reference is made to a SEQ ID NO that recites a protospacer /guide sequence, it refers to a protospacer sequence that is a DNA sequence or a guide sequence that is an RNA sequence, depending on the context.
The protospacer sequence on the nontarget strand of the target dsDNA may be associated with a PAM (protospacer adjacent motif) adjacent to the protospacer sequence. PAM is a short motif (short DNA sequence) that can be identified or recognized by a napDNAbp, e.g., Cas9, Cas12.
Depending on the nature and property of the napDNAbp, the PAM may be upstream (5’ to) or downstream (3’ to) of a protospacer sequence. In some embodiments, the PAM is immediately adjacent to the protospacer sequence or, for example, within a small number (e.g., 1, 2, 3, 4, or 5) of nucleotides from the protospacer sequence. In some embodiments, the PAM is 5’ or 3’ to the protospacer sequence. In some embodiments, the PAM is immediately 5’ or 3’ to the protospacer sequence. Generally, the PAM is immediately 3’ to the protospacer sequence for CRISPR-Cas9 systems, and the PAM is immediately 5’ to the protospacer sequence for CRISPR-Cas12 systems. Generally, there is no PAM limitation for CRISPR-Cas13 systems in eukaryotic cells.
Any sequence on the nontarget strand of a target dsDNA adjacent to a PAM may be a potential protospacer sequence for a system of the disclosure comprising a napDNAbp (e.g., Cas9, Cas12) capable of identifying and recognizing the PAM. Thus, a protospacer sequence can be identified by querying a PAM on a target dsDNA (e.g., a gene) by a tool or algorithm in the art, and then the efficacy (e.g., on-target DNA editing activity, off-target DNA editing activity) of the system of the disclosure for the protospacer sequence can be evaluated by a method in the art or in the disclosure.
In some embodiments, the PAM comprises, consists essentially of, or consists of sequence 5’-TTN-3’, wherein N is A, T, G, or C. In some embodiments, the PAM comprises, consists essentially of, or consists of sequence 5’-NGG-3’, wherein N is A, T, G, or C.
The optimal length for a protospacer sequence may vary depending on the selection of a napDNAbp. For example, a length of at least 16 nt, and preferably 20 nt, would be favorable to a CRISPR-Cas12i system comprising xCas12i or its variant (e.g., Cas12Max, hfCas12Max) . Such an optimal length can be determined by a skilled in the art based on the selection of a napDNAbp and a serious of conventional experiments to evaluate the change of intended effect (e.g., DNA cleavage activity) with the length.
In some embodiments, the protospacer sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides. In some embodiments, the protospacer sequence is about 20 nucleotides in length.
In some embodiments, the protospacer sequence comprises at least about 14 contiguous nucleotides of the nontarget strand of the target dsDNA (e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, or more contiguous nucleotides of the nontarget strand of the target gene, or in a numerical range between any of two preceding values, e.g., from about 14 to about 50 contiguous nucleotides of the target dsDNA) . In some embodiments, the protospacer sequence comprises, consists essentially of, or consists of 20 contiguous nucleotides of the nontarget strand of the target dsDNA.
In some embodiments, the protospacer sequence is immediately 5’ or 3’ to a PAM comprises, consists essentially of, or consists of sequence 5’-TTN-3’, wherein N is A, T, G, or C.
In some embodiments, the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-2029 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-2029. For example, the protospacer sequence can comprise nucleotide 1 through nucleotide 14 of any indicated sequence, the protospacer sequence can comprise nucleotide 1 through nucleotide 15 of any indicated sequence, the protospacer sequence can comprise nucleotide 1 through nucleotide 16 of any indicated sequence, the protospacer sequence can comprise nucleotide 1 through nucleotide 17 of any indicated sequence, the protospacer sequence can comprise nucleotide 1 through nucleotide 18 of any indicated sequence, the protospacer sequence can comprise nucleotide 1 through nucleotide 19 of any indicated sequence, the protospacer sequence can comprise nucleotide 1 through nucleotide 20 of any indicated sequence, the protospacer sequence can comprise nucleotide 2 through nucleotide 15 of any indicated sequence, and so on.
In some embodiments, the sequence of any one of SEQ ID NOs: 1-2029 is identified from the nontarget strand of the target gene by identifying a PAM as described herein on the nontarget strand of the target gene and electing the sequence immediately 3’ to the PAM.
In some embodiments, the target gene is A2AR, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-73 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-73.
In some embodiments, the target gene is AAVS1, and the protospacer sequence comprises at least about 14 (e.g., 20)  contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79.
In some embodiments, the target gene is B2M, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117.
In some embodiments, the target gene is BCL11A, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137.
In some embodiments, the target gene is CCR5, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375.
In some embodiments, the target gene is CD16a, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 521-573; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 521-573.
In some embodiments, the target gene is CD52, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 574-591; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 574-591.
In some embodiments, the target gene is CD7, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 592-608; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 592-608.
In some embodiments, the target gene is CIITA, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610.
In some embodiments, the target gene is CISH, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795.
In some embodiments, the target gene is CTLA4, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842.
In some embodiments, the target gene is CXCR4, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897.
In some embodiments, the target gene is HBB, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011.
In some embodiments, the target gene is HEXB, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1049-1246; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1049-1246.
In some embodiments, the target gene is IL2RG, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ  ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253.
In some embodiments, the target gene is KLRG1, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1354-1414; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1354-1414.
In some embodiments, the target gene is LAG3, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418.
In some embodiments, the target gene is NKG2A, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1493-1635; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1493-1635.
In some embodiments, the target gene is PD1, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639.
In some embodiments, the target gene is TGFBR2, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681.
In some embodiments, the target gene is TIGIT, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809.
In some embodiments, the target gene is TIM3, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847.
In some embodiments, the target gene is TRAC, and the protospacer sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937.
In some embodiments, the continuous nucleotides comprised in the protospacer sequence are immediately 3’ to a PAM as described herein.
Guide Sequence
The guide sequence of the guide RNA of the disclosure is designed to hybridize to a target sequence on the target strand of a target dsDNA. The term “guide sequence” as used herein is exchangeable with “spacer sequence” .
The optimal length for a guide sequence may vary depending on the selection of a napDNAbp. For example, a length of at least 16 nt, and preferably 20 nt, would be favorable to a CRISPR-Cas12i system comprising xCas12i or its variant (e.g., Cas12Max, hfCas12Max) . Such an optimal length can be determined by a skilled in the art based on the selection of a napDNAbp and a serious of conventional experiments to evaluate the change of intended effect (e.g., DNA cleavage activity) with the length.
In some embodiments, the guide sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides. In some embodiments, the guide sequence is about 20 nucleotides in length.
In some embodiments, the guide sequence is about 50%to about 100%, e.g., at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 100%, optionally about 100% (fully) , complementary to the target sequence. In some embodiments, the guide sequence is fully (100%) complementary to the target sequence.
In some embodiments, the guide sequence contains no more than 1, 2, 3, 4, or 5 mismatches to the target sequence. In some embodiments, the guide sequence comprises no mismatch with the target sequence in the first 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or  70 nucleotides from the 5’ end of the guide sequence.
In some embodiments, the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-2029 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-2029. For example, the guide sequence can comprise nucleotide 1 through nucleotide 14 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 15 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 16 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 17 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 18 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 19 of any indicated sequence, the guide sequence can comprise nucleotide 1 through nucleotide 20 of any indicated sequence, the guide sequence can comprise nucleotide 2 through nucleotide 15 of any indicated sequence, and so on.
In some embodiments, the target gene is A2AR, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-73 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-73.
In some embodiments, the target gene is AAVS1, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79.
In some embodiments, the target gene is B2M, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117.
In some embodiments, the target gene is BCL11A, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137.
In some embodiments, the target gene is CCR5, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375.
In some embodiments, the target gene is CD16a, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 521-573; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 521-573.
In some embodiments, the target gene is CD52, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 574-591; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 574-591.
In some embodiments, the target gene is CD7, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 592-608; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 592-608.
In some embodiments, the target gene is CIITA, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610.
In some embodiments, the target gene is CISH, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795.
In some embodiments, the target gene is CTLA4, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842.
In some embodiments, the target gene is CXCR4, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887,  888, 889, 890, 891, 892, 893, 894, 895, 896, and 897; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897.
In some embodiments, the target gene is HBB, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011.
In some embodiments, the target gene is HEXB, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1049-1246; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1049-1246.
In some embodiments, the target gene is IL2RG, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253.
In some embodiments, the target gene is KLRG1, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1354-1414; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1354-1414.
In some embodiments, the target gene is LAG3, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418.
In some embodiments, the target gene is NKG2A, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1493-1635; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1493-1635.
In some embodiments, the target gene is PD1, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639.
In some embodiments, the target gene is TGFBR2, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681.
In some embodiments, the target gene is TIGIT, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809.
In some embodiments, the target gene is TIM3, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847.
In some embodiments, the target gene is TRAC, and the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937.


























Scaffold Sequence
In some embodiments, the scaffold sequence is a direct repeat (DR) sequence, e.g., a DR sequence for CRISPR-Cas12 systems. In some embodiments, the scaffold sequence is a scaffold sequence comprising a tracr mate sequence fused to a tracr sequence with or without a linker, e.g., such a scaffold sequence for CRISPR-Cas9 system.
Without wishing to be bound by theory, it is believed that the scaffold sequence of the guide RNA of the disclosure serves as a binding site to which a napDNAbp of the disclosure can be bound to complex with the guide RNA to form an  RNA-protein complex, which is guided by the guide RNA to a target sequence of a target DNA through the hybridization of the guide sequence of the guide RNA to the target sequence.
Any scaffold sequence that can mediate the binding or complexing of the napDNAbp to the guide RNA can be used in the disclosure. If a napDNAbp is selected, the scaffold sequence can be determined accordingly. For example, if a Cas12i polypeptide is selected as the napDNAbp, a scaffold sequence that can mediate the binding or complexing of the Cas12i polypeptide (ascaffold sequence corresponding to the Cas12i polypeptide) to the guide RNA comprising the scaffold sequence and a guide sequence can be selected accordingly for use in combination with the Cas12i polypeptide.
In some embodiments, the scaffold sequence is a direct repeat sequence corresponding to xCas12i or its mutants (e.g., Cas12Max, hfCas12Max) . In some embodiments, the scaffold sequence (1) is as set forth in SEQ ID NO: 2032 or 2033; (2) comprises the sequence of SEQ ID NO: 2032 or 2033; (3) comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the sequence of SEQ ID NO: 2032 or 2033; or (4) comprises at least about 14 (e.g., at least about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30) contiguous nucleotides of (1) a sequence of SEQ ID NO: 2032 or 2033 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%to SEQ ID NO: 2032 or 2033. For example, the scaffold sequence can comprise nucleotide 1 through nucleotide 14 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 15 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 16 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 17 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 18 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 19 of any indicated sequence, the scaffold sequence can comprise nucleotide 1 through nucleotide 20 of any indicated sequence, the scaffold sequence can comprise nucleotide 2 through nucleotide 15 of any indicated sequence, and so on.
Without wishing to be bound by theory, it is generally believed that the secondary structure of the scaffold sequence plays a role in its binding with a napDNAbp, and the change of one or more nucleotides (e.g., addition, deletion, substitution) of the scaffold sequence may be tolerated and may not significantly affect the functionality of the scaffold sequence as long as the secondary structure of the scaffold sequence retains. In some embodiments, the scaffold sequence has substantially the same secondary structure as the secondary structure of SEQ ID NO: 2032 or 2033. For example, the DR sequence 2 as set forth in SEQ ID NO: 2033 (23 nt) is a N-terminal truncation of DR sequence 1 as set forth in SEQ ID NO: 2032 (30 nt) , and both were demonstrated to work with xCas12i and its mutants (e.g., Cas12Max, hfCas12Max) . In some embodiments, the scaffold sequence comprises the stem-loop structure of the secondary structure of SEQ ID NO: 2032 or 2033.
In some embodiments, the scaffold sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides.
Particular embodiments of guide RNA
In some embodiments, the guide RNA of the disclosure comprises (1) a scaffold sequence of the disclosure, (2) a guide sequence of the disclosure, and (3) a scaffold sequence of the disclosure, wherein (1) , (2) , and (3) are in 5’ to 3’ direction. In some embodiments, the guide RNA of the disclosure comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to a sequence comprising (1) a scaffold sequence of the disclosure, (2) a guide sequence of the disclosure, and (3) a scaffold sequence of the disclosure, wherein (1) , (2) , and (3) are in 5’ to 3’ direction.
In some embodiments, the guide RNA of the disclosure comprises (1) a scaffold sequence of SEQ ID NO: 2032 or 2033, (2) a guide sequence of any one of SEQ ID NOs: 1-2029, and (3) a scaffold sequence of SEQ ID NO: 2032 or 2033, wherein (1) , (2) , and (3) are in 5’ to 3’ direction. In some embodiments, the guide RNA of the disclosure comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to a sequence comprising (1) a scaffold sequence of SEQ ID NO: 2032 or 2033, (2) a guide sequence of any one of SEQ ID NOs: 1-2029, and (3) a scaffold sequence of SEQ ID NO: 2032 or 2033, wherein (1) , (2) , and (3) are in 5’ to 3’ direction.
In some embodiments, the guide RNA (1) is as set forth in any one of SEQ ID NOs: 2045-2051; (2) comprises the sequence of any one of SEQ ID NOs: 2045-2051; or (3) comprises a sequence having a sequence identity of at least about 60%(e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%,  99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the sequence of any one of SEQ ID NOs: 2045-2051.
In some embodiments, the guide RNA of the disclosure comprises a modification as described herein.
Modifications
The sequence (e.g., a guide RNA, a mRNA encoding napDNAbp) of the disclosure may include one or more covalent modifications with respect to a reference sequence, in particular the parent polyribonucleotide, which are included within the disclosure. In some embodiment, the sequence is a modified guide RNA. In some embodiment, the sequence is a modified mRNA encoding a napDNAbp of the disclosure.
Exemplary modifications can include any modification to the sugar, the nucleobase, the internucleoside linkage (e.g., to a linking phosphate/to a phosphodiester linkage/to the phosphodiester backbone) , and any combination thereof. Some of the exemplary modifications provided herein are described in detail below.
The sequence may include any useful modification, such as to the sugar, the nucleobase, or the internucleoside linkage (e.g., to a linking phosphate/to a phosphodiester linkage/to the phosphodiester backbone) . One or more atoms of a pyrimidine nucleobase may be replaced or substituted with optionally substituted amino, optionally substituted thiol, optionally substituted alkyl (e.g., methyl or ethyl) , or halo (e.g., chloro or fluoro) . In certain embodiments, modifications (e.g., one or more modifications) are present in each of the sugar and the internucleoside linkage. Modifications may be modifications of ribonucleic acids (RNAs) to deoxyribonucleic acids (DNAs) , threose nucleic acids (TNAs) , glycol nucleic acids (GNAs) , peptide nucleic acids (PNAs) , locked nucleic acids (LNAs) or hybrids thereof) . Additional modifications are described herein.
In some embodiments, the modification may include a chemical or cellular induced modification. For example, some nonlimiting examples of intracellular RNA modifications are described by Lewis and Pan in “RNA modifications and structures cooperate to guide RNA-protein interactions” from Nat Reviews Mol Cell Biol, 2017, 18: 202-210. Different sugar modifications, nucleotide modifications, and/or internucleoside linkages (e.g., backbone structures) may exist at various positions in the sequence. One of ordinary skill in the art will appreciate that the nucleotide analogs or other modification (s) may be located at any position (s) of the sequence, such that the function of the sequence is not substantially decreased. The sequence may include from about 1%to about 100%modified nucleotides (either in relation to overall nucleotide content, or in relation to one or more types of nucleotide, i.e., any one or more of A, G, U/T, or C) or any intervening percentage (e.g., from 1%to 20%, from 1%to 25%, from 1%to 50%, from 1%to 60%, from 1%to 70%, from 1%to 80%, from 1%to 90%, from 1%to 95%, from 10%to 20%, from 10%to 25%, from 10%to 50%, from 10%to 60%, from 10%to 70%, from 10%to 80%, from 10%to 90%, from 10%to 95%, from 10%to 100%, from 20%to 25%, from 20%to 50%, from 20%to 60%, from 20%to 70%, from 20%to 80%, from 20%to 90%, from 20%to 95%, from 20%to 100%, from 50%to 60%, from 50%to 70%, from 50%to 80%, from 50%to 90%, from 50%to 95%, from 50%to 100%, from 70%to 80%, from 70%to 90%, from 70%to 95%, from 70%to 100%, from 80%to 90%, from 80%to 95%, from 80%to 100%, from 90%to 95%, from 90%to 100%, and from 95%to 100%) .
In some embodiments, sugar modifications (e.g., at the 2’ position or 4’ position) or replacement of the sugar at one or more ribonucleotides of the sequence may, as well as backbone modifications, include modification or replacement of the phosphodiester linkages. Specific examples of a sequence include, but are not limited to, sequences including modified backbones or no natural internucleoside linkages such as internucleoside modifications, including modification or replacement of the phosphodiester linkages. Sequences having modified backbones include, among others, those that do not have a phosphorus atom in the backbone. For the purposes of the disclosure, and as sometimes referenced in the art, modified RNAs that do not have a phosphorus atom in their internucleoside backbone can also be considered to be oligonucleosides. In particular embodiments, a sequence will include ribonucleotides with a phosphorus atom in its internucleoside backbone.
Modified sequence backbones may include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates such as 3’-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates such as 3 ’-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, and boranophosphates having normal 3’-5’ linkages, 2’-5’ linked analogs of these, and those having inverted polarity wherein the adjacent pairs of nucleoside units are linked 3’-5’ to 5’-3’ or 2’-5’ to 5’-2’ . Various salts, mixed salts and free acid forms are also included. In some embodiments, the sequence may be negatively or positively charged.
The modified nucleotides, which may be incorporated into the sequence, can be modified on the internucleoside linkage (e.g., phosphate backbone) . Herein, in the context of the polynucleotide backbone, the phrases “phosphate” and “phosphodiester” are used interchangeably. Backbone phosphate groups can be modified by replacing one or more of the oxygen atoms with a different substituent. Further, the modified nucleosides and nucleotides can include the wholesale replacement of an unmodified phosphate moiety with another internucleoside linkage as described herein. Examples of modified phosphate groups include, but are not limited to, phosphorothioate,  phosphoroselenates, boranophosphates, boranophosphate esters, hydrogen phosphonates, phosphoramidates, phosphorodiamidates, alkyl or aryl phosphonates, and phosphotriesters. Phosphorodithioates have both non-linking oxygens replaced by sulfur. The phosphate linker can also be modified by the replacement of a linking oxygen with nitrogen (bridged phosphoramidates) , sulfur (bridged phosphorothioates) , and carbon (bridged methylene-phosphonates) .
The α-thio substituted phosphate moiety is provided to confer stability to RNA and DNA polymers through the unnatural phosphorothioate backbone linkages. Phosphorothioate DNA and RNA have increased nuclease resistance and subsequently a longer half-life in a cellular environment.
In specific embodiments, a modified nucleoside includes an alpha-thio-nucleoside (e.g., 5’-O- (1-thiophosphate) -adenosine, 5’-O- (1-thiophosphate) -cytidine (α-thio-cytidine) , 5’-O- (1-thiophosphate) -guanosine, 5’-O- (1-thiophosphate) -uridine, or 5’-O- (1-thiophosphate) -pseudouridine) .
Other internucleoside linkages that may be employed according to the disclosure, including internucleoside linkages which do not contain a phosphorous atom, are described herein.
In some embodiments, the sequence may include one or more cytotoxic nucleosides. For example, cytotoxic nucleosides may be incorporated into sequence, such as bifunctional modification. Cytotoxic nucleoside may include, but are not limited to, adenosine arabinoside, 5-azacytidine, 4’-thio-aracytidine, cyclopentenylcytosine, cladribine, clofarabine, cytarabine, cytosine arabinoside, 1- (2-C-cyano-2-deoxy-beta-D-arabino-pentofuranosyl) -cytosine, decitabine, 5-fhiorouracil, fludarabine, floxuridine, gemcitabine, a combination of tegafur and uracil, tegafur ( (RS) -5-fluoro-1- (tetrahydrofuran-2-yl) pyrimidine-2, 4 (1H, 3H) -dione) , troxacitabine, tezacitabine, 2’-deoxy-2’-methylidenecytidine (DMDC) , and 6-mercaptopurine. Additional examples include fludarabine phosphate, N4-behenoyl-1-beta-D-arabinofuranosylcytosine, N4-octadecyl-1-beta-D-arabinofuranosylcytosine, N4-palmitoyl-1- (2-C-cyano-2-deoxy-beta-D-arabino-pentofuranosyl) cytosine, and P-4055 (cytarabine 5’-elaidic acid ester) .
In some embodiments, the sequence includes one or more post-transcriptional modifications (e.g., capping, cleavage, polyadenylation, splicing, poly-Asequence, methylation, acylation, phosphorylation, methylation of lysine and arginine residues, acetylation, and nitrosylation of thiol groups and tyrosine residues, etc. ) . The one or more post-transcriptional modifications can be any post-transcriptional modification, such as any of the more than one hundred different nucleoside modifications that have been identified in RNA (Rozenski, J, Crain, P, and McCloskey, J. (1999) . The RNA Modification Database: 1999 update. Nucl Acids Res 27: 196-197, which is incorporated herein by reference in its entirety) . In some embodiments, the sequence comprises at least one nucleoside selected from the group consisting of pyridin-4-one ribonucleoside, 5-aza-uridine, 2-thio-5-aza-uridine, 2-thiouridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxyuridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyluridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, 1-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-l-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-deaza-pseudouridine, 2-thio-1-methyl-1-deaza-pseudouridine, dihydrouridine, dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-dihydropseudouridine, 2-methoxyuridine, 2-methoxy-4-thio-uridine, 4-methoxy-pseudouridine, and 4-methoxy-2-thio-pseudouridine. In some embodiments, the sequence comprises at least one nucleoside selected from the group consisting of 5-aza-cytidine, pseudoisocytidine, 3-methyl-cytidine, N4-acetylcytidine, 5-formylcytidine, N4-methylcytidine, 5-hydroxymethylcytidine, 1-methyl-pseudoisocytidine, pyrrolo-cytidine, pyrrolo-pseudoisocytidine, 2-thio-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoisocytidine, 4-thio-1-methyl-pseudoisocytidine, 4-thio-1-methyl-1-deaza-pseudoisocytidine, 1-methyl-1-deaza-pseudoisocytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-thio-zebularine, 2-methoxy-cytidine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoisocytidine, and 4-methoxy-1-methyl-pseudoisocytidine. In some embodiments, the sequence comprises at least one nucleoside selected from the group consisting of 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2, 6-diaminopurine, 7-deaza-8-aza-2, 6-diaminopurine, 1-methyladenosine, N6-methyladenosine, N6-isopentenyladenosine, N6- (cis-hydroxyisopentenyl) adenosine, 2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine, N6-glycinylcarbamoyladenosine, N6-threonylcarbamoyladenosine, 2-methylthio-N6-threonyl carbamoyladenosine, N6, N6-dimethyladenosine, 7-methyladenine, 2-methylthio-adenine, and 2-methoxy-adenine. In some embodiments, the sequence comprises at least one nucleoside selected from the group consisting of inosine, 1-methyl-inosine, wyosine, wybutosine, 7-deaza-guanosine, 7-deaza-8-aza-guanosine, 6-thio-guanosine, 6-thio-7-deaza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methylinosine, 6-methoxy-guanosine, 1-methylguanosine, N2-methylguanosine, N2, N2-dimethylguanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, l-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2, N2-dimethyl-6-thio-guanosine.
The sequence may or may not be uniformly modified along the entire length of the sequence. For example, one or more or  all types of nucleotides (e.g., naturally-occurring nucleotides, purine or pyrimidine, or any one or more or all of A, G, U, C, I, and pU) may or may not be uniformly modified in the sequence, or in a given predetermined sequence region thereof. In some embodiments, the sequence includes a pseudouridine. In some embodiments, the sequence includes an inosine, which may aid in the immune system characterizing the sequence as endogenous versus viral RNAs. The incorporation of inosine may also mediate improved RNA stability and/or reduced degradation. See for example, Yu, Z. et al. (2015) RNA editing by ADAR1 marks dsRNA as “self’ . Cell Res. 25, 1283-1284, which is incorporated herein by reference in its entirety.
In some embodiments, the guide RNA comprises at its 3’ end a polyU tail. In some embodiments, the polyU tail comprises four to seven uracil, and optionally four uracil.
In some embodiments, the guide RNA comprises a 2’-O-methyl-and phosphorothioate modification. In some embodiments, the 2’-O-methyl-and phosphorothioate modification is located at one or more of the nucleotides of the guide RNA, e.g., at the first three nucleotides at the 5’-end of the guide RNA, and/or, at each U of the polyU tail.
In some embodiments, the guide RNA comprises at its 3’ end a 3’ modified poly U tail containing a 2’-O-methyl-and phosphorothioate modification. In some embodiments, the 3’ modified poly U tail comprises three, four, five, six, seven, or more uracils. In some embodiments, the 3’ modified poly U tail comprises a 2’-O-methyl-and phosphorothioate modification on its first two, three, four, five, six, seven, or more 5’ end uracils. In some embodiments, the 3’ modified poly U tail comprises a 2’-O-methyl-and phosphorothioate modification on each of its first two, three, four, five, six, seven, or more 5’ end uracils. In some embodiments, the 3’ modified poly U tail is consisting of four uracils with 2’-O-methyl-and phosphorothioate modifications on each of its first three 5’ end uracils.
In some embodiments, the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification. In some embodiments, the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification on one, two, or three nucleotides of the first three nucleotides. In some embodiments, the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification on each of the first three nucleotides.
SYSTEM, COMPOSITION, OR COMPLEX
In another aspect, the disclosure provides a system or composition comprising: (1) a nucleic acid programmable DNA binding protein (napDNAbp) or a polynucleotide encoding the napDNAbp; and (2) the guide RNA of the disclosure, or a polynucleotide encoding the guide RNA. In some embodiments, the polynucleotide encoding the napDNAbp and/or the polynucleotide encoding the guide RNA is a DNA or an RNA. In some embodiments, the system comprises two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) guide RNAs of the disclosure. As an example, in some embodiments, the system or composition of the disclosure comprises a first guide RNA and a second guide RNA, wherein the first guide RNA comprises a guide sequence as set forth in SEQ ID NO: 1933, 1934, or 1935, and the second guide RNA comprises a guide sequence as set forth in SEQ ID NO: 80, 90, or 117. In some embodiments, the system or composition of the disclosure comprises a first guide RNA and a second guide RNA, wherein the first guide RNA comprises a guide sequence as set forth in SEQ ID NO: 1934, and the second guide RNA comprises a guide sequence as set forth in SEQ ID NO: 117.
In another aspect, the disclosure provides a complex comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) complexed or bound with (2) the guide RNA of the disclosure. In some embodiments, the guide RNA and the napDNAbp complex or bind to each other in a molar ratio of about 1: 1. In some embodiments, the complex comprising the napDNAbp and the guide RNA binds to a target sequence. In some embodiments, the complex further comprises a target sequence on a target strand of the target gene; optionally hybridized to the guide sequence of the guide RNA. In some embodiments, the complex comprising the napDNAbp and the guide RNA binds to a target sequence at a molar ratio of about 1: 1. In some embodiments, the complex comprises enzymatic activity, such as nuclease activity, that can cleave the target sequence. In some embodiments, the guide RNA, the napDNAbp, and the target sequence, either alone or together, do not naturally occur.
In some aspects, use of the systems, compositions, or complexes disclosed herein has advantages over those of other known nuclease systems. Cas12i polypeptides as exemplary napDNAbp herein are smaller than other nucleases. Cas12f polypeptides are even smaller. For example, xCas12i is 1, 080 amino acids in length and Cas12i2 is 1, 054 amino acids in length, whereas S. pyogenes Cas9 (SpCas9) is 1,368 amino acids in length, S. thermophilus Cas9 (StCas9) is 1,128 amino acids in length, FnCpfl is 1,300 amino acids in length, AsCpfl is 1, 307 amino acids in length, and LbCpfl is 1,246 amino acids in length. Cas12i guide RNAs, which do not require a trans-activating CRISPR RNA (tracrRNA) , are also smaller than Cas9 guide RNAs. The smaller Cas12i polypeptide and guide RNA sizes are beneficial for delivery. Systems comprising a Cas12i polypeptide also demonstrate decreased off-target activity compared to systems comprising an SpCas9 polypeptide. See PCT/US2021/025257, which is incorporated by reference in its entirety. Furthermore, indels induced by systems comprising a Cas12i polypeptide differ from indels induced by systems comprising an SpCas9 polypeptide. For example, SpCas9 polypeptides primarily induce insertions and deletions of 1 nucleotide in length. However, Cas12i polypeptides induce larger deletions, which can be beneficial in disrupting a larger portion of a target gene.
Donor for Knock In
In some embodiments, the system or composition further comprises a donor sequence (e.g., encoding a chimeric antigen receptor (CAR) ) for insertion into the genome of the target cell, e.g., by homologous recombination. In some embodiments, the donor sequence is a donor DNA. The donor sequence may be inserted at the site where the genome is modified by the system or composition of the disclosure, or at a site irrelevant to where the genome is modified by the system or composition of the disclosure.
In some embodiments, the donor sequence encodes a chimeric antigen receptor (CAR) . In some embodiments, the chimeric antigen receptor (CAR) comprises a scFv targeting an antigen on a B cell, a T cell, or a tumor cell, such as, CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM.
In some embodiments, the donor sequence encodes or comprises a sequence of a wild type gene or a non-disease-causing version of a gene. In some embodiments, the insertion of the donor sequence results in introduction of a wildtype or non-disease-causing version of a gene to the target cell. In some embodiments, the insertion of the donor sequence results in introduction of a selection marker or a reporter protein to the target cell. In some embodiments, the insertion of the donor sequence results in knock-in of a gene in the target cell. In some embodiments, the insertion of the donor sequence results in a knockout mutation in the target cell. In some embodiments, the insertion of the donor sequence results in a substitution mutation, such as a single nucleotide substitution, in the target cell. In some embodiments, the insertion induces a phenotypic change to the target cell.
NapDNAbp
The guide RNA of the disclosure can work with any proper nucleic acid programmable DNA binding protein (napDNAbp) that can identify or recognize the PAM adjacent to the protospacer sequence and can be guided by the guide RNA to the target DNA. For example, the napDNAbp can be a Cas12i polypeptide that can identify or recognize the PAM 5’-TTN-3’ immediately 5’ to the protospacer sequence and be guided by the guide RNA to the target DNA.
In some embodiments, the napDNAbp is capable of identifying or recognizing a PAM comprising, consisting essentially of, or consisting of sequence 5’-TTN-3’, and N is A, T, G, or C, e.g., Cas12i polypeptides, Cas12f polypeptides.
In some embodiments, the napDNAbp is a CRISPR-associated protein (Cas) . In some embodiments, the napDNAbp is an IscB polypeptide, which is not a Cas. In some embodiments, the napDNAbp is an IsrB polypeptide, which is not a Cas.
In some embodiments, the napDNAbp is a Class 2, Type II CRISPR-associated protein (Cas9) , e.g., spCas9, saCas9. In some embodiments, the napDNAbp is a Class 2, Type V CRISPR-associated protein (Cas12) . In some embodiments, the Cas12 is a Cas12i, Cas12a (Cpf1) , Cas12b (C2c1) , Cas12c (C2c3) , Cas12d (CasY) , Cas12e (CasX) , Cas12f (Cas14) , or Cas12k (C2c10, C2C7) polypeptide. In some embodiments, the Cas is a Cas12i polypeptide.
In some embodiments, the Cas12i is any Cas12i polypeptide in any of the patent applications CN202111290670.8, US17/819, 795, CN202111289092.6, CN202210081981.1, PCT/CN2022/089074, PCT/CN2022/129376, PCT/CN2023/073420, the disclosure of which are incorporated herein by reference in their entirety. In some embodiments, the Cas12i is xCas12i or any mutant or variant thereof, e.g., Cas12Max, hfCas12Max. In some embodiments, the Cas12i is a mutant or variant of xCas12i with increased on-target dsDNA cleavage activity, such as, Cas12Max (SEQ ID NO: 2031) . In some embodiments, the Cas12i is a mutant or variant of xCas12i with decreased off-target dsDNA cleavage activity. In some embodiments, the Cas is a mutant or variant of xCas12i with both increased on-target dsDNA cleavage activity and decreased off-target dsDNA cleavage activity, such as, hfCas12Max (SEQ ID NO: 2044) .
In some embodiments, the Cas12i is any Cas12i polypeptide in any of the patent applications PCT/US2019/022375, US16/680104, US17/020414, US17/020215, US17/139678, US17/497725, US17/055719, US16/862261, US17/260791, US17/506627, US17/829692, US17/435563, US17/619165, US17/626072, US17/638065, US17/634461, US17/641523, US17/505578, US17/782254, US17/830212, US17/831852, US17/832114, US17/832038, US17/814318, the disclosure of which are incorporated herein by reference in their entirety. In some embodiments, the Cas12i is Cas12i1, Cas12i2, Cas12i3, Cas12i4, or any mutant or variant thereof, e.g., a mutant or variant thereof with increased on-target dsDNA cleavage activity and/or decreased off-target dsDNA cleavage activity.
In some embodiments, the Cas12i (1) is as set forth in SEQ ID NO: 2030 (xCas12i, also known as SiCas12i) , 2031 (Cas12Max) , or 2044 (hfCas12Max) ; (2) comprises the amino acid sequence of SEQ ID NO: 2030, 2031, or 2044; or (3) comprises an amino acid sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the amino acid sequence of SEQ ID NO: 2030, 2031, or 2044.
In some embodiments, the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) nuclear localization  signal (NLS) . In some embodiments, the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) nuclear export signal (NES) . In some embodiments, the napDNAbp comprises at least one (e.g., two, three, four, five, six, or more) NLS and at least one (e.g., two, three, four, five, six, or more) NES. In some embodiments, the NLS comprises or is SV40 NLS (SEQ ID NO: 2038) , bpSV40 NLS (BP NLS, bpNLS) , or NP NLS (Xenopus laevis Nucleoplasmin NLS, nucleoplasmin NLS, SEQ ID NO: 2039) .
In some embodiments, the napDNAbp can be self-inactivating. See, for example, Epstein et al., “Engineering a Self-Inactivating CRISPR System for AAV Vectors, ” Mol. Ther., 24 (2016) : S50, which is incorporated by reference in its entirety.
Although the changes described herein with respect to a mutant or variant of a napDNAbp may be one or more amino acid changes, changes to the napDNAbp may also be of a substantive nature, such as fusion of polypeptides as amino-and/or carboxyl-terminal extensions. For example, the napDNAbp may contain additional peptides, e.g., one or more peptides. Examples of additional peptides may include epitope peptides for labelling, such as a polyhistidine tag (His-tag) , Myc, and FLAG. In some embodiments, the napDNAbp can be fused to a detectable moiety such as a fluorescent protein (e.g., green fluorescent protein (GFP) or yellow fluorescent protein (YFP) ) .
Polynucleotide
In yet another aspect, the disclosure provides a polynucleotide encoding the guide RNA of the disclosure. In some embodiments, the polynucleotide further comprises a polynucleotide encoding the napDNAbp of the disclosure. In some embodiments, the polynucleotide is a DNA or an RNA. In some embodiments, one or more of the nucleotides of the polynucleotide is modified.
In some embodiments, the polynucleotide can be codon-optimized for use in a particular host cell or organism. For example, the nucleic acid can be codon-optimized for any non-human eukaryote including mice, rats, rabbits, dogs, livestock, or non-human primates. Codon usage tables are readily available, for example, at the “Codon Usage Database” available at www. kazusa. orjp/codon/and these tables can be adapted in a number of ways. See Nakamura et al. Nucl. Acids Res. 28: 292 (2000) , which is incorporated herein by reference in its entirety. Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, PA) . In some embodiments, the polynucleotide is codon optimized for expression in eukaryotic (e.g., mammalian, such as, human) cells.
DELIVERY
The systems or complexes of the disclosure may be formulated, for example, including a carrier, such as a carrier and/or a polymeric carrier, e.g., a liposome, and delivered by known methods to a cell (e.g., a prokaryotic, eukaryotic, plant, mammalian, etc. ) . Such methods include, but not limited to, transfection (e.g., lipid-mediated, cationic polymers, calcium phosphate, dendrimers) ; electroporation or other methods of membrane disruption (e.g., nucleofection) , viral delivery (e.g., lentivirus, retrovirus, adenovirus, AAV) , microinjection, microprojectile bombardment ( “gene gun” ) , fugene, direct sonic loading, cell squeezing, optical transfection, protoplast fusion, impalefection, magnetofection, exosome-mediated transfer, lipid nanoparticle-mediated transfer, and any combination thereof.
In some embodiments, the method comprises delivering one or more nucleic acids (e.g., nucleic acids encoding the napDNAbp, guide RNA, donor DNA, etc. ) , one or more transcripts thereof, and/or a pre-formed guide RNA/napDNAbp complex to a cell, where a ternary complex is formed. Exemplary intracellular delivery methods, include, but are not limited to: viruses or virus-like agents; chemical-based transfection methods, such as those using calcium phosphate, dendrimers, liposomes, or cationic polymers (e.g., DEAE-dextran or polyethylenimine) ; non-chemical methods, such as microinjection, electroporation, cell squeezing, sonoporation, optical transfection, impalefection, protoplast fusion, bacterial conjugation, delivery of plasmids or transposons; particle-based methods, such as using a gene gun, magnetofection or magnet assisted transfection, particle bombardment; and hybrid methods, such as nucleofection. In some embodiments, the disclosure further provides cells produced by such methods, and organisms (such as animals, plants, or fungi) comprising or produced from such cells.
In some embodiments, the napDNAbp and the guide RNA are delivered together. For example, in some embodiments, the napDNAbp and the guide RNA are packaged together in a single AAV particle. In another example, in some embodiments, the napDNAbp and the guide RNA are delivered together via lipid nanoparticles (LNPs) . In some embodiments, the napDNAbp and the guide RNA are delivered separately. For example, in some embodiments, the napDNAbp and the guide RNA are packaged into separate AAV particles.
In another example, in some embodiments, the napDNAbp is delivered by a first delivery mechanism and the guide RNA is delivered by a second delivery mechanism.
Vector
In yet another aspect, the disclosure provides a vector comprising the polynucleotide of the disclosure.
In some embodiments, the polynucleotide encoding the guide RNA is operably linked to and under the regulation of a  promoter. In some embodiments, the polynucleotide encoding the napDNAbp is operably linked to and under the regulation of a promoter. In some embodiments, the polynucleotide encoding the guide RNA and the polynucleotide encoding the napDNAbp are operably under the regulation of a same promoter. In some embodiments, the polynucleotide encoding the guide RNA and the polynucleotide encoding the napDNAbp are each operably under the regulation of a promoter.
In some embodiments, the promoter is selected from the group consisting of a ubiquitous promoter, a tissue-specific promoter, a cell-type specific promoter, a constitutive promoter, and an inducible promoter. In some embodiments, the promoter comprises or is a promoter selected from the group consisting of: a (human) U6 promoter (e.g., SEQ ID NO: 2034) , a CBh promoter (e.g., SEQ ID NO: 2035) , an elongation factor 1α short (EFS) promoter, a (human) Cbh promoter, a MHCK7 promoter, a Cba promoter, a pol I promoter, a pol II promoter, a pol III promoter, a T7 promoter, a H1 promoter, a retroviral Rous sarcoma virus LTR promoter, a (human) cytomegalovirus (CMV) promoter (e.g., SEQ ID NO: 2042) , a SV40 promoter, a dihydrofolate reductase promoter, a β-actin promoter, a βglucuronidase (GUSB) promoter, a cytomegalovirus (CMV) immediate-early (Ie) enhancer and/or promoter, a chicken β-actin (CBA) promoter or derivative thereof such as a CAG promoter, CB promoter, a (human) elongation factor 1α-subunit (EF1α) promoter, a ubiquitin C (UBC) promoter, a prion promoter, a neuron-specific enolase (NSE) promoter, a neurofilament light (NFL) promoter, a neurofilament heavy (NFH) promoter, a platelet-derived growth factor (PDGF) promoter, a platelet-derived growth factor B-chain (PDGF-β) promoter, a synapsin (Syn) promoter, a synapsin 1 (Syn1) promoter, a methyl-CpG binding polypeptide 2 (MeCP2) promoter, a Ca2+/calmodulin-dependent polypeptide kinase II (CaMKII) promoter, a metabotropic glutamate receptor 2 (mGluR2) promoter, a β-globin minigene nβ2 promoter, a preproenkephalin (PPE) promoter, an enkephalin (Enk) promoter, an excitatory amino acid transporter 2 (EAAT2) promoter, a glial fibrillary acidic polypeptide (GFAP) promoter, and a myelin basic polypeptide (MBP) promoter.
In some embodiments, the polynucleotide comprises a Kozak sequence (e.g., SEQ ID NO: 2036) .
In some embodiments, the polynucleotide comprises a bGH polyA coding sequence (e.g., SEQ ID NO: 2040) .
In some embodiments, the polynucleotide comprises a CMV enhancer (e.g., SEQ ID NO: 2041) .
In some embodiments, the polynucleotide encoding the napDNAbp is 5' or 3' to the polynucleotide encoding the guide RNA.
In some embodiments, the vector is a plasmid. In some embodiments, the vector is a mammalian plasmid for expression in eukaryotic cells.
In some embodiments, the vector is a viral vector. In some embodiments, the vector is a retroviral vector, a phage vector, an adenoviral vector, a herpes simplex viral (HSV) vector, an AAV vector, or a lentiviral vector. In some embodiments, the AAV vector is an AAV vector capable of encapsidating a DNA or an AAV vector capable of encapsidating an RNA. In some embodiments, the AAV vector comprises a capsid with a serotype of AAV1, AAV2, AAV3, AAV3A, AAV3B, AAV4, AAV5, AAV6, AAV7, AAVrh74, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV-DJ, AAV. PHP. eB, a member of the Clade to which any of the AAV1-AAV13 belong, a functional truncated variant thereof, or a functional mutant thereof. In some embodiments, the AAV vector comprises 5’ ITR and/or 3’ ITR from wild type AAV2. The system of the disclosure may be delivered via a single AAV due to the suitable sizes of the components of the system.
The system of the disclosure can be delivered via ribonucleoprotein (RNP) delivery. For example, an Addgene blog ( “CRISPR 101: Ribonucleoprotein (RNP) Delivery” by Andrew Hempstead, https: //blog. addgene. org/crispr-101-ribonucleoprotein-rnp-delivery) explains the RNP delivery of a CRISPR-Cas9 system comprising a Cas9 protein and a guide RNA targeting a genomic site of interest. This delivery method can be similarly applied to the system of the disclosure, mutatis mutandis.
RNP
In yet another aspect, the disclosure provides a ribonucleoprotein (RNP) comprising the system or complex of the disclosure comprising the napDNAbp and the guide RNA. In some embodiments, the RNP comprises an excess or supersaturation amount of the guide RNA over the napDNAbp. In some embodiments, the RNP comprises the napDNAbp and the guide RNA in a ratio in a range of about 1: 1 to about 1: 2, e.g., about 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5, 1: 1.6, 1: 1.7, 1: 1.8, 1: 1.9, 1: 2, e.g., a ratio of 1: 1.875, or in a range between any of two preceding ratios, e.g., a ratio of about 1: 1.7 to about 1: 1.9. In some embodiments, the RNP further comprises the donor sequence as described herein. Methods and materials for production and delivery of such a RNP is known in the art.
LNP
In yet another aspect, the disclosure provides a lipid nanoparticle (LNP) comprising the system of the disclosure comprising a mRNA encoding the napDNAbp and the guide RNA. In some embodiments, the mRNA comprises a 5’ UTR. In some embodiments, the mRNA comprises a 3’ polyA tail. In some embodiments, the LNP comprises the mRNA and the guide RNA in a ratio in a range of about 1: 1 to about 1: 2, e.g., about 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5, 1: 1.6, 1: 1.7, 1: 1.8, 1: 1.9, 1: 2, e.g., a ratio of 1: 1.875, or in a range between any of two preceding ratios, e.g., a ratio of  about 1: 1.7 to about 1: 1.9. In some embodiments, the LNP further comprises the donor sequence as described herein. Methods and materials for production and delivery of such a LNP is known in the art.
Cells (target cells)
The systems or complexes of the disclosure can be delivered to a variety of cells. In some embodiments, the cell is an isolated cell. In some embodiments, the cell is in cell culture or a co-culture of two or more cell types. In some embodiments, the cell is ex vivo. In some embodiments, the cell is obtained from a living organism and maintained in a cell culture. In some embodiments, the cell is a single-cellular organism.
In some embodiments, the cell is a prokaryotic cell. In some embodiments, the cell is a bacterial cell or derived from a bacterial cell. In some embodiments, the cell is an archaeal cell or derived from an archaeal cell.
In some embodiments, the cell is a eukaryotic cell. In some embodiments, the cell is a plant cell or derived from a plant cell. In some embodiments, the cell is a fungal cell or derived from a fungal cell. In some embodiments, the cell is an animal cell or derived from an animal cell. In some embodiments, the cell is an invertebrate cell or derived from an invertebrate cell. In some embodiments, the cell is a vertebrate cell or derived from a vertebrate cell. In some embodiments, the cell is a mammalian cell or derived from a mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the cell is a zebra fish cell. In some embodiments, the cell is a rodent cell. In some embodiments, the cell is synthetically made, sometimes termed an artificial cell. In some embodiments, the cell is engineered, i.e., an engineered cell, to comprise a modification relative to a corresponding non-engineered cell. In some embodiments, the modification comprises an insertion of a sequence into the genome of the cell. In some embodiments, the modification comprises a deletion of a sequence from the genome of the cell. In some embodiments, the modification comprises both an insertion of a sequence into the genome of the cell and a deletion of a sequence from the genome of the cell.
In some embodiments, the cell is derived from a cell line. A wide variety of cell lines for tissue culture are known in the art. Examples of cell lines include, but are not limited to, HEK293T (293T) , MF7, K562, HeLa, CHO, and transgenic varieties thereof. Cell lines are available from a variety of sources known to those with skill in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassas, Virginia) ) . In some embodiments, the cell is an immortal or immortalized cell. 
In some embodiments, the cell is a primary cell. In some embodiments, the cell is a stem cell such as a totipotent stem cell 
(e.g., omnipotent) , a pluripotent stem cell, a multipotent stem cell, an oligopotent stem cell, or an unipotent stem cell. In some embodiments, the cell is an induced pluripotent stem cell (iPSC) or derived from an iPSC. In some embodiments, the cell is a differentiated cell. For example, in some embodiments, the differentiated cell is a muscle cell (e.g., a myocyte) , a fat cell (e.g., an adipocyte) , a bone cell (e.g., an osteoblast, osteocyte, osteoclast) , a blood cell (e.g., a monocyte, a lymphocyte, a neutrophil, an eosinophil, a basophil, a macrophage, a erythrocyte, or a platelet) , a nerve cell (e.g., a neuron) , an epithelial cell, an immune cell (e.g., a lymphocyte, a T cell, a B cell, a NK cell, a neutrophil, a monocyte, or a macrophage) , a liver cell (e.g., a hepatocyte) , a fibroblast, or a sex cell (e.g., an egg, a sperm cell) . In some embodiments, the cell is a terminally differentiated cell. For example, in some embodiments, the terminally differentiated cell is a neuronal cell, an adipocyte, a cardiomyocyte, a skeletal muscle cell, an epidermal cell, or a gut cell. In some embodiments, the cell is an immune cell. In some embodiments, the immune cell is a T cell. In some embodiments, the immune cell is a B cell. In some embodiments, the immune cell is a Natural Killer (NK) cell. In some embodiments, the immune cell is a Tumor Infiltrating Lymphocyte (TIL) . In some embodiments, the cell is a mammalian cell, e.g., a human cell, a monkey cell, or a murine cell. In some embodiments, the murine cell is derived from a wild-type mouse, an immunosuppressed mouse, or a disease-specific mouse model. In some embodiments, the cell is a cell within a living tissue, organ, or organism.
In yet another aspect, the disclosure provides a cell or a progeny thereof comprising the guide RNA, the system, the polynucleotide, the vector, the LNP, and/or the RNP of the disclosure.
In yet another aspect, the disclosure provides a cell or a progeny thereof, wherein the cell is modified by the method of the disclosure, and also termed as a modified cell.
The gRNA, the system, the complex, and the method of the disclosure are applicable for any suitable cell type with respect to the target cell of the disclosure in which a target DNA (e.g., gene) may be located, the cell of the disclosure, or the modified cell of the disclosure (collectively, “the cell” ) .
In some embodiments, the cell is in vivo, ex vivo, or in vitro.
In some embodiments, the cell is a eukaryotic cell (e.g., an animal cell, a vertebrate cell, a mammalian cell, a non-human mammalian cell, a non-human primate cell, a rodent (e.g., mouse or rat) cell, a human cell, a plant cell, or a yeast cell) or a prokaryotic cell (e.g., a bacteria cell) .
In some embodiments, the cell is a cell isolated from natural sources, such as a tissue biopsy. In some embodiments, the cell is a cell isolated from an in vitro cultured cell line. In some embodiments, the cell is from a primary cell line. In some embodiments, the cell is from an immortalized cell line. In some embodiments, the cell is an engineered cell. In some embodiments, the cell is a genetically engineered cell.
In some embodiments, the cell is a cultured cell, an isolated primary cell, or a cell within a living organism.
In some embodiments, the cell is an immune cell. In some embodiments, the cell is a T cell (such as, CAR-T cell, a cytotoxic T cell, a helper T cell, a regulatory T cell, a natural killer (NK) T cell, an iNK-T cell, an NK-T like cell, a γδT cell, a tumor-infiltrating T cell and a dendritic cell (DC) -activated T cell) . In some embodiments, the cell is a B cell. In some embodiments, the cell is a NK cell (such as, CAR-NK cell) . In some embodiments, the cell is a universal CAR-T cell or a universal CAR-NK cell.
In some embodiments, the cell comprises a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC, and any homology, mutant, or variant thereof, and any intron, exon, complement, or fragment thereof.
In some embodiments, the cell expresses one or more engineered receptors. Exemplary engineered receptors include, but are not limited to, CAR, TCR, TAC receptor, and TFPs. In some embodiments, the engineered receptor comprises an extracellular domain that specifically binds to an antigen (e.g., a tumor antigen) , a transmembrane domain, and an intracellular signaling domain (e.g., CD3zeta domain) . In some embodiments, the intracellular signaling domain comprises a primary intracellular signaling domain and/or a co-stimulatory domain, e.g., CD28, 4-1BB.
In some embodiments, the engineered receptor comprises one or more specific binding domains that target at least one (e.g., tumor) antigen, for example, CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM. In some embodiments, the binding domain is scFv.
In some embodiments, the engineered receptor is a chimeric antigen receptor (CAR) . Many chimeric antigen receptors are known in the art and may be suitable for the modified therapeutic cells of the disclosure. CARs can also be constructed with a specificity for any cell surface marker by utilizing antigen binding fragments or antibody variable domains of, for example, antibody molecules. Any method for producing a CAR may be used herein. See, for example, US6, 410, 319, US7, 446, 191, US7, 514, 537, US9765342B2, WO 2002/077029, WO2015/142675, US2010/065818, US 2010/025177, US 2007/059298, WO2017025038A1, and Berger C. et al., J. Clinical Investigation 118: 1 294-308 (2008) , which are hereby incorporated by reference.
In some embodiments, the cell is a T cell expressing an engineered receptor. In some embodiments, the T cell expresses a CAR (CAR-T cell) .
In some embodiments, the cell is NK cell expressing an engineered receptor. In some embodiments, the NK cell expresses a CAR (CAR-NK cell) .
In some embodiments, a nucleic acid encoding the CAR is introduced into the cell before the cell is modified according to the methods described herein. In some embodiments, a nucleic acid encoding the CAR is introduced into the cell after the cell is modified according to the methods described herein.
In some embodiments, the cell is an iPSC. In some embodiments, the iPSC is first modified by the methods described herein, and the modified iPSC is then differentiated into an immune cell, such as a T cell or an NK cell, which is optionally modified as a universal T cell, a universal CAR-T cell, a universal NK cell, or a universal CAR-NK cell.
In some embodiments, the cell is modified in vivo. In some embodiments, the cell is modified ex vivo. In some embodiments, the cell is derived from a healthy individual. In some embodiments, the cell is derived from an individual having a disease.
In some embodiments, the cell is derived from an individual, modified according to the methods described herein, and subsequently used to treat the individual from which the cell was derived.
In some embodiments, the cell is derived from a first individual, modified according to the methods described herein, and used for treatment of a second individual different from the first individual. In some embodiments, the cell is modified according to the methods described herein to produce an allogeneic cell with reduced or no potential for graft-versus-host-disease or other immune-mediated rejection of the cells in a recipient individual. Allogeneic cells are also referred to as “off the shelf” cells in the art. In some embodiments, the cell is modified to produce an allogenic cell. In some embodiments, the allogeneic cell is an allogeneic CAR-T cell. In some embodiments, the allogeneic cell is an allogenic CAR-NK cell. Allogeneic CAR-T and allogeneic CAR-NK cells are also referred to as “off the shelf, ” or “universal” CAR-T and CAR-NK cells.
In some embodiments, the cell is a stem cell (such as, iPS cell, hematopoietic stem cell (HSC) ) . In some embodiments, the HSC is CD34+ hematopoietic stem cell.
In some embodiments, the cell is derived from or heterogenous to a subject.
In yet another aspect, the disclosure provides a host comprising the cell or progeny thereof of the disclosure. In some embodiments, the host is a non-human animal or a plant. In some embodiments, the non-human animal is an animal (e.g., rodent or non-human primate) model for a human genetic disorder.
In yet another aspect, the disclosure provides a (e.g., pharmaceutical) composition comprising the guide RNA, the system, the complex, the polynucleotide, the vector, the RNP, the LNP, and/or the cell or progeny thereof of the disclosure. In some  embodiments, the composition comprises a pharmaceutically acceptable excipient. In some embodiments, the composition is formulated for delivery by a nanoparticle, e.g., a lipid nanoparticle (LNP) , a ribonucleoprotein (RNP) , a liposome, an exosome, a microvesicle, a nucleic acid (e.g., DNA) nanoassembly, a gene gun, or an implantable device.
In yet another aspect, the disclosure provides a delivery system comprising: (1) a delivery vehicle, and (2) the guide RNA, the system, the complex, the polynucleotide, the vector, the RNP, the LNP, the cell or progeny thereof, and/or the composition of the disclosure. In some embodiments, the delivery vehicle is a nanoparticle, e.g., a lipid nanoparticle (LNP) , a ribonucleoprotein (RNP) , a liposome, an exosome, a microvesicle, a nucleic acid (e.g., DNA) nanoassembly, a gene-gun, or an implantable device.
The disclosure also provides kits that can be used, for example, to carry out a method described herein. In some embodiments, the kits include a system of the disclosure comprising a guide RNA and a napDNAbp herein. In some embodiments, the systems include a polynucleotide of the disclosure that encodes such a napDNAbp, and optionally the polynucleotide is comprised within a vector, e.g., as described herein. In some embodiments, the kits include a polynucleotide of the disclosure that encodes a guide RNA disclosed herein. The napDNAbp and the guide RNA (e.g., as a ribonucleoprotein) can be packaged within the same or other vessel within a kit or can be packaged in separate vials or other vessels, the contents of which can be mixed prior to use. The kits can additionally include, optionally, a buffer and/or instructions for use of the system of the disclosure comprising the guide RNA and napDNAbp.
In yet another aspect, the disclosure provides a kit comprising the guide RNA, the system, the complex, the polynucleotide, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, and/or the delivery system of the disclosure. In some embodiments, the kit further comprises an instruction for modifying a target DNA.
PRODUCTION
The disclosure includes methods for production of the guide RNA of the disclosure, methods for production of the napDNAbp of the disclosure, and methods for complexing the guide RNA of the disclosure and the napDNAbp of the disclosure.
Guide RNA
In some embodiments, the guide RNA of the disclosure is made by in vitro transcription of a DNA template. Thus, for example, in some embodiments, the guide RNA is generated by in vitro transcription of a DNA template encoding the guide RNA using an upstream promoter sequence (e.g., a T7 polymerase promoter sequence) . In some embodiments, the DNA template encodes multiple guide RNAs or the in vitro transcription reaction includes multiple different DNA templates, each encoding a different guide RNA. In some embodiments, the guide RNA is made using chemical synthetic methods. In some embodiments, the guide RNA is made by expressing the guide RNA sequence in cells transfected with a plasmid including sequences that encode the guide RNA. In some embodiments, the plasmid encodes multiple different guide RNAs. In some embodiments, multiple different plasmids, each encoding a different guide RNA, are transfected into the cells. In some embodiments, the guide RNA is expressed from a plasmid that encodes the guide RNA and also encodes a napDNAbp. In some embodiments, the guide RNA is expressed from a plasmid that expresses the guide RNA but not a napDNAbp. In some embodiments, the guide RNA is purchased from a commercial vendor. In some embodiments, the guide RNA is synthesized using one or more modified nucleotide, e.g., as described above.
NapDNAbp
In some embodiments, the napDNAbp of the disclosure can be prepared by (a) culturing bacteria which produce the napDNAbp, isolating the napDNAbp, optionally, purifying the napDNAbp, and optionally, complexing the napDNAbp with a guide RNA. The napDNAbp can be also prepared by (b) a known genetic engineering technique, specifically, by isolating a gene encoding the napDNAbp from bacteria, constructing an expression vector based on the gene, and then transferring the vector into an appropriate host cell that expresses a guide RNA for expression of the napDNAbp that complexes with the guide RNA in the host cell. Alternatively, the napDNAbp can be prepared by (c) an in vitro coupled transcription-translation system and then complexing with a guide RNA.
In some embodiments, a host cell is used to express the napDNAbp. The host cell is not particularly limited, and various known cells can be used. Specific examples of the host cell include bacteria such as E. coli, yeasts (including budding yeast, e.g., Saccharomyces cerevisiae, and fission yeast, e.g., Schizosaccharomyces pombe) , nematodes (Caenorhabditis elegans) , Xenopus laevis oocytes, and animal cells (for example, CHO cells, COS cells, and HEK293 cells) . The method for transferring the expression vector described above into host cells, i.e., the transformation method, is not particularly limited, and known methods such as electroporation, the calcium phosphate method, the liposome method and the DEAE dextran method can be used.
After a host cell is transformed with the expression vector, the host cell may be cultured, cultivated, or bred, for production of the napDNAbp. After expression of the napDNAbp, the host cell can be collected and napDNAbp purified  from the cultures according to conventional methods (for example, filtration, centrifugation, cell disruption, gel filtration chromatography, ion exchange chromatography, etc. ) .
A variety of methods can be used to determine the level of production of a napDNAbp in a host cell. Such methods include, but are not limited to, for example, methods that utilize either polyclonal or monoclonal antibodies specific for the napDNAbp or a labeling tag as described elsewhere herein. Exemplary methods include, but are not limited to, enzyme-linked immunosorbent assays (ELISA) , radioimmunoassays (MA) , fluorescent immunoassays (FIA) , and fluorescent activated cell sorting (FACS) . These and other assays are well known in the art (See, e.g., Maddox et al., J. Exp. Med. 158: 1211 [1983] ) .
The disclosure provides methods of in vivo expression of the napDNAbp in a cell, comprising providing a polynucleotide (e.g., a DNA or an RNA) encoding the napDNAbp to a cell wherein the polynucleotide encodes the napDNAbp, expressing the napDNAbp in the cell, and obtaining the napDNAbp from the cell.
Complexing
In some embodiments, the guide RNA of the disclosure is complexed with the napDNAbp of the disclosure to form a ribonucleoprotein. In some embodiments, the complexation of the guide RNA and the napDNAbp occurs at a temperature lower than about any one of 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, or 55 ℃. In some embodiments, the guide RNA does not dissociate from the napDNAbp at about 37 ℃over an incubation period of at least about any one of 10 mins, 15 mins, 20 mins, 25 mins, 30 mins, 35 mins, 40 mins, 45 mins, 50 mins, 55 mins, 1 hour, 2 hours, 3 hours, 4 hours, or more hours.
In some embodiments, the guide RNA and napDNAbp are complexed in a complexation buffer. In some embodiments, the napDNAbp is stored in a buffer that is replaced with a complexation buffer to form a complex with the guide RNA. In some embodiments, the napDNAbp is stored in a complexation buffer. In some embodiments, the guide RNA is stored in a buffer that is replaced with a complexation buffer to form a complex with the napDNAbp. In some embodiments, the guide RNA is stored in a complexation buffer.
In some embodiments, the complexation buffer has a pH in a range of about 7.3 to 8.6. In one embodiment, the pH of the complexation buffer is about 7.3. In one embodiment, the pH of the complexation buffer is about 7.4. In one embodiment, the pH of the complexation buffer is about 7.5. In one embodiment, the pH of the complexation buffer is about 7.6. In one embodiment, the pH of the complexation buffer is about 7.7. In one embodiment, the pH of the complexation buffer is about 7.8. In one embodiment, the pH of the complexation buffer is about 7.9. In one embodiment, the pH of the complexation buffer is about 8.0. In one embodiment, the pH of the complexation buffer is about 8.1. In one embodiment, the pH of the complexation buffer is about 8.2. In one embodiment, the pH of the complexation buffer is about 8.3. In one embodiment, the pH of the complexation buffer is about 8.4. In one embodiment, the pH of the complexation buffer is about 8.5. In one embodiment, the pH of the complexation buffer is about 8.6.
In some embodiments, the napDNAbp is overexpressed and complexed with the guide RNA in a host cell prior to purification as described herein. In some embodiments, RNA (e.g., mRNA) or DNA encoding the napDNAbp is introduced into a cell so that the napDNAbp is expressed in the cell. In some embodiments, the guide RNA is also introduced into the cell, whether simultaneously, separately, or sequentially from a single RNA (e.g., mRNA) or DNA construct, such that the ribonucleoprotein complex is formed in the cell.
In some embodiments, the ribonucleoprotein complex is formed with the guide RNA and the napDNAbp in a molar ratio of about 0.5: 1, 0.6: 1, 0.7: 1, 0.8: 1, 0.9: 1, 1: 1, 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5 or a molar ratio in a range composed of any two of the preceding molar ratios.
METHODS
Method of modifying
In yet another aspect, the disclosure provides a method for modifying a target DNA, comprising contacting the target DNA with the system, the complex, the vector, the RNP, or the LNP of the disclosure, wherein the guide sequence is capable of hybridizing to a target sequence of the target DNA, wherein the target DNA is modified (e.g., cleaved) by the complex.
In yet another aspect, the disclosure provides use of the system, the complex, the vector, the RNP, or the LNP of the disclosure in the manufacture of an agent for modifying a target DNA, wherein the guide sequence is capable of hybridizing to a target sequence of the target DNA, wherein the target DNA is modified (e.g., cleaved) by the complex.
In yet another aspect, the disclosure provides the system, the complex, the vector, the RNP, or the LNP of the disclosure, for use in modifying a target DNA, wherein the guide sequence is capable of hybridizing to a target sequence of the target DNA, wherein the target DNA is modified (e.g., cleaved) by the complex.
In some embodiments, the napDNAbp has enzymatic activity (e.g., nuclease activity) . In some embodiments, the napDNAbp induces one or more DNA double-stranded breaks in the target DNA. In some embodiments, the napDNAbp induces one or more DNA single-stranded breaks in the target DNA. In some embodiments, the napDNAbp induces one or more DNA nicks in the target DNA. In some embodiments, DNA breaks and/or nicks  result in formation of one or more indels (e.g., one or more deletions) in the target DNA.
In some embodiments, a guide RNA disclosed herein forms a complex with a napDNAbp and directs the napDNAbp to a protospacer sequence adjacent to a 5’-TTN-3’ sequence. In some embodiments, the complex induces a deletion (e.g., a nucleotide deletion or DNA deletion) adjacent to the 5’-TTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the complex induces a deletion adjacent to a T/C-rich sequence.
In some embodiments, the deletion is downstream of a 5’-TTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion is downstream of a T/C-rich sequence.
In some embodiments, the deletion alters expression of the target gene. In some embodiments, the deletion alters function of the target gene. In some embodiments, the deletion inactivates the target gene. In some embodiments, the deletion is a frameshifting deletion. In some embodiments, the deletion is a non-frameshifting deletion. In some embodiments, the deletion leads to cell toxicity or cell death (e.g., apoptosis) .
In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) from the 5’-TTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
In some embodiments, the deletion ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) from the 5’-NTTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5’-NTTN-3’ sequence, wherein N is A, T, G, or C.
In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) from the 5’-TTN-3’ sequence and ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) from the 5’-TTN-3’ sequence, wherein N is A, T, G, or C. In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5’-TTN-3’ sequence and ends within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5’-TTN-3’ sequence, wherein N is A, T, G, or C.
In some embodiments, the deletion is up to about 50 nucleotides in length (e.g., about or up to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides) . In some embodiments, the deletion is between about 4 nucleotides and about 50 nucleotides in length (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides) .
Method of producing modified cells
In yet another aspect, the disclosure provides a method of producing a modified target cell, comprising: (1) optionally harvesting a target cell from a subject; (2) optionally sorting and/or optionally amplifying the harvested target cell; (3) modifying a target gene in the (optionally sorted and/or optionally amplified) target cell by the method of any preceding claim;
(4) optionally inserting a donor sequence (e.g., a chimeric antigen receptor (CAR) -encoding donor sequence) into the genome of the target cell; and (5) optionally purifying the modified target cell.
In some embodiments, the modified cell is used for CAR-T cell therapy or CAR-NK cell therapy.
Method of diagnosing, preventing, or treating
In yet another aspect, the disclosure provides a method for diagnosing, preventing, or treating a disease or disorder in a subject, comprising administering to the subject (e.g., an effective amount of) the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure.
In yet another aspect, the disclosure provides use of (e.g., an effective amount of) the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure in the manufacture of an agent, a medicament, or a kit for diagnosing, preventing, or treating a disease or disorder in a subject.
In yet another aspect, the disclosure provides (e.g., an effective amount of) the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure, for use in diagnosing, preventing, or treating a disease or disorder in a subject.
Any suitable delivery or administration method known in the art may be used to deliver the system, the complex, the vector, the RNP, the LNP, the cell or progeny thereof, the composition, the delivery system, and/or the kit of the disclosure.
In some embodiments, the disease or disorder is associated with a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC.
In some embodiments, the disease or disorder is associated with a target or an antigen selected from the group consisting of  CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM.
In some embodiments, wherein the disease or disorder is a cancer, e.g., a hematologic malignancy or solid cancer, such as, chronic lymphocytic leukemia (CLL) , acute lymphoblastic leukemia (ALL) . In some embodiments, the disease or disorder is a hematologic disease or disorder, e.g., thalassemia, sickle cell disease, β-hemoglobinopathy, β-thalassemia.
In some embodiments, the target cell or the cell is derived from the same subject as the subject to whom the modified cell is administered. In some embodiments, the target cell or the cell is derived from a subject different from the subject to whom the modified cell is administered.
All references and publications cited herein are hereby incorporated by reference in their entirety.
EXAMPLES
The following examples are provided to further illustrate some embodiments of the disclosure but are not intended to limit the scope of the disclosure; it will be understood by their exemplary nature that other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used.
EXAMPLE 1. Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in mammalian cells
This example demonstrates the cleavage activity of CRISPR-Cas12i system at a target gene (e.g., AAVS1, B2M, BCL11A, CCR5, CIITA, CISH, CTLA4, CXCR4, HBB, IL2RG, LAG3, PD1, TGFBR2, TIGIT, TIM3, TRAC) in mammalian cells. The CRISPR-Cas12i system used in this Example is composed of
(1) a Cas12i polypeptide (SEQ ID NO: 2031, “Cas12Max” , which is N243R mutant of xCas12i (SiCas12i) of SEQ ID NO: 2030) and
(2) a guide RNA composed of
(i) one direct repeat (DR) sequence (or scaffold sequence) (SEQ ID NO: 2032) capable of forming a complex with the Cas12i polypeptide;
(ii) one spacer sequence (or guide sequence) , 3’ to the DR sequence, capable of hybridizing to a target sequence on a target strand of the target gene, thereby guiding the complex to the target gene, and
(iii) one poly U tail composed of seven uracil (UUUUUUU) (shown as poly T tail TTTTTTT as set forth in SEQ ID NO: 2052) 3’ to the spacer sequence.
xCas12i (SiCas12i) , 1080 aa, SEQ ID NO: 2030
Cas12Max (SiCas12i-N243R) , 1080 aa, SEQ ID NO: 2031

DR sequence 1, 30 nt, SEQ ID NO: 2032
DR sequence 2, 23 nt, SEQ ID NO: 2033
The target gene comprises a protospacer sequence on the nontarget DNA strand (NTS) of the target gene, which is completely complementary to the target sequence on the target DNA strand (TS) of the target gene and a protospacer adjacent motif (PAM) 5’ to the protospacer sequence.
A representative set of 99 protospacer sequences (Table 2) were selected from the protospacer sequences (Table 1) identified from the 16 genes listed above, by using 5’-TTN-3’ PAM sequence to verify the cleavage activity of the CRISPR-Cas12i system.
For the purpose of this experiment, each of the spacer sequences of the guide RNAs as tested in this Example was designed to be fully complementary to, and capable of hybridizing to, the target sequence and thus was identical to each of the representative protospacer sequences except for the replacement of a thymine (T) with an uracil (U) due to the nature of DNA and RNA. For convenience and to comply with electric sequence listing standard ST. 26 ( “The symbol ‘t’ will be construed as thymine in DNA and uracil in RNA when it is used with no further description” , see SECTION 1: LIST OF NUCLEOTIDES) , the spacer sequence is described to have the same sequence as the corresponding protospacer sequence (collectively “Protospacer /Spacer Sequence” and marked as DNA in the associated sequence listing. XML) , but it is well understood that each T of the spacer sequence represents uracil (U) . The protospacer /spacer sequence of each tested guide RNA is listed below. When a reference is made to a SEQ ID NO that recites a protospacer /spacer sequence, it refers to a protospacer sequence that is a DNA sequence or a spacer sequence that is a RNA sequence, depending on the context.
Design and Construction:
An all-in-one plasmid was constructed for transfection into HEK293T cells to express Cas12Max and the guide RNA of the CRISPR-Cas12i system. The plasmid comprised, from 5’ to 3’, a U6 promoter (SEQ ID NO: 2034) operably linked to a sequence encoding the guide RNA as described above, a CBh promoter (SEQ ID NO: 2035) , a Kozak sequence (SEQ ID NO: 2036) , a sequence encoding 3xFLAG (SEQ ID NO: 2037) , a sequence encoding SV40 NLS (SEQ ID NO: 2038) , a sequence encoding Cas12Max (SEQ ID NO: 2031) , a sequence encoding NP NLS (SEQ ID NO: 2039) , a sequence encoding a bGH polyA signal (SEQ ID NO: 2040) , a CMV enhancer (SEQ ID NO: 2041) , and a CMV promoter (SEQ ID NO: 2042) operably linked to a sequence encoding mCherry (SEQ ID NO: 2043) (indicating successful transfection and expression of the plasmid in the HEK293T cells) .
U6 promoter (RNA polymerase III promoter for human U6 snRNA) , 241 nt, SEQ ID NO: 2034
CBh promoter, 794 nt, SEQ ID NO: 2035
Kozak sequence, SEQ ID NO: 2036
3xFLAG, 23 aa, SEQ ID NO: 2037
SV40 NLS (nuclear localization signal of SV40 large T antigen) , 7 aa, SEQ ID NO: 2038
NP NLS (nucleoplasmin NLS) (bipartite nuclear localization signal from nucleoplasmin) , 16 aa, SEQ ID NO: 2039
bGH polyA (bovine growth hormone polyadenylation signal) , 208 nt, SEQ ID NO: 2040
CMV enhancer (human cytomegalovirus immediate early enhancer) , 304 nt, SEQ ID NO: 2041
CMV promoter (human cytomegalovirus (CMV) immediate early promoter) , 204 nt, SEQ ID NO: 2042
mCherry, 236 aa, SEQ ID NO: 2043
Transfection and Detection:
HEK293T cells were cultured in 24-well tissue culture plates according to standard methods for 12 hours, before the expression plasmid was transfected into the cells using standard polyethyleneimine (PEI) transfection. The transfected cells were then cultured at 37℃ under 5%CO2 for 48 hours. Then the cultured cells expressing mCherry were sorted by flow cytometry, and the cleavage activity of the CRISPR-Cas12i system was measured by using Indel rate (%) by sequencing with a pair of primers upstream and downstream of the protospacer sequence and TIDE analysis.
Results:
The cleavage activity of the CRISPR-Cas12i system for each tested protospacer/spacer sequence is shown in Table 2. The results show the significant cleavage activity of the CRISPR-Cas12i system for the listed genes, for example, up to 91.8%for TRAC gene, indicating the promising application of the CRISPR-Cas12i system in targeted gene editing.
Table 2: Cleavage activity of the CRISPR-Cas12i system for various target genes



A second round of experiments were conducted for B2M gene to evaluate the cleavage activity of CRISPR-Cas12i system comprising hfCas12Max (SEQ ID NO: 2044) instead of Cas12Max, showing that a B2M cleavage activity of up to 99.35%was achieved by using the CRISPR-Cas12i system (Table 3) .
hfCas12Max (SiCas12i-N243R, E336R, D892R) , 1080 aa, SEQ ID NO: 2044
Table 3: Cleavage activity of the CRISPR-Cas12i system for target gene B2M
EXAMPLE 2. Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in T cells
This example demonstrates the cleavage activity of CRISPR-Cas12i system at a target gene (e.g., TRAC, B2M, PD1, IL2RG, CCR5, KLRG1, CD52, CD7, LAG3, CTLA4, TIM3, CIITA, TIGIT, CD16a, CXCR4) in human primary T cells.
Electroporation
1. Bringing SF /P3 Primary Cell Nucleofector Solution to room temperature.
2. Preparing RNP (ribonucleoprotein) by complexing a purified Cas12i polypeptide and a chemically synthesized guide RNA (Genscript) in a ratio in PBS and then incubating at room temperature for >15 min prior to electroporation.
3. Aspirating supernatant from the T cell pellet. Resuspending cells in 20 μL of SF /P3 Primary Cell Nucleofector Solution and pipetting up and down to mix.
4. Transferring 20 μL of the cell suspension to each 5 μL RNP Complex Mix and pipetting up and down gently to mix, trying not to form air bubbles.
5. Transferring 25 μL of the cell suspension + RNP Complex Mix to a well of a 16-well Nucleocuvette Strip. Gently taping or using a pipette tip to ensure no air bubbles are present.
6. Placing the Nucleocuvette Strip in the Shuttle device of the 4D-Nucleofector X Unit and choosing the optimal pulse code (HEK293 cells: SF, DS-150; hiPS and T cells: P3, CA-137) , selecting OK to load the strip and selecting Start to begin electroporation.
7. Immediately after electroporation, transferring the cells to the warm (37 ℃) plate.
8. Incubating at 37 ℃ and 5%CO2 for 48 -96 hours.
9. Harvesting cells for assessment of gene cleavage.
Measurement of indel frequencies
1. Extracting genomic DNA from the harvested cells.
2. Amplifying DNA product with 50 μL PCR mix, containing 2 μL DNA, 2 μL forward primer, 2 μL reverse primer, 25 μL 2xTaq PCR mix, and 19 μL ddH2O.
3. PCR cycle conditions: 95 ℃ for 3 min; 35 cycles of 95 ℃ for 20 s, 60 ℃ for 20 s, and 72 ℃ for 30 s (500 bp) ; 72 ℃ for 5 min.
4. Sanger sequencing the PCR products.
5. Importing the sequencing traces to TIDE software for indel frequency measurement.
Results:
To identify an optimal molar ratio of the Cas12i polypeptide and the gRNA for preparing RNP for a specific test Cas12/gRNA pair, three molar ratios were tested with hfCas12Max and TRAC gRNA-3-long (SEQ ID NO: 2047) , and the molar amount of hfCas12Max was 3.2 μM for all three molar ratios. To facilitate gRNA preparation, TRAC gRNA-3-long was prepared in a long form of DR-spacer-DR-spacer. All the gRNA in Example 2 contains a polyU tail composed of four uracil (UUUU) (shown as polyT tail TTTT) at the 3’-end of the gRNA.
Table 4
wherein *denotes a 2’-O-methyl and phosphorothioate modification on the ribonucleotide, DR sequences (SEQ ID NOs: 2032 and 2033, respectively) are in bold, and spacer sequences are double-underlined.
Table 5

It was observed that a higher cleavage activity was achieved with an excess /supersaturation amount of gRNA (Molar Ratio 2 and Molar Ratio 3, e.g., higher moles of gRNA than hfCas12Max) compared with an undersaturation amount of gRNA (Ratio 1, a lower mole of gRNA than hfCas12Max) . Molar Ratio 3 that achieved the highest cleavage activity was then used in the subsequent tests.
To identify an optimal amount of the Cas12i polypeptide, five amounts of hfCas12Max in Molar Ratio 3 with TRAC gRNA-3-long were tested.
Table 6
It was observed that the cleavage activity was dose-dependent, and the highest cleavage activity of 90.0%was achieved with 3.2 μM of hfCas12Max.
With Molar Ratio 3, the cleavage activity of the CRISPR-Cas12i system with TRAC gRNA-1, -2, and -3 (synthesized in a long form of DR-spacer-DR-spacer) were tested.
underlined.
Table 7
It was observed that a significant TRAC gene cleavage was achieved for all three TRAC gRNAs, and almost 100%for TRAC gRNA-2 and -3 (Exp Groups 11-17) . It was also noted that lower off-target gene editing was achieved with TRAC gRNA-2-long than TRAC gRNA-3-long (data not shown) .
Similar experiments were conducted with B2M gRNA-3, -5, and -11 (synthesized in a short form of DR-spacer) with hfCas12Max : gRNA Molar Ratio 3 to test the cleavage activity of the CRISPR-Cas12i system against B2M gene.
Table 8

wherein *denotes a 2’-O-methyl and phosphorothioate modification on the ribonucleotide, DR sequences (SEQ ID NO: 2033) are in bold, and spacer sequences are double-underlined.
Table 9
It was observed that about 90%B2M gene cleavage was achieved with B2M gRNA-11.
Table 10
wherein *denotes a 2’-O-methyl and phosphorothioate modification on the ribonucleotide, DR sequence (SEQ ID NO: 2033) is in bold, and spacer sequence is double-underlined.
Further experiment 21 was conducted with RNP comprising hfCas12Max and both B2M gRNA-11 and TRAC gRNA-2 (both in a short form of DR-spacer) to evaluate the simultaneous knock out of multiple target genes.
Table 11
It was observed that the use of B2M gRNA-11 and TRAC gRNA-2 achieved simultaneous and almost complete knock out of the two target genes TRAC and B2M (Table 11) .
EXAMPLE 3. Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in NK cells
This example provides a strategy for using a CRISPR-Cas12i system to knock out a gene (e.g., PD-1, LAG-3, TIM-3, TIGIT, CISH, TGFBR2, KLRC1 (NKG2A) , A2AR, KLRK1, KLRG1) in NK cells, e.g., to produce universal NK cells.
NK cells from a donor are collected and counted using an automated cell counter. A sample from each donor is collected and analyzed for viability. NK cells are expanded in NK-cell media. Following expansion, cells are collected, counted, and cell density is adjusted in P3 primary cell buffer (Lonza) .
A Cas12i-gRNA mixture is prepared by mixing purified xCas12i or its variant and a gRNA targeting a gene (e.g., PD-1, LAG-3, TIM-3, TIGIT, CISH, TGFBR2, KLRC1 (NKG2A) , A2AR, KLRK1, KLRG1) in NK cells in a ratio. Exemplary protospacer sequence /spacer sequences for such a gRNA may be found in SEQ ID NOs: 1-2029.
NK cells are dispensed into an electroporation plate and the Cas12i-gRNA mixture is added to the cells. Several different final concentrations of the Cas12i-gRNA mixture are used (e.g., a final concentration of concentration of, for example, 2 μM, 5 μM, 10 μM, or 16 μM) . The plate is electroporated using an electroporation device. Following electroporation, replacement media is added to quench the reaction and cells are transferred to a new plate with pre-warmed media. The NK cells are then incubated until further analysis. The NK cells successfully modified according to the method are identified, with an insertion or deletion created in the target gene.
A nucleic acid encoding a chimeric antigen receptor (CAR) is optionally introduced into the NK cells before or after the introduction of the CRISPR-Cas12i system.
EXAMPLE 4. Evaluation of gene cleavage (gene knock out) by CRISPR-Cas12i system in hematopoietic stem cells (HSC)
This example provides a strategy for using a CRISPR-Cas12i system to knock out a gene (e.g., BCL11A, HBB, HEXB) in HSC, e.g., to treat a hemoglobinopathy.
Bone marrow CD34+ hematopoietic stem cells are assessed for cell number and viability using acridine orange/propidium iodide staining using a cell counter. CD34+ cells are cultured in serum-free expansion media with the appropriate supplement for approximately 48 hours.
A Cas12i-gRNA mixture is prepared by mixing purified xCas12i or its variant and a gRNA targeting a gene (e.g., BCL11A, HBB, HEXB) in HSC in a ratio. Exemplary protospacer sequence /spacer sequences for such a gRNA may be found in SEQ ID NOs: 1-2029.
A donor DNA template corresponding to the wildtype sequence of the target gene (e.g., BCL11A, HBB, HEXB) may also be added for insertion at the cleavage site of the target gene. The sequence of the donor DNA template is adjusted based on the gRNA used so that the sequence of the modified gene (e.g., BCL11A, HBB, HEXB) of the modified CD34+ cells after modification reflects the corresponding wildtype sequence.
Cells are washed with PBS and resuspended in buffer and supplement (Lonza #VXP-3032) with transfection enhancer oligo. Cells are dispensed into an electroporation plate at and the Cas12i-gRNA mixture with a donor DNA template encoding a wild-type gene (e.g., BCL11A, HBB, HEXB) is added to the cells. Several different final concentrations of the Cas12i-gRNA mixture are used (e.g., a final concentration of concentration of, for example 2 μM, 5 μM, 10 μM, or 16 μM) . The plate is electroporated using an electroporation device. Following electroporation, replacement media is added to quench the reaction and cells are transferred to a new plate with pre-warmed media. Cells are then incubated until further analysis. Cells successfully modified according to the method are identified, expressing the wild-type gene (e.g., BCL11A, HBB, HEXB) .
Various modifications and variations of the described products, methods, and uses of the disclosure will be apparent to those skilled in the art without departing from the scope and spirit of the disclosure. Although the disclosure has been described in connection with specific embodiments, it will be understood that it is capable of further modifications and that the disclosure as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the disclosure that are obvious to those skilled in the art are intended to be within the scope of the disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure come within known customary practice within the art to which the disclosure pertains and may be applied to the essential features herein before set forth.

Claims (56)

  1. A guide RNA comprising:
    (1) a scaffold sequence capable of forming a complex with a nucleic acid programmable DNA binding protein (napDNAbp) ; and
    (2) a guide sequence capable of hybridizing to a target sequence on a target strand of a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC in a target cell, thereby guiding the complex to the target gene;
    wherein the target gene comprises a protospacer sequence on the nontarget strand of the target gene and a protospacer adjacent motif (PAM) adjacent (e.g., 5’) to the protospacer sequence, wherein the protospacer sequence is fully complementary to the target sequence on the target strand of the target gene.
  2. The guide RNA of claim 1, wherein the PAM comprises, consists essentially of, or consists of sequence 5’-TTN-3’, wherein N is A, T, G, or C; optionally wherein the PAM is 5’ or 3’ to the protospacer sequence; optionally wherein the PAM is immediately adjacent to the protospacer sequence; optionally wherein the PAM is immediately 5’ or 3’ to the protospacer sequence.
  3. The guide RNA of claim 1 or 2, wherein the protospacer sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides; optionally wherein the protospacer sequence is about 20 nucleotides in length.
  4. The guide RNA of any one of claims 1-3, wherein the protospacer sequence comprises at least about 14 contiguous nucleotides of the nontarget strand of the target gene (e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, or more contiguous nucleotides of the nontarget strand of the target gene, or in a numerical range between any of two preceding values, e.g., from about 14 to about 50 contiguous nucleotides of the target gene) ; optionally wherein the protospacer sequence comprises, consists essentially of, or consists of 20 contiguous nucleotides of the nontarget strand of the target gene.
  5. The guide RNA of any one of claims 1-4, wherein the guide sequence is at least about 14 nucleotides in length, e.g., about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides in length, or in a length of a numerical range between any of two preceding values, e.g., in a length of from about 16 to about 50 nucleotides; optionally wherein the guide sequence is about 20 nucleotides in length.
  6. The guide RNA of any one of claims 1-5, wherein the guide sequence is about 90 to about 100%, optionally about 100%, complementary to the target sequence; or wherein the guide sequence comprises no mismatch with the target sequence in the first 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nucleotides from the 5’ end of the guide sequence.
  7. The guide RNA of any one of claims 1-6, wherein the protospacer sequence or guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-2029 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-2029.
  8. The guide RNA of claim 7, wherein the sequence of any one of SEQ ID NOs: 1-2029 is identified from the nontarget strand of the target gene by identifying a PAM on the nontarget strand of the target gene and electing the sequence immediately 3’ to the PAM.
  9. The guide RNA of any one of claims 1-6, wherein the target gene is A2AR, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1-73 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1-73.
  10. The guide RNA of any one of claims 1-6, wherein the target gene is AAVS1, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 74-79; optionally any one of SEQ ID NOs: 74, 75, 76, 77, 78, and 79.
  11. The guide RNA of any one of claims 1-6, wherein the target gene is B2M, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 80-129; optionally any one of SEQ ID NOs: 80, 81, 82, 83, 85, 86, 88, 90, 92, 94, 96, 100, 104, 105, 111, 112, 117, 118, 120, 126, and 129; optionally any one of SEQ ID NOs: 80, 90, and 117.
  12. The guide RNA of any one of claims 1-6, wherein the target gene is BCL11A, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 130-368; optionally any one of SEQ ID NOs: 130, 131, 132, 133, 134, 135, 136, and 137.
  13. The guide RNA of any one of claims 1-6, wherein the target gene is CCR5, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 369-520; optionally any one of SEQ ID NOs: 369, 370, 371, 372, 373, 374, and 375.
  14. The guide RNA of any one of claims 1-6, wherein the target gene is CD16a, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 521-573; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 521-573.
  15. The guide RNA of any one of claims 1-6, wherein the target gene is CD52, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 574-591; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 574-591.
  16. The guide RNA of any one of claims 1-6, wherein the target gene is CD7, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 592-608; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 592-608.
  17. The guide RNA of any one of claims 1-6, wherein the target gene is CIITA, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 609-792; optionally any one of SEQ ID NOs: 609 and 610.
  18. The guide RNA of any one of claims 1-6, wherein the target gene is CISH, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 793-833; optionally any one of SEQ ID NOs: 793, 794, and 795.
  19. The guide RNA of any one of claims 1-6, wherein the target gene is CTLA4, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 834-885; optionally any one of SEQ ID NOs: 834, 835, 836, 837, 838, 839, 840, 841, and 842.
  20. The guide RNA of any one of claims 1-6, wherein the target gene is CXCR4, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 886-1002; optionally any one of SEQ ID NOs: 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, and 897.
  21. The guide RNA of any one of claims 1-6, wherein the target gene is HBB, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1003-1048; optionally any one of SEQ ID NOs: 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, and 1011.
  22. The guide RNA of any one of claims 1-6, wherein the target gene is HEXB, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1049-1246; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1049-1246.
  23. The guide RNA of any one of claims 1-6, wherein the target gene is IL2RG, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1247-1353; optionally any one of SEQ ID NOs: 1247, 1248, 1249, 1250, 1251, 1252, and 1253.
  24. The guide RNA of any one of claims 1-6, wherein the target gene is KLRG1, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1354-1414; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any  one of SEQ ID NOs: 1354-1414.
  25. The guide RNA of any one of claims 1-6, wherein the target gene is LAG3, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1415-1492; optionally any one of SEQ ID NOs: 1415, 1416, 1417, and 1418.
  26. The guide RNA of any one of claims 1-6, wherein the target gene is NKG2A, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1493-1635; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1493-1635.
  27. The guide RNA of any one of claims 1-6, wherein the target gene is PD1, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1636-1670; optionally any one of SEQ ID NOs: 1636, 1637, 1638, 1639.
  28. The guide RNA of any one of claims 1-6, wherein the target gene is TGFBR2, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1671-1801; optionally any one of SEQ ID NOs: 1671, 1672, 1673, 1674, 1675, 1676, 1677, 1678, 1679, 1680, and 1681.
  29. The guide RNA of any one of claims 1-6, wherein the target gene is TIGIT, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1802-1845; optionally any one of SEQ ID NOs: 1802, 1803, 1804, 1805, 1806, 1807, 1808, and 1809.
  30. The guide RNA of any one of claims 1-6, wherein the target gene is TIM3, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1846-1932; optionally any one of SEQ ID NOs: 1846 and 1847.
  31. The guide RNA of any one of claims 1-6, wherein the target gene is TRAC, and the protospacer sequence or the guide sequence comprises at least about 14 (e.g., 20) contiguous nucleotides of (1) a sequence of any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937; or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 95%, or 100%to any one of SEQ ID NOs: 1933-2029; optionally any one of SEQ ID NOs: 1933, 1934, 1935, and 1937.
  32. The guide RNA of any one of claims 1-31, wherein the guide RNA comprises one scaffold sequence and one guide sequence in the structure (or configuration) of 5’-scaffold sequence -guide sequence -3’ or 5’-guide sequence -scaffold sequence -3’, and wherein the “-” between the scaffold sequence and the guide sequence represents an optional linker.
  33. The guide RNA of any one of claims 1-31, wherein the guide RNA comprises two scaffold sequences and one guide sequence in the structure (or configuration) of 5’-scaffold sequence -guide sequence -scaffold sequence -3’, wherein the two scaffold sequences are the same or different, and wherein each “-” between the scaffold sequence and the guide sequence represents an optional linker.
  34. The guide RNA of any one of claims 1-33, wherein the guide RNA comprises at its 3’ end a 3’ modified poly U tail containing a 2’-O-methyl-and phosphorothioate modification;
    optionally the 3’ modified poly U tail comprises three, four, five, six, seven, or more uracils;
    optionally the 3’ modified poly U tail comprises a 2’-O-methyl-and phosphorothioate modification on its first two, three, four, five, six, seven, or more 5’ end uracils;
    optionally the 3’ modified poly U tail comprises a 2’-O-methyl-and phosphorothioate modification on each of its first two, three, four, five, six, seven, or more 5’ end uracils;
    optionally the 3’ modified poly U tail is consisting of four uracils with 2’-O-methyl-and phosphorothioate modifications on each of its first three 5’ end uracils.
  35. The guide RNA of any one of claims 1-34, wherein the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification;
    optionally the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification on one, two, or three nucleotides of the first three nucleotides;
    optionally the guide RNA comprises at its 5’ end a 2’-O-methyl and phosphorothioate modification on each of the first three nucleotides.
  36. The guide RNA of any one of claims 1-35, wherein the scaffold sequence:
    (1) is as set forth in SEQ ID NO: 2032 or 2033;
    (2) comprises the sequence of SEQ ID NO: 2032 or 2033; or
    (3) comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the sequence of SEQ ID NO: 2032 or 2033; or
    (4) comprises at least about 14 (e.g., at least about 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30) contiguous nucleotides of (1) a sequence of SEQ ID NO: 2032 or 2033 or (2) a sequence having a sequence identity of at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%to SEQ ID NO: 2032 or 2033.
  37. The guide RNA of any one of claims 1-36, wherein the scaffold sequence has substantially the same secondary structure as the secondary structure of SEQ ID NO: 2032 or 2033.
  38. The guide RNA of any one of claims 1-37, wherein the guide RNA:
    (1) is as set forth in any one of SEQ ID NOs: 2045-2051;
    (2) comprises the sequence of any one of SEQ ID NOs: 2045-2051; or
    (3) comprises a sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the sequence of any one of SEQ ID NOs: 2045-2051.
  39. A system comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) or a polynucleotide encoding the napDNAbp; and (2) the guide RNA of any one of claims 1-38, or a polynucleotide encoding the guide RNA; optionally wherein the polynucleotide encoding the napDNAbp and/or the polynucleotide encoding the guide RNA is a DNA or an RNA; and optionally wherein the system further comprises a donor sequence (e.g., encoding a chimeric antigen receptor (CAR) ) for insertion into the genome of the target cell, e.g., by homologous recombination.
  40. A complex comprising (1) a nucleic acid programmable DNA binding protein (napDNAbp) complexed with (2) the guide RNA of any one of claims 1-38; optionally wherein the complex further comprises a target sequence on a target strand of the target gene; optionally hybridized to the guide sequence of the guide RNA.
  41. The system or complex of claim 39 or 40, wherein the napDNAbp is capable of recognizing a PAM comprising, consisting essentially of, or consisting of sequence 5’-TTN-3’, wherein N is A, T, G, or C.
  42. The system or complex of any one of claims 39-41, wherein the napDNAbp is a Class 2, Type II CRISPR-associated protein (Cas9) .
  43. The system or complex of any one of claims 39-41, wherein the napDNAbp is a Class 2, Type V CRISPR-associated protein (Cas12) .
  44. The system or complex of claim 43, wherein the Cas12 is Cas12i, Cas12a (Cpf1) , Cas12b (C2c1) , Cas12c (C2c3) , Cas12d (CasY) , Cas12e (CasX) , Cas12f (Cas14) , or Cas12k (C2c10, C2C7) ; optionally wherein the Cas is Cas12i.
  45. The system or complex of claim 44, wherein the Cas12i:
    (1) is as set forth in SEQ ID NO: 2030, 2031, or 2044;
    (2) comprises the amino acid sequence of SEQ ID NO: 2030, 2031, or 2044; or
    (3) comprises an amino acid sequence having a sequence identity of at least about 60% (e.g., at least about 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or 100%) to the amino acid sequence of SEQ ID NO: 2030, 2031, or 2044.
  46. A polynucleotide encoding the guide RNA of any one of claims 1-38.
  47. A ribonucleoprotein (RNP) comprising the system or complex of any one of claims 39-45 comprising the napDNAbp and the guide RNA; optionally wherein the RNP comprises an excess or supersaturation amount of the guide RNA over the napDNAbp; optionally wherein the RNP comprises the napDNAbp and the guide RNA in a ratio in a range of about 1: 1 to about 1: 2, e.g., about 1: 1.1, 1: 1.2, 1: 1.3, 1: 1.4, 1: 1.5, 1: 1.6, 1: 1.7, 1: 1.8, 1: 1.9, 1: 2, e.g., a ratio of 1: 1.875, or in a range between any of two preceding ratios, e.g., a ratio of about 1: 1.7 to about 1: 1.9.
  48. A lipid nanoparticle (LNP) comprising the system of any one of claims 39-45 comprising a mRNA encoding the napDNAbp and the guide RNA.
  49. A method for modifying a target gene in a target cell, comprising contacting the target cell with the system or complex of any one of claims 39-45, the RNP of claim 47, or the LNP of claim 48, wherein the guide sequence is capable of hybridizing to a target sequence on a target strand of the target gene, wherein the target gene is modified (e.g., cleaved) by the complex.
  50. A method of producing a modified target cell, comprising:
    (1) optionally harvesting a target cell from a subject;
    (2) optionally sorting and/or optionally amplifying the harvested target cell;
    (3) modifying a target gene in the (optionally sorted and/or optionally amplified) target cell by the method of claim 49;
    (4) optionally inserting a donor sequence (e.g., a chimeric antigen receptor (CAR) -encoding donor sequence) into the genome of the target cell; and
    (5) optionally purifying the modified target cell.
  51. A cell or a progeny thereof, wherein the cell is modified by the method of claim 50; optionally wherein the cell is a modified human CAR-T cell, e.g., a universal modified human CAR-T cell.
  52. A method for preventing or treating a disease or disorder in a subject, comprising administering to the subject (e.g., an effective amount of) the system or complex of any one of claims 39-45, the RNP of claim 47, the LNP of claim 48, or the cell or progeny thereof of claim 51.
  53. The method of claim 52, wherein the disease or disorder is associated with a target gene selected from the group consisting of A2AR, AAVS1, B2M, BCL11A, CCR5, CD16a, CD52, CD7, CIITA, CISH, CTLA4, CXCR4, HBB, HEXB, IL2RG, KLRG1, LAG3, NKG2A, PD1, TGFBR2, TIGIT, TIM3, and TRAC.
  54. The method of claim 52 or 53, wherein the disease or disorder is associated with a target or an antigen selected from the group consisting of CD3, CD5, CD7, CD19, CD20, CD22, CD30, CD33, CD38, CD79b, CD123, CD138, CD269 (BCMA) , ROR1, Mesothelin, GPC3, GD2, CEA, EGFR (e.g., EGFR vIII) , HER2, PSMA, MUC-1, EPHA2, VEGF, VEGFR, FAP, tenascin, SLAMF7, CLDN18.2, EpCAM.
  55. The method of any one of claims 52-54, wherein the disease or disorder is a cancer, e.g., a hematologic malignancy or solid cancer, such as, chronic lymphocytic leukemia (CLL) , acute lymphoblastic leukemia (ALL) .
  56. The method of any one of claims 52-54, wherein the disease or disorder is a hematologic disease or disorder, e.g., thalassemia, sickle cell disease, β-hemoglobinopathy, β-thalassemia.
PCT/CN2023/077462 2022-02-21 2023-02-21 Guide rna and uses thereof Ceased WO2023155924A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN2022077116 2022-02-21
CNPCT/CN2022/077116 2022-02-21
CNPCT/CN2022/142073 2022-12-26
CN2022142073 2022-12-26

Publications (1)

Publication Number Publication Date
WO2023155924A1 true WO2023155924A1 (en) 2023-08-24

Family

ID=87577629

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/077462 Ceased WO2023155924A1 (en) 2022-02-21 2023-02-21 Guide rna and uses thereof

Country Status (1)

Country Link
WO (1) WO2023155924A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117460822A (en) * 2022-04-25 2024-01-26 辉大基因治疗(新加坡)私人有限公司 Novel CRISPR-Cas12i system and its uses
WO2024140737A1 (en) * 2022-12-26 2024-07-04 Huidagene Therapeutics Co., Ltd. Modified guide rna and uses thereof
WO2025038648A1 (en) * 2023-08-14 2025-02-20 Intellia Therapeutics, Inc. Compositions and methods for genetically modifying transforming growth factor beta receptor type 2 (tgfβr2)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180119140A1 (en) * 2015-04-06 2018-05-03 The Board Of Trustees Of The Leland Stanford Junior University Chemically Modified Guide RNAs for CRISPR/CAS-Mediated Gene Regulation
WO2018176009A1 (en) * 2017-03-23 2018-09-27 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable dna binding proteins
US20180362975A1 (en) * 2015-12-04 2018-12-20 Novartis Ag Compositions and methods for immunooncology
WO2021097521A1 (en) * 2019-11-20 2021-05-27 Cartherics Pty. Ltd. Method for providing immune cells with enhanced function
CN113025613A (en) * 2021-03-11 2021-06-25 中国计量大学 ADORA2A gene knockout cell and construction method and application thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180119140A1 (en) * 2015-04-06 2018-05-03 The Board Of Trustees Of The Leland Stanford Junior University Chemically Modified Guide RNAs for CRISPR/CAS-Mediated Gene Regulation
US20180362975A1 (en) * 2015-12-04 2018-12-20 Novartis Ag Compositions and methods for immunooncology
WO2018176009A1 (en) * 2017-03-23 2018-09-27 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable dna binding proteins
WO2021097521A1 (en) * 2019-11-20 2021-05-27 Cartherics Pty. Ltd. Method for providing immune cells with enhanced function
CN113025613A (en) * 2021-03-11 2021-06-25 中国计量大学 ADORA2A gene knockout cell and construction method and application thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JONES KARLIE R., CHOI UIMOOK, GAO JI-LIANG, THOMPSON ROBERT D., RODMAN LARRY E., MALECH HARRY L., KANG ELIZABETH M.: "A Novel Method for Screening Adenosine Receptor Specific Agonists for Use in Adenosine Drug Development", SCIENTIFIC REPORTS, vol. 7, no. 1, XP093086396, DOI: 10.1038/srep44816 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117460822A (en) * 2022-04-25 2024-01-26 辉大基因治疗(新加坡)私人有限公司 Novel CRISPR-Cas12i system and its uses
CN117460822B (en) * 2022-04-25 2024-08-02 辉大基因治疗(新加坡)私人有限公司 Novel CRISPR-Cas12i system and application thereof
WO2024140737A1 (en) * 2022-12-26 2024-07-04 Huidagene Therapeutics Co., Ltd. Modified guide rna and uses thereof
WO2025038648A1 (en) * 2023-08-14 2025-02-20 Intellia Therapeutics, Inc. Compositions and methods for genetically modifying transforming growth factor beta receptor type 2 (tgfβr2)

Similar Documents

Publication Publication Date Title
WO2023155924A1 (en) Guide rna and uses thereof
US20230023791A1 (en) Gene editing systems comprising a crispr nuclease and uses thereof
US20230287456A1 (en) Compositions comprising a cas12i polypeptide and uses thereof
US20230203539A1 (en) Gene editing systems comprising an rna guide targeting stathmin 2 (stmn2) and uses thereof
US20230399639A1 (en) Compositions comprising an rna guide targeting b2m and uses thereof
US20230407343A1 (en) Compositions comprising an rna guide targeting pdcd1 and uses thereof
WO2023018856A1 (en) Gene editing systems comprising an rna guide targeting polypyrimidine tract binding protein 1 (ptbp1) and uses thereof
WO2022140340A1 (en) Compositions comprising an rna guide targeting dmd and uses thereof
WO2023081377A2 (en) Compositions comprising an rna guide targeting ciita and uses thereof
WO2023137451A1 (en) Compositions comprising an rna guide targeting cd38 and uses thereof
US20230416732A1 (en) Compositions comprising an rna guide targeting bcl11a and uses thereof
EP4237560A1 (en) Compositions comprising an rna guide targeting trac and uses thereof
EP4627083A1 (en) Reverse transcriptase-mediated genetic editing of transthyretin (ttr) and uses thereof
WO2022140343A1 (en) Compositions comprising an rna guide targeting dmpk and uses thereof
CN116867898A (en) Compositions containing TRAC-targeting RNA guides and uses thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23755924

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 23755924

Country of ref document: EP

Kind code of ref document: A1