[go: up one dir, main page]

WO2025201316A1 - Système crispr-cas - Google Patents

Système crispr-cas

Info

Publication number
WO2025201316A1
WO2025201316A1 PCT/CN2025/084684 CN2025084684W WO2025201316A1 WO 2025201316 A1 WO2025201316 A1 WO 2025201316A1 CN 2025084684 W CN2025084684 W CN 2025084684W WO 2025201316 A1 WO2025201316 A1 WO 2025201316A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
nucleic acid
protein
cell
target nucleic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2025/084684
Other languages
English (en)
Chinese (zh)
Inventor
徐讯
刘传
李百涛
祁琛
陈珂
刘金熙
蓝虹霞
郑越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of WO2025201316A1 publication Critical patent/WO2025201316A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]

Definitions

  • the present invention relates to the field of nucleic acid editing, in particular to nucleic acid editing based on regularly clustered interspaced short palindromic repeats (CRISPR).
  • CRISPR regularly clustered interspaced short palindromic repeats
  • the present invention relates to a system or composition comprising a Cas effector protein, and a vector system, a delivery composition, and a kit comprising the system or composition.
  • the present invention also relates to the use of these systems or compositions, vector systems, delivery compositions, and kits in nucleic acid editing, as well as methods for nucleic acid editing, nucleic acid detection, and disease treatment.
  • the CRISPR (clustered regularly interspaced short palindromic repeats) system is an adaptive immune system in prokaryotes with RNA-guided endonuclease activity.
  • the earliest discovered gene editing system was the Cas9 system, mediated by crRNA and tracrRNA (trans-activating crRNA).
  • This system uses certain Cas proteins, primarily Cas1 and Cas2, to capture bacteriophage viral DNA or exogenous plasmid DNA and insert it into the bacterium's native direct repeat sequence to form a CRISPR sequence.
  • the CRISPR sequence is transcribed into pre-crRNA, which is then processed and modified into crRNA.
  • the crRNA and tracrRNA combine through local base pairing and form a ribonucleoprotein (RNP) with the Cas9 nuclease, which has RNA-guided DNA endonuclease activity.
  • RNP ribonucleoprotein
  • PAM protospacer-adjacent motif
  • Type II and V CRISPR-Cas systems encode Cas1, Cas2, and several accessory proteins, such as Cas4.
  • Type VI CRISPR-Cas systems consist solely of the CRISPR array and two ribonuclease domain-containing effector proteins from the HEPN superfamily. Furthermore, a notable feature of type II and type V CRISPR-Cas sequences is the presence of RuvC-like nuclease domains in their effector proteins.
  • the RuvC domain contains an inserted HNH nuclease domain, which preferentially recognizes guanine-rich PAM sequences and, under RNA guidance, cleaves phosphodiester bonds to produce blunt ends.
  • various subtypes of type V effector proteins contain only RuvC-like domains and lack HNH nuclease domains. Instead, they preferentially recognize thymine-rich PAM sequences and cleave phosphodiester bonds to produce sticky ends.
  • CRISPR technology has developed rapidly and can be applied to gene editing in bacteria, archaea, and eukaryotic cells, but it has also exposed many problems.
  • the currently widely used Cas9 and Cas12a gene editing systems suffer from drawbacks such as large molecular weight, difficulty in delivery, limited editing sites, and susceptibility to off-target effects.
  • Cas9 and Cas12a are typically larger than 1200 amino acids, and the widely used adeno-associated virus (AAV) has a genome of approximately 4.7 kb and limited packaging length, inserting excessively long exogenous sequences may exceed the capacity of AAV, posing a significant challenge to AAV delivery.
  • AAV adeno-associated virus
  • the inventors of this application unexpectedly discovered a new class of nucleases with smaller molecular size and higher editing activity. Based on this discovery, the inventors developed a new CRISPR-Cas system and a gene editing method based on this system.
  • the present application identifies a new class of Cas proteins, which can be distinguished from other classes of Cas proteins by protein size, primary structure (e.g., motif), secondary structure (e.g., ⁇ -helical structure, ⁇ -sheet structure), functional domain (e.g., containing a zinc finger domain, containing a RuvC domain, and not containing an HNH domain) and/or biological activity.
  • primary structure e.g., motif
  • secondary structure e.g., ⁇ -helical structure, ⁇ -sheet structure
  • functional domain e.g., containing a zinc finger domain, containing a RuvC domain, and not containing an HNH domain
  • the present invention provides a system or composition comprising:
  • the RuvC domain is capable of unwinding the double-stranded target nucleic acid (eg, DNA double strand, DNA-RNA double strand) into a single-stranded target nucleic acid.
  • the amino acid sequences of the same domain contained in different proteins may be the same or different.
  • the zinc finger domain contained in the Cas protein C2c11 in the system or composition provided herein contains the amino acid sequence Y1-X1-X2-Y2, wherein X1 and X2 are each independently selected from any amino acid, Y1 and Y2 are each independently selected from amino acids other than Cys, and Y1 and Y2 can be the same or different amino acids.
  • the zinc finger domain of the Cas protein C2c11 in the system or composition provided herein contains the conserved amino acids Ser and Arg.
  • the conserved amino acid Ser (S) is located at position 425 of the Cas protein C2c11 corresponding to SEQ ID NO: 18.
  • the conserved amino acid Arg (R) is located at position 442 of the Cas protein C2c11 corresponding to SEQ ID NO: 18.
  • the RuvC domain is capable of cutting or breaking the phosphodiester bond in the target nucleic acid. In certain embodiments, after the RuvC domain cuts or breaks the target nucleic acid, a target nucleic acid with a sticky end is generated.
  • the Cas protein C2c11 comprises a WED domain.
  • the WED domain can participate in the recognition of the Cas protein C2c11 and PAM, and/or the processing of crRNA (e.g., processing pre-crRNA into mature crRNA).
  • the Cas protein C2c11 has cis-cleavage activity for cleaving a target nucleic acid (eg, a double-stranded target nucleic acid, a single-stranded target nucleic acid).
  • a target nucleic acid eg, a double-stranded target nucleic acid, a single-stranded target nucleic acid.
  • the Cas protein C2c11 of the present application forms a complex with the guide RNA, it is guided by the guide RNA and hybridizes or anneals to the target nucleic acid, and then the complex cuts one or both chains of the target nucleic acid.
  • the Cas protein C2c11 has a trans-cleavage activity that can be activated by a target nucleic acid.
  • the Cas protein C2c11 can cut single-stranded nucleic acids (e.g., ssDNA, ssRNA) after being activated by a target nucleic acid.
  • the Cas protein C2c11 of the present application has a trans-cleavage activity that can be activated by a target nucleic acid.
  • both double-stranded and single-stranded target nucleic acids can activate the trans-cleavage activity of the Cas protein C2c11.
  • the Cas protein C2c11 forms a complex with the guide RNA and hybridizes or anneals with the target nucleic acid, the complex is activated and has a trans-cleavage activity against non-target single-stranded DNA sequences. That is, the complex can indiscriminately shear single-stranded DNA of any sequence in the reaction system.
  • the amino acid residue at position 250 of the Cas protein C2c11 corresponding to SEQ ID NO: 18 is D. In certain embodiments, the amino acid residue at position 358 of the Cas protein C2c11 corresponding to SEQ ID NO: 18 is E. In certain embodiments, the amino acid residue at position 425 of the Cas protein C2c11 corresponding to SEQ ID NO: 18 is S. In certain embodiments, the amino acid residue at position 442 of the Cas protein C2c11 corresponding to SEQ ID NO: 18 is R. In certain embodiments, the amino acid residue at position 455 of the Cas protein C2c11 corresponding to SEQ ID NO: 18 is D.
  • the Cas protein C2c11 comprises a WED domain, a REC domain, a RuvC domain, and a zinc finger domain.
  • the zinc finger domain is adjacent to the RuvC domain.
  • the RuvC domain comprises a RuvC-I subdomain, a RuvC-II subdomain and/or a RuvC-III subdomain. In certain embodiments, the RuvC domain consists of a RuvC-I subdomain, a RuvC-II subdomain and a RuvC-III subdomain.
  • the zinc finger domain comprises a zinc finger domain-I subdomain and/or a zinc finger domain-II subdomain. In certain embodiments, the zinc finger domain consists of a zinc finger domain-I subdomain and a zinc finger domain-II subdomain.
  • the Cas protein C2c11 comprises the amino acid sequence of the following domains and/or subdomains from N-terminus to C-terminus in sequence: WED-I subdomain, REC domain, WED-II subdomain, RuvC-I subdomain, RuvC-II subdomain, zinc finger domain-I subdomain, RuvC-III subdomain and zinc finger domain-II subdomain.
  • the microorganism is selected from bacteria, viruses (eg, bacteriophages), or any combination thereof.
  • the bacteria is selected from the group consisting of Corynebacterium, Schutella, Legionella, Treponema, Filamentosa, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Flavobacterium, Flavobacterium, Glomerella, Azospirillum, Gluconacetobacter, Neisseria, Roseburia, Microbacterium, Staphylococcus, Nitrobacter, Campylobacter, Carnegiea, Rhodobacter, Listeria, Parodia, Clostridium, Lachnospiraceae, Leptotrichia, Francisella, Alicyclobacillus, Methanophilus, Porphyromonas, Prevotella, Bacillus, Helicococcus, Spirochaete, Desulfovibrio, Desulfovibrio, Potassium Fungus, Truffle, Oscillatory Spirospira, Eubacterium, Ruminococcus, Lachnos
  • the microorganism is selected from a bacteriophage (eg, a tailed phage).
  • the nucleotide sequence of the nucleic acid molecule A1 encoding the amino acid sequence of the Cas protein C2c11 is codon-optimized for expression in prokaryotic and/or eukaryotic cells.
  • eukaryotic cells can be mammals or primates, including but not limited to humans, mice, rats, rabbits, and dogs.
  • codon optimization refers to a method of replacing at least one codon of a native sequence (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50 or more codons) with a codon that is more frequently or most frequently used in the genes of a host cell while maintaining the native amino acid sequence to enhance expression of the sequence of interest in a host cell.
  • a native sequence e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50 or more codons
  • Codon usage tables are readily available, for example, in the Codon Usage Database available at www.kazusa.orjp/codon/, and these tables can be adapted for use in different ways.
  • the Cas protein C2c11 has a nucleotide sequence shown in any one of SEQ ID NOs: 31-60 or has at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity thereto.
  • the protein of the present invention can be functionally linked (by chemical coupling, gene fusion, non-covalent linkage or other means) to one or more other molecular groups, such as another protein or polypeptide, a detection reagent or a pharmaceutical agent, etc.
  • the modifying moiety is selected from another protein or polypeptide, a detectable label, a purification tag, or any combination thereof.
  • the modifying moiety is linked to the N-terminus or C-terminus of the protein, optionally through a linker.
  • the protein of the present invention may be linked to a nuclear localization signal (NLS) sequence to enhance the ability of the protein of the present invention to enter the cell nucleus.
  • NLS nuclear localization signal
  • the protein of the present invention can be connected to a detectable label or reporter gene to facilitate detection of the protein of the present invention.
  • detectable labels are well known to those skilled in the art, such as fluorescent dyes, such as FITC or DAPI.
  • reporter genes are well known to those skilled in the art, and examples thereof include but are not limited to GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, etc.
  • the protein of the present invention can be linked to an epitope tag to facilitate expression, detection, tracing, and/or purification of the protein of the present invention.
  • epitope tags are well known to those skilled in the art, and examples thereof include, but are not limited to, His, V5, FLAG, HA, Myc, VSV-G, Trx, and the like. Those skilled in the art know how to select an appropriate epitope tag based on the desired purpose (e.g., purification, detection, or tracing).
  • the modified portion is connected to the N-terminus or C-terminus of the protein of the present invention via a linker.
  • linkers are well known in the art, and examples thereof include, but are not limited to, linkers comprising one or more (e.g., 1, 2, 3, 4, or 5) amino acids (e.g., Glu or Ser) or amino acid derivatives (e.g., Ahx, ⁇ -Ala, GABA, or Ava) or PEG, etc.
  • the protein of the present invention is not limited by the method of production; for example, it can be produced by genetic engineering methods (recombinant technology) or by chemical synthesis methods.
  • the system or composition comprises two or more guide RNAs capable of hybridizing to different target nucleic acids or different regions of the same target nucleic acid.
  • the two or more guide RNAs guide the same Cas protein C2c11 or guide different Cas proteins C2c11 respectively.
  • the guide RNA comprises a guide sequence
  • the guide sequence is capable of hybridizing or annealing to a target nucleic acid under conditions that permit nucleic acid hybridization or annealing.
  • the guide sequence comprises a complementary sequence to the sequence of the target nucleic acid.
  • the guide sequence is at least 10 nt in length, e.g., 10-15 nt, 15-20 nt, 20-25 nt, 25-30 nt, 30-40 nt, 40-50 nt, 50-100 nt, 100-200 nt, or longer.
  • the guide RNA further comprises a backbone sequence.
  • the backbone sequence has a length of at least 20 nt, such as 20-30 nt, 30-40 nt, 40-50 nt, 50-100 nt, 100-200 nt, 200-300 nt or longer.
  • the guide sequence is located 3' to the backbone sequence.
  • the sequence of the guide RNA comprises at least one chemical modification.
  • the chemical modification is selected from pseudo-U, 5-methyl-C, methylated nucleotide or nucleotide analog, 2'-O-methyl, 2'-O-methyl 3' phosphorothioate, 2'-O-methyl 3' thio PACE, or any combination thereof.
  • the guide RNA includes crRNA and tracrRNA, wherein a partial sequence of crRNA serves as the guide sequence of the guide RNA, and the remaining sequence of crRNA and tracrRNA together serve as the backbone sequence of the guide RNA.
  • the crRNA comprises a repetitive sequence and a guide sequence, wherein the guide sequence is capable of hybridizing or annealing to the target nucleic acid under conditions that allow nucleic acid hybridization or annealing.
  • the repetitive sequence is located upstream of the guide sequence.
  • tracrRNA comprises a complementary repeat sequence, wherein, under conditions allowing nucleic acid hybridization or annealing, the complementary repeat sequence can hybridize or anneal to the repeat sequence of crRNA. It will be appreciated by those skilled in the art that the complementary repeat sequence and the repeat sequence do not need to be completely complementary. In certain embodiments, when optimally aligned, the degree of complementarity between the complementary repeat sequence and the repeat sequence can be at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95% or at least 99%.
  • tracrRNA further comprises a non-complementary repeat sequence that can form a stem-loop structure (also referred to as a "hairpin structure") in the secondary structure.
  • the non-complementary repeat sequence is located upstream of the complementary repeat sequence.
  • the tracrRNA consists of complementary repeat sequences and non-complementary repeat sequences.
  • a linker sequence is further included between the complementary repeat sequence of tracrRNA and the repeat sequence of crRNA.
  • the linker sequence is at least 2 nt in length. For example, 2 nt, 3 nt, 4 nt, 5 nt, 6 nt, 7 nt, or 8 nt.
  • sequence of any one of (ii) to (v) substantially retains the biological activity of the sequence from which it is derived;
  • the guide RNA comprises or consists of a sequence selected from the group consisting of:
  • sequence of any one of (ii) to (v) substantially retains the biological function of the sequence from which it is derived.
  • the sequence of the target nucleic acid is a DNA and/or RNA sequence from a prokaryotic cell or a eukaryotic cell; or, the sequence of the target nucleic acid is a non-naturally occurring DNA and/or RNA sequence.
  • the target nucleic acid is selected from a double-stranded target nucleic acid, a single-stranded target nucleic acid, or any combination thereof.
  • the target sequence when the target sequence is RNA, the target sequence is not restricted by a PAM domain.
  • the additional component is selected from:
  • nuclease e.g., Cas1, Cas2, or Cas4
  • nucleic acid molecule E1 encoding the nuclease
  • the additional component forms a complex with the Cas protein C2c11, or exists independently of the Cas protein C2c11.
  • the vector further comprises a first regulatory element operably linked to the nucleic acid molecule A1;
  • Some vectors can replicate autonomously in the host cell into which they are introduced.
  • Other vectors e.g., non-additional mammalian vectors
  • some vectors can instruct the expression of their operably linked genes.
  • Such vectors are referred to as "expression vectors" herein.
  • the common expression vectors used in recombinant DNA technology are typically plasmid forms.
  • the one or more vectors further comprise a nucleic acid molecule El encoding a nuclease. In certain embodiments, the vector further comprises a fifth regulatory element operably linked to the nucleic acid molecule El.
  • the viral vector is selected from a retroviral vector, a lentiviral vector, an adenoviral vector, an adeno-associated viral vector, a herpes simplex viral vector, or any combination thereof.
  • nucleotide sequences of nucleic acid molecule A1, nucleic acid molecule C1, nucleic acid molecule D1 and/or nucleic acid molecule E1 are codon-optimized according to the preference of the host cell (eg, eukaryotic cell, prokaryotic cell).
  • the delivery vehicle is selected from lipid particles, sugar particles, metal particles, protein particles, liposomes, exosomes, microvesicles, gene guns, viral vectors (e.g., replication-defective retroviruses, lentiviruses, adenoviruses, or adeno-associated viruses), or any combination thereof.
  • viral vectors e.g., replication-defective retroviruses, lentiviruses, adenoviruses, or adeno-associated viruses
  • kits of the present invention may be provided in any suitable container.
  • the kit further comprises one or more buffers.
  • the buffer can be any buffer, including but not limited to sodium carbonate buffer, sodium bicarbonate buffer, borate buffer, Tris buffer, MOPS buffer, HEPES buffer and combinations thereof.
  • the buffer is neutral or close to neutral.
  • the buffer has a pH from about 6.0 to 9.0 (e.g., 6.0 to 7.0, 7.0 to 8.0 or 8.0 to 9.0).
  • the present invention provides a composite comprising:
  • the protein component and the nucleic acid component combine with each other to form a complex.
  • the additional protein is selected from:
  • the methods are used to modify a target nucleic acid in vitro or ex vivo.
  • the target nucleic acid is present in an in vitro nucleic acid molecule (e.g., a plasmid or genomic DNA collected in vitro by cell lysis or PCR amplification).
  • the present invention provides a method for modifying a target gene, comprising: delivering the system or composition as described in the first aspect, the vector system as described in the second aspect, the delivery composition as described in the third aspect, the kit as described in the fourth aspect, or the complex as described in the fifth aspect into a cell containing the target nucleic acid.
  • the modification causes a change in the expression product of the target gene (eg, an increase or decrease in the expression level of the expression product).
  • the present invention provides a cell, cell line or progeny thereof comprising a modified target gene, wherein the cell or cell line has been modified by the method according to the sixth aspect.
  • the cell is a eukaryotic cell. In certain embodiments, the cell is a mammalian cell. In certain embodiments, the cell is a human cell. In certain embodiments, the cell is a non-human mammalian cell, such as a cell of a non-human primate, cattle, sheep, pig, dog, monkey, rabbit, rodent (such as rat or mouse). In certain embodiments, the cell is a non-mammalian eukaryotic cell, such as a cell of poultry (such as chicken), fish or crustacean (such as clams, shrimp).
  • poultry such as chicken
  • fish or crustacean such as clams, shrimp
  • the cell is a plant cell, such as a cell, a cultivated plant or a food crop such as cassava, corn, sorghum, soybean, wheat, oat or rice, such as algae, tree, production plant, fruit or vegetable (for example, trees such as citrus trees, nut trees; Solanaceae, cotton, tobacco, tomato, grape, coffee, cocoa, etc.).
  • a plant cell such as a cell, a cultivated plant or a food crop such as cassava, corn, sorghum, soybean, wheat, oat or rice, such as algae, tree, production plant, fruit or vegetable (for example, trees such as citrus trees, nut trees; Solanaceae, cotton, tobacco, tomato, grape, coffee, cocoa, etc.).
  • the cell is in vitro, ex vivo, or in vivo.
  • the modification results in an altered expression of at least one gene product of the cell, for example, an increase in the expression of the at least one gene product, or a decrease in the expression of the at least one gene product.
  • the present invention provides a plant or animal model comprising a cell or cell line comprising a modified target gene as described above, or progeny thereof.
  • the present invention relates to the use of the system or composition of the first aspect, the vector system of the second aspect, the delivery composition of the third aspect, the kit of the fourth aspect, or the complex of the fifth aspect for nucleic acid editing (e.g., in vitro or ex vivo nucleic acid editing), or in the preparation of a nucleic acid editing preparation, in the preparation of an in vitro or ex vivo nucleic acid detection preparation, or in the preparation of a medicament for treating a disease or condition in a subject in need thereof.
  • nucleic acid editing e.g., in vitro or ex vivo nucleic acid editing
  • the nucleic acid to be edited is present in a cell. In certain embodiments, the nucleic acid to be edited is a genome. In certain embodiments, the cell is a prokaryotic cell or a eukaryotic cell. In certain embodiments, the nucleic acid to be edited is present in an in vitro nucleic acid molecule (e.g., a plasmid). In certain embodiments, the nucleic acid to be edited includes one or more genes.
  • the nucleic acid editing includes gene editing, for example, the gene editing includes single-site and/or multi-site gene modification, single-site and/or multi-site gene knockout, single-site and/or multi-site gene knock-in, single-site and/or multi-site methylation modification (for example, adding, removing methylation modification), changing the expression of gene products, repairing mutations, and/or inserting polynucleotides.
  • the gene editing does not include the step of modifying human germline genetic characteristics.
  • the purpose is not a method for treating humans or animals by therapy.
  • system or composition as described in the first aspect, the vector system as described in the second aspect, the delivery composition as described in the third aspect, the kit as described in the fourth aspect, or the complex as described in the fifth aspect can be used to edit a gene associated with the disease or condition in a subject to treat the disease or condition.
  • the disease or disorder is a genetic disease or disorder; for example, a blood disease or disorder, an eye disease or disorder, a liver disease or disorder, a muscle disease or disorder, or a neurological disease or disorder.
  • the disease or condition is a disease or condition caused by a genetic mutation or a pathogenic SNP; for example, cancer.
  • the Cas protein C2c11 of the present application has a trans-cleavage activity that can be activated by a target nucleic acid.
  • double-stranded or single-stranded target nucleic acids can activate the trans-cleavage activity of the Cas protein C2c11. Therefore, when the Cas protein C2c11 of the present application forms a complex with the guide RNA and hybridizes or anneals with the target nucleic acid, the complex will be activated and have a trans-cleavage activity for non-target single-stranded DNA sequences. That is, the complex can indiscriminately shear single-stranded DNA of any sequence in the reaction system.
  • the present invention also relates to a method for detecting the presence of a target nucleic acid in a sample, comprising the following steps:
  • the system or composition, vector system, delivery composition, kit or complex comprises a guide sequence that is capable of hybridizing to a target nucleic acid, and the single-stranded DNA probe does not hybridize to the guide sequence;
  • the detectable signal is determined by one or more methods selected from the group consisting of imaging-based detection, sensor-based detection, color-based detection, gold nanoparticle-based detection, fluorescence polarization-based detection, colloidal phase transition/dispersion-based detection, electrochemical-based detection, and semiconductor-based sensing detection.
  • the target nucleic acid is as defined in the first aspect.
  • one end (e.g., the 5' end) of the single-stranded DNA probe is labeled with a fluorescent group, and the other end (e.g., the 3' end) is labeled with a quencher group.
  • the target nucleic acid is selected from double-stranded DNA, single-stranded DNA, RNA, or any combination thereof.
  • the method further comprises a step of amplifying the target nucleic acid in the sample before step (1).
  • the amplification is selected from nucleic acid sequence-based amplification, recombinase polymerase amplification, loop-mediated isothermal amplification, strand displacement amplification, exonuclease III assisted signal amplification, hybridization chain reaction, helicase-dependent amplification, isothermal circular strand displacement polymerization, multiple displacement amplification, primase-based whole genome amplification, rolling circle amplification, whole genome amplification, or any combination thereof.
  • the method further comprises, before step (1), pre-treating the sample to expose the target nucleic acid in the sample.
  • the sequence of the target nucleic acid is a sequence obtained from a pathogen.
  • the sample is a biological sample or an environmental sample.
  • the biological sample is an isolated biological sample.
  • the biological sample is selected from blood, plasma, serum, urine, feces, sputum, mucus, lymph, bile, ascites, pleural effusion, saliva, cerebrospinal fluid, any body secretion, exudate or exudate (e.g., fluid obtained from an abscess or any other site of infection or inflammation), a swab of the skin or mucosal surface, or any combination thereof.
  • the environmental sample is selected from a food sample, a paper surface, a fabric, a metal surface, a wood surface, a plastic surface, a soil sample, a freshwater sample, a wastewater sample, a saltwater sample, or any combination thereof.
  • the "C2c11" of the present application is less than 700 amino acids (e.g., 300-700 amino acids, 350-650 amino acids, 400-600 amino acids, 450-550 amino acids), and it comprises at least one (e.g., 1, 2, 3 or more) zinc finger domain, the zinc finger domain comprising an ⁇ -helical structure and a ⁇ -sheet structure and does not contain the amino acid sequence Cys-X1-X2-Cys, wherein X1 and X2 are each independently selected from any amino acid.
  • the C2c11 of the present invention is a nuclease that binds to and cuts a specific site of a target sequence under the guidance of a guide RNA, and has both DNA and RNA endonuclease activities.
  • zinc finger domain refers to a domain involved in the recognition, binding and/or cutting of Cas proteins, which can recognize, bind and/or cut DNA, RNA or protein.
  • a zinc finger domain is capable of coordinating one or more zinc ions and is coordinated with one zinc ion by four amino acids (e.g., Cys and/or His) as ligands.
  • This structure can form a variety of different geometric configurations, such as “C2H2 type”, “C4 type”, “C6 type” and the like.
  • the diversity and modularity of zinc finger domains allow them to participate in the recognition and binding of various biomolecules in different combinations. This DNA recognition can occur via sequence-specific and nonspecific interactions, which are controlled by amino acids in the ZF-DNA interface (Bulyk, Huang, Choo, & Church, 2001, PNAS, 98:7158-63).
  • the "zinc finger domain” can be involved in recognizing, binding and/or cutting target nucleic acids (e.g., DNA, RNA).
  • target nucleic acids e.g., DNA, RNA
  • other domains of the Cas protein C2c11 e.g., RuvC domain
  • the spatial position of the zinc finger domain in the Cas protein C2c11 is the same as or similar to its spatial position in a closely related family protein (e.g., Type V-U Cas protein).
  • the zinc finger domain is adjacent to the RuvC domain.
  • domain refers to a region in a protein molecule that has a specific structure and/or function and is the basic unit that constitutes the tertiary structure of a protein. Each domain is typically composed of 50 to 300 amino acid residues, contains a unique spatial conformation, and performs the same or different biological functions. Generally speaking, a group of proteins with the same domain is called a family.
  • the term "motif” refers to a relatively conserved amino acid sequence in a protein. Typically, a protein family may or may not contain a specific motif to distinguish it from other protein families. A motif typically consists of 3 to 20 consecutive amino acid residues. In some embodiments, the amino acid sequence of the motif can also be shorter or longer.
  • CRISPR-CRISPR-associated (Cas) system CRISPR-Cas system
  • CRISPR system CRISPR system
  • Such transcripts or other elements can include sequences encoding Cas effector proteins and guide RNAs comprising CRISPR RNA (crRNA), as well as trans-acting crRNA (tracrRNA) sequences contained in the CRISPR-Cas system, or other sequences or transcripts from the CRISPR locus.
  • crRNA CRISPR RNA
  • tracrRNA trans-acting crRNA
  • Cas effector protein and “Cas effector enzyme” are used interchangeably and refer to a protein of a certain length of amino acids that is present in the CRISPR-Cas system. In some cases, such a protein refers to a protein identified from the Cas locus.
  • the degree of complementarity between a guide sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99%. Determining optimal alignment is within the capabilities of one of ordinary skill in the art. For example, there are publicly available and commercially available alignment algorithms and programs such as, but not limited to, ClustalW, Smith-Waterman in Matlab, Bowtie, Geneious, Biopython, and SeqMan.
  • the guide sequence is at least 10, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 35, at least 40, at least 45, or at least 50 nucleotides in length. In some cases, the guide sequence is no more than 50, 45, 40, 35, 30, 25, 24, 23, 22, 21, 20, 15, 10, or fewer nucleotides in length. In certain embodiments, the guide sequence is 10-15, 15-20, 20-25, 25-30, or 30-40 nucleotides in length.
  • the backbone sequence is no more than 70, 65, 64, 63, 62, 61, 60, 59, 58, 57, 56, 55, 50, 45, 40, 35, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 15, 10 or fewer nucleotides in length. In certain embodiments, the backbone sequence is 20-30, 30-40, 40-50, 50-100, 100-200, or 200-300 nucleotides in length.
  • CRISPR-Cas complex refers to a ribonucleoprotein complex formed by the binding of a guide RNA or mature crRNA to a Cas protein, comprising a guide sequence that hybridizes to a target sequence and binds to the Cas protein.
  • the ribonucleoprotein complex is capable of recognizing and cleaving a polynucleotide that hybridizes to the guide RNA or mature crRNA.
  • the target sequence may be located in an organelle of a eukaryotic cell, such as a mitochondria or chloroplast.
  • a sequence or template that can be used to recombine into a target locus comprising the target sequence is referred to as an "editing template” or “editing polynucleotide” or “editing sequence”.
  • the editing template is an exogenous nucleic acid.
  • the recombination is homologous recombination.
  • the expression "target sequence” or “target nucleic acid” can be any endogenous or exogenous polynucleotide to a cell (e.g., a eukaryotic cell).
  • the target nucleic acid can be a polynucleotide present in the nucleus of a eukaryotic cell.
  • the target nucleic acid can be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or useless DNA).
  • the target sequence should be associated with a protospacer adjacent motif (PAM).
  • PAM protospacer adjacent motif
  • PAM protein acetylase
  • sequence and length requirements for PAM vary depending on the Cas effector enzyme used, but PAM is typically a 2-5 base pair sequence adjacent to the protospacer sequence (i.e., target sequence). Those skilled in the art will be able to identify PAM sequences for use with a given Cas effector protein.
  • target sequences or target nucleic acids include sequences associated with signal transduction biochemical pathways, such as signal transduction biochemical pathway-related genes or polynucleotides.
  • target polynucleotides include disease-associated genes or polynucleotides.
  • Disease-associated genes or polynucleotides refer to any genes or polynucleotides that produce transcriptional or translational products at abnormal levels or in abnormal forms in cells derived from tissues affected by the disease, compared to tissues or cells of non-disease controls.
  • the altered expression in cases where the altered expression is associated with the emergence and/or progression of the disease, it can be a gene that is expressed at an abnormally high level; alternatively, it can be a gene that is expressed at an abnormally low level.
  • Disease-associated genes also refer to genes with one or more mutations or genetic variations that are directly responsible for or are disequilibrium with one or more genes responsible for the etiology of the disease.
  • the transcribed or translated products can be known or unknown and can be at normal or abnormal levels.
  • wild type or “native” has the meaning commonly understood by those skilled in the art to refer to the typical form of an organism, strain, gene, or characteristic as it occurs in nature, as distinguished from mutant or variant forms, which can be isolated from a source in nature and has not been intentionally modified by man.
  • non-naturally occurring means the involvement of human effort.
  • these terms are used to describe a nucleic acid molecule or polypeptide, they mean that the nucleic acid molecule or polypeptide is at least substantially separated from at least one other component with which it is found in nature or associated as found in nature.
  • identity refers to the match between two polypeptides or between two nucleic acids.
  • a position in both sequences being compared is occupied by the same base or amino acid monomer subunit (e.g., a position in each of the two DNA molecules is occupied by adenine, or a position in each of the two polypeptides is occupied by lysine)
  • the molecules are identical at that position.
  • the "percent identity” between two sequences is a function of the number of matching positions shared by the two sequences divided by the number of positions compared x 100. For example, if 6 out of 10 positions in two sequences match, then the two sequences have 60% identity.
  • the DNA sequences CTGACT and CAGGTT share 50% identity (3 out of 6 positions match).
  • two sequences are compared when they are aligned for maximum identity.
  • Such an alignment can be achieved, for example, by using the method of Needleman et al. (1970) J. Mol. Biol. 48:443-453, which can be conveniently performed using a computer program such as the Align program (DNAstar, Inc.).
  • the percent identity between two amino acid sequences can also be determined using the algorithm of E. Meyers and W. Miller (Comput. Appl Biosci., 4:11-17 (1988)), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4.
  • the percent identity between two amino acid sequences can be determined using the Needleman and Wunsch (J MoI Biol. 48:444-453 (1970)) algorithm, which has been incorporated into the GAP program in the GCG software package (available at www.gcg.com), using a Blossum 62 matrix or a PAM250 matrix and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6.
  • the expression "corresponding to position ... of SEQ ID NO: 1" refers to the fragments at the equivalent position in the compared sequences when the sequences are optimally aligned, i.e., when the sequences are aligned to obtain the highest percentage identity.
  • the term "host cell” refers to a cell that can be used to introduce a vector, including but not limited to prokaryotic cells such as Escherichia coli or Bacillus subtilis, fungal cells such as yeast cells or Aspergillus, insect cells such as S2 Drosophila cells or Sf9, or animal cells such as fibroblasts, CHO cells, COS cells, NSO cells, HeLa cells, BHK cells, HEK 293 cells or human cells.
  • prokaryotic cells such as Escherichia coli or Bacillus subtilis
  • fungal cells such as yeast cells or Aspergillus
  • insect cells such as S2 Drosophila cells or Sf9
  • animal cells such as fibroblasts, CHO cells, COS cells, NSO cells, HeLa cells, BHK cells, HEK 293 cells or human cells.
  • a vector can be introduced into a host cell to thereby produce a transcript, protein, or peptide, including a protein, fusion protein, isolated nucleic acid molecule, etc. as described herein (e.g., a CRISPR transcript, such as a nucleic acid transcript, protein, or enzyme).
  • a CRISPR transcript such as a nucleic acid transcript, protein, or enzyme
  • regulatory element is intended to include promoters, enhancers, internal ribosome entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences), which are described in detail in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, CA (1990).
  • regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cells and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences).
  • Tissue-specific promoters can direct expression primarily in a desired tissue of interest, such as muscle, neurons, bone, skin, blood, a specific organ (e.g., liver, pancreas), or a particular cell type (e.g., lymphocytes).
  • regulatory elements can also direct expression in a temporally dependent manner (e.g., in a cell cycle dependent or developmental stage dependent manner), which may or may not be tissue or cell type specific.
  • the term "regulatory element” encompasses enhancer elements such as WPRE, CMV enhancer, the R-U5' segment in the LTR of HTLV-I ((Mol. Cell. Biol., Vol. 8(1), pp.
  • promoter has a meaning well known to those skilled in the art and refers to a non-coding nucleotide sequence located upstream of a gene that can initiate expression of a downstream gene.
  • a constitutive promoter is a nucleotide sequence that, when operably linked to a polynucleotide encoding or defining a gene product, results in the production of the gene product in the cell under most or all physiological conditions of the cell.
  • An inducible promoter is a nucleotide sequence that, when operably linked to a polynucleotide encoding or defining a gene product, results in the production of the gene product in the cell essentially only when an inducer corresponding to the promoter is present in the cell.
  • a tissue-specific promoter is a nucleotide sequence that, when operably linked to a polynucleotide encoding or defining a gene product, results in the production of the gene product in the cell essentially only when the cell is a cell of the tissue type corresponding to the promoter.
  • substantially complementary refers to a degree of complementarity that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides, or to two nucleic acids that hybridize under stringent conditions.
  • stringent conditions for hybridization refer to conditions under which nucleic acids having complementarity to a target sequence predominantly hybridize to the target sequence and do not substantially hybridize to non-target sequences. Stringent conditions are typically sequence-dependent and vary depending on a number of factors. In general, the longer the sequence, the higher the temperature at which the sequence specifically hybridizes to its target sequence. Non-limiting examples of stringent conditions are described in Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology - Hybridization With Nucleic Acid Probes, Part I, Chapter 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assay", Elsevier, New York.
  • hybridization refers to a reaction in which one or more polynucleotide reactions form a complex that is stabilized via hydrogen bonding of the bases between the nucleotide residues. Hydrogen bonding can occur by means of Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner.
  • the complex can comprise two chains forming a duplex, three or more chains forming a multi-chain complex, a single self-hybridizing chain, or any combination thereof.
  • a hybridization reaction can constitute a step in a broader process (such as the beginning of PCR or the cutting of a polynucleotide via an enzyme). A sequence that can hybridize with a given sequence is referred to as the "complement" of the given sequence.
  • linker refers to a linear polypeptide formed by connecting multiple amino acid residues through peptide bonds.
  • the linker of the present invention can be an artificially synthesized amino acid sequence or a naturally occurring polypeptide sequence, such as a polypeptide having a hinge region function.
  • Such linker polypeptides are well known in the art (see, for example, Holliger, P. et al. (1993) Proc. Natl. Acad. Sci. USA 90: 6444-6448; Poljak, R.J. et al. (1994) Structure 2: 1121-1123).
  • the term "subject” includes, but is not limited to, various animals, such as mammals, such as bovines, equines, ovines, porcines, canines, felines, lagomorphs, rodents (e.g., mice or rats), non-human primates (e.g., macaques or cynomolgus monkeys), or humans.
  • the subject e.g., human
  • suffers from a disorder e.g., a disorder caused by a disease-associated gene defect.
  • the Cas protein C2c11 and system of the present invention have significant differences in protein size, primary structure (e.g., motif), secondary structure (e.g., ⁇ -helix structure, ⁇ -sheet structure) and/or functional domain (e.g., containing a zinc finger domain, containing a RuvC domain, and not containing an HNH domain).
  • primary structure e.g., motif
  • secondary structure e.g., ⁇ -helix structure, ⁇ -sheet structure
  • functional domain e.g., containing a zinc finger domain, containing a RuvC domain, and not containing an HNH domain.
  • the Cas protein C2c11 and system of the present invention have significant advantages.
  • the Cas effector protein of the present invention has cis-cleavage activity (e.g., it can cut double-stranded and single-stranded target nucleic acids) and trans-cleavage activity (e.g., it can achieve detection of target nucleic acids).
  • the molecular size of the Cas effector protein of the present invention is significantly smaller than that of Cas9 and Cas12a proteins, so the transfection efficiency is significantly better than that of Cas9 and Cas12a proteins.
  • the editing activity of the Cas effector protein of the present invention is higher.
  • Figure 1 shows the structure of the C2c11 protein.
  • Figure 1A shows the predicted three-dimensional structure of the C2c11 protein;
  • Figure 1B compares the structural domains of C2c11 with those of the closely related C2C9 and C2C10 systems;
  • Figure 1C compares the key zinc ion coordination residues in the zinc finger domains of C2c11 and Cas12f1 (C2C10) (left: purple-red; right: C2c11: green).
  • Figure 2 is a plasmid map constructed based on pET28a(+).
  • Figure 6 shows the fluorescence signal generated after the CRISPR-C2c11 system cuts the targeted single-stranded DNA, where Figure 6A is the fluorescence curve of C2c11-L2 and Figure 6B is the fluorescence curve of C2c11-L3; the abscissa is time (in minutes) and the ordinate is the fluorescence value.
  • Figure 7 shows the electrophoresis results of the products obtained after double-stranded DNA cleavage by the CRISPR-C2c11 system; wherein, the substrate used in Figure 7A is the 7N PAM library and target sequence, the substrate used in Figure 7B is the 5'TTCC PAM sequence and target sequence, and the substrate used in Figure 7C is the 5'TTTC PAM sequence and target sequence.
  • FIG8 shows the conserved amino acid sites with consistency in the zinc finger domain of the C2c11 protein.
  • Fig. 9 is an exemplary structural diagram of the guide RNA of the present application.
  • the guide RNA comprises in sequence from 5' end to 3' end: the non-complementary repeat sequence of tracrRNA (the part in which the red segment forms a stem-loop structure in Fig. 9), the complementary repeat sequence of tracrRNA (the part complementary to the blue segment in Fig. 9), the linker sequence (linker in Fig. 9), the repeat sequence of crRNA (the part complementary to the red segment in Fig. 9), the guide sequence of crRNA (the yellow segment in Fig. 9).
  • the non-complementary repeat sequence of tracrRNA, the complementary repeat sequence of tracrRNA, the linker sequence and the repeat sequence of crRNA serve together as the backbone sequence of the guide RNA, and the guide sequence of crRNA serves as the guide sequence of the guide RNA.
  • Figure 10 is a map of the editing plasmid pX458-C2c11 in C2c11 cells.
  • Figure 11 is the electrophoresis diagram of T7E1 enzyme digestion of AAVS1 gene editing products.
  • Figure 12 is a map of the CMV-BFP-P2A-mcherry plasmid.
  • FIGS 4A-4F the exemplary results of SDS-PAGE are shown in Figures 4A-4F.
  • the SDS-PAGE figure is the result of gel filtration chromatography protein purification, and the numbers in the figure indicate the order in which the proteins flow out of the chromatography column during the purification process.
  • Figures 4A to 4F are the purification results of C2c11-A4, C2c11-A5, C2c11-L1, C2c11-L2, C2c11-B2, and C2c11-B9, respectively.
  • the position of the target protein in the PAGE figure is indicated by an arrow.
  • the proteins that do not conform to the actual size in the figure are all miscellaneous proteins that flow out during the purification process.
  • the detection used 100ng purified C2c11-gRNA complex, 25pM ssDNA-FQ, 100ng TS chain DNA fragment, and 1 ⁇ buffer (50mM Tris-HCl, 10mM MgCl2, 100 ⁇ g/ml BSA (pH 8.0)) in a final volume of 20 ⁇ L.
  • the reaction was incubated at 37°C.
  • the specific Cas protein is activated by the target sequence, the ssDNA-FQ will be cleaved.
  • the fluorescence of the detection reaction was measured using a full-wavelength microplate reader with an excitation wavelength of 485nm and an emission wavelength of 520nm. The signal was recorded once every minute for 1 hour.
  • Each reaction was set up with 3 parallel replicates, incubated at 37°C for 1 hour, the FAM fluorescence signal value was detected every 5 minutes, and the fluorescence value-time relationship reaction curve was drawn.
  • the fluorescence signal generated by substrate cleavage upon the action of C2c11 protein and gRNA increased significantly over time, demonstrating that the system possessed trans-cleavage activity that could be activated by target nucleic acids.
  • Double-stranded DNA Substrate Sequence Preparation Method Using a 7N PAM region library plasmid constructed in our laboratory as a template, we generated substrate plasmids with PAM sequences of TTTC and TTCC through site-directed PCR. PCR products were recovered and purified to obtain linear dsDNA. After Cas cleavage, the dsDNA formed two DNA fragments of approximately 1500 and 1100 bp in length.
  • Example 6 C2c11 system edits the AAVS1 gene in mammalian cells
  • the C2c11 system with in vitro activity was tested for in vivo editing activity in the human embryonic kidney cell-derived cell line HEK293T.
  • AAVS1 gene bank ID: AC005782.1
  • SEQ ID NO: 107 The nucleotide sequence of the AAVS1 gene is shown in SEQ ID NO: 107.
  • Target site sequences for the AAVS1 gene were designed based on the TC-rich PAM characteristics of the C2c11 family, as shown in Table 3. These include seven C2c11-A4 editing target sites, eight C2c11-L1 target sites, three each of C2c11-B9, C2c11-B11, and C2c11-B12 target sites, and two each of C2c11-L2, C2c11-B2, and C2c11-B8 target sites.
  • the corresponding guide RNA scaffold information is shown in Table 1.
  • Gene editing plasmids for the corresponding proteins were designed and synthesized. The corresponding plasmid information is shown in Tables 3 and 4.
  • the C2c11-L1 system editing plasmid px458-C2c11L1-AAVS1-g1 contains the following inserts: the C2c11-L1 protein gRNA backbone sequence (SEQ ID NO: 63) and the guide sequence (C2c11 L1-AAVS1-g1, see Table 3 for the specific sequence), transcribed from the U6 promoter RNA; and the human codon-optimized C2c11-L1 protein sequence, which carries an NLS nuclear import signal.
  • SEQ ID NO: 63 the C2c11-L1 protein gRNA backbone sequence
  • C2c11 L1-AAVS1-g1 see Table 3 for the specific sequence
  • the human codon-optimized C2c11-L1 protein sequence which carries an NLS nuclear import signal.
  • a map of the synthesized C2c11 intracellular editing plasmid, Px458-C2c11L1-AAVS1-g1, is shown in Figure 10, and the complete sequence is
  • the editing plasmids for the other C2c11 systems were designed similarly to those for the C2c11-L1 system.
  • the codon-optimized protein sequences of C2c11-L2, C2c11-A4, C2c11-A5, C2c11-B2, C2c11-B8, C2c11-B9, C2c11-B11, and C2c11-B12 are shown in Table 1 as SEQ ID NOs: 109-116.
  • the gRNA backbone sequences for these C2c11 systems are shown in Table 1 for the corresponding gRNA backbone sequences. All nucleotide sequences were synthesized at Beijing Liuhe BGI Genomics Co., Ltd. The positive control was spCas9.
  • the culture conditions were: DMEM medium (high glucose, Gibco) containing 10% fetal bovine serum (FBS, Gibco), 1% non-essential amino acids (NEAA, Gibco) and 1% glutamine (GlutaMAX, Gibco), 37°C, 5% CO2 concentration.
  • FBS fetal bovine serum
  • NEAA non-essential amino acids
  • GlutaMAX glutamine
  • NC plasmid 1000 ng NC plasmid was digested with 2 ⁇ L BsaI (NEB) at 37°C, 10x loading buffer (Takara) was added, and electrophoresis was performed on 0.5% agarose gel. The target fragment was selected for gel recovery to obtain a vector with sticky ends. Oligo with sticky ends and target sequence was synthesized (Beijing Liuhe BGI), and double-stranded insert fragments were formed after annealing. The vector backbone and insert fragment were connected at room temperature for 30 minutes using T4 ligase (NEB), transformed into DH5 ⁇ Escherichia coli (commissioned to Sangon), and plated on Amp-resistant plates. After overnight culture, single clones were picked and Sanger sequencing was performed using U6 universal primers (commissioned to Beijing Liuhe BGI). The correct single clone was selected for expansion culture, and the plasmid was extracted according to the method shown in step (1).
  • NEB T4 ligase
  • a 24-well cell culture plate was used for plating and culturing, with approximately 0.5 to 1*106 cells per well.
  • the cells were transfected with the target plasmid using the Lipofectamine 8000 kit (Biyuntian) according to the instructions (1 ⁇ g of plasmid was added to each well).
  • the cells needed to be cultured for 2-3 days to allow for sufficient gene editing.
  • the transfection efficiency was calculated and the cells were recovered.
  • the cell digestion and resuspension steps were the same as above (the amount of relevant reagents was calculated based on the proportion of cell culture area).
  • GFP green fluorescence
  • Genomic DNA extraction Genomic DNA was extracted using a genomic DNA extraction kit (Tiangen), and the genomic (gDNA) concentration was quantified using Nanodrop and stored at -20°C.
  • Target region amplification was performed from gDNA using a high-fidelity amplification enzyme (PrimeSTAR GXL DNA Polymerase, Takara). Primers F (SEQ ID NO: 117) and R (SEQ ID NO: 118) were synthesized at Beijing Liuhe BGI. After a distinct, single target band was observed on a 1% agarose (TAE) gel electrophoresis (180 V, 20 min), the gel was excised and purified using a PCR purification and gel extraction kit (NucleoSpin Extract, MN). Concentration was measured using a Nanodrop.
  • TAE 1% agarose
  • the genome editing results are shown in Figure 11.
  • electropherogram 11 five target sites of C2c11-L1-AAVS1-g4 to C2c11-L1-AAVS1-g8, six target sites of C2c11-A4-AAVS1-g4 to C2c11-A4-AAVS1-g9, and C2c11-B9-AAVS1-g3, C2c11-B11-AAVS1-g2, C2c11-B11-AAVS1-g3, and C2c11-B12-AAVS1-g1 showed genome editing activity.
  • the target region was amplified from gDNA using a high-fidelity amplification enzyme (PrimeSTAR GXL DNA Polymerase, Takara).
  • the primers for amplification are shown in Table 6 below and were synthesized at BGI in Liuhe, Beijing.
  • the PCR products were sequenced by NGS at BGI Co., Ltd. in Wuhan.
  • Next-generation sequencing libraries were constructed according to the instructions for the MGI Easy Fast PCR-FREE Enzyme Digestion Library Preparation Kit V2.0. DNA was prepared into a DNBSEQ platform-specific library and sequenced using the MGISEQ2000 platform using the PE100 sequencing method.
  • C2c11 systems including C2c11-A4, B5, B2, B8, B9, B11, B12, L1, and L2, exhibited gene editing activity in human cells at at least one AAVS1 site.
  • B12 exhibited an editing efficiency of 7.87% at the g1 site
  • L1 had editing efficiencies of 8.27% and 3.51% at the g7 and g8 sites, respectively
  • B11 had an editing efficiency of 1.36% at the g3 site.
  • the remaining A4, B5, B2, B8, B9, B11, and L2 systems exhibited effective editing rates ranging from 0.5% to 0.04%.
  • Example 7 C2c11 system edits reporter gene plasmids in mammalian cells
  • the C2c11 system selected above was used to test the in vivo editing activity of exogenous plasmids in the human embryonic kidney cell-derived cell line HEK293T.
  • the design of the editing plasmid is essentially the same as in Example 6. Specifically, it comprises a protein coding sequence (the same as in Example 6), a gRNA backbone sequence (the same as in Example 6), and a gRNA guide sequence (as shown in Table 8).
  • the map of the editing plasmid is the same as in Figure 10.
  • the editing plasmid and the targeting plasmid were constructed separately and co-delivered (transfected) into 293T cells.
  • the cell culture, plasmid delivery and intracellular editing methods were essentially the same as in Example 6.
  • Target region amplification was performed using a high-fidelity amplification enzyme (PrimeSTAR GXL DNA Polymerase, Takara) with primers CMV-F (SEQ ID NO: 70) and CMV-R (SEQ ID NO: 71), synthesized at Beijing Liuhe BGI.
  • the product was detected by electrophoresis as a distinct, single target band, which was then purified and recovered from the gel. Concentration was measured using a Nanodrop.
  • the genome editing results are shown in Figure 13.
  • C2c11A5-g2 and C2c11B2-g1 have obvious editing activity
  • C2c11A4-g3, C2c11A5-g2, and C2c11B9-g1 have weak editing activity
  • C2c11B11 and C2c11B12 have no obvious editing activity.
  • the target sequence DNA was subsequently sequenced by NGS to more accurately evaluate the editing situation.
  • the target site region was amplified.
  • the amplification primers are shown in Table 9 below and synthesized at Liuhe BGI.
  • the PCR recovery product was then sequenced by NGS at Wuhan BGI Co., Ltd.
  • the second-generation sequencing library construction method referred to the instructions for use of BGI's "MGIEasy Fast PCR-FREE Enzyme Digestion Library Preparation Reagent Set V2.0".
  • the DNA was prepared into a DNBSEQ platform-specific library and sequenced using the MGISEQ2000 platform.
  • the sequencing type was PE100.
  • C2c11 systems including C2c11-A4, A5, B1, B2, B8, B9, B11, and B12, all stably edited at at least one mCherry locus on the intracellular plasmid.
  • the editing efficiencies for A4 at the TS3 site, B1 at the ST3 site, and B8 at the ST2 site were 3.52%, 3.26%, and 3.47%, respectively.
  • the effective editing rates for B5, B2, B9, B11, and B12 ranged from 1.96% to 0.43%.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Peptides Or Proteins (AREA)

Abstract

La présente invention concerne le domaine de l'édition d'acides nucléiques et, en particulier, le domaine technique des courtes répétitions palindromiques groupées et régulièrement espacées (CRISPR). Plus particulièrement, la présente invention concerne un système ou une composition comprenant une protéine effectrice Cas, ainsi qu'un système vectoriel, une composition d'administration et un kit comprenant le système ou la composition. La présente invention concerne en outre une utilisation du système ou de la composition, du système de vecteur, de la composition d'administration et du kit dans l'édition d'acide nucléique, ainsi que des procédés d'édition d'acide nucléique, de détection d'acide nucléique et de traitement de maladie.
PCT/CN2025/084684 2024-03-25 2025-03-25 Système crispr-cas Pending WO2025201316A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202410349814 2024-03-25
CN202410349814.X 2024-03-25

Publications (1)

Publication Number Publication Date
WO2025201316A1 true WO2025201316A1 (fr) 2025-10-02

Family

ID=97215769

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2025/084684 Pending WO2025201316A1 (fr) 2024-03-25 2025-03-25 Système crispr-cas

Country Status (1)

Country Link
WO (1) WO2025201316A1 (fr)

Similar Documents

Publication Publication Date Title
JP7216877B2 (ja) 新規なCRISPR/Cas12f酵素およびシステム
JP7460178B2 (ja) CRISPR-Cas12j酵素およびシステム
CN114672473B (zh) 一种优化的Cas蛋白及其应用
CN113015798B (zh) CRISPR-Cas12a酶和系统
CA3111432A1 (fr) Nouvelles enzymes crispr et systemes
CN112004932A (zh) 一种CRISPR/Cas效应蛋白及系统
WO2019099943A1 (fr) Compositions et méthodes pour améliorer l'efficacité de stratégies knock-in basées sur cas9
WO2019206233A1 (fr) Protéine effectrice crispr/cas éditée par arn et système
US20230058054A1 (en) Crispr/cas system and uses thereof
WO2024198961A1 (fr) Protéine cas et son mutant, système d'édition génique correspondant et utilisation associée
US20230212612A1 (en) Genome editing system and method
CN113930413A (zh) 新型CRISPR-Cas12j.23酶和系统
CN113930411A (zh) 新型CRISPR-Cas12M酶和系统
WO2025025808A1 (fr) Protéine cas, système d'édition génique correspondant et utilisation
CN113930410A (zh) 新型CRISPR-Cas12L酶和系统
WO2024240053A1 (fr) Protéine d'édition de gène, système d'édition de gène correspondant et utilisation de ceux-ci
WO2025201316A1 (fr) Système crispr-cas
EP4631979A1 (fr) Nouvelle enzyme crispr-cas sigma et système
EP4632065A1 (fr) Nouvelle enzyme crispr-cas delta et système
JP6779513B2 (ja) インビボクローニング可能な細胞株をスクリーニングするための方法、インビボクローニング可能な細胞株の製造方法、細胞株、インビボクローニング方法、及びインビボクローニングを行うためのキット
US20250243482A1 (en) Class 2 type v crispr-cas prime editing
CN120712347A (zh) CRISPR/Cas效应蛋白及系统
TW202526012A (zh) 改良整合酶
TW202440913A (zh) Cas12蛋白、CRISPR-Cas系統及其用途
HK40033058A (en) Novel crispr/cas12f enzyme and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25776774

Country of ref document: EP

Kind code of ref document: A1