[go: up one dir, main page]

WO2025081686A1 - Procédé de préparation de système de manipulation d'acide nucléique dear modifié - Google Patents

Procédé de préparation de système de manipulation d'acide nucléique dear modifié Download PDF

Info

Publication number
WO2025081686A1
WO2025081686A1 PCT/CN2024/074721 CN2024074721W WO2025081686A1 WO 2025081686 A1 WO2025081686 A1 WO 2025081686A1 CN 2024074721 W CN2024074721 W CN 2024074721W WO 2025081686 A1 WO2025081686 A1 WO 2025081686A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
sequence
recognition region
manipulation system
dear1
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2024/074721
Other languages
English (en)
Chinese (zh)
Inventor
刘俊杰
朱汉舟
张寿悦
刘子贤
李隆骐
陈之航
杨韵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Publication of WO2025081686A1 publication Critical patent/WO2025081686A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material

Definitions

  • the invention belongs to the field of biotechnology, and in particular relates to a method for preparing an engineered DEAR nucleic acid manipulation system.
  • the CRISPR-Cas system still has some problems: first, the CRISPR-Cas system has off-target effects, and editing of the Cas protein in non-target areas may cause uncontrollable harmful mutations; second, the CRISPR-Cas system has the problem of too large proteins.
  • the protein size of the currently used CRISPR-Cas editing tool molecules SpyCas9 and AsCas12a both exceeds 1,300 amino acids.
  • the excessive molecular weight affects the transfection efficiency of the CRISPR-Cas system tools; at the same time, the CRISPR-Cas system has potential immune responses.
  • the SpyCas9 and AsCas12a proteins currently used are derived from pathogenic bacteria that humans have been exposed to, which may cause human immune responses.
  • the CRISPR-Cas nuclease system is limited by the limitations of its protein components. If a new generation of nucleic acid targeting manipulation technology based entirely on RNA can be developed that combines both gene sequence-specific targeting and catalytic activity, it is expected to overcome the limitations of the application of protease-based gene editing systems.
  • RNA ribozyme-based DEAR nucleic acid manipulation systems with high cutting activity and editing efficiency in bacteria through biochemical means, but its structural basis is still unknown. Therefore, its atomic structure is urgently needed to provide a basis for subsequent transformation and application.
  • the RNA ribozyme-based DEAR nucleic acid manipulation system (Chinese patent application number: 202310424082.1) has RNA, DNA and plasmid cutting activity, and has gene editing activity in cells and bacteria.
  • the present invention intends to improve its specificity and cutting activity by engineering the DEAR nucleic acid manipulation system.
  • the applicant has developed a DEAR nucleic acid manipulation system based on RNA ribozymes and applied it to the targeted modification (e.g., cutting) of nucleic acids (DNA, RNA).
  • the applicant of the present invention provides a method for preparing an engineered DEAR nucleic acid manipulation system to obtain a The original RNA ribozyme-based DEAR nucleic acid manipulation system and an engineered DEAR nucleic acid manipulation system with improved activity.
  • a method for preparing an engineered DEAR nucleic acid manipulation system wherein the original DEAR nucleic acid manipulation system comprises an RNA molecule derived from a bacterial C-type second intron, wherein the RNA molecule comprises a substrate recognition region that hybridizes with a target sequence in a target nucleic acid, and the RNA molecule comprises at least one of domains I to VI;
  • the preparation method comprises at least one selected from the following (a) to (d):
  • the substrate recognition region in the RNA molecule is extended to an extended substrate recognition region having a length of 7 to 10 nucleotides.
  • the recruitment sequence is 14 to 26 nucleotides in length.
  • the portion of the target nucleic acid that hybridizes to the recruitment sequence and the target sequence in the target nucleic acid that hybridizes to the substrate recognition region are separated by 10 to 60 nucleotides, preferably 20 to 50 nucleotides.
  • the first substrate recognition region of the first RNA molecule and the second substrate recognition region of the second RNA molecule recognize the same portion of the target nucleic acid, or respectively recognize different portions of the target nucleic acid.
  • the length of the original DEAR nucleic acid manipulation system ranges from 100-5660 nt, preferably 124-3897 nt.
  • the I domain comprises 2-6 stem-loop/hairpin structures, with a length ranging from 50-400 nt; preferably, the I domain comprises 3-5 stem-loop/hairpin structures, with a length ranging from 65-384 nt; and/or,
  • the II domain comprises 1-4 stem-loop/hairpin structures, with a length ranging from 10-300 nt; preferably, the II domain comprises 1-3 stem-loop/hairpin structures, with a length ranging from 10-218 nt; and/or,
  • the III domain comprises 1-3 stem-loop/hairpin structures, with a length ranging from 10-200 nt; preferably, the III domain comprises 1-2 stem-loop/hairpin structures, with a length ranging from 10-140 nt; and/or,
  • the IV domain comprises 0-4 stem-loop/hairpin structures, with a length ranging from 0-4500 nt; preferably, the IV domain comprises 0-4 stem-loop/hairpin structures, with a length ranging from 0-3000 nt; and/or,
  • the V domain comprises one stem-loop/hairpin structure with a length ranging from 20 to 60 nt, preferably, the V domain comprises one stem-loop/hairpin structure with a length ranging from 29 to 43 nt; and/or,
  • the VI domain comprises one stem-loop/hairpin structure with a length ranging from 10 to 200 nt.
  • the VI domain comprises one stem-loop/hairpin structure with a length ranging from 10 to 112 nt.
  • the C-type second class intron is a C-type second class intron in which an open reading frame encoding an intron-encoded protein is present or absent in the IV domain; optionally, the length of the open reading frame encoding the intron-encoded protein is 0-4000 nt; and/or,
  • the substrate recognition region is located in the I domain.
  • nucleotide sequence of the RNA molecule of the original DEAR nucleic acid manipulation system is selected from any one of the following:
  • nucleotide sequence comprising the reverse complementary sequence of a sequence as shown in any one of SEQ ID NOs: 1 to 9 and 132;
  • the substrate recognition region is programmable to hybridize to different target sequences.
  • polynucleotide wherein the polynucleotide comprises a nucleotide sequence encoding the engineered DEAR nucleic acid manipulation system as described in [17].
  • nucleic acid construct wherein the nucleic acid construct comprises the isolated polynucleotide described in [18].
  • a vector wherein the vector comprises the isolated polynucleotide described in [18], or the nucleic acid construct described in [19].
  • a cell wherein the cell comprises the engineered DEAR nucleic acid manipulation system as described in [17], the isolated polynucleotide as described in [18], the nucleic acid construct as described in [19] or the vector as described in [20].
  • a reagent or kit wherein the reagent or kit comprises the engineered DEAR nucleic acid manipulation system as described in [17], the isolated polynucleotide as described in [18], the nucleic acid construct as described in [19], the vector as described in [20] or the cell as described in [21].
  • a pharmaceutical composition wherein the pharmaceutical composition comprises the engineered DEAR nucleic acid manipulation system as described in [17], the isolated polynucleotide as described in [18], the nucleic acid construct as described in [19], the vector as described in [20] or the cell as described in [21]; and, optionally, a pharmaceutically acceptable carrier.
  • a method for modifying a target nucleic acid comprising the step of contacting the target nucleic acid with a DEAR nucleic acid manipulation system as described in [17], an isolated polynucleotide as described in [18], a nucleic acid construct as described in [19], a vector as described in [20], a cell as described in [21], or a reagent or kit as described in [22].
  • the applicant has found that the DEAR nucleic acid manipulation system based on RNA ribozymes can achieve DNA and RNA cleavage, and also has DNA cleavage ability in Escherichia coli and mammalian eukaryotic cells. Based on this, the applicant has further improved its specificity and cleavage activity through engineering modification, and provided a method for preparing the engineered DEAR nucleic acid manipulation system.
  • Figure 1A to Figure 1J Secondary structure display of DEAR1 to 10.
  • Figure 1A to Figure 1J show the secondary structure prediction results of DEAR1-10 respectively. The prediction was performed using RNA fold, and domains I-VI and TRS are marked in the figure.
  • Figure 1K to Figure 1M Primary sequence and secondary structure characteristics of C-type group II introns.
  • the C-type group II intron RNA in the database http://webapps2.ucalgary.ca/ ⁇ groupii/) was modeled using LocARNA software. Domains I-VI are marked in the figure, corresponding to the primary sequence and secondary structure characteristics of domain I of the C-type group II intron, three primary sequence and secondary structure characteristics of domains II-III (models 1 to 3), and the primary sequence and secondary structure characteristics of domains V-VI.
  • FIG. 2A Quality identification of RNA ribozyme molecules DEAR1 to 9.
  • FIG. 2B Quality identification of the RNA ribozyme molecule DEAR10.
  • Figure 3 Verification of RNA cleavage activity of RNA ribozyme molecules DEAR1 ⁇ 9.
  • Figure 4 Verification of the cleavage activity of RNA ribozyme molecules DEAR1 to 6 on unpaired RNA substrates.
  • Figure 5 Verification of ssDNA cleavage activity of RNA ribozyme molecules DEAR1 ⁇ 6.
  • Figure 6 Comparison of cleavage of paired and unpaired DNA substrates by RNA ribozyme molecules DEAR1-6.
  • FIG. 7 Verification of the ssDNA cleavage sites of the RNA ribozyme molecules DEAR1 to 6.
  • Figure 8 Optimization results of reaction conditions for RNA ribozyme molecule DEAR1.
  • Figure 9 Efficiency curve of reaction condition optimization of RNA ribozyme molecule DEAR1.
  • FIG. 10 Comparison of DNA cleavage activity of DEAR1 and RNA-guided protein nucleases.
  • Figure 11 Verification of plasmid cleavage activity of RNA ribozyme molecule DEAR1.
  • Figure 12 Verification of plasmid cleavage activity of RNA ribozyme molecules DEAR1-3 in Escherichia coli.
  • FIG. 13 Further verification of the plasmid cleavage activity of the RNA ribozyme molecule DEAR1 in Escherichia coli.
  • Figure 14 Verification of the plasmid cleavage activity of RNA ribozyme molecules DEAR4-9 in bacteria.
  • FIG. 15 Verification of the ssDNA cleavage activity of RNA ribozyme molecules DEAR1 to 6 that reprogram the TRS region.
  • FIG. 16 Schematic diagram of the survival of cells stably transfected with DEAR1 stable transfection plasmid and DEAR-NT stable transfection plasmid in Example 7.
  • FIG. 16 Schematic diagram of the survival of cells stably transfected with DEAR1 stable transfection plasmid and DEAR-NT stable transfection plasmid in Example 7.
  • 17A to 17B are schematic diagrams of analysis of sequencing results in Example 7.
  • Figure 18 Schematic diagram of DEARs recognizing substrates
  • TRS substrate recognition sequence (substrate recognition region).
  • Target substrate binding sequence or target sequence, the sequence of DEAR substrate recognized by DEAR (complementary pairing with TRS, i.e., target sequence of target nucleic acid).
  • Figure 19 Schematic diagram of the DEARs structure.
  • the first column is the name of DEARs
  • the second column is the secondary structure of DEARs
  • the third column is the cryo-EM 2D classification diagram of DEARs
  • the fourth column is the cryo-EM structure of DEARs.
  • the structural domains I, II, III, IV, V, and VI are marked as shown in the figure.
  • Figure 20 Schematic diagram of the catalytic active center structure of DEARs, M1 and M2 are magnesium ions in the catalytic active center.
  • the nucleotides closely related to the catalytic activity are marked: G1, U2, G3, C4, G5, A106, C107, A181, A182, G183, A184, C185, A186 from domain I, where the nucleotides at positions 1-5 are called the 5' end, and the range of 181-186 is also called the substrate recognition sequence, i.e.
  • TRS TRS; A337, G338, C339 from domain II, these three nucleotides are also called the intersection J2/3; G582, U583 from domain V, A584, C585, C565, C566, G567, C568, wherein positions 566-568 are also referred to as the catalytic triad and nucleotides 584 and 585 are also referred to as the 2-nucleotide bulge; U633 from domain VI.
  • Figure 21 The results of in vitro and in vivo cleavage of TRS after elongation.
  • a in Figure 21 is an abstract schematic diagram of the structure of DEAR1. The TRS region and dimerization motif are marked, and B in Figure 21 is a diagram of the cleavage activity after different TRS sequences are replaced.
  • C in Figure 21 is a schematic diagram and cleavage efficiency diagram of TRS sequences of different lengths.
  • D in Figure 21 is a diagram of the effect of plasmid cleavage in bacteria of TRS of different lengths.
  • Figure 22 A in Figure 22 is a schematic diagram of two different modifications of DERAs relative to the original version, wherein V1 represents the version of DEAR6 without D6, and V2 represents the version without D6 and with the recruitment sequence added.
  • B in Figure 22 is a schematic diagram of the atomic model after the modification, in which the recruitment sequence, recruitment sequence binding sequence, linker sequence, substrate recognition sequence, substrate binding sequence, and cleavage site are marked as shown in the figure.
  • C and D in Figure 22 are schematic diagrams of the activity after the modification.
  • FIG. 23 Effects of different lengths of linker sequences on the in vitro cleavage activity of DEAR6 with increased recruitment sequences.
  • Figure 24 Effect of adding recruitment sequence on in vitro cleavage activity of DEAR1.
  • Figure 24 A is a comparison of in vitro cleavage activity of the original version of DEAR1 and DEAR1 with D6 deleted and recruitment sequence added
  • Figure 24 B is the effect of recruitment sequences of different lengths on the in vitro cleavage activity of engineered DEAR1.
  • Figure 25 Figure A in Figure 25 is a schematic diagram of the transformation of DEARs into heterodimers;
  • Figure 25 B is a molecular sieve and cryo-electron microscopy two-dimensional image of the DEAR1 heterodimer;
  • Figure 25 C is a schematic diagram of different forms of substrates of DEAR1 heterodimer cutting double-stranded DNA, from left to right are schematic diagrams of 5' protruding ends, blunt ends, and 3' protruding end products;
  • Figure 25 D is a result diagram of DEAR1 heterodimer cutting double-stranded DNA substrates, which are the results of cutting substrates with 5' protruding ends and spacer sequence lengths of 15, 30, and 45 nucleotides, the results of cutting the cutting sites with a distance of 0 nucleotides, i.e., the cutting products are blunt-end products, and the results of cutting substrates with 3' protruding ends and spacer sequence lengths of -6, 0, 15, 30, and 45 nucleot
  • Figure 26 Schematic diagram of the results of DEARs heterodimer cutting in bacteria.
  • On the left is a survival experiment of DEARs cutting in bacteria on a plate with streptomycin resistance and with arabinose inducing the expression of the CcdB toxic gene. Only bacteria that have undergone cutting can grow.
  • On the right is a resistance cutting experiment. On a plate with ampicillin resistance, when the DEAR system cuts the resistance plasmid, bacteria that have lost their resistance cannot survive.
  • FIG. 27 Schematic diagram of the modified structure of the DEAR system with the addition of an extended recognition zone (RS).
  • RS extended recognition zone
  • Figure 28 Cutting effect of the DEAR system after modification with the extended identification zone (RS).
  • a in FIG28 is a cutting glue diagram of the modified DEAR1 RS+1-14
  • B in FIG28 is a cutting efficiency diagram of the modified DEAR1 RS+1-14
  • C in FIG28 is a K value diagram of the modified DEAR1 RS+1-14 cutting
  • D in FIG28 is a 24-hour cutting ratio diagram of the modified DEAR1 RS+1-14;
  • E in Figure 28 is the cutting glue diagram of the modified DEAR2RS+1-14
  • F in Figure 28 is the cutting efficiency diagram of the modified DEAR2RS+1-14
  • G in Figure 28 is the K value diagram of the cutting of the modified DEAR2RS+1-14
  • H in Figure 28 is the 24-hour cutting ratio diagram of the modified DEAR2RS+1-14.
  • Figure 29 Comparison of the cleavage of specific substrates and non-specific substrates by the DEAR system with an extended recognition region.
  • Figure 29 A shows the comparison of the cleavage of RS-paired substrates and RS-unpaired substrates by the wild type and the extended recognition region (RS)-inserted DEAR1. Cleavage of paired substrates.
  • FIG29B compares the cleavage of RS-paired and RS-unpaired substrates by wild-type and extended recognition region (RS)-inserted DEAR2.
  • Figure 30 Comparison of the cutting effect diagrams after the modification of the DEAR1 system with the addition of a 14-nt extended recognition region (RS),
  • Figure 30 A is the cutting gel image of the comparison diagram of the single point mutation after DEAR1 extended RS to 14nt, where the single point mutation is referred to as SM1 (the number represents the unpaired region on the DNA substrate)
  • Figure 30 B is the cutting efficiency diagram of the single point mutation after DEAR1 extended RS to 14nt
  • Figure 30 C is the K value diagram of the single point mutation after DEAR1 extended RS to 14nt
  • Figure 30 D is the 24-hour cutting ratio diagram of the single point mutation after DEAR1 extended RS to 14nt.
  • FIG. 31 Verification of RNA cleavage activity of DEAR10.
  • FIG. 32 Verification of ssDNA cleavage activity of DEAR10.
  • FIG. 33 Comparison of cleavage of paired and unpaired DNA substrates by DEAR10.
  • FIG. 34 Verification of ssDNA cleavage activity of DEAR10 in reprogramming TRS region.
  • FIG. 35 Verification of plasmid cleavage activity of DEAR1 to 6 and DEAR10.
  • Figure 36 Toxicity testing of DEAR in E. coli.
  • Figure 37 Verification of DEAR's cutting activity on mammalian cell genomic DNA, wherein A in Figure 37 is a schematic diagram of the DEAR cutting mammalian cell genome activity verification system, and B in Figure 37 is the survival of mammalian cells edited by DEAR under resistance screening.
  • Figures 38A to 38C Detection of the editing pattern of DEAR on mammalian cells, wherein Figure 38A: Detection of the editing pattern of DEAR1 at three targeting sites; Figure 38B: Detection of the editing pattern of DEAR1 on the entire targeting sequence; Figure 38C: Detection of the editing of DEAR1 on the upstream and downstream sequences of the targeting site.
  • the numerical range expressed using "a numerical value A to a numerical value B" means a range including the endpoints numerical values A and B.
  • the word “may” means both performing a certain process and not performing a certain process.
  • references to “some specific/preferred embodiments”, “other specific/preferred embodiments”, “embodiments”, etc. mean that the specific elements (e.g., features, structures, properties and/or characteristics) described in connection with the embodiments are included in at least one embodiment described herein, and may or may not exist in other embodiments.
  • the elements may be combined in any suitable manner. In various embodiments.
  • the term “multiple” refers to two or more than two.
  • "And/or” describes the association relationship of associated objects, indicating that three relationships may exist.
  • a and/or B can represent the following three situations: A exists alone, A and B exist at the same time, and B exists alone.
  • the character “/” generally indicates that the associated objects are in an "or” relationship.
  • polynucleotide and “nucleic acid” used interchangeably refer to a polymeric form of nucleotides (ribonucleotides or deoxyribonucleotides) of any length. Therefore, this term includes, but is not limited to, single-stranded, double-stranded or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or polymers containing purine bases and pyrimidine bases or other natural, chemically or biochemically modified, non-natural or derived nucleotide bases.
  • G", “C”, “A”, “T” and “U” generally represent the bases of guanine, cytosine, adenine, thymine and uracil, respectively, but it is also generally known in the art that “G”, “C”, “A”, “T” and “U” each generally represent nucleotides containing guanine, cytosine, adenine, thymine and uracil as bases, respectively, which is a common way to represent deoxyribonucleic acid sequences and/or ribonucleic acid sequences, so in the context of the present invention, the meanings represented by “G”, “C”, “A”, “T”, “U” include the above-mentioned various possible situations.
  • ribonucleotide or “nucleotide” can also refer to a modified nucleotide or an alternative replacement part.
  • guanine, cytosine, adenine and uracil can be replaced by other parts without substantially changing the base pairing properties of an oligonucleotide (including a nucleotide having such a replacement part).
  • nucleic acid manipulation includes binding, nicking one strand, or cutting (i.e., severing) two strands of a nucleic acid, or includes modifying or editing a nucleic acid.
  • Nucleic acid manipulation can silence, activate, or regulate (increase or decrease) the expression of an RNA or polypeptide encoded by the nucleic acid.
  • hybridizable or “complementary” or “substantially complementary” means that a nucleic acid (e.g., RNA, DNA) comprises a nucleotide sequence that enables the nucleic acid to non-covalently bind (i.e., form Watson-Crick base pairs and/or G/U base pairs), “anneal” or “hybridize” with another nucleic acid in a sequence-specific, antiparallel manner (i.e., the nucleic acid specifically binds to the complementary nucleic acid) under appropriate in vitro and/or in vivo temperature and solution ionic strength conditions.
  • a nucleic acid e.g., RNA, DNA
  • anneal or “hybridize” with another nucleic acid in a sequence-specific, antiparallel manner (i.e., the nucleic acid specifically binds to the complementary nucleic acid) under appropriate in vitro and/or in vivo temperature and solution ionic strength conditions.
  • Standard Watson-Crick base pairing includes: adenine (A) pairs with thymidine (T), adenine (A) pairs with uracil (U), and guanine (G) pairs with cytosine (C).
  • adenine (A) pairs with thymidine (T) adenine (A) pairs with uracil (U)
  • guanine (G) pairs with cytosine (C) cytosine
  • RNA molecules e.g., dsRNA
  • guanine (G) can also pair with uracil (U).
  • G/U base pairing is at least partially responsible for the degeneracy of the genetic code.
  • Hybridization and washing conditions are well known and described in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Spring Harbor (1989), particularly Chapter 11 and Table 11.1 of that reference; and Sambrook, J. and Russell, W., Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor (2001). The conditions of temperature and ionic strength determine the "stringency" of hybridization.
  • moderate stringency conditions In the present invention, “moderate stringency conditions”, “moderate-high stringency conditions”, “high stringency conditions” or “very high stringency conditions” describe conditions for nucleic acid hybridization and washing. Guidance for conducting hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, NY (1989), 6.3.1-6.3.6, which is incorporated herein by reference. Aqueous and non-aqueous methods are described in this document, and either can be used.
  • specific hybridization conditions are as follows: (1) Low stringency hybridization conditions are in 6 ⁇ sodium chloride/sodium citrate (SSC) at about Then at least Wash twice in 0.2 ⁇ SSC, 0.1% SDS (for low stringency conditions, the wash temperature can be increased to ); (2) medium stringency hybridization conditions at 6 ⁇ SSC, at about Then in Wash once or more in 0.2 ⁇ SSC, 0.1% SDS; (3) High stringency hybridization conditions at 6 ⁇ SSC, about Then in Washing in 0.2 ⁇ SSC, 0.1% SDS one or more times and preferably; (4) Very high stringency hybridization conditions are 0.5 M sodium phosphate, 7% SDS, Then in Wash one or more times in 0.2X SSC, 1% SDS.
  • SSC sodium chloride/sodium citrate
  • Hybridization requires that the two nucleic acids contain complementary sequences, but mismatches between bases are possible.
  • Conditions suitable for hybridization between two nucleic acids depend on the length of the nucleic acids and the degree of complementarity, which are variables well known in the art.
  • a DNA sequence that "encodes" a specific RNA is a DNA nucleotide sequence that is transcribed into RNA.
  • a DNA polynucleotide may encode an RNA (mRNA) that is converted into a protein (thus both DNA and mRNA encode a protein), or a DNA polynucleotide may encode an RNA that is not translated into a protein (e.g., tRNA, rRNA, microRNA (miRNA), "non-coding” RNA (ncRNA), and the DEAR nucleic acid manipulation system provided by the present invention, etc.).
  • nucleic acids, polypeptides, cells or organisms refers to nucleic acids, polypeptides, cells or organisms that exist in nature.
  • a polypeptide or polynucleotide sequence that is present in an organism and can be isolated from a source in nature is naturally occurring.
  • recombination means that a specific nucleic acid (DNA or RNA) is the product of various combinations of cloning, restriction, polymerase chain reaction (PCR) and/or ligation steps, which produce a construct having a structural coding sequence or non-coding sequence that can be distinguished from an endogenous nucleic acid present in a natural system.
  • the DNA sequence encoding a polypeptide can be assembled from a cDNA fragment or from a series of synthetic oligonucleotides to provide a synthetic nucleic acid that can be expressed by a recombinant transcription unit contained in a cell or in a cell-free transcription and translation system.
  • Genomic DNA containing related sequences can also be used in the formation of recombinant genes or transcription units.
  • the sequence of non-translated DNA can be present at the 5' end or 3' end of the open reading frame, wherein such sequences do not interfere with the manipulation or expression of the coding region, and can actually play a role in regulating the production of the desired product through various mechanisms (see "DNA regulatory sequence").
  • a DNA sequence encoding an untranslated RNA e.g., the DEAR nucleic acid manipulation system provided by the present invention
  • the term "recombinant" nucleic acid refers to a non-naturally occurring polynucleotide or nucleic acid, such as a polynucleotide or nucleic acid made by an artificial combination of two otherwise separated segments of a sequence through human intervention.
  • This artificial combination is often accomplished by chemical synthesis or by artificial manipulation of isolated segments of nucleic acid (e.g., by genetic engineering techniques). This operation is usually performed to encode the same amino acid sequence.
  • the codon of a nucleic acid, a conservative amino acid or a non-conservative amino acid is replaced by a codon.
  • this operation is performed to connect the nucleic acid segments with the desired function together to produce the desired functional combination.
  • This artificial combination is often completed by chemical synthesis means or by artificially manipulating the separated segments of nucleic acid (for example, by genetic engineering technology).
  • isolated means a substance that is in a form or environment that does not exist in nature.
  • isolated substances include (1) any non-naturally occurring substance, (2) any substance including but not limited to any enzyme, mutant, nucleic acid, protein, peptide or cofactor that is at least partially removed from one or more or all of the naturally occurring components with which it is essentially associated; (3) any substance that has been artificially modified relative to the substance found in nature; or (4) any substance that has been modified by increasing the amount of the substance relative to other components with which it is naturally associated (e.g., recombinant production in a host cell; multiple copies of a gene encoding the substance; and use of a stronger promoter than the promoter naturally associated with the gene encoding the substance).
  • nucleic acid construct comprises a polynucleotide encoding a polypeptide or a domain or a module operatively linked to a suitable regulatory sequence, which is necessary for the expression of the polynucleotide in a selected cell or strain.
  • the transcriptional regulatory element comprises a promoter, and on this basis, may also comprise enhancers, silencers, insulators and other elements.
  • vector refers to a genetic element, such as a plasmid, cosmid, bacmid, phage or virus, to which another genetic sequence or element (DNA or RNA) can be attached.
  • a vector can be a replicon, thereby causing the replication of the attached sequence or element.
  • An "expression vector” is a vector that promotes the expression of a nucleic acid or a nucleic acid sequence encoding a polypeptide in a host cell or organism.
  • the terms "recombinant expression vector” or "DNA construct” are used interchangeably herein to refer to a DNA molecule comprising a vector and an insert.
  • Recombinant expression vectors are generally produced for the purpose of expressing and/or propagating one or more inserts, or for the purpose of constructing other recombinant nucleotide sequences.
  • the one or more inserts may or may not be operably linked to a promoter sequence, and may or may not be operably linked to a DNA regulatory sequence.
  • operably linked refers to a nucleic acid sequence that is placed in a functional relationship with another nucleic acid sequence.
  • nucleic acid sequences that can be operably linked include, but are not limited to, promoters, transcription terminators, enhancers or activators, and heterologous genes that, when transcribed and, if appropriate, translated, will produce a functional product, such as a protein, ribozyme, or RNA molecule.
  • nucleic acid derived from an original nucleic acid may partially or completely comprise the original nucleic acid, and may be a fragment or variant of the original nucleic acid.
  • ribozyme refers to an RNA molecule that can catalyze a specific biochemical reaction. Common examples of such reactions include the cutting or ligation and modification of RNA and DNA.
  • a "target nucleic acid” is a polynucleotide (e.g., DNA such as genomic DNA, RNA, etc.) that includes a site (“target site” or “target sequence”) targeted by the DEAR nucleic acid manipulation system provided by the present invention.
  • the target sequence is the sequence with which the substrate recognition region of the DEAR nucleic acid manipulation system will hybridize. For example, the target site (or target sequence) 5'-UGUCUU-3' or 5'-TGTCTT-3' in the target nucleic acid is targeted (or bound by, or hybridized or complementary to) the sequence 5'-AAGACA-3'.
  • Suitable hybridization conditions include physiological conditions that normally exist in cells.
  • cleavage means the breakage of the covalent backbone of the target nucleic acid molecule (e.g., RNA, DNA). Both single-stranded and double-stranded cleavage are possible, and double-stranded cleavage can occur due to two different single-stranded cleavage events.
  • Major cleavage site refers to the DNA/RNA cleavage site corresponding to the cleavage product with obvious bands.
  • Secondary cleavage site refers to the DNA/RNA cleavage site corresponding to the cleavage product with no obvious bands.
  • palindrome sequence or “palindrome structure” refers to a specific nucleotide segment in a double-stranded DNA or RNA molecule in genetics, where the sequence read from 5' to 3' on one strand is consistent with the sequence read from 5' to 3' on its complementary strand.
  • Single-stranded DNA or RNA with a palindromic sequence has a symmetry center, and the bases on both sides of the symmetry center are symmetrical about the symmetry center and can form complementarity. Therefore, a palindrome sequence can form a hairpin structure (stem-loop structure).
  • stem-loop also known as “hairpin”, “hairpin loop”, “stem-loop structure” or “stem-loop/hairpin structure” refers to the secondary structure formed by a single-stranded oligonucleotide when a complementary base in a first portion of a linear chain hybridizes with a base in a second portion of the same chain.
  • homodimer refers to a dimerized molecule that is formed by the same molecule and can exist stably.
  • heterodimer refers to a stable dimerized molecule formed by different molecules.
  • RNA ribozymes based on bacterial class II intron elements.
  • Class II introns are composed of two parts: RNA ribozymes and intron-encoded proteins (IEPs).
  • IEPs intron-encoded proteins
  • RNA ribozymes can catalyze the self-splicing maturation of the original transcript, while protein IEPs play an auxiliary role.
  • the RNA ribozyme part includes six domains, I to VI. Domain I is the largest of all domains and plays an important stabilizing role in the formation of the overall structure of the intron. It contains an exon binding site (EBS) for binding to exons. Domains II and III are also involved in the formation of the ribozyme structure.
  • EBS exon binding site
  • Domain IV contains an open reading frame (ORF), and the protein it encodes is IEP.
  • Domain V is the catalytic center of the RNA ribozyme, and domain VI performs auxiliary catalytic functions.
  • type C is considered to be a more ancient type of intron (D.M.Simon et al., Group II introns in eubacteria and archaea: ORF-less introns and new variations. RNA 14, 1704-1713 (2008); A.M.Lambowitz, S.Zimmerly, Mobile group II introns. Annu Rev Genet 38, 1-35 (2004); J.S.Rest, D.P.Mindell, Retroids in archaea: phylogeny and lateral origins. Mol Biol Evol 20, 1134-1142 (2003).).
  • the EBS of the C-type second-class intron and its surrounding sequences can be used as the substrate recognition element of the target nucleic acid of the ribozyme (called the target recognition site (TRS)), and the programmability of the TRS has been discovered and demonstrated, and the target nucleic acid (RNA, DNA) is hydrolyzed and cut with the help of the V domain of the RNA intron ribozyme.
  • TRS target recognition site
  • the applicant calls the system with programmable nucleic acid recognition and cutting ability constructed based on the presence or absence of an open reading frame encoding an intron-encoded protein in the IV domain of the C-type second-class intron derived from bacteria as the RNA ribozyme-based DEAR (Dr) nucleic acid manipulation system, or the RNA ribozyme-based HYER (Hr) nucleic acid manipulation system, which is also called the original DEAR (HYER) nucleic acid manipulation system or the original RNA ribozyme-based DEAR (HYER) nucleic acid manipulation system in the present invention.
  • Dr RNA ribozyme-based DEAR
  • Hr RNA ribozyme-based HYER
  • the original DEAR nucleic acid manipulation system still has certain limitations for gene editing due to its short substrate recognition window.
  • the present invention intends to engineer the DEAR nucleic acid manipulation system to improve its specificity and cutting activity, thereby obtaining an engineered DEAR nucleic acid manipulation system.
  • the object of the present invention is to provide a method for preparing an engineered DEAR nucleic acid manipulation system, wherein the original DEAR nucleic acid manipulation system comprises an RNA molecule derived from a bacterial C-type second intron, wherein the RNA molecule comprises a substrate recognition region that hybridizes with a target sequence in a target nucleic acid, and the RNA molecule comprises domains I to VI;
  • the preparation method comprises at least one selected from the following (a) to (d):
  • the engineered DEAR nucleic acid manipulation system has improved specificity and/or cleavage activity compared to the original DEAR nucleic acid manipulation system.
  • the original RNA ribozyme-based DEAR nucleic acid manipulation system comprises an (isolated) RNA molecule derived from a bacterial C-type group II intron, wherein the RNA molecule comprises a substrate recognition region that hybridizes to a target sequence in a target nucleic acid.
  • the original DEAR nucleic acid manipulation system comprises at least one domain of domain I, domain II, domain III, domain IV, domain V and domain VI.
  • the original DEAR nucleic acid manipulation system contains 6 domains (i.e., domain I, domain II, domain III, domain IV, domain V, and domain VI; the 6 domains can also be expressed as domains I to VI, or simply referred to as D1 to D6), with a length ranging from 100-5660nt; preferably 124-3897nt.
  • domain I comprises 2-6 stem-loop/hairpin structures, with a length range of 50-400 nt, and a TRS sequence responsible for substrate recognition, preferably 3-5 stem-loop/hairpin structures, with a length range of 65-384 nt;
  • domain II comprises 1-4 stem-loop/hairpin structures, with a length range of 10-300 nt, preferably 1-3 stem-loop/hairpin structures, with a length range of 10-218 nt;
  • domain III comprises 1-3 stem-loop/hairpin structures, with a length range of 10-200 nt, preferably 1-2 stem-loop/hairpin structures, with a length range of 10-140 nt;
  • domain IV comprises 0-4 stem-loop/hairpin structures, and 0-4000 nt.
  • domain V contains 1 stem-loop/hairpin structure, with a length range of 20-60nt, preferably 1 stem-loop/hairpin structure, with a length range of 29-43nt, which contains the catalytic reaction core
  • domain VI contains 1 stem-loop/hairpin structure, with a length range of 10-200nt, preferably 1 stem-loop/hairpin structure, with a length range of 10-112nt.
  • RNA molecule of the C-type second intron The primary sequence and secondary structure characteristics of the RNA molecule of the C-type second intron are shown in Figures 1K to 1M.
  • the C-type group II intron is a C-type group II intron in which an open reading frame encoding an IEP is not present in the IV domain.
  • the C-type second intron is a C-type second intron in which an open reading frame encoding IEP exists in the IV domain.
  • the open reading frame encoding IEP in the C-type second intron in which an open reading frame encoding IEP exists is missing.
  • the primary cleavage site of the original DEAR nucleic acid manipulation system is 0-1 nt downstream of the 3’ end of the target sequence in the target nucleic acid, that is, the primary cleavage site is located 0-1 nt downstream of the 3’ end of the region pairing with the substrate recognition region on the target nucleic acid.
  • the nucleotide sequence of the RNA molecule of the original DEAR nucleic acid manipulation system is selected from any one of the following:
  • the structure of the second RNA molecule monomer 2 is:
  • the two TRS of the heterodimer when cutting double-stranded DNA, can respectively recognize and cut the two strands of the double-stranded DNA to form a double-stranded DNA break. Only when there are substrate binding sequences (Target) that can be recognized by the two TRS of the heterodimer on the two strands of the double-stranded DNA, can it be cut to form a double-stranded DNA break, which broadens the range of DNA recognition by the DEAR nucleic acid manipulation system and extends the original 6nt recognition to 12nt. And as shown in Figure 25.
  • an extended recognition region is inserted downstream of the 3' end (primary structure, i.e., nucleotide sequence) of the substrate recognition region in the RNA molecule of the original DEAR nucleic acid manipulation system, and the extended recognition region hybridizes with at least a portion of the target nucleic acid.
  • an extended recognition region is inserted into a region spatially close to the 5' end of the substrate recognition region TRS in the RNA molecule of the original DEAR nucleic acid manipulation system (in the tertiary structure, the region adjacent to the 5' end of the substrate recognition region TRS), and the extended recognition region hybridizes with at least a portion of the target nucleic acid.
  • a sequence is inserted into a position spatially (tertiary structure) close to the substrate recognition region TRS (i.e., the region spatially close to the 5' end of the substrate recognition region TRS, or referred to as the spatial position) (the spatial position is in the tertiary structure, the region adjacent to the first nucleotide of TRS.
  • the spatial position is in a conserved region, and the conserved region is located in a region that can be aligned with the 223rd nucleotide of DEAR1, for example, by the Clustal Omega (1.2.4) method (F.
  • the sequences of DEAR1-10 are aligned, for example, by the alignment method described above, the 223rd nucleotide of DEAR1 corresponds to the 226th nucleotide of DEAR2), which can effectively improve the specificity and efficiency of DEAR for substrate cleavage, as shown in Figure 27.
  • the examples have confirmed that by extending the region (recognition region) in the RNA molecule of the original DEAR nucleic acid manipulation system that hybridizes with the target nucleic acid, the cleavage activity and specificity of the original DEAR nucleic acid manipulation system can be improved.
  • the length of the inserted extended recognition region ranges from 0 to 24 nt, so that the DEAR system, which originally only has 6 bases involved in complementary pairing, increases to 20 base sequences or more, thereby greatly improving the specificity.
  • the extended recognition region is inserted into the original DEAR nucleic acid manipulation system by replacing any nucleotide from 20 to 60, preferably any nucleotide from 30 to 50, downstream of the 3' end (primary structure, i.e., nucleotide sequence) of the substrate recognition region.
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:1, that is, the 223rd nucleotide of SEQ ID NO:1 is deleted, and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide of SEQ ID NO:1.
  • the extended recognition region replaces the 226th nucleotide in SEQ ID NO:2, that is, the 226th nucleotide of SEQ ID NO:2 is deleted, and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide of SEQ ID NO:2.
  • the extended recognition region replaces the 222nd nucleotide in SEQ ID NO:3, that is, the 222nd nucleotide of SEQ ID NO:3 is deleted, and the extended recognition region is inserted between the 221st nucleotide and the 223rd nucleotide of SEQ ID NO:3.
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:5, that is, the 223rd nucleotide of SEQ ID NO:5 is deleted, and the extended recognition region is inserted between the 222nd and 224th nucleotides of SEQ ID NO:5.
  • the extended recognition region replaces the 249th nucleotide in SEQ ID NO:7, that is, the 249th nucleotide of SEQ ID NO:7 is deleted, and the extended recognition region is inserted between the 248th nucleotide and the 250th nucleotide of SEQ ID NO:7.
  • the extended recognition region replaces the 224th nucleotide in SEQ ID NO:8, that is, the 224th nucleotide of SEQ ID NO:8 is deleted, and the extended recognition region is inserted between the 223rd nucleotide and the 225th nucleotide of SEQ ID NO:8.
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:9, that is, the 223rd nucleotide of SEQ ID NO:9 is deleted, and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide of SEQ ID NO:9.
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:132, that is, the 223rd nucleotide of SEQ ID NO:132 is deleted, and the extended recognition region is inserted between the 222nd and 224th nucleotides of SEQ ID NO:132.
  • the target sequence in the target nucleic acid that hybridizes to the substrate recognition region and the sequence in the target nucleic acid that hybridizes to the extended recognition region are continuous nucleotide sequences in the target nucleic acid.
  • the specificity and cutting efficiency of DEAR for substrate cutting are improved.
  • the target sequence in the target nucleic acid that hybridizes to the substrate recognition region is located at the 5’ end of the sequence in the target nucleic acid that hybridizes to the extended recognition region, that is, in the target nucleic acid, the first nucleotide downstream of the 3’ end of the nucleotide of the 3’ end of the target sequence in the target nucleic acid that hybridizes to the substrate recognition region is the 5’ end of the sequence in the target nucleic acid that hybridizes to the extended recognition region.
  • the length of the extended recognition region is 1 to 24 nt, preferably 1 to 14 nt, for example, 1 nt, 2 nt, 3 nt, 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, but not limited thereto.
  • the percentage of complementarity between the extended recognition region and the sequence of the target nucleic acid it recognizes is 60% or more (e.g., 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 97% or more, 98% or more, 99% or more, or 100%). In some embodiments, the complementarity percentage between the extended recognition region and the sequence of the target nucleic acid it identifies is 80% or higher (e.g., 85% or higher, 90% or higher, 95% or higher, 97% or higher, 98% or higher, 99% or higher, or 100%).
  • the complementarity percentage between the extended recognition region and the sequence of the target nucleic acid it identifies is 90% or higher (e.g., 95% or higher, 97% or higher, 98% or higher, 99% or higher, or 100%). In some embodiments, the complementarity percentage between the substrate recognition region and the target sequence of the target nucleic acid is 100%.
  • Some aspects of the present invention provide an engineered DEAR nucleic acid manipulation system, which is prepared by the method for preparing the engineered DEAR nucleic acid manipulation system described in the present invention.
  • an isolated polynucleotide comprising a nucleotide sequence encoding the engineered DEAR nucleic acid manipulation system of the present invention.
  • nucleic acid construct comprises the isolated polynucleotide of the present invention.
  • the polynucleotide is operably linked to one or more regulatory sequences, which are nucleotide sequences comprising a promoter and/or a ribosome binding site, and the regulatory sequences direct the expression of the genes of the engineered DEAR nucleic acid manipulation system in the host cell.
  • regulatory sequences are nucleotide sequences comprising a promoter and/or a ribosome binding site, and the regulatory sequences direct the expression of the genes of the engineered DEAR nucleic acid manipulation system in the host cell.
  • a vector comprising the isolated polynucleotide of the present invention, or the nucleic acid construct of the present invention.
  • the vector is a recombinant expression vector.
  • Suitable recombinant expression vectors include viral expression vectors (e.g., viral vectors based on the following viruses, vaccinia virus, polio virus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, human immunodeficiency virus, retroviral vectors (e.g., murine leukemia virus, spleen necrosis virus, and vectors derived from retroviruses such as Rous sarcoma virus, Harvey sarcoma virus, avian leukemia virus, lentivirus, human immunodeficiency virus, myeloproliferative sarcoma virus, and mammary tumor virus), etc.
  • viral expression vectors e.g., viral vectors based on the following viruses, vaccinia virus, polio virus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, human immunodeficiency virus, retroviral vectors (e.g., murine
  • the present invention provides a cell comprising the engineered DEAR nucleic acid manipulation system of the present invention, the isolated polynucleotide of the present invention, the nucleic acid construct of the present invention, or the vector of the present invention.
  • the cell can be any of a variety of cells, including, for example, in vitro cells, in vivo cells, ex vivo cells, primary cells, cancer cells, animal cells, plant cells, algae cells, fungal cells, and the like.
  • the cell is a recipient of the engineered DEAR nucleic acid manipulation system, isolated polynucleotide, nucleic acid construct or vector provided by the invention, which may also be referred to as a "host cell” or a "target cell.”
  • Host cells or target cells can be recipients of the engineered DEAR nucleic acid manipulation systems, isolated polynucleotides, nucleic acid constructs or vectors provided by the present invention.
  • non-limiting examples of cells include: prokaryotic cells, eukaryotic cells, bacterial cells, archaeal cells, cells of unicellular eukaryotic organisms, protozoan cells, cells from plants, algal cells, fungal cells, animal cells, cells from invertebrates, cells from vertebrates, cells from mammals (e.g., ungulates; rodents; non-human primates; humans; cats; dogs, etc.), etc.
  • the cell is a cell that is not derived from a natural organism (e.g., the cell can be a synthetic cell; also known as an artificial cell).
  • any of a number of suitable transcription and/or translation control elements including constitutive and inducible promoters, transcription enhancer elements, transcription terminators, and the like may be used in the recombinant expression vector.
  • nucleic acids e.g., recombinant expression vectors, isolated polynucleotides, nucleic acid constructs, engineered DEAR nucleic acid manipulation systems provided by the present invention
  • Suitable methods include, for example, viral infection, transfection, liposome transfection, electroporation, calcium phosphate precipitation, polyethyleneimine (PEI)-mediated transfection, DEAE-dextran-mediated transfection, liposome-mediated transfection, particle gun technology, direct microinjection, nanoparticle-mediated nucleic acid delivery, etc.
  • PEI polyethyleneimine
  • the present invention provides a reagent or kit comprising the engineered DEAR nucleic acid manipulation system described in the present invention, the isolated polynucleotide described in the present invention, the nucleic acid construct described in the present invention, the vector described in the present invention, or the cell described in the present invention.
  • the present invention provides a pharmaceutical composition
  • a pharmaceutical composition comprising the engineered DEAR nucleic acid manipulation system described in the present invention, the isolated polynucleotide described in the present invention, the nucleic acid construct described in the present invention, the vector described in the present invention, or the cell described in the present invention, and optionally, a pharmaceutically acceptable carrier.
  • the present invention provides a method for modifying a target nucleic acid, the method comprising the step of contacting the target nucleic acid with the engineered DEAR nucleic acid manipulation system of the present invention, the isolated polynucleotide of the present invention, the nucleic acid construct of the present invention, the vector of the present invention, the cell of the present invention, the reagent or the kit of the present invention.
  • the contacting results in modification of the target nucleic acid by the engineered DEAR nucleic acid manipulation system.
  • the present invention provides uses of the engineered DEAR nucleic acid manipulation system, the isolated polynucleotide, the nucleic acid construct, the vector, and the cell of the present invention in modifying target nucleic acids or preparing reagents or kits for modifying target nucleic acids.
  • the modification is cleavage of the target nucleic acid.
  • the target nucleic acid is selected from the group consisting of: DNA, RNA, genomic DNA and extrachromosomal DNA.
  • the contacting occurs in vitro or in vivo. In some specific embodiments, the contacting occurs inside a cell or outside a cell.
  • the cell is a eukaryotic cell or a prokaryotic cell.
  • the cell is selected from the group consisting of: plant cells, fungal cells, mammalian cells, reptile cells, insect cells, avian cells, fish cells, parasite cells, arthropod cells, invertebrate cells, vertebrate cells, rodent cells, mouse cells, rat cells, primate cells, non-human primate cells and human cells.
  • said contacting results in genome editing.
  • the contacting comprises introducing the engineered DEAR nucleic acid manipulation system into a cell.
  • this embodiment uses 92 C-type second-class introns in public databases, and constructs sequence and structure covariance models for the RNA sequences of the conserved I-III domain and V-VI domain, respectively. And for the potential protein IEP, a hidden Markov model of its amino acid sequence characteristics is also constructed. Usually, the length of the C-type second-class intron will not exceed 4000nt, so a 4000bp recognition window is set in this embodiment. Potential C-type second-class introns need to meet the high-confidence I-III domain and V-VI domain within the range of 4000bp. If the IEP protein cannot be identified in the IV domain, it is regarded as a C-type second-class intron without ORF.
  • this example identified 5,684 C-type second-class introns in the Earth metagenome dataset. Active C-type second-class introns should have multiple highly similar copies in the same strain genome. Therefore, this example clustered highly similar candidate C-type second-class introns in the metagenome of the same genus, and identified a total of 469 potentially active C-type second-class introns with multiple copies.
  • this example sorted the candidate C-type second-class introns (GII-C intron) according to the predicted secondary structure thermal stability.
  • this example also used RNA secondary structure prediction to further screen candidate C-type second-class introns with conservative secondary structures in the substrate recognition region (TRS).
  • DEAR1 to 10 were selected as the DEAR nucleic acid manipulation system, and the substrate cleavage activity was verified.
  • the secondary structure predictions of the selected DEAR1-10 (using RNAfold WebServer to predict RNA secondary structure: http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi) and their domain annotations are shown in Figures 1A to 1J. It can be seen that the secondary structures of DEAR1-10 are relatively similar, all consisting of domains I to VI, each domain exists in the form of a stem-loop structure and is naturally separated, and the programmable TRS regions are all located in the top ring region of domain I for recognizing nucleic acid substrates.
  • the sequences of DEAR1-10 are shown in Table 1 below, where the underlined and bold parts are TRS.
  • DEAR1 contains 6 domains (domains I to VI).
  • Domain I contains 4 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 2 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR2 contains 6 domains (domains I to VI).
  • Domain I contains 3 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 1 stem-loop/hairpin structure;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 4 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR3 contains 6 domains (domains I to VI).
  • Domain I contains 4 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 4 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR4 contains 6 domains (domains I to VI).
  • Domain I contains 6 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 3 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 3 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR5 contains 6 domains (domains I to VI).
  • Domain I contains 4 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 2 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR6 contains 6 domains (domains I to VI).
  • Domain I contains 6 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 1 stem-loop/hairpin structure;
  • domain III contains 1 stem-loop/ Hairpin structure;
  • domain IV contains 3 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR7 contains 6 domains (domains I to VI).
  • Domain I contains 4 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 2 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR8 contains 6 domains (domains I to VI).
  • Domain I contains 5 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 3 stem-loop/hairpin structures;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR9 contains 6 domains (domains I to VI).
  • Domain I contains 4 stem-loop/hairpin structures and the TRS sequence responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 1 stem-loop/hairpin structure;
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core;
  • domain VI contains 1 stem-loop/hairpin structure.
  • DEAR10 contains 6 domains (domains I to VI).
  • Domain I contains 4 stem-loop/hairpin structures and TRS sequences responsible for substrate recognition;
  • domain II contains 2 stem-loop/hairpin structures;
  • domain III contains 1 stem-loop/hairpin structure;
  • domain IV contains 1 stem-loop/hairpin structure and 1260nt ORF region (deleted in DEAR10);
  • domain V contains 1 stem-loop/hairpin structure, which contains the catalytic reaction core; domain VI contains 1 stem-loop/hairpin structure.
  • T7 promoter (TAATACGACTCACTATA; SEQ ID NO: 19) was added upstream of each DEAR by PCR.
  • the PCR amplification products were purified using DNA purification magnetic beads (VAHTS DNA Clean Beads, Vazyme, catalog number N411-01), and the products were used as templates for in vitro transcription (IVT).
  • IVT in vitro transcription
  • the in vitro transcription reaction was carried out in 30 mM Tris pH 8.1, 25 mM MgCl 2 , 0.01% Triton X-100, 2 mM spermidine, 5 mM DTT, and NTP was added, 5 mM each.
  • Example 3 Cleavage of single-stranded RNA, DNA and plasmids in vitro using the DEAR nucleic acid manipulation system
  • ssRNA single-stranded RNA
  • DEAR target sequences where the underlined and bold parts are target sequences recognized by DEAR
  • -Cy5 single-stranded RNA substrates with DEAR target sequences (where the underlined and bold parts are target sequences recognized by DEAR)
  • -Cy5 single-stranded RNA substrates with DEAR target sequences (where the underlined and bold parts are target sequences recognized by DEAR)
  • ssRNA substrate were incubated for 1 hour under the conditions of 500 mM NH 4 Cl, 125 mM MgCl 2 , 40 mM MOPS 7.5, and 50°C for reaction. After terminating the reaction, Urea-PAGE electrophoresis was used, and the gel fluorescence signal was scanned on a fluorescence imager.
  • Figure 3 the products obtained by cutting ssRNA are at the bottom, and it can be seen that DEAR1 to DEAR9 can all cut single-stranded RNA.
  • I represents the
  • DEAR1-6 (1.5 ⁇ M each) were respectively incubated with a single-stranded RNA (100 nM) substrate that cannot pair with the TRS region (the substrate used by DEAR1 is the sequence shown in SEQ ID NO: 21; the substrate used by DEAR2 is the sequence shown in SEQ ID NO: 23; the substrate used by DEAR3 is the sequence shown in SEQ ID NO: 23; the substrate used by DEAR4 is the sequence shown in SEQ ID NO: 22; the substrate used by DEAR5 is the sequence shown in SEQ ID NO: 21; the substrate used by DEAR6 is the sequence shown in SEQ ID NO: 23) in 10 mM KCl, 50 mM MgCl2 , 40 mM MOPS 7.5, The reaction was incubated under the conditions of , and samples were taken at different time points (0min, 5min, 10min, 30min, 60min, 120min).
  • single-stranded DNA (ssDNA) substrates with target sequences corresponding to DEAR1 to 6 were synthesized (the underlined and bold parts are target sequences recognized by DEAR), and their 3' ends were labeled with -Cy5.
  • each DEAR (1.5 ⁇ M) and the corresponding single-stranded DNA (100 nM) substrate were mixed in 500 mM NH 4 Cl, 125 mM MgCl 2 , 40 mM MOPS 7.5, The reaction was incubated under the conditions of , and samples were taken at different time points (0min, 5min, 10min, 20min, 40min, 60min, 120min, 0-2h in the figure).
  • T represents a paired substrate
  • T* represents a cleavage product of a paired substrate
  • N represents an unpaired substrate
  • N* represents a cleavage product of an unpaired substrate
  • M represents a marker.
  • the product obtained by cleaving ssDNA is below the substrate. It can be seen that DEAR1 to DEAR6 can all cause cleavage of single-stranded DNA, and the substrate cannot be cleaved when it cannot be paired with the TRS region.
  • single-stranded DNA (ssDNA) substrates with target sequences corresponding to DEAR1-6 were synthesized (the underlined and bold parts are target sequences recognized by DEAR), and the 3' end of each DEAR (1.5 ⁇ M) and the corresponding single-stranded DNA (100 nM) substrate were then mixed in 500 mM NH 4 Cl, 125 mM MgCl 2 , 40 mM MOPS 7.5, The reaction was incubated for 24 hours under the conditions of . After the reaction was terminated, Urea-PAGE electrophoresis was performed and the gel fluorescence signal was scanned on a fluorescence imager. The gel image is shown in Figure 7.
  • the larger triangle indicates the main cleavage site
  • the smaller triangle indicates the secondary cleavage site
  • I indicates the input substrate
  • Dr1 ⁇ 6 indicates the cleavage product of DEAR1 ⁇ 6
  • L is the ladder generated by random digestion of ssDNA using DNase I (Promega, Catalog No. M6101), which is used to indicate the product length
  • M indicates a marker
  • the product obtained by cutting ssDNA is below the substrate. It can be seen that the main cleavage site is located 0-1nt downstream of the 3' end of the TRS pairing region.
  • DEAR1 1.5 ⁇ M
  • SEQ ID NO: 35 100 nM
  • samples were taken at different time points (0 min, 5 min, 10 min, 20 min, 40 min, 1 h, 2 h, 4 h, 8 h, 12 h, 24 h).
  • Urea-PAGE electrophoresis was used, and the gel fluorescence signal was scanned on a fluorescence imager. See Figures 8 and 9 for the gel images and efficiency curve results.
  • DEAR1 (1.5 ⁇ M) and single-stranded DNA 1X-DEAR1 (SEQ ID NO: 35; 100 nM) substrate were mixed in 10 mM KCl, 50 mM MgCl 2 , 40 mM MOPS 7.5, The reaction was incubated under the conditions and samples were taken at different time points (0min, 10min, 30min, 1h, 2h, 4h, 8h, 16h).
  • a plasmid with a target sequence corresponding to DEAR1 (TGTCTTAAGACA; SEQ ID NO: 41) was designed and synthesized (the backbone was the commercially available pUC19 plasmid from addgene, Plasmid #50005).
  • DEAR1 (1.5 ⁇ M) and plasmid substrate (0.03 ⁇ M) were mixed in 150 mM KCl, 10 mM MgCl 2 , 40 mM MOPS 7.5, The reaction was incubated under the conditions of , and samples were taken at different time points (0h, 3h, 8h, 24h). After the reaction was terminated, agarose gel electrophoresis was used, and the gel image was obtained by shooting on a UV imager.
  • RNA-DEAR10 The RNA substrate sequence used in this experiment (RNA-DEAR10) is:
  • DEAR10 RNA 1.5 ⁇ M
  • single-stranded RNA substrate 100 nM, SEQ ID NO: 24
  • the reaction was incubated for 1 hour under the conditions of .
  • Urea-PAGE electrophoresis was performed and the gel fluorescence signal was scanned on a fluorescence imager.
  • FIG31 the product obtained by cutting ssRNA is at the bottom, and it can be seen that DEAR10 can cut single-stranded RNA.
  • the DNA substrate sequence used in this experiment is:
  • DEAR10 RNA (1.5 ⁇ M) and the corresponding single-stranded DNA substrate (100 nM, SEQ ID NO: 33) were mixed in 500 mM NH 4 Cl, 125 mM MgCl 2 , 40 mM MOPS 7.5, The reaction was incubated under the conditions of , and samples were taken at different time points (0min, 5min, 10min, 20min, 40min, 60min, 120min, shown as 0-2h in the figure). After the reaction was terminated, Urea-PAGE electrophoresis was performed, and the gel fluorescence signal was scanned on a fluorescent imager. The gel image result is shown in Figure 32. As shown in Figure 32, the product obtained by cutting ssDNA is below the substrate, which shows that DEAR10 can cut single-stranded DNA.
  • DEAR10 (1.5 ⁇ M) was incubated with single-stranded DNA (100 nM) substrates that could or could not pair with its TRS region (sequences shown in SEQ ID NO:33 and SEQ ID NO:30, respectively) in 500 mM NH 4 Cl, 125 mM MgCl 2 , 40 mM MOPS 7.5, The reaction was incubated for 1 hour under the conditions of . After the reaction was terminated, Urea-PAGE electrophoresis was performed and the gel fluorescence signal was scanned on a fluorescence imager. The gel image results are shown in Figure 33. The product obtained by cutting ssDNA is below the substrate. It can be seen that DEAR10 can cut single-stranded DNA, and the substrate cannot be cut when it cannot pair with the TRS region.
  • the ccdB toxic gene inducible expression plasmid with the corresponding target sequences of DEAR1-3 at the plasmid replication origin (ori) was used as the targeting plasmid (addgene sequence number: 69056).
  • the J23119 promoter (its specific sequence is (SEQ ID NO: 42): TTGACAGCTAGCTCAGTCCTAGGTATAATACTAGT) is used to promote the expression of each DEAR sequence.
  • the construction method is as follows: the J23119 promoter is connected to DEAR1 ⁇ 3 respectively, and then the sequence of DEAR1 ⁇ 3 connected to the J23119 promoter is inserted into the pCDFDuet1 plasmid vector (Novagen catalog number: 71340-3) by homologous recombination, respectively, replacing the sequence of the plasmid 410 ⁇ 3765 interval as a whole.
  • Trc promoter (its specific sequence is (SEQ ID NO:43): TTGACAATTAATCATCCGGCTCGTATAATG) is used to start the expression of Cas9 nuclease
  • J23119 promoter (its specific sequence is the same as above) is used to start the expression of its corresponding guide nucleic acid (sgRNA) sequence.
  • the sgRNA expressed by the positive control group (marked as PC in Figure 12, i.e., PC group) contains a 20-base target sequence (its specific sequence is (SEQ ID NO:44): GCGATAAGTCGTGTCTTACC), and cuts the targeted plasmid under the guidance of sgRNA;
  • the sgRNA expressed by the negative control group (marked as NC in Figure 12, i.e., NC group) does not contain a 20-base target sequence and cannot cut the targeted plasmid.
  • the construction method is as follows: after the Trc promoter is connected to the Cas9 sequence, the J23119 promoter is connected to the sgRNA, and then the Trc-Cas9-J23119-sgRNA sequence is inserted into the pCDFDuet1 plasmid vector (Novagen catalog number: 71340-3) by homologous recombination, replacing the entire sequence in the 410-3765 interval of the plasmid.
  • the targeting plasmid in step 1 and the different expression plasmids constructed in steps 2 and 3 were combined and introduced into Escherichia coli BW25141 strain (CGSC strain preservation number: 7635). After a certain period of culture, the bacterial solution samples were cultured on plates containing ccdB inducer (10mM arabinose, Biotechnology Product Number: A610071) and targeting plasmid resistance plates (ampicillin).
  • the bacteria when the bacteria only contain the targeting plasmid, they can survive and show plaques on the ampicillin plate, but cannot grow on the ccdB induced expression plate (BC group); the death and live conditions of the plaques are basically consistent with those of the expression plasmid that does not cut the targeting plasmid (NC group, expressing Cas9 without cutting ccdB).
  • the expression plasmid cuts the targeted plasmid (PC group, expressing Cas9 to cut ccdB; DEAR1 ⁇ DEAR3 groups: expressing the corresponding intron RNA sequences respectively)
  • the ccdB toxic gene cannot be expressed normally, so that the bacteria can survive on the ccdB induction expression plate; at the same time, the bacteria lose the ampicillin resistance due to the cutting of the targeted plasmid and die on the ampicillin plate.
  • the results of bacterial plate coating and the analysis of ccdB gene expression levels showed that DEAR1 ⁇ DEAR3 can all cut the plasmid in E. coli cells.
  • the DEAR1 expression plasmid was subjected to PCR using primers GGATGAGTTTGCAAACAAAGTCCTTTCTGCCG (SEQ ID NO: 45) and AGGACTTTGTTTGCAAACTCATCCAATGATACCTAGC (SEQ ID NO: 46) and the ⁇ TRS mutant expression plasmid was constructed by homologous recombination, denoted as Dr1_ ⁇ TRS (DEAR1 expression plasmid constructed by deleting 6 nucleotides of TRS sequence), as one of the expression plasmids;
  • AGTACAGCATCGGCCTGGCCATCGGCACCAACTCTGTGG (SEQ ID NO: 47); and GGCCAGGCCGATGCTGTACTTCTTGTCAGAACCGTGGTGA (SEQ ID NO: 48) were used for PCR.
  • the PCR products were further PCR-polymerized with primers:
  • dCas9 expression plasmid (all 2 enzyme cleavage active centers of Cas9 were mutated and inactivated), denoted as dCas9, as one of the expression plasmids;
  • nCas9 expression plasmid one enzyme cleavage active center H840 of Cas9 was mutated and inactivated was constructed by homologous recombination, denoted as nCas9, as one of the expression plasmids;
  • the plasmid used in the PC group in step 3 was used as the wtCas9 expression plasmid without modification.
  • the targeting plasmid constructed in step 1 and the above-mentioned different expression plasmids (dCas9, nCas9, wtCas9, Dr1_ ⁇ TRS, and the DEAR1 expression plasmid used in step 2) were combined and introduced into the Escherichia coli BW25141 strain (CGSC strain preservation number: 7635). After a certain period of culture, the bacterial solution sample was cultured on the targeting plasmid resistance plate (ampicillin).
  • the bacteria when the bacteria only contain the targeting plasmid, they can survive and show plaques on the ampicillin plate (Blank group); the death and life of the plaques expressing dCas9 and the DEAR expression plasmid Dr1_ ⁇ TRS without TRS are basically the same.
  • the DEAR1 expression plasmid cuts the targeting plasmid, which is basically consistent with the results of expressing nCas9 and wtCas9.
  • the bacteria die on the ampicillin plate because the targeting plasmid is cut and the ampicillin resistance is lost.
  • the results of bacterial plate spreading and analysis of AmpR gene expression levels showed that DEAR1 can cut the plasmid in Escherichia coli cells through the TRS region.
  • a ccdB toxic gene inducible expression plasmid with the corresponding intron RNA targeting sequences of DEAR1 and DEAR4-9 at the plasmid replication origin site (ori) was used as the targeting plasmid (addgene sequence number: 69056).
  • the J23119 promoter (its specific sequence is (SEQ ID NO: 42): TTGACAGCTAGCTCAGTCCTAGGTATAATACTAGT) is used to start the expression of each DEAR sequence.
  • the construction method is as follows: the J23119 promoter is connected to Dr1_ ⁇ TRS, DEAR1, DEAR4 ⁇ 9 (wherein, Dr1_ ⁇ TRS and DEAR1 are the same as in Example 4), and then the sequences of Dr1_ ⁇ TRS, DEAR1, DEAR4 ⁇ 9 connected to the J23119 promoter are inserted into the pCDFDuet1 plasmid vector (Novagen catalog number: 71340-3) by homologous recombination, respectively, to replace the sequence of the plasmid 410 ⁇ 3765 interval as a whole.
  • the targeting plasmid in step 1 and the different expression plasmids constructed in step 2 were combined and introduced into Escherichia coli BW25141 strain (CGSC strain deposit number: 7635), and after a certain period of culture, the bacterial solution samples were cultured on the plates containing the targeting plasmid resistance (ampicillin).
  • Example 5-1 In vitro cleavage of plasmids by the DEAR nucleic acid manipulation system
  • the ccdB toxic gene inducible expression plasmid (addgene sequence number: 69056) with the corresponding target sequences of DEAR1-6 and DEAR10 was used as the plasmid substrate for in vitro cleavage experiments.
  • DEAR1-6 and DEAR10 1.5 ⁇ M
  • plasmid substrate (0.03 ⁇ M) were respectively added to 150 mM KCl, 10 mM MgCl2, 40 mM MOPS 7.5, The reaction was incubated under the conditions of , and samples were taken at different time points (0h, 3h, 8h, 24h). After the reaction was terminated, agarose gel electrophoresis was used, and the gel image was obtained by photographing on a UV imager.
  • Example 5-2 Toxicity test of the DEAR nucleic acid manipulation system in Escherichia coli
  • this example uses turbidimetry to determine the growth curve of E. coli. Take 50 ng of Blank, dCas9, Cas9, Dr1_ ⁇ TRS and DEAR1 expression plasmids (same as Example 4) respectively, and chemically transform E. coli BW25141 competent cells. Add plasmids to competent cell suspension and mix well, incubate on ice for 30 minutes; incubate in 42°C water bath for 60 seconds, and ice bath for 2 minutes; add 1 mL of liquid LB medium, The cells were cultured and revived in a constant temperature shaker at 220 rpm for 1 h.
  • a single-stranded DNA substrate with a new DEAR target sequence (the underlined and bold parts are the target sequences recognized by DEAR), with a -Cy5 label at its 3' end.
  • DEAR1-6 (1.5 ⁇ M) with the TRS sequence changed to CGAUAG were all mixed with this single-stranded DNA (100 nM) substrate in 50 mM MgCl 2 , 10 mM KCl, 40 mM MOPS 7.5, The reaction was incubated for 8 hours under the conditions of . After terminating the reaction, Urea-PAGE electrophoresis was performed, and the gel fluorescence signal was scanned on a fluorescence imager. The results are shown in Figure 15.
  • DEAR10 (1.5 ⁇ M) with a reprogrammed TRS in which the TRS sequence of DEAR10 was changed to CGAUAG was incubated with a single-stranded DNA substrate (100 nM, SEQ ID NO: 51) with a corresponding new target sequence at 50 mM MgCl 2 , 10 mM KCl, 40 mM MOPS 7.5, and 37°C for 8 h for reaction. After the reaction was terminated, Urea-PAGE electrophoresis was performed, and the gel fluorescence signal was scanned on a fluorescent imager. The results are shown in FIG34 .
  • the product obtained by cutting ssDNA is below the substrate, and it can be seen that DEAR10 with a reprogrammed TRS can cut the new single-stranded DNA.
  • I represents the input ssDNA substrate
  • Dr10* represents the cutting product of ssDNA by DEAR10 after the TRS is changed.
  • PiggyBac TM Transposon Vector System (from System Biosciences) was used to construct DEAR1 targeting sequence stable transfection plasmid and DEAR stable transfection plasmid:
  • HEK-293T ATCC CRL-11268 cells were cultured in DMEM high-glucose medium containing 10% fetal bovine serum under 5% CO 2 conditions to the logarithmic phase, digested with 0.25% trypsin, washed twice with PBS (pH 7.0-7.2), resuspended in Opti-MEM TM (Gibco, Cat. No.: 31985070) medium, and the cell density was adjusted to 5 ⁇ 10 4 / ⁇ L.
  • 2 ⁇ g Integration PB transposase plasmid (System Biosciences) and 2 ⁇ g DEAR1 targeting sequence stable transfection plasmid were added to 20 ⁇ L of cell suspension, and the cell suspension was electroporated at 450V (Celetrix biotechnologies, Model: LE+).
  • the electroporated cells were added to DMEM high-glucose medium containing 10% fetal bovine serum, and the medium was replaced with 10 ⁇ g/mL 24h after electroporation. Blasticidin medium was added and the cells were screened for one week. During this period, the cells were subcultured according to their growth conditions. When the cells were stable, a stable cell line containing the DEAR1 targeting sequence was obtained.
  • the same method was used to electroporate 2 ⁇ g Integration PB into the stable cell line containing the DEAR1 targeting sequence.
  • Transposase plasmid System Biosciences
  • 2 ⁇ g DEAR stable plasmid DEAR1 stable plasmid or DEAR-NT stable plasmid
  • the culture medium was replaced with a culture medium containing 50 ⁇ g/mL Hygromycin B, and drug selection was performed for one week, during which the cells were passaged according to their growth status.
  • the culture medium was replaced with a culture medium containing 10 ⁇ g/mL Puromycin, and drug selection was performed for one week.
  • the PuroR gene integrated in the cells is in a frameshift state and cannot express the correct protein, thus not having resistance to Puromycin.
  • DEAR1 (DEAR1 stable plasmid) can cut the DEAR1 targeting sequence to cause DNA double-strand breaks, and the insertion or deletion mutation introduced by the break repair can restore the frameshifted PuroR gene to normal expression, resulting in cell survival under Puromycin screening; while DEAR-NT (DEAR-NT stable plasmid) cannot cut the DEAR1 targeting sequence, and the cells cannot express the correct PuroR gene, resulting in cell death under Puromycin screening.
  • DEAR1 DEAR1 stable plasmid
  • DEAR-NT DEAR-NT stable plasmid
  • Example 7-1 Genomic DNA cleavage in mammalian cells-2
  • PiggyBac TM Transposon Vector System (from System Biosciences) was used to construct DEAR1 targeting sequence stable transfection plasmid and DEAR stable transfection plasmid:
  • 2 ⁇ g Integration PB transposase plasmid (System Biosciences) and 2 ⁇ g DEAR1 targeting sequence stable transfection plasmid were added to 20 ⁇ L of the cell suspension, and the cell suspension was electroporated at 450V (Celetrix biotechnologies, Model: LE+).
  • the electroporated cells were added to DMEM high glucose medium containing 10% fetal bovine serum.
  • the medium was replaced with a medium containing 10 ⁇ g/mL Blasticidin, and the cells were screened for one week. During this period, the cells were subcultured according to the growth conditions. After the cells were stable, the stable cell line containing the DEAR1 targeting sequence was obtained.
  • the same method was used to electroporate 2 ⁇ g Integration PB transposase plasmid (System Biosciences) and 2 ⁇ g DEAR stable plasmid (DEAR1 stable plasmid or DEAR1-NT stable plasmid) into the stable cell line containing the DEAR1 targeting sequence.
  • the medium was replaced with a medium containing 50 ⁇ g/mL Hygromycin B, and the cells were screened for one week. During this period, the cells were subcultured according to the growth conditions. After the cells were stable, the medium was replaced with a medium containing 10 ⁇ g/mL Puromycin, and the cells were screened for one week.
  • the PuroR gene integrated in the cells is in a frameshift state and cannot express the correct protein, thus not having resistance to Puromycin.
  • DEAR1 (DEAR1 stable plasmid) can cut the DEAR1 targeting sequence to cause DNA double-strand breaks, and the insertion or deletion mutation (INDEL) introduced by the break repair can restore the frameshifted PuroR gene to normal expression, resulting in cell survival under Puromycin screening, as shown in A in Figure 37; while DEAR1-NT (DEAR1-NT stable plasmid) cannot cut the DEAR1 targeting sequence, and the cells cannot express the correct PuroR gene, resulting in cell death under Puromycin screening.
  • INDEL insertion or deletion mutation
  • DEAR1 stable plasmid cells stably transfected with DEAR1 (DEAR1 stable plasmid) survive, while cells stably transfected with DEAR1-NT (DEAR1-NT stable plasmid) die (scale bar: 500 ⁇ m).
  • Target 1 detected 9.18% of the insertion or deletion mutations
  • Target 2 detected 7.35% of the insertion or deletion mutations
  • Target 3 detected 0.01% of the insertion or deletion mutations.
  • this embodiment detected insertion mutations with a length of between 1 and 2 nt and deletion mutations with a length of between 1 and 25 nt near the three target sites.
  • this example also observed a deletion mutation with a maximum length of 85 nt spanning Target 1 and Target 2.
  • Example 8 Improving the cleavage activity and specificity of the DEAR nucleic acid manipulation system by extending the TRS region
  • DEAR1-10 has 6 domains, domains I to VI, also referred to as D1-D6, of which D1 is the largest domain, the TRS substrate recognition sequence is located on D1, D1-D4 and D6 are its structural scaffolds, used to stabilize the overall configuration, and D5 is the catalytic active structural center, which forms the catalytic active center by binding to 2 magnesium ions. From the secondary structure and tertiary structure, it was found that DEARs all have conserved catalytic active centers and substrate recognition regions (see Figure 20), so in this embodiment, one of the DEARs (DEAR1) is used as an example for explanation.
  • DEAR1-10 has 6 domains, domains I to VI, also referred to as D1-D6, of which D1 is the largest domain, the TRS substrate recognition sequence is located on D1, D1-D4 and D6 are its structural scaffolds, used to stabilize the overall configuration, and D5 is the catalytic active structural center, which forms the catalytic active center by binding to 2 magnesium ions. From the secondary structure
  • the nucleotides corresponding to D1-D6 in DEAR1-10 are as follows:
  • the I domain (D1) in DEAR1 corresponds to nucleotide 1 to nucleotide 266 of the nucleotide sequence of DEAR1 (SEQ ID NO: 1).
  • the secondary structure is predicted by the method of RNAfold (R. Lorenz et al., Vienna RNA Package 2.0. Algorithms Mol Biol 6, 26 (2011). doi: 10.1186/1748-7188-6-26). Due to the dynamic nature of RNA structure, each domain of RNA may have a deviation of 0-5 domains.
  • DEAR9 predicts that domain IV contains a stem-loop/hairpin structure, and it is reasonable that there is no corresponding primary sequence in the primary structure confirmed by sequence alignment with Clustal and DEAR1, 2, 3, 5, 6.
  • this example tests the adaptability of the TRS region by replacing the TRS region (nucleotides 181-186 of exemplary DEAR1) with different sequences.
  • the sequences of different TRSs and the corresponding substrates are listed in Table 7 below:
  • RNA was prepared.
  • 10 mM KCl, 50 mM MgCl2, 40 mM MOPS 7.5 The cleavage activity of the corresponding RNA was detected under the same conditions. It was found that different sequences could produce cleavage effects with slightly different cleavage activities.
  • this example demonstrates the potential of extending the TRS region to improve the specificity of the DEAR nucleic acid manipulation system.
  • 0-nt TRS i.e., TRS in the full-length sequence of DEAR was deleted
  • 2-nt TRS i.e., TRS in the full-length sequence of DEAR was replaced with GA
  • 4-nt TRS i.e., TRS in the full-length sequence of DEAR was replaced with AGAC
  • 6-nt TRS i.e., the original version of the full-length sequence of DEAR
  • 8-nt TRS i.e., TRS in the full-length sequence of DEAR was replaced with CUAAGACA
  • 10-nt TRS i.e., TRS in the full-length sequence of DEAR was replaced with CGCUAAGACA (SEQ ID NO: 89)
  • 12-nt TRS i.e., TRS in the full-length sequence of DEAR was replaced with UCC
  • KCl 50mM MgCl2, 40mM MOPS 7.5
  • the cleavage activity of the corresponding RNA was detected under the conditions, and the substrates used were all ssDNA substrates with a sequence of SEQ ID NO:35 and a -Cy5 label at the 3' end.
  • the TRS region of DEAR1 By extending the TRS region of DEAR1, it can be seen that it still has DNA cleavage activity when extended to 12nt as shown in C in Figure 21. Among them, the cleavage activity of the 6nt TRS is the highest in the most original version. This shows that the TRS region is highly modifiable.
  • plasmid cutting experiment in bacteria also proved that the TRS-extended DEAR1 still has cutting activity.
  • 0-nt TRS i.e., TRS in the full-length sequence of DEAR is deleted
  • 2-nt TRS i.e., TRS in the full-length sequence of DEAR is replaced with GA
  • 4-nt TRS i.e., TRS in the full-length sequence of DEAR is replaced with AGAC
  • 6-nt TRS i.e., the original version of the full-length sequence of DEAR
  • 8-nt TRS i.e., TRS in the full-length sequence of DEAR is replaced with GUAAGACA
  • 10-nt TRS i.e., TRS in the full-length sequence of DEAR is replaced with CGGUAAGACA (SEQ ID NO:92
  • 12-nt TRS i.e., TRS in the full-length sequence of DEAR is replaced with CCCGGUAAGACA (SEQ ID NO:93)
  • step 4 of Example 4 the plasmid interference ability of DEAR1 molecules with different TRS lengths was detected, and the targeting plasmids were all the ccdB toxic gene inducible expression plasmids described in step 1 of Example 4 (addgene sequence number: 69056, DEAR1 molecules with different TRS lengths can target the ori of the plasmid).
  • D of Figure 21 it can be observed that DEAR1 with TRS extension still has cleavage activity at 4-10 nt in bacteria.
  • TRS can be preferably extended to 7-10 nt in vivo and in vitro, and has basic cleavage activity against DNA in vitro and plasmids in bacteria with improved specificity.
  • extension to 12 nt, 14 nt, etc. also has cleavage activity and improved specificity.
  • Example 9 Improving the cleavage activity and specificity of the DEAR nucleic acid manipulation system by adding a recruitment sequence at the 3' end
  • this embodiment designs a modification method to improve its substrate (target nucleic acid) recognition specificity and thus improve the cutting activity by designing a DNA sequence that can be complementary to the substrate (target nucleic acid). As shown in Figures 22-24, this embodiment tested two schemes: 1) by deleting the VI domain (D6) in DEAR and adding a recruiter sequence; or, 2) directly adding a recruiter sequence after D6, thereby increasing the substrate recognition specificity and improving the cutting activity.
  • B in Figure 22 is a schematic diagram of the two transformations and the original DEAR nucleic acid manipulation system
  • B in Figure 22 is a schematic diagram of the recruitment sequence helping to identify the substrate sequence, wherein the recruitment sequence is located at the 3' end of DEAR and can hybridize with the recruitment sequence binding sequence in the substrate
  • TRS substrate recognition sequence, for DEAR6, TRS is ACAUCA
  • TRS substrate recognition sequence, for DEAR6, TRS is ACAUCA
  • the corresponding RNA was prepared according to the method in Example 2.
  • the substrate ssDNA with a Cy5-labeled 3' end and a linker sequence of different lengths was cut separately (the sequence is shown in Table 8 below, the substrate binding sequence is bold, the linker sequence is italicized, and the recruitment sequence binding sequence is underlined) according to the experimental method in Examples 3-6, in 10mM KCl, 50mM MgCl2, 40mM MOPS 7.5, The cleavage activity of the modified DEAR6 molecule on substrates with different lengths of linker sequences was detected under different conditions.
  • FIG. 23 A in Figure 23 is a schematic diagram of the substrate ssDNA
  • B in Figure 23 is a schematic diagram of the cutting efficiency of DEAR6WT, DEAR6_ ⁇ DVI, and DEAR6_ ⁇ DVI-recruitment sequence for substrates with different linker sequence lengths at 24h
  • C, D, and E in Figure 23 are schematic diagrams of the cutting efficiency of DEAR6WT, DEAR6_ ⁇ DVI, and DEAR6_ ⁇ DVI-recruitment sequence for substrates with different linker sequence lengths at different time points, respectively
  • F in Figure 23 is a statistical table of the cutting efficiency of the above experimental groups. The results show that the appropriate linker sequence length ranges from 20nt to 50nt, and a length of 30-nt is preferred.
  • DEAR1 was modified by the VI domain (597-633) and the recruitment sequence ATGAGCATGATTAGGCCTAG (SEQ ID NO: 66) was added to the 3' end.
  • the corresponding DNA sequence of DEAR1 was prepared according to the method in Example 2.
  • the original DEAR1 RNA was used as a control and the experimental methods in Examples 3-6 were used to generate the corresponding RNA in 10 mM KCl, 50 mM MgCl2, 40 mM MOPS 7.5, Under the conditions, the substrate ssDNA (sequence is) with a 3' end labeled with -Cy5 and a connection sequence length of 12 nt was cut.
  • the corresponding DNA sequences of DEAR1 with 14nt recruitment sequence (sequence ATGAGCATGATTAG (SEQ ID NO: 68)), 20nt recruitment sequence (sequence ATGAGCATGATTAGGCCTAG (SEQ ID NO: 66)) and 26nt recruitment sequence (sequence ATGAGCATGATTAGGCCTAGCTCTTC (SEQ ID NO: 96)) added to the 3' end were synthesized in this example.
  • the corresponding RNA was prepared according to the method in Example 2.
  • the above substrate ssDNA sequence: (SEQ ID NO:67), wherein the substrate binding sequence is bold + underlined, the recruitment sequence binding sequence is italicized, the 14nt recruitment sequence can recognize the portion shown by the dotted underline, the 20nt recruitment sequence can recognize the portion shown by the double underline and the dotted underline, and the 26nt recruitment sequence can recognize the portion shown by the single underline, double underline and dotted underline).
  • DEAR1 with 14nt, 20nt and 26nt recruitment sequences at the 3' end showed similar cutting efficiency, so the recruitment sequence length of 14nt-26nt is optional, and the length of 20nt is preferred (the 6nt TRS combined with the 20nt recruitment sequence, a total of 26nt recognition sequences, has met the requirements of most of the recognition of specific genomic sites, referring to the commonly used SpyCas9 and AsCas12a gRNA spacer length of about 20nt).
  • the recruitment sequence binding sequence is at the 3' end of the substrate binding sequence, and the length selection range of the connecting sequence separated in the middle is preferably 20-50nt ( Figure 23), wherein the highest cleavage rate is achieved at about 30nt: the length range of the recruitment sequence is preferably 14-26nt.
  • the Gibbs free energy of its secondary structure is less than 7 when the recruitment sequence is designed.
  • Example 10 Improving substrate recognition specificity by transforming originally homodimeric DEARs into heterodimeric DEARs
  • DEAR1 can form homodimers in natural conditions (Figure 21 A is a schematic diagram of DEAR1 homodimers). Moreover, this dimerization depends on the dimerization motif of its III domain (D3) (position 361-366, sequence UCUAGA).
  • this embodiment is designed Its dimerization motif was transformed into two different sequences that can hybridize with each other, thereby mediating the formation of heterodimers by DEAR1 molecules with two different dimerization motifs and two different TRSs (referred to as monomer 1 and monomer 2, and their TRSs are called TRS1 and TRS2, respectively) (as shown in A in Figure 25), thereby improving its substrate recognition specificity from the original DNA sequence that can only recognize 6nt to a 12nt sequence.
  • sequences of the DEAR1 heterodimers constructed in this example are:
  • TRS1 and TRS2 in monomer 1 and monomer 2 are both bolded and underlined, and the dimerization motifs in the two monomers are both italicized and underlined.
  • the corresponding DNA sequences were synthesized respectively, and the corresponding RNA was prepared according to the method in Example 2.
  • DEAR has three possible recognition modes for double-stranded DNA substrates (such as plasmids, bubble DNA, etc.), forming 5' protruding ends, blunt ends and 3' protruding ends respectively.
  • the cleavage products generated in these three cases are different, as shown in C in Figure 25, where the DEAR1 heterodimer is represented by a butterfly-shaped cartoon, TRS1, TRS2 and the corresponding substrate binding sequences are shown in the figure respectively, and the cleavage site is represented by a triangle.
  • this embodiment cuts different types of DNA substrates, including the cutting conditions of substrates with 5' protruding ends, blunt ends and 3' protruding ends at different cutting distances.
  • the sequences of the two chains of different substrates are shown in Table 9 below.
  • the substrates used are all double-stranded DNA with bubbles, wherein the 3' end of the sense chain is labeled with -Cy5 and has a Target of TRS1 (shown in bold in the table below); the 3' end of the antisense chain is labeled with FAM and has a Target of TRS2 (shown underlined in the table below); the corresponding region where the sense chain and the antisense chain can be complementary is shown in italics.
  • the sense strand and antisense strand were mixed in equal amounts and denatured and annealed to form double-stranded DNA with bubbles.
  • 1.5 ⁇ M equimolar concentration of DEAR1 monomer was incubated at room temperature for 30 minutes to form heterodimers, and then incubated with 100 nM double-stranded DNA substrate in 50 mM MgCl 2 , 10 mM KCl, 40 mM MOPS 7.5, The cleavage reaction was carried out under the conditions. Time points: 0, 3, 6, 18h.
  • the fluorescence images obtained by scanning the corresponding fluorescence channels of Cy5 and FAM were merged using ImageJ software. As shown in D in Figure 25, it can be seen that this heterodimer has effective cleavage of both chains of different types of double-stranded DNA substrates. This shows that the modification of the heterodimer is biochemically active.
  • the construction method is as follows: The J23119 promoter is connected to DEAR1 monomer 1 and DEAR1 monomer 2 respectively, and then the two coding frames are connected in sequence, and then the whole is inserted into the pCDFDuet1 plasmid vector (Novagen item number: 71340-3) by homologous recombination, and the sequence of the plasmid 410-3765 interval is replaced as a whole.
  • the cutting activity of the heterodimer DEAR in bacteria was tested. As shown in Figure 26, this indicates that this design also has plasmid interference activity in bacteria.
  • DEAR has editing effects in eukaryotic cells and bacteria ( Figures 12-17), these modification methods for improving specificity and activity are also applicable to editing in eukaryotic cells and bacteria.
  • Example 11 Improving the cleavage activity and specificity of the DEAR nucleic acid manipulation system by inserting an extended recognition region
  • a modified DEAR nucleic acid manipulation system was obtained by inserting extended recognition regions of different lengths into DEAR1 and DEAR2, and then tested.
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: u;
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: gu;
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: ggu;
  • RS+4 the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: cggu;
  • RS+6 the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: cccggu;
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO:10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: aacccggu;
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO: 10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: ccaacccggu (SEQ ID NO: 97);
  • the extended recognition region replaces the 223rd nucleotide in SEQ ID NO: 10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: guccaacccggu (SEQ ID NO: 98);
  • RS+14 the extended recognition region replaces the 223rd nucleotide in SEQ ID NO: 10, i.e., the 223rd nucleotide is deleted and the extended recognition region is inserted between the 222nd nucleotide and the 224th nucleotide.
  • the sequence of the extended recognition region is: gaguccaacccggu (SEQ ID NO: 99);
  • nucleotide sequence of DEAR1 RS+14 is as follows (SEQ ID NO: 100; wherein the single underlined region is the TRS region, and the double underlined region is the RS region):
  • the substrate sequence used is (SEQ ID NO: 101):
  • the single underlined part is the region (target sequence) that hybridizes with the substrate recognition region;
  • the double underlined part (or some nucleotides therein) is the region that hybridizes with different extended recognition regions.
  • RS+1 its extended recognition region u hybridizes with the first nucleotide a at the 5' end of the double underlined part;
  • RS+3 its extended recognition region ggu hybridizes with the first three nucleotides acc at the 5' end of the double underlined part;
  • RS+14 its extended recognition region hybridizes with the double underlined part.
  • the target sequence in the target nucleic acid that hybridizes with the substrate recognition region for example, TGTCTT
  • the sequence in the target nucleic acid that hybridizes with the extended recognition region for example, accgggttggactc recognized by RS+14; SEQ ID NO: 102 are continuous nucleotide sequences in the target nucleic acid.
  • the corresponding RNAs with different RS extensions were prepared according to the method in Example 2. Wild-type DEAR1 and DEAR1 RS+1-14 were used to cut the substrate ssDNA (sequences shown above) with Cy5 labeling at the 3' end, respectively. According to the experimental method in Examples 3-6, 10 mM KCl, 50 mM MgCl2, 40 mM MOPS 7.5, The cleavage activity of DEAR1 molecules before and after modification on the substrate was detected under the following conditions. The cleavage time points were selected as 0, 1, 6, and 24 h.
  • the cleavage ratio at 24 hours was taken from the last time point in the cleavage curve. Each group was repeated 3 times.
  • the TRS region and the RS region are shown in the figure, and the DNA substrate and TRS and RS sequences of different lengths match each other.
  • the experimental results are shown in A-D in Figure 28.
  • the cutting rate reaches the maximum at RS+14, and it is active when the length of the inserted extended recognition region is 1-14nt.
  • the extended recognition region replaces the 226th nucleotide in SEQ ID NO:11, i.e., the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: a;
  • RS+2 The extended recognition region replaces the 226th nucleotide in SEQ ID NO:11, that is, the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: ga;
  • Extended recognition region replaces the 226th nucleotide in SEQ ID NO: 11, i.e., deletes the 226th nucleotide and inserts the extended recognition region between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: cga;
  • RS+4 The extended recognition region replaces the 226th nucleotide in SEQ ID NO:11, that is, the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: gcga;
  • RS+6 The extended recognition region replaces the 226th nucleotide in SEQ ID NO:11, that is, the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: uggcga;
  • the extended recognition region replaces the 226th nucleotide in SEQ ID NO:11, that is, the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: gguggcga;
  • the extended recognition region replaces the 226th nucleotide in SEQ ID NO: 11, that is, the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: acgguggcga (SEQ ID NO: 103);
  • the extended recognition region replaces the 226th nucleotide in SEQ ID NO: 11, that is, the 226th nucleotide is deleted and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: cgacgguggcga (SEQ ID NO: 104);
  • RS+14 The extended recognition region replaces the 226th nucleotide in SEQ ID NO: 11, that is, the 226th nucleotide is deleted, and the extended recognition region is inserted between the 225th nucleotide and the 227th nucleotide.
  • the sequence of the extended recognition region is: accgacgguggcga (SEQ ID NO: 105).
  • nucleotide sequence of DEAR2RS+14 is as follows (SEQ ID NO: 106; wherein the single underlined region is the TRS region, and the double underlined region is the RS region):
  • the substrate sequence used for testing DEAR2 and modified DEAR2 is (SEQ ID NO: 107):
  • the single underlined part is the region (target sequence) that hybridizes with the substrate recognition region;
  • the double underlined part (or some nucleotides therein) is the region that hybridizes with different extended recognition regions.
  • RS+1 its extended recognition region a hybridizes with the first nucleotide t at the 5' end of the double underlined part;
  • RS+3 its extended recognition region cga hybridizes with the first three nucleotides tcg at the 5' end of the double underlined part;
  • RS+14 its extended recognition region hybridizes with the double underlined part.
  • the target sequence (for example, TGCCTA) in the target nucleic acid that hybridizes with the substrate recognition region and the target sequence (for example, TGCCTA) that hybridizes with the extended recognition region The sequence in the target nucleic acid (eg, tcgccaccgtcggt recognized by RS+14; SEQ ID NO: 108) is a continuous nucleotide sequence in the target nucleic acid.
  • the corresponding RNAs with different RS extensions i.e., DEAR2RS+1-14
  • DEAR2RS+1-14 were prepared according to the method in Example 2. Wild-type DEAR2 and DEAR2RS+1-14 were used to cut the substrate ssDNA (sequences shown above) with Cy5 labeling at the 3' end, and the ssDNA was lysed according to the experimental method in Examples 3-6 in 10 mM KCl, 50 mM MgCl2, 40 mM MOPS 7.5, The cleavage activity of DEAR2 molecules before and after modification on substrates was detected under the following conditions. The cleavage time points were selected as 0, 1, 6, and 24 h.
  • E refers to the cleavage ratio (%)
  • E 0 refers to the cleavage ratio at the 0 h time point (%)
  • Plateau refers to the maximum cleavage ratio (%)
  • k refers to the rate constant (h -1 )
  • t refers to the cleavage time (h)).
  • the cleavage ratio at 24 h was taken from the last time point in the cleavage curve. Each group was repeated 3
  • the experimental results are shown in A-H in Figure 28.
  • the cutting rate is improved when the RS is extended (i.e., the extended recognition region is inserted), and the inserted extended recognition region is active when the length is 1-14 nt.
  • the cutting efficiency is the highest when the RS length is 14 nt (i.e., RS+14), and the cutting ratio is relatively the highest at RS+2 and RS+3.
  • the cutting rate is the highest at RS+6, and the relative cutting ratio is the highest at RS+8.
  • RNAs with different RS extensions were prepared according to the method in Example 2. Wild-type DEAR1, DEAR1 RS+14, wild-type DEAR2, and DEAR2RS+14 were used to cut the substrate ssDNA with Cy5 labeling at the 3' end (i.e., the substrates in (1) and (2) above and the following DEAR1-58-1-new-ssDNA-NT and DEAR2-58-1-new-ssDNA-NT), respectively.
  • the ssDNA was purified by PCR in 10 mM KCl, 50 mM MgCl2, 40 mM MOPS 7.5, The cleavage activity of the above substrates of DEAR1 and DEAR2 molecules before and after modification was detected under the same conditions.
  • the cleavage time points were selected as 0, 1, 6, and 24 hours.
  • the statistical K values were as described in (1) and (2) above. Each group was repeated 3 times.
  • the non-specific substrate sequence used is as follows:
  • DEAR1-58-1-new-ssDNA-NT (SEQ ID NO: 109; DEAR1 and the extended recognition region do not match the substrate, and the unmatched region is underlined):
  • DEAR2-58-1-new-ssDNA-NT SEQ ID NO: 110; DEAR2 and the extended recognition region do not match the substrate, and the unmatched region is underlined
  • both the modified DEAR1 and the modified DEAR2 have a significant improvement in the cleavage of specific substrates compared to the cleavage of non-specific substrates.
  • both RS-paired and non-paired substrates have cleavage efficiency, but the modified DEAR1 (RS inserted) has a high cleavage efficiency for paired substrates, but has no cleavage effect on non-paired substrates.
  • both RS-paired and non-paired substrates have cleavage efficiency
  • the modified DEAR2 (RS inserted) has a high cleavage efficiency for paired substrates, but has a weaker cleavage effect on non-paired substrates.
  • the above results show that the modified DEAR has good specificity.
  • RNAs with different RS extensions according to the method in Example 2 (i.e., the RNAs in (1) above) DEAR1 RS+14).
  • the ssDNA substrates with Cy5 labeling at the 3' end (the following SM1-20 substrates, the substrates in (1) above, and the DEAR1-58-1-new-ssDNA-NT in (3) above (i.e., as RS unpaired substrates), and the following DEAR1-58-1-new-ssDNA-TSR-NT substrate (i.e., TRS unpaired substrates)) were cleaved according to the experimental method in Example 3-6 in 10 mM KCl, 50 mM MgCl2, 40 mM MOPS 7.5, The substrate cleavage activity of DEAR1 before and after modification was detected under the same conditions. The cleavage time points were 0, 1, 6, and 24 h. The statistical K values were as described in (1) and (2) above. Each group was repeated twice.
  • the substrate sequence for single base mutation is shown below, with the single mutation position underlined:
  • Dr1-58-ms1-ssDNA-TRS3 (SEQ ID NO: 111; SM1 substrate):
  • Dr1-58-ms2-ssDNA-TRS3 (SEQ ID NO: 112; SM2 substrate):
  • Dr1-58-ms3-ssDNA-TRS3 (SEQ ID NO: 113; SM3 substrate):
  • Dr1-58-ms4-ssDNA-TRS3 (SEQ ID NO: 114; SM4 substrate):
  • Dr1-58-ms5-ssDNA-TRS3 (SEQ ID NO: 115; SM5 substrate):
  • Dr1-58-ms6-ssDNA-TRS3 (SEQ ID NO: 116; SM6 substrate):
  • Dr1-58-ms7-ssDNA-TRS3 (SEQ ID NO: 117; SM7 substrate):
  • Dr1-58-ms8-ssDNA-TRS3 (SEQ ID NO: 118; SM8 substrate):
  • Dr1-58-ms9-ssDNA-TRS3 (SEQ ID NO: 119; SM9 substrate):
  • Dr1-58-ms10-ssDNA-TRS3 (SEQ ID NO: 120; SM10 substrate):
  • Dr1-58-ms11-ssDNA-TRS3 (SEQ ID NO: 121; SM11 substrate):
  • Dr1-58-ms12-ssDNA-TRS3 (SEQ ID NO: 122; SM15 substrate):
  • Dr1-58-ms14-ssDNA-TRS3 (SEQ ID NO: 124; SM14 substrate):
  • Dr1-58-ms15-ssDNA-TRS3 (SEQ ID NO: 125; SM15 substrate):
  • Dr1-58-ms16-ssDNA-TRS3 (SEQ ID NO: 126; SM16 substrate):
  • Dr1-58-ms17-ssDNA-TRS3 (SEQ ID NO: 127; SM17 substrate):
  • Dr1-58-ms19-ssDNA-TRS3 (SEQ ID NO: 129; SM19 substrate):
  • Dr1-58-ms20-ssDNA-TRS3 (SEQ ID NO: 130; SM20 substrate):
  • DEAR1-58-1-new-ssDNA-TRS-NT (SEQ ID NO: 131; i.e., TRS unpaired substrate):

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Epidemiology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne un procédé de préparation d'un système de manipulation d'acide nucléique DEAR modifié. Le procédé de préparation comprend au moins l'un des éléments suivants : l'extension d'une région de reconnaissance de substrat d'une molécule d'ARN d'un système de manipulation d'acide nucléique DEAR d'origine à une longueur de 7 à 14 nucléotides ; l'ajout d'une séquence de recrutement à l'extrémité 3' de la molécule d'ARN du système de manipulation d'acide nucléique DEAR d'origine ; et l'approvisionnement, respectivement, d'un premier motif de dimérisation et d'un second motif de dimérisation dans les domaines III d'une première molécule d'ARN et d'une seconde molécule d'ARN dans le système de manipulation d'acide nucléique DEAR d'origine, de telle sorte que la première et la seconde molécule d'ARN forment un hétérodimère, etc. Le procédé de préparation du système de manipulation d'acide nucléique DEAR modifié selon la présente invention peut en outre améliorer la spécificité et l'activité de clivage de systèmes de manipulation d'acide nucléique DEAR à base d'ARN-ribozyme.
PCT/CN2024/074721 2023-10-17 2024-01-30 Procédé de préparation de système de manipulation d'acide nucléique dear modifié Pending WO2025081686A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202311344851.3 2023-10-17
CN202311344851.3A CN119842702B (zh) 2023-10-17 2023-10-17 一种工程改造的dear核酸操纵系统的制备方法

Publications (1)

Publication Number Publication Date
WO2025081686A1 true WO2025081686A1 (fr) 2025-04-24

Family

ID=95356993

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2024/074721 Pending WO2025081686A1 (fr) 2023-10-17 2024-01-30 Procédé de préparation de système de manipulation d'acide nucléique dear modifié

Country Status (2)

Country Link
CN (1) CN119842702B (fr)
WO (1) WO2025081686A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5180818A (en) * 1990-03-21 1993-01-19 The University Of Colorado Foundation, Inc. Site specific cleavage of single-stranded dna
WO2015089277A1 (fr) * 2013-12-12 2015-06-18 The Regents Of The University Of California Procédés et compositions pour modifier un acide nucléique cible monobrin
CN113549648A (zh) * 2021-07-19 2021-10-26 中国农业大学 一种新型基因编辑系统及相关载体和方法
CN113795587A (zh) * 2019-03-07 2021-12-14 纽约市哥伦比亚大学理事会 使用Tn7样转座子进行RNA引导的DNA整合
CN113891936A (zh) * 2019-03-19 2022-01-04 布罗德研究所股份有限公司 编辑核苷酸序列的方法和组合物
CN115335526A (zh) * 2020-02-07 2022-11-11 罗切斯特大学 核酶介导的rna组装和表达

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005123937A2 (fr) * 2004-06-14 2005-12-29 The University Of Texas At Austin Methodes d'expression de particules d'arn dans des cellules eucaryotes
SG10201804973TA (en) * 2013-12-12 2018-07-30 Broad Inst Inc Compositions and Methods of Use of Crispr-Cas Systems in Nucleotide Repeat Disorders
CA3113817A1 (fr) * 2018-10-09 2020-04-16 The University Of North Carolina At Chapel Hill Systeme d'edition de genes regule

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5180818A (en) * 1990-03-21 1993-01-19 The University Of Colorado Foundation, Inc. Site specific cleavage of single-stranded dna
WO2015089277A1 (fr) * 2013-12-12 2015-06-18 The Regents Of The University Of California Procédés et compositions pour modifier un acide nucléique cible monobrin
CN113795587A (zh) * 2019-03-07 2021-12-14 纽约市哥伦比亚大学理事会 使用Tn7样转座子进行RNA引导的DNA整合
CN113891936A (zh) * 2019-03-19 2022-01-04 布罗德研究所股份有限公司 编辑核苷酸序列的方法和组合物
CN115335526A (zh) * 2020-02-07 2022-11-11 罗切斯特大学 核酶介导的rna组装和表达
CN113549648A (zh) * 2021-07-19 2021-10-26 中国农业大学 一种新型基因编辑系统及相关载体和方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BELFORT MARLENE, LAMBOWITZ ALAN M.: "Group II Intron RNPs and Reverse Transcriptases: From Retroelements to Research Tools", COLD SPRING HARBOR PERSPECTIVES IN BIOLOGY, vol. 11, no. 4, 1 April 2019 (2019-04-01), United States , pages 1 - 17, XP093306810, ISSN: 1943-0264, DOI: 10.1101/cshperspect.a032375 *
LIU ZI-XIAN, ZHANG SHOUYUE, ZHU HAN-ZHOU, CHEN ZHI-HANG, YANG YUN, LI LONG-QI, LEI YUAN, LIU YUN, LI DAN-YUAN, SUN AO, LI CHENG-PI: "Hydrolytic endonucleolytic ribozyme (HYER) is programmable for sequence-specific DNA cleavage", SCIENCE, vol. 383, no. 6682, 2 February 2024 (2024-02-02), US , pages 1 - 14, XP093306804, ISSN: 0036-8075, DOI: 10.1126/science.adh4859 *

Also Published As

Publication number Publication date
CN119842702A (zh) 2025-04-18
CN119842702B (zh) 2025-11-28

Similar Documents

Publication Publication Date Title
US10760081B2 (en) Compositions and methods for enhancing CRISPR activity by POLQ inhibition
US9879283B2 (en) CRISPR oligonucleotides and gene editing
Martens et al. RNAi in Dictyostelium: the role of RNA-directed RNA polymerases and double-stranded RNase
CA2913865C (fr) Procede visant a produire un clivage d'adn precis par l'activite de cas9 nickase
JP5735927B2 (ja) タンパク質生産の増強のためのmRNAの一次構造の再操作
JP4747245B2 (ja) RNAiライブラリーの酵素的構築方法
CA3111432A1 (fr) Nouvelles enzymes crispr et systemes
US20220298228A1 (en) Novel eukaryotic cells and methods for recombinantly expressing a product of interest
CN109154001A (zh) 新型最小utr序列
KR20240099418A (ko) 세린 재조합효소
EP4114845A1 (fr) Recombinaison du génome guidé par arn à l'échelle du kilobase
JP7700222B2 (ja) 編集効率が向上したプライム編集ベースの遺伝子編集用組成物およびその用途
WO2014102688A1 (fr) Nouvelle matrice de conception pour l'amélioration du ciblage génique dirigé par homologie
CN111304172A (zh) 一种基于CRISPR-Cas9编辑技术的敲除鸡EphA2基因的细胞系的构建方法
WO2022159741A1 (fr) Compositions comprenant une nucléase et leurs utilisations
US20120058917A1 (en) Nucleic Acids and Libraries
Stiefel et al. Noncoding RNAs, post-transcriptional RNA operons and Chinese hamster ovary cells
KR102699756B1 (ko) 편집 효율이 향상된 프라임 편집 기반 유전자 교정용 조성물 및 이의 용도
WO2020093025A1 (fr) Procédés d'inactivation d'une séquence cible par l'introduction d'un codon d'arrêt prématuré
WO2025081686A1 (fr) Procédé de préparation de système de manipulation d'acide nucléique dear modifié
US20240101983A1 (en) Programmable rna editing platform
WO2024216743A1 (fr) Système de manipulation d'acides nucléiques dear à base de ribozymes d'arn et son utilisation
CN101679976B (zh) 核酸和文库
KR20250174608A (ko) Rna 리보자임 기반 dear 핵산 조작 시스템 및 이의 용도
JP2015180203A (ja) タンパク質生産の増強のためのmRNAの一次構造の再操作

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24878343

Country of ref document: EP

Kind code of ref document: A1