WO2023232109A1 - 新的crispr基因编辑系统 - Google Patents
新的crispr基因编辑系统 Download PDFInfo
- Publication number
- WO2023232109A1 WO2023232109A1 PCT/CN2023/097783 CN2023097783W WO2023232109A1 WO 2023232109 A1 WO2023232109 A1 WO 2023232109A1 CN 2023097783 W CN2023097783 W CN 2023097783W WO 2023232109 A1 WO2023232109 A1 WO 2023232109A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- guide rna
- trac
- sequence
- protein
- effector protein
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/90—Vectors containing a transposable element
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
Definitions
- the invention belongs to the field of genetic engineering. Specifically, the present invention relates to a new CRISPR gene editing system and its application. More specifically, the present invention provides a transposon and CRISPR-Cas12 intermediate (TraC) effector protein or a functional variant thereof, as well as a gene editing system based thereon and its application.
- TraC transposon and CRISPR-Cas12 intermediate
- the Type V effector protein is a Cas12 protein with multiple functional domains. Its iconic feature is that it contains a RuvC-like domain, which is generally responsible for the cleavage of target DNA.
- Type V subtypes are very abundant. The currently discovered and classified subtypes include Cas12a-k, a total of 11 subtypes. Among them, Cas12a and Cas12b have been developed into efficient eukaryotic gene editing systems.
- Cas12a also known as Cpf1 protein, includes a RuvC-like domain similar to Cas9 protein or TnpB protein, but compared with Cas9, Cas12a family proteins lack the HNH domain and only use the RuvC domain to cut the two strands of DNA. .
- Cas12b was called C2c1 (Class 2 Candidate 1) when it was first discovered. Its C-terminal sequence is very similar to the TnpB protein of the IS605 family, but does not have significant sequence similarity with other Class II family proteins. Its Cas genes include Cas1/Cas4 fusion gene, Cas2, and Cas12b genes. The maturation of crRNA also requires the participation of trRNA. Cas12c was originally called C2c3 (Class 2 Candidate 3), and its Cas genes only include Cas1 and Cas12c genes. The Cas12c gene only has limited similarity with the TnpB homologous sequence of Cpf1.
- Type V subtypes have increased explosively in recent years.
- a total of 10 Type V subtypes have been discovered, including Cas12a, Cas12b and Cas12c proteins.
- the nucleic acid interference activities of these subtypes have also been gradually demonstrated experimentally.
- scientists from Arbor Biotechnology Company have demonstrated through in vitro experiments the DNA double-stranded cleavage activity of effector proteins Cas12c, Cas12g, Cas12h and Cas12i from Type VC, Type VG, Type VH, and Type VI.
- the effector proteins of Type VD and Type VE subtypes are CasX and CasY respectively, also known as Cas12d and Cas12e.
- the effector protein of the Type VF subtype which was previously considered to be one of the subtypes of the Type VU family, is Cas14 (also known as Cas12f), which can cleave single-stranded DNA and RNA and is only one-third the size of the Cas9 protein.
- the Cas14 protein was first developed into the nucleic acid detection tool DETECTOR, and has recently been proven to have double-stranded DNA cleavage activity in prokaryotes and eukaryotes.
- the Cas ⁇ protein (also known as Cas12j) recently discovered in macrophages has also been proven to have the ability to cut double-stranded DNA in prokaryotes, animal cells, and plant cells.
- its effector protein Cas12k is "hijacked" by the transposon Tn7, which can generate an R-loop at the target site and utilize the targeting ability of crRNA to achieve site-specific transposition of the transposon.
- This hijacking protein provides a new strategy for targeted insertion into DNA.
- Embodiment 1 An engineered clustered regularly interspaced short palindromic repeats (CRISPR) system, comprising:
- the guide RNA is selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) a guide RNA comprising tracrRNA and/or crRNA, such as a single guide RNA (sgRNA) comprising tracrRNA and crRNA;
- reRNA guide RNA
- sgRNA single guide RNA
- the TraC effector protein can form a CRISPR complex with guide RNA
- the TraC effector protein can target-bind to a target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or can target-bind to a target DNA sequence under the guidance of a guide RNA containing tracrRNA and/or crRNA.
- Embodiment 2 The engineered CRISPR system of embodiment 1, wherein the tracrRNA contains a non-targeting strand binding sequence (NTB) complementary to a non-targeting strand (NTS).
- NTB non-targeting strand binding sequence
- NTS non-targeting strand
- Embodiment 3 An engineered clustered regularly interspaced short palindromic repeats CRISPR vector system comprising one or more constructs, comprising:
- a second regulatory element operably linked to one or more nucleotide sequences encoding one or more guide RNAs selected from i) derived from a guide RNA (reRNA) of the right end element of the transposon and/or ii) a guide RNA containing tracrRNA and/or crRNA, such as a single guide RNA (sgRNA) containing tracrRNA and crRNA;
- reRNA guide RNA
- sgRNA single guide RNA
- the TraC effector protein can form a CRISPR complex with guide RNA
- the TraC effector protein can target-bind to a target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or can target-bind to a target DNA sequence under the guidance of a guide RNA containing tracrRNA and/or crRNA.
- Embodiment 4 The engineered CRISPR vector system of embodiment 3, wherein the tracrRNA contains a non-targeting strand binding sequence (NTB) complementary to a non-targeting strand (NTS).
- NTB non-targeting strand binding sequence
- NTS non-targeting strand
- Embodiment 5 The system of embodiment 2 or 4, wherein the guide RNA is a guide RNA comprising tracrRNA and crRNA, wherein the tracrRNA contains a non-targeting strand binding sequence complementary to the non-targeting strand (NTS) ( NTB), wherein the guide RNA hybridizes to the targeting strand (TS) of the target DNA sequence via crRNA and to the non-targeting strand (NTS) via NTB.
- NTS targeting strand
- NTS targeting strand
- Embodiment 6 The system of embodiment 4, wherein when transcribed, the one or more guide RNAs Hybridizes to the target DNA, and the guide RNA forms a complex with the TraC effector protein, which causes distal cleavage of the target DNA sequence.
- Embodiment 7 The system of any one of embodiments 1-6, wherein the target DNA sequence is within a cell, preferably within a eukaryotic cell.
- Embodiment 8 The system of any one of embodiments 1-7, wherein the effector protein comprises one or more nuclear localization sequences (NLS), cytoplasmic localization sequences, chloroplast localization sequences or mitochondrial localization sequences.
- NLS nuclear localization sequences
- cytoplasmic localization sequences cytoplasmic localization sequences
- chloroplast localization sequences mitochondrial localization sequences.
- Embodiment 9 The system of any one of embodiments 1-8, wherein the nucleic acid sequences encoding the effector protein are codon optimized for expression in eukaryotic cells.
- Embodiment 10 The system of any one of embodiments 1-9, wherein components a) and b) or their nucleotide sequences are constructed on the same or different vectors.
- Embodiment 11 A method for modifying a DNA sequence of interest, which method includes systematically delivering the DNA sequence of interest as described in any one of Embodiments 1-10 into the DNA sequence of interest or a cell containing the DNA sequence of interest.
- Embodiment 12 A method of modifying a DNA sequence of interest, the method comprising delivering a composition of a TraC effector protein and one or more nucleic acid components to the DNA sequence of interest, wherein the effector protein is capable of being transformed upon derivation.
- Targeted binding to the target DNA sequence is guided by the guide RNA of the right end element of the transposon, and can also be targeted to bind the target DNA sequence under the guidance of the guide RNA containing tracrRNA and crRNA; the effector protein and the one or more nucleic acid components
- a CRISPR complex is formed, and after the complex is targeted to bind to a DNA sequence of interest that is 3' to the protospacer adjacent motif (PAM), the effector protein induces modification of the DNA sequence of interest.
- PAM protospacer adjacent motif
- Embodiment 13 The method of embodiment 12, wherein the gene of interest is in a cell, preferably a eukaryotic cell.
- Embodiment 14 The method of embodiment 13, wherein the cell is an animal cell or a human cell.
- Embodiment 15 The method of embodiment 13, wherein the cell is a plant cell.
- Embodiment 16 The method of embodiment 12, wherein the effector protein comprises one or more nuclear localization sequences (NLS), cytoplasmic localization sequences, chloroplast localization sequences or mitochondrial localization sequences.
- NLS nuclear localization sequences
- cytoplasmic localization sequences cytoplasmic localization sequences
- chloroplast localization sequences mitochondrial localization sequences.
- Embodiment 17 The method of embodiment 12, wherein the effector protein and nucleic acid component, or a construct expressing the effector protein and nucleic acid component, are comprised in a delivery system.
- Embodiment 18 The method of embodiment 17, wherein the delivery system comprises a virus, virus-like particles, virions, liposomes, vesicles, exosomes, liposomal nanoparticles (LNP), N-acetylgalactose amine (GalNAc) or engineered bacteria.
- the delivery system comprises a virus, virus-like particles, virions, liposomes, vesicles, exosomes, liposomal nanoparticles (LNP), N-acetylgalactose amine (GalNAc) or engineered bacteria.
- Embodiment 19 A transposon and CRISPR-Cas12 intermediate (TraC) effector protein or functional variant thereof for genome editing in an organism or an organism cell, wherein the TraC effector protein or its function Sexual variants can form CRISPR complexes with guide RNA;
- TraC CRISPR-Cas12 intermediate
- the guide RNA is selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) a guide RNA comprising tracrRNA and crRNA, such as a single guide RNA (sgRNA) comprising tracrRNA and crRNA;
- reRNA guide RNA
- sgRNA single guide RNA
- the TraC effector protein or a functional variant thereof is capable of acting on a guide RNA derived from the right end element of a transposon. Targeted binding to target DNA sequences can also be guided by guide RNA containing tracrRNA and crRNA.
- Embodiment 20 Transposon and CRISPR-Cas12 intermediate protein (TraC) effector protein or functional variant thereof for genome editing in an organism or organism cell, the TraC effector protein or a functional variant thereof and
- (ii) Contains at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70% with one of SEQ ID NO: 1-37 , at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least An amino acid sequence that is 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or even 100% sequence identical, or contains one or more, with respect to SEQ ID NOs: 1-37, For example, an amino acid sequence with 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids substituted, deleted or added.
- Embodiment 21 The TraC effector protein of embodiment 20 or a functional variant thereof, wherein the effector protein functional variant is derived from SEQ ID NO: 25 and includes a sequence selected from K78R relative to SEQ ID NO: 25 , D86R, S137R, V145R, I147R, P148R, D150R, V228R, V254R, A510R, A278R, K315R, S334R, L343R, A369R, H392R, L394R, S408R, N456R, V500R, A510R, T One or more amino acid substitutions of 573R.
- Embodiment 22 The TraC effector protein of embodiment 20 or 21, or a functional variant thereof, wherein the effector protein functional variant is derived from SEQ ID NO:25 and includes a sequence selected relative to SEQ ID NO:25. Mutations from any of the sets shown in Table 3 or Table 4.
- Embodiment 23 The TraC effector protein or functional variant thereof of embodiment 20, wherein the effector protein functional variant comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 80-87.
- Embodiment 24 The TraC effector protein of any one of embodiments 20-23, or a functional variant thereof, which has at least guide RNA-mediated sequence-specific targeting ability.
- Embodiment 25 The TraC effector protein of any one of embodiments 20-23, or a functional variant thereof, which has guide RNA-mediated sequence-specific targeting ability, and double-stranded nucleic acid cleavage activity.
- Embodiment 26 The TraC effector protein of any one of embodiments 20-23, or a functional variant thereof, having guide RNA-mediated sequence-specific targeting ability, and nickase activity.
- Embodiment 27 The TraC effector protein of any one of embodiments 20-23, or a functional variant thereof, which has guide RNA-mediated sequence-specific targeting ability but does not have double-stranded nucleic acid cleavage activity and/or nicking Enzyme activity.
- Embodiment 28 The TraC effector protein of any one of embodiments 24-27, or a functional variant thereof, wherein the guide RNA is selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) A guide RNA comprising tracrRNA and/or crRNA, such as a single guide RNA (sgRNA) comprising tracrRNA and crRNA.
- reRNA guide RNA
- sgRNA single guide RNA
- Embodiment 29 The TraC effector protein of embodiment 28 or a functional variant thereof, said TraC effector protein White or its functional variant can target the target DNA sequence under the guidance of the guide RNA derived from the right end element of the transposon, or can target the target DNA sequence under the guidance of the guide RNA containing tracrRNA and crRNA.
- Embodiment 30 The TraC effector protein of embodiment 28 or a functional variant thereof, wherein the guide RNA is a reRNA derived from the TnpB system, for example, the reRNA comprises the scaffold sequence shown in SEQ ID NO: 77 or 78.
- Embodiment 31 The TraC effector protein of embodiment 28 or a functional variant thereof, wherein the guide RNA is a single guide RNA (sgRNA) of tracrRNA and crRNA, for example, the sgRNA comprises SEQ ID NO: 75 or 76 Scaffold sequence.
- sgRNA single guide RNA
- Embodiment 32 The TraC effector protein of any one of embodiments 19-31, or a functional variant thereof, further comprising at least one nuclear localization sequence (NLS), cytoplasmic localization sequence, chloroplast localization sequence or mitochondrial localization sequence.
- NLS nuclear localization sequence
- cytoplasmic localization sequence cytoplasmic localization sequence
- chloroplast localization sequence mitochondrial localization sequence
- Embodiment 33 A fusion protein comprising the TraC effector protein or functional variant thereof according to any one of embodiments 19-32, and at least one other functional protein.
- Embodiment 34 The fusion protein of embodiment 33, wherein the other functional protein is a deaminase.
- Embodiment 35 The fusion protein of embodiment 34, wherein the deaminase is a cytosine deaminase, for example, the cytosine deaminase is selected from the group consisting of APOBEC1 deaminase, activation-induced cytidine deaminase (AID ), APOBEC3G, CDA1, human APOBEC3A deaminase, double-stranded DNA deaminase (Ddd), single-stranded DNA deaminase (Sdd) or their functional variants.
- APOBEC1 deaminase activation-induced cytidine deaminase
- AID activation-induced cytidine deaminase
- APOBEC3G activation-induced cytidine deaminase
- CDA1 APOBEC3G
- human APOBEC3A deaminase double-stranded
- Embodiment 36 The fusion protein of embodiment 35, further comprising a uracil DNA glycosylase inhibitor (UGI).
- UMI uracil DNA glycosylase inhibitor
- Embodiment 37 The fusion protein of embodiment 34, wherein the deaminase is an adenine deaminase, eg, a DNA-dependent adenine deaminase derived from E. coli tRNA adenine deaminase TadA (ecTadA).
- the deaminase is an adenine deaminase, eg, a DNA-dependent adenine deaminase derived from E. coli tRNA adenine deaminase TadA (ecTadA).
- Embodiment 38 The fusion protein of any one of embodiments 34-37, wherein the fusion protein includes cytosine deaminase and adenine deaminase.
- Embodiment 39 The fusion protein of embodiment 33, wherein the other functional protein is selected from the group consisting of a transcription activator protein, a transcription repressor protein, a DNA methylase, a DNA demethylase, and a reverse transcriptase.
- Embodiment 40 The fusion protein of any one of embodiments 33-39, wherein different parts of the fusion protein can be connected independently through a linker or directly.
- Embodiment 41 The fusion protein of any one of embodiments 33-40, further comprising at least one nuclear localization sequence (NLS), cytoplasmic localization sequence, chloroplast localization sequence or mitochondrial localization sequence.
- NLS nuclear localization sequence
- cytoplasmic localization sequence cytoplasmic localization sequence
- chloroplast localization sequence mitochondrial localization sequence
- Embodiment 42 The TraC effector protein of any one of embodiments 19-32 or a functional variant thereof or the fusion protein of any one of embodiments 33-41 is performed on cells, preferably eukaryotic cells, more preferably plant cells. Uses of genome editing.
- Embodiment 43 The use of embodiment 42, wherein the genome editing includes base editing (Base Editor), guide editing (Prime Editor), and PrimeRoot editing (PrimRoot Editor).
- Embodiment 44 A genome editing system for site-directed modification of target nucleic acid sequences in cellular genomes system, which includes:
- An expression construct encoding a nucleotide sequence encoding the TraC effector protein of any one of embodiments 19-32, or a functional variant thereof, or the fusion protein of any one of embodiments 33-41.
- Embodiment 45 The genome editing system of embodiment 44, further comprising at least one guide RNA (gRNA) and/or an expression construct comprising a nucleotide sequence encoding the at least one guide RNA.
- gRNA guide RNA
- Embodiment 46 The genome editing system of embodiment 45, wherein the genome editing system comprises any one selected from:
- an expression construct comprising a nucleotide sequence encoding the TraC effector protein of any one of embodiments 19-32 or a functional variant thereof or the fusion protein of any one of embodiments 33-41, and said at least a guide RNA;
- an expression construct comprising a nucleotide sequence encoding the TraC effector protein of any one of embodiments 19-32 or a functional variant thereof or the fusion protein of any one of embodiments 33-41, and an expression construct comprising an expression construct of the nucleotide sequence of at least one guide RNA;
- v) comprising a nucleotide sequence encoding the TraC effector protein of any one of embodiments 19-32 or a functional variant thereof or the fusion protein of any one of embodiments 33-41 and encoding said at least one guide RNA Expression constructs of nucleotide sequences.
- Embodiment 47 The genome editing system of any one of embodiments 45-46, wherein the guide RNA is selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) comprising tracrRNA and/or crRNA Guide RNA, such as single guide RNA (sgRNA) including tracrRNA and crRNA.
- reRNA guide RNA
- sgRNA single guide RNA
- Embodiment 48 The genome editing system of embodiment 47, wherein the guide RNA is a reRNA derived from the TnpB system, for example, the reRNA comprises the scaffold sequence shown in SEQ ID NO: 77 or 78.
- Embodiment 49 The genome editing system of any one of embodiments 45-46, wherein the guide RNA is a single guide RNA (sgRNA) comprising tracrRNA and crRNA, for example, the sgRNA comprises SEQ ID NO: 75 or 76 Shows the stent sequence.
- sgRNA single guide RNA
- the sgRNA comprises SEQ ID NO: 75 or 76 Shows the stent sequence.
- Embodiment 50 The genome editing system of embodiment 47 or 49, wherein the guide RNA comprises tracrRNA and crRNA, such as a single guide RNA (sgRNA) comprising tracrRNA and crRNA, wherein the crRNA comprises the same target sequence immediately adjacent to the PAM
- tracrRNA contains a sequence complementary to a sequence located distal to the PAM target sequence (non-targeting strand binding sequence, NTB).
- Embodiment 51 The genome editing system of any one of embodiments 44-50, wherein the genome editing system further comprises a donor nucleic acid molecule comprising a nucleotide sequence to be site-specifically inserted into the genome, e.g. As described, the nucleotide sequence to be inserted into the genome at a specific site includes sequences homologous to the sequences flanking the target sequence in the genome.
- Embodiment 52 The genome editing system of any one of embodiments 44-51, wherein the nucleotide sequence encoding the TraC effector protein or a functional variant thereof or the fusion protein and/or encoding the at least one
- the nucleotide sequence of the guide RNA is operably linked to an expression control element such as a promoter.
- Embodiment 53 The genome editing system of any one of embodiments 44-52, wherein the components of the genome editing system are comprised in a delivery system selected from the group consisting of viruses, virus-like particles, virions, lipids Plastids, vesicles, exosomes, liposome nanoparticles (LNP), N-acetylgalactosamine (GalNAc) or engineered bacteria.
- a delivery system selected from the group consisting of viruses, virus-like particles, virions, lipids Plastids, vesicles, exosomes, liposome nanoparticles (LNP), N-acetylgalactosamine (GalNAc) or engineered bacteria.
- Embodiment 54 A method of producing a genetically modified cell, comprising introducing the genome editing system of any one of embodiments 44-53 into the cell.
- Embodiment 55 The method of embodiment 54, wherein the cells are from prokaryotes or eukaryotes, preferably from mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens , ducks, geese; plants, including monocots and dicots, such as rice, corn, wheat, sorghum, barley, soybean, peanut, and Arabidopsis.
- prokaryotes or eukaryotes preferably from mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens , ducks, geese; plants, including monocots and dicots, such as rice, corn, wheat, sorghum, barley, soybean, peanut, and Arabidopsis.
- the present invention obtains new CRISPR effector proteins and their genome editing systems, enriching the selection and application scenarios of genome editing tools;
- the TraC subbranch CRISPR effector protein obtained by the present invention has a dual-guide mechanism and has the targeted cleavage pathways of both the TnpB system and the CRISPR system. That is, the TraC effector protein can both target and bind target DNA under the guidance of reRNA. It can target and bind target DNA under the guidance of sgRNA, helping to achieve multiple genome editing under the same gene editing tool;
- the TraC effector protein obtained by the present invention is the smallest monomer among the currently known monomeric Cas12 proteins, which is helpful for achieving delivery and editing in vivo;
- the TraC effector protein obtained in the present invention interacts with the non-targeting strand of the target dsDNA to form a bubble under the guidance of sgRNA containing the non-targeting strand complementary sequence (NTB) complementary to the non-targeting strand (NTS). structure, which facilitates the opening and editing of PAM distal DNA.
- NTB non-targeting strand complementary sequence
- NTS non-targeting strand
- Figure 1 Shows three structural motifs conserved in 86 Cas12 proteins.
- Figure 2 Shows the prokaryotic expression system of TraC protein.
- Figure 3 A flow chart showing the use of a fluorescent reporter system to screen CRISPR systems with DNA double-strand binding ability.
- Figure 5 Flow chart for screening CRISPR systems with DNA double-strand cutting ability.
- Figure 6 Test results of DNA double-stranded cleavage ability.
- A The test results of TraC-875, TraC-365, TraC-655, and TraC-445; B: The test results of TraC-297, TraC-459, TraC-466, and TraC-949. LbCpf1 as yang Sexual comparison.
- Figure 7 Flowchart of using the plasmid interference system to detect the DNA double-stranded cleavage ability of the new CRISPR system.
- Figure 8 Results of testing TraC-459, TraC-875 and TraC-297 proteins using the plasmid interference system.
- Figure 9 A: Prediction of secondary structure of accessory RNA and structural folding model analysis of V-type CRISPR and TnpB systems; B: Model of co-evolution of effectors and accessory RNA.
- Figure 10 Shows optimization of TraC protein sgRNA.
- A sgRNA predicted by TraC-459 protein;
- B tracrRNA: The impact of crRNA complementary region truncation length, tracrRNA 5’ region truncation length, and spacer length on the editing efficiency of TraC-459 protein.
- Figure 11 Shows that optimized sgRNA-opt can significantly improve the editing efficiency of TraC-459.
- Figure 12 shows the use of plasmid interference experiments to analyze the dsDNA cleavage ability of TraC-459 on E. coli under different guide RNAs.
- Figure 13 shows the prediction results of the three-dimensional structure folding of TraC-459 protein.
- Figure 14 Shows TraC-459 variant screen.
- TraC effector protein targets DNA in a bubble-like structure guided by reprogrammed sgRNA.
- Reprogrammed sgRNA can improve editing efficiency.
- FIG. 18 Shows that TraC protein is affected by temperature in plant cells.
- the editing efficiency of TraC-5M-7 at 32°C is 1-29 times higher than that at 25°C.
- the protein or nucleic acid may consist of the sequence, or may have additional amino acids or nucleic acids at one or both ends of the protein or nucleic acid. Glycoside acid, but still possesses the activity described in the present invention.
- those skilled in the art know that the methionine encoded by the start codon at the N-terminus of the polypeptide will be retained under certain practical circumstances (such as when expressed in a specific expression system), but will not substantially affect the function of the polypeptide.
- Gene as used herein encompasses not only chromosomal DNA present in the nucleus, but also organellar DNA present in subcellular components of the cell (eg, mitochondria, plastids).
- organism includes any organism suitable for genome editing, preferably eukaryotes.
- organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; plants including monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis thaliana, etc.
- Genetically modified organism or “genetically modified cell” means an organism or cell that contains exogenous polynucleotides or modified genes or expression control sequences within its genome.
- exogenous polynucleotides can be stably integrated into the genome of an organism or cell and inherited for successive generations.
- Exogenous polynucleotides can be integrated into the genome alone or as part of a recombinant DNA construct.
- a modified gene or expression control sequence is one in which the sequence contains single or multiple deoxynucleotide substitutions, deletions, and additions in the genome of an organism or cell.
- Form with respect to a sequence means a sequence from an alien species or, if from the same species, a sequence that has undergone significant changes in composition and/or locus from its native form by deliberate human intervention.
- nucleic acid sequence is used interchangeably and are single- or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural or altered nucleotide bases.
- Nucleotides are referred to by their single-letter names as follows: "A” for adenosine or deoxyadenosine (for RNA or DNA, respectively), “C” for cytidine or deoxycytidine, and “G” for guanosine or Deoxyguanosine, "U” represents uridine, “T” represents deoxythymidine, “R” represents purine (A or G), “Y” represents pyrimidine (C or T), “K” represents G or T, “ H” represents A or C or T, “I” represents inosine, and “N” represents any nucleotide.
- nucleotide sequences may be represented herein as DNA sequences (including T), when referring to RNA, one skilled in the art can readily determine the corresponding RNA sequence (i.e., substituting U for T).
- Polypeptide “peptide,” and “protein” are used interchangeably herein and refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers.
- the terms “polypeptide,” “peptide,” “amino acid sequence,” and “protein” may also include modified forms including, but not limited to, glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation, lation and ADP-ribosylation.
- Sequence "identity” has an art-recognized meaning, and the percentage of sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide or along a region of the molecule.
- identity is well known to those skilled in the art (Carrillo, H. & Lipman, D., SIAM J Applied Math 48:1073 (1988) ).
- construct or “expression construct” refers to a vector, such as a recombinant vector, suitable for expression of a nucleotide sequence of interest in an organism. "Expression” refers to the production of a functional product.
- expression of a nucleotide sequence may refer to transcription of the nucleotide sequence (eg, transcription to produce mRNA or functional RNA) and/or translation of the RNA into a precursor or mature protein.
- the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, an RNA capable of translation (such as mRNA).
- An "expression construct" of the present invention may comprise regulatory sequences and nucleotide sequences of interest from different sources, or control sequences and nucleotide sequences of interest from the same source but arranged in a manner different from that which normally occurs in nature.
- regulatory sequence and “regulatory element” are used interchangeably and refer to a coding sequence that is located upstream (5' non-coding sequence), intermediate or downstream (3' non-coding sequence) and affects the transcription, RNA processing or Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leaders, introns, and polyadenylation recognition sequences.
- a promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
- a promoter is a promoter capable of controlling the transcription of a gene in a cell, whether or not it is derived from said cell.
- the promoter may be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
- tissue-specific promoter and “tissue-preferred promoter” are used interchangeably and refer to expression primarily, but not necessarily exclusively, in one tissue or organ, but also in a specific cell or cell type promoter.
- Developmentally regulated promoter refers to a promoter whose activity is determined by developmental events.
- inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
- operably linked means that a regulatory element (eg, but not limited to, a promoter sequence, a transcription termination sequence, etc.) is linked to a nucleic acid sequence (eg, a coding sequence or an open reading frame) such that the nucleotide Transcription of the sequence is controlled and regulated by the transcriptional regulatory elements.
- a regulatory element eg, but not limited to, a promoter sequence, a transcription termination sequence, etc.
- nucleic acid sequence eg, a coding sequence or an open reading frame
- Introducing" a nucleic acid molecule eg, plasmid, linear nucleic acid fragment, RNA, etc.
- a nucleic acid molecule or protein into an organism means transforming an organism's cells with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
- Transformation as used in the present invention includes stable transformation and transient transformation.
- “Stable transformation” refers to the introduction of exogenous nucleotide sequences into the genome, resulting in stable inheritance of the exogenous nucleotide sequences. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
- Transient transformation refers to the introduction of a nucleic acid molecule or protein into a cell to perform its function without stable inheritance of the exogenous nucleotide sequence. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
- Chargeer refers to the physiological, morphological, biochemical or physical characteristics of a cell or organism.
- Agronomic traits specifically refer to measurable indicator parameters of crop plants, including but not limited to: leaf greenness, grain yield, growth rate, total biomass or accumulation rate, fresh weight at maturity, dry weight at maturity, fruit Yield, seed yield, total plant nitrogen content, fruit nitrogen content, seed nitrogen content, plant vegetative tissue nitrogen content, plant total free amino acid content, fruit free amino acid content, seed free amino acid content, plant vegetative tissue free amino acid content, total plant protein content, fruit protein content, seed protein content, plant vegetative tissue protein content, herbicide resistance and drought resistance, nitrogen absorption, root lodging, harvest index, stem lodging, plant height, ear height, ear length, disease resistance properties, cold resistance, salt resistance and number of tillers, etc.
- the present invention provides a new type of CRISPR effector protein, which has the targeted cleavage activity of the TnpB system and the CRISPR system, that is, it can not only target the target DNA under the guidance of reRNA, but also can bind to the target DNA under the guidance of tracrRNA and/or crRNA. Targeted binding to target DNA is guided by RNA such as sgRNA.
- This subtype of CRISPR nuclease is also referred to herein as transposon and CRISPR-Cas12 intermediate (TraC) effector proteins.
- the invention provides an engineered Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) system, comprising:
- the guide RNA is selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) a guide RNA comprising tracrRNA and/or crRNA, such as a single guide RNA (sgRNA) comprising tracrRNA and crRNA;
- reRNA guide RNA
- sgRNA single guide RNA
- the TraC effector protein can form a CRISPR complex with guide RNA
- the TraC effector protein can target-bind to a target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or can target-bind to a target DNA sequence under the guidance of a guide RNA containing tracrRNA and crRNA.
- the engineered clustered regularly interspaced short palindromic repeats (CRISPR) system is a genome editing system for genome editing in an organism or cells of an organism.
- the TraC effector protein is as defined below.
- the tracrRNA contains a non-targeting strand binding sequence (NTB) complementary to a non-targeting strand (NTS).
- NTB non-targeting strand binding sequence
- NTS non-targeting strand
- the invention also provides an engineered clustered regularly interspaced short palindromic repeats CRISPR vector system comprising one or more constructs, comprising:
- a second regulatory element operably linked to one or more nucleotide sequences encoding one or more guide RNAs selected from i) derived from a guide RNA (reRNA) of the right end element of the transposon and/or ii) a guide RNA containing tracrRNA and/or crRNA, such as a single guide RNA (sgRNA) containing tracrRNA and crRNA;
- reRNA guide RNA
- sgRNA single guide RNA
- the TraC effector protein can form a CRISPR complex with guide RNA
- the TraC effector protein can target-bind to a target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or can target-bind to a target DNA sequence under the guidance of a guide RNA containing tracrRNA and/or crRNA.
- the TraC effector protein is as defined below.
- the tracrRNA contains a non-targeting strand binding sequence (NTB) complementary to a non-targeting strand (NTS).
- NTB non-targeting strand binding sequence
- NTS non-targeting strand
- the guide RNA is a guide RNA comprising tracrRNA and/or crRNA, wherein the tracrRNA contains a non-targeting strand binding sequence (NTB) complementary to a non-targeting strand (NTS), wherein the guide RNA
- NTB non-targeting strand binding sequence
- the guide RNA The crRNA hybridizes to the targeting strand (TS) of the target DNA sequence, and the NTB hybridizes to the non-targeting strand (NTS).
- the one or more guide RNAs hybridize to the target DNA when transcribed, and the guide RNA forms a complex with the TraC effector protein, the complex causing distal cleavage of the target DNA sequence.
- the target DNA sequence is within a cell, preferably within a eukaryotic cell.
- the effector protein contains one or more nuclear localization signals.
- the nucleic acid sequences encoding the effector protein are codon optimized for expression in eukaryotic cells.
- components a) and b) or their nucleotide sequences are constructed on the same or different vectors.
- the present invention provides a method for modifying a DNA sequence of interest, which method comprises delivering the system described herein to the DNA sequence of interest or a cell containing the DNA sequence of interest.
- the invention provides a method of modifying a DNA sequence of interest, the method comprising delivering a composition of a TraC effector protein and one or more nucleic acid components to the DNA sequence of interest, wherein the effector protein is both capable of Targeted binding to the target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or targeted binding to the target DNA sequence under the guidance of a guide RNA containing tracrRNA and/or crRNA; the effector protein is combined with the one or Multiple nucleic acid components form a CRISPR complex, and after the complex targets a DNA sequence of interest that is 3' to the protospacer adjacent motif (PAM), the effector protein induces a response to the DNA sequence of interest. Grooming.
- PAM protospacer adjacent motif
- the DNA sequence of interest is within a cell, preferably a eukaryotic cell.
- the cell is an animal cell or a human cell.
- the cell is a plant cell.
- the effector protein comprises one or more nuclear localization sequences (NLS), cytoplasmic localization position sequence, chloroplast localization sequence or mitochondrial localization sequence.
- NLS nuclear localization sequences
- cytoplasmic localization position sequence cytoplasmic localization position sequence
- chloroplast localization sequence mitochondrial localization sequence
- the effector protein and nucleic acid component or a construct expressing the effector protein and nucleic acid component, are comprised in a delivery system.
- the delivery system includes viruses, virus-like particles, virions, liposomes, vesicles, exosomes, liposomal nanoparticles (LNP), N-acetylgalactosamine (GalNAc), or engineered bacteria .
- the invention provides a transposon and CRISPR-Cas12 intermediate (TraC) effector protein or a functional variant thereof for genome editing in an organism or an organism cell, wherein the TraC effector protein or its functional variant capable of forming a CRISPR complex with guide RNA;
- TraC transposon and CRISPR-Cas12 intermediate
- the guide RNA is selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) a guide RNA comprising tracrRNA and/or crRNA, such as a single guide RNA (sgRNA) comprising tracrRNA and crRNA;
- reRNA guide RNA
- sgRNA single guide RNA
- the TraC effector protein or its functional variant can either target the target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or can target the target under the guidance of a guide RNA containing tracrRNA and crRNA. DNA sequence.
- the invention provides transposons and CRISPR-Cas12 intermediate (TraC) effector proteins or functional variants thereof for genome editing in organisms or cells of organisms, the TraC effector proteins or functions thereof sexual variant
- (ii) Contains at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70% with one of SEQ ID NO: 1-37 , at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least An amino acid sequence that is 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or even 100% sequence identical, or contains one or more, with respect to SEQ ID NOs: 1-37, For example, an amino acid sequence with 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids substituted, deleted or added.
- the effector protein or functional variant thereof is derived from SEQ ID NO: 25.
- the effector protein or functional variant thereof comprises, relative to the sequence of SEQ ID NO: 25, selected from the group consisting of K78R, D86R, S137R, V145R, I147R, P148R, D150R, V228R, V254R, A510R, A278R, One or more amino acid substitutions of K315R, S334R, L343R, A369R, H392R, L394R, S408R, N456R, V500R, A510R, T573R.
- the effector protein or functional variant thereof comprises any set of mutations selected from the group shown in Table 3 or Table 4 relative to the sequence of SEQ ID NO:25.
- the effector protein or functional variant thereof comprises an amino acid sequence selected from SEQ ID NOs: 80-87.
- the TraC effector protein or functional variant thereof has at least guide RNA-mediated Sequence-specific targeting capabilities. That is, the TraC effector protein or its functional variant can form a complex with the guide RNA and bind to a specific target sequence (such as a DNA target sequence).
- the TraC effector protein or functional variant thereof has guide RNA-mediated sequence-specific targeting capabilities, as well as double-stranded nucleic acid (eg, double-stranded DNA) cleavage activity.
- double-stranded nucleic acid eg, double-stranded DNA
- the TraC effector protein or its functional variant forms a complex with the guide RNA and binds to a specific target sequence (such as a DNA target sequence)
- a specific target sequence such as a DNA target sequence
- it can cleave double-stranded nucleic acids (such as double-stranded DNA) within or near the target sequence. , forming a double-strand break (DSB).
- DSB double-strand break
- the TraC effector protein or functional variant thereof has guide RNA-mediated sequence-specific targeting capabilities, as well as nickase activity. For example, after the TraC effector protein or its functional variant forms a complex with the guide RNA and binds to a specific target sequence (such as a DNA target sequence), it can generate a nick in or near the target sequence.
- TraC effector proteins with nickase activity or functional variants thereof are also called TraC nickases.
- the TraC effector protein or functional variant thereof has guide RNA-mediated sequence-specific targeting capabilities but does not have double-stranded nucleic acid cleavage activity and/or nickase activity.
- Such TraC effector proteins or functional variants thereof that do not have double-stranded nucleic acid cleavage activity and/or nickase activity are also called dead TraC effector proteins.
- RNA molecules targeting target sequences Generally speaking, the gRNA of the CRISPR system targets the target sequence through base pairing between the crRNA and the complementary strand of the target sequence.
- the guide RNA can be selected from i) a guide RNA (reRNA) derived from the right end element of the transposon and/or ii) a guide RNA containing tracrRNA and/or crRNA, such as a single guide RNA containing tracrRNA and crRNA ( sgRNA).
- reRNA guide RNA
- sgRNA single guide RNA containing tracrRNA and crRNA
- the TraC effector protein of the present invention or a functional variant thereof is capable of targeting and binding to a target DNA sequence under the guidance of a guide RNA derived from the right end element of the transposon, or it is capable of binding to a target DNA sequence containing tracrRNA and/or crRNA targets and binds to the target DNA sequence under the guidance of guide RNA.
- the guide RNA is a guide RNA (reRNA) derived from the right end element of a transposon, for example, the reRNA comprises the scaffold sequence set forth in SEQ ID NO: 77 or 78.
- reRNA guide RNA
- the specific form or sequence of reRNA can vary according to the specific TraC effector protein.
- the design can refer to Karvelis, T. et al.
- Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692-696 (2021).
- the guide RNA comprises tracrRNA and/or crRNA.
- the guide RNA is a guide RNA formed by complementation of tracrRNA and crRNA.
- the guide RNA is a single guide RNA (sgRNA) comprising tracrRNA and crRNA, wherein tracrRNA and crRNA are fused.
- the guide RNA may comprise only crRNA, which may also be referred to as sgRNA. The specific gRNA form or sequence may vary depending on the specific TraC nuclease.
- the guide RNA containing tracrRNA and/or crRNA may also be called CRISPR system guide RNA.
- Guide RNA containing tracrRNA and/or crRNA is a conventional form of guide RNA for CRISPR systems.
- tracrRNA The sequence of and/or crRNA can be obtained by analyzing sequences near the CRISPR effector protein locus. It is within the ability of those skilled in the art to analyze and obtain tracrRNA and/or crRNA-containing guide RNAs for CRISPR effector proteins.
- the guide RNA comprising tracrRNA and/or crRNA is derived or matured from the nucleotide sequence of one of SEQ ID NOs: 38-74.
- the guide RNA includes tracrRNA and crRNA, such as a single guide RNA (sgRNA) including tracrRNA and crRNA.
- the crRNA comprises the same sequence as the target sequence immediately adjacent to the PAM (e.g., 3' to the PAM), thereby complementary binding to the opposite strand of the PAM (targeting strand).
- the tracrRNA contains a sequence complementary to a sequence distal to the PAM (in the direction of the target sequence) (non-targeting strand binding sequence, NTB).
- the non-targeting strand binding sequence is located at the 5' end of the tracrRNA.
- NTB in tracrRNA to the distal sequence of PAM can help the effector protein-guide RNA complex to open the PAM distal DNA region and improve editing efficiency.
- the complement of the non-targeting strand binding sequence is from about 10 to about 50 nucleotides from the PAM, such as about 10, about 16, about 20, about 24, about 28, about 30, about 40 , or about 50 nucleotides, preferably about 20 nucleotides from the PAM.
- the non-targeting strand binding sequence is about 5 to about 20 nucleotides in length, preferably about 8 to 12 nucleotides in length, and more preferably about 10 nucleotides in length.
- the complementary sequence of the non-targeting strand binding sequence overlaps at least partially with the target sequence.
- the complement of the non-targeting strand binding sequence is included in the target sequence.
- a "target sequence” refers to a sequence of about 20 nucleotides in length in the genome that is characterized by flanking (e.g., 5' flanking) PAM (pregapacer adjacent motif) sequences.
- PAM is necessary for the complex formed by CRISPR nuclease, such as the TraC effector protein of the present invention or its functional variant, and the guide RNA to recognize the target sequence.
- CRISPR nuclease such as the TraC effector protein of the present invention or its functional variant
- the guide RNA to recognize the target sequence.
- the target sequence can be located on any strand of the genomic DNA molecule.
- the strand bound by crRNA is called the targeting strand (TS), and the strand complementary to the targeting strand is called the non-targeting strand (NTS).
- the sgRNA comprises the scaffold sequence set forth in SEQ ID NO: 75 or 76.
- nucleotides 154-209 in the scaffold sequence shown in SEQ ID NO:75 or nucleotides 92-147 in the scaffold sequence shown in SEQ ID NO:76 are reprogrammable regions, and the Regions can be reprogrammed to contain non-targeting strand binding sequences (NTB).
- NTB non-targeting strand binding sequences
- the invention also provides a protein complex of the TraC effector protein or a functional variant thereof and at least one other functional protein.
- the TraC effector protein or functional variant thereof and the other functional protein form a protein complex via an affinity tag that mediates specific binding.
- the other functional protein forms a protein complex with the TraC effector protein or a functional variant thereof by specifically binding to the guide RNA.
- the invention also provides a fusion protein of the TraC effector protein or a functional variant thereof and at least one other functional protein.
- the other functional protein is a deaminase.
- the protein complex or fusion protein can be used for base editing in organisms or organism cells.
- the protein complex or fusion protein containing the TraC effector protein or its functional variant and a deaminase is also called a base editor.
- the protein complex or fusion protein can comprise one or more of the deaminase enzymes.
- the deaminase is cytosine deaminase.
- Cytosine deaminase refers to a deaminase that can accept single-stranded DNA as a substrate and catalyze the deamination of cytidine or deoxycytidine to uracil or deoxyuracil, respectively.
- cytosine deaminase examples include, but are not limited to, APOBEC1 deaminase, activation-induced cytidine deaminase (AID), APOBEC3G, CDAl, human APOBEC3A deaminase, double-stranded DNA deaminase (Ddd), single-stranded DNA deaminase (Sdd) (Ddd and Sdd reference CN202310220057.1, PCT/CN2023/080052) or their functional variants.
- APOBEC1 deaminase activation-induced cytidine deaminase
- AID activation-induced cytidine deaminase
- AID activation-induced cytidine deaminase
- AID activation-induced cytidine deaminase
- APOBEC3G activation-induced cytidine deaminase
- CDAl
- a cytidine deaminase in a protein complex or fusion protein is capable of converting cytidine deamination of single-stranded DNA produced in the formation of a protein complex or fusion protein-guide RNA-DNA complex. into U, and then base substitution from C to T is achieved through base mismatch repair.
- the protein complex or fusion protein comprising cytosine deaminase further comprises a uracil DNA glycosylase inhibitor (UGI).
- Uracil DNA glycosylase can catalyze the removal of U from DNA and initiate base excision repair (BER), resulting in the repair of U:G to C:G. Therefore, without being bound by any theory, inclusion of a uracil DNA glycosylase inhibitor (UGI) in the fusion protein of the invention will be able to increase the efficiency of C to T base editing.
- the deaminase is adenine deaminase.
- Adenine deaminase refers to a domain that can accept single-stranded DNA as a substrate and catalyze the formation of inosine (I) from adenosine or deoxyadenosine (A).
- adenine deaminase in the protein complex or fusion protein can deaminate adenosine of the single-stranded DNA generated in the formation of the protein complex or fusion protein-guide RNA-DNA complex into inosine ( I), because DNA polymerase treats inosine (I) as guanine (G), A to G substitution can be achieved through base mismatch repair.
- the adenine deaminase is a DNA-dependent adenine deaminase derived from E. coli tRNA adenine deaminase TadA (ecTadA).
- the protein complex or fusion protein includes cytosine deaminase and adenine deaminase.
- the other functional proteins may be transcription activator proteins, transcription repressor proteins, DNA methylases, DNA demethylases, etc., thereby enabling transcriptional regulation functions and/or epigenetic modification functions.
- the other functional protein may be reverse transcriptase.
- the protein complex or fusion protein containing the TraC effector protein or its functional variant and reverse transcriptase can be used for large fragment DNA insertion, such as prime editor (Anzalone, A.V., Randolph, P.B., Davis, J.R.
- the linkers described herein can be 1-50 in length (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structures.
- the joint may be a flexible joint.
- the TraC effector protein or functional variant thereof or the other functional protein forming a protein complex or the fusion protein is recombinantly produced.
- the TraC effector protein or functional variant thereof or the other functional protein forming a protein complex or the fusion protein further contains a fusion tag, e.g. for TraC effector A tag for the isolation/and purification of a protein or a functional variant thereof or said other functional protein or said fusion protein forming a protein complex.
- Methods for recombinantly producing proteins are known in the art. And there are many tags known in the art that can be used to separate/and purify proteins, including but not limited to His tags, GST tags, etc. Generally speaking, these tags do not alter the activity of the protein of interest.
- the TraC effector protein of the invention or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein further comprises a nuclear localization sequence (NLS) , for example, connected to the nuclear localization sequence through a linker.
- NLS nuclear localization sequence
- one or more NLS in the TraC effector protein or functional variant thereof or the other functional protein forming a protein complex or the fusion protein should be of sufficient strength to drive in the nucleus
- the TraC effector protein or functional variant thereof or the other functional protein forming a protein complex or the fusion protein accumulates in an amount that can achieve its genome editing function.
- the strength of the nuclear localization activity is determined by the number, position, one or Determined by multiple specific NLS, or a combination of these factors.
- Exemplary nuclear localization sequences include, but are not limited to, SV40 nuclear localization signal sequence and nucleoplasmin nuclear localization signal sequence.
- the TraC effector protein or its functional variant or the fusion protein of the present invention can also include other positioning sequences, such as cytoplasmic positioning sequence, chloroplast positioning sequence, mitochondrial positioning sequence, etc. .
- the invention provides the TraC effector protein of the invention or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein in a cell, preferably a eukaryotic cell, more preferably a plant.
- a cell preferably a eukaryotic cell, more preferably a plant.
- the invention provides a genome editing system for site-directed modification of a target nucleic acid sequence in a cell genome, which includes the TraC effector protein of the invention or a functional variant thereof or all components forming a protein complex.
- Said other functional protein or said fusion protein and/or an expression construct comprising a nucleotide sequence encoding said TraC effector protein of the invention or its functional variant or said fusion protein.
- the terms "genome editing system” and “gene editing system” are used interchangeably and refer to a combination of components required for genome editing of the genome within the cells of an organism, wherein the individual components of the system, e.g. The TraC effector protein or its functional variant or the other functional protein forming a protein complex or the fusion protein, gRNA or corresponding expression construct, etc. can exist independently, or can be used in any combination as exist in the form of compositions.
- the components of the genome editing system are comprised in a delivery system selected from viruses, virus-like particles, virions, liposomes, vesicles, exosomes, liposomal nanoparticles particles (LNP), N-acetylgalactosamine (GalNAc) or engineered bacteria.
- a delivery system selected from viruses, virus-like particles, virions, liposomes, vesicles, exosomes, liposomal nanoparticles particles (LNP), N-acetylgalactosamine (GalNAc) or engineered bacteria.
- the genome editing system further includes at least one guide RNA (gRNA) and/or an expression construct comprising a nucleotide sequence encoding the at least one guide RNA.
- gRNA guide RNA
- the guide RNA is selected from i) a guide RNA derived from the right end element of a transposon (reRNA) and/or ii) a guide RNA comprising tracrRNA and/or crRNA, e.g., a single guide RNA comprising tracrRNA and crRNA (sgRNA).
- reRNA transposon
- sgRNA single guide RNA comprising tracrRNA and crRNA
- the guide RNA is a single guide RNA (sgRNA) comprising tracrRNA and crRNA, for example, the sgRNA comprises the scaffold sequence shown in SEQ ID NO: 75 or 76.
- sgRNA single guide RNA
- the guide RNA derived from the CRISPR system includes tracrRNA and crRNA, such as a single guide RNA (sgRNA) including tracrRNA and crRNA.
- the crRNA comprises the same sequence as the target sequence immediately adjacent to the PAM, thereby complementary binding to the opposite strand of the PAM.
- the tracrRNA includes a sequence complementary to a stretch of the PAM distal to the target sequence (non-targeting strand binding sequence, NTB).
- the non-targeting strand binding sequence is located at the 5' end of the tracrRNA.
- the complement of the non-targeting strand binding sequence is from about 10 to about 50 nucleotides from the PAM, such as about 10, about 16, about 20, about 24, about 28, about 30, about 40 , or about 50 nucleotides, preferably about 20 nucleotides from the PAM.
- the non-targeting strand binding sequence is about 5 to about 20 nucleotides in length, preferably about 8 to 12 nucleotides in length, and more preferably about 10 nucleotides in length.
- the complementary sequence of the non-targeting strand binding sequence overlaps at least partially with the target sequence.
- the complement of the non-targeting strand binding sequence is included in the target sequence.
- nucleotides 154-209 in the scaffold sequence shown in SEQ ID NO:75 or nucleotides 92-147 in the scaffold sequence shown in SEQ ID NO:76 are reprogrammable regions, and the Regions can be reprogrammed to contain non-targeting strand binding sequences (NTB).
- NTB non-targeting strand binding sequences
- the 5' or 3' end of the target sequence targeted by the genome editing system of the present invention needs to contain a protospacer adjacent motif (PAM).
- PAM protospacer adjacent motif
- the specific gRNA form or sequence will vary depending on the specific nuclease.
- the gRNA when used to guide editing, can be a so-called pegRNA.
- the pegRNA additionally adds a reverse transcription template (RT) sequence and a primer binding site (PBS) sequence to the sgRNA.
- RT reverse transcription template
- PBS primer binding site
- the PAM recognized by the nucleases of the invention or functional variants thereof is a T-rich PAM. In some embodiments, the PAM recognized by the nucleases of the invention or functional variants thereof is a G-rich PAM.
- the PAM can be, for example, 5'-TTTN-3', 5'-TGTNNN-3', PolyT, PolyG, 5'-TTTG-3', 5'-TTC-3', 5'-TGA-3', 5'-YTTC-3', 5'-CTCGTG-3', 5'-GTTG-3', 5'-CTTG-3', 5'-TCTG-3', 5'-TTTA-3', 5' -TTAG-3', where N represents A, G, C or T and Y represents C or G).
- PAMs Based on the presence of PAMs, those skilled in the art can easily determine target sequences in the genome that can be used for targeting and optionally editing and design appropriate guide RNAs accordingly. For example, if there is a PAM sequence 5'-TTC-3' in the genome, then about 18 to about 35, preferably 20, 21, 22 or 23 consecutive nucleotides immediately adjacent to its 5' or 3' can be used as the target sequence.
- the at least one guide RNA is encoded by a different expression construct. In some implementations In the scheme, the at least one guide RNA is encoded by the same expression construct. In some embodiments, the at least one guide RNA and the TraC effector protein of the invention or a functional variant thereof or the fusion protein are encoded by the same expression construct.
- the genome editing system may comprise any one selected from:
- the TraC effector protein of the present invention or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein and the at least one guide RNA, optionally, the TraC effector
- the protein or functional variant thereof or the fusion protein and the at least one guide RNA form a complex
- an expression construct comprising a nucleotide sequence encoding the TraC effector protein of the invention or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein, and the at least one species guide RNA;
- an expression construct comprising a nucleotide sequence encoding the TraC effector protein of the invention or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein, and an expression construct encoding the An expression construct for the nucleotide sequence of at least one guide RNA;
- v) comprising a nucleotide sequence encoding the TraC effector protein of the invention or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein and encoding the at least one guide RNA Expression constructs for nucleotide sequences.
- the genome editing system further comprises a donor nucleic acid molecule comprising a nucleotide sequence to be site-specifically inserted into the genome.
- the nucleotide sequence to be site-directedly inserted into the genome is flanked by sequences homologous to sequences flanking the target sequence in the genome. After editing, the nucleotide sequence to be inserted into the genome at a specific site can be integrated into the genome through homologous recombination.
- the nucleic acid encoding the TraC effector protein or a functional variant thereof or the other functional protein forming a protein complex or the fusion protein is codon-optimized for the organism from which the cells to be genome edited are derived.
- Codon optimization refers to replacing at least one codon of the native sequence (e.g., about or more than about 1, 2, 3, 4, 5, 10) with a codon that is more frequently or most frequently used in the host cell's genes. , 15, 20, 25, 50 or more codons while maintaining the native amino acid sequence and modifying the nucleic acid sequence to enhance expression in the host cell of interest. Different species display certain codons for specific amino acids specific preferences. Codon bias (differences in codon usage between organisms) is often related to the efficiency of messenger RNA (mRNA) translation, which is thought to depend on the nature of the codons being translated and Availability of specific transfer RNA (tRNA) molecules.
- mRNA messenger RNA
- tRNA transfer RNA
- tRNAs within a cell generally reflects the codons most frequently used for peptide synthesis.
- genes can be tailored to be most efficient in a given organism based on codon optimization.
- Optimal gene expression. Codon utilization tables are readily available, for example in the Codon Usage Database available at www.kazusa.orjp/codon/, and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., “Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
- the cells from which genome editing can be performed by the TraC effector protein or functional variant thereof or the fusion protein or genome editing system of the present invention are derived from organisms that can be prokaryotes or eukaryotes, preferably eukaryotes, Including but not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; plants including monocots and dicots, such as rice, corn , wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc.
- a nucleotide sequence encoding the TraC effector protein or a functional variant thereof or the fusion protein and/or a nucleotide sequence encoding the at least one guide RNA and an expression control element If the promoter is operably linked.
- promoters examples include, but are not limited to, polymerase (pol) I, pol II or pol III promoters.
- pol I promoters include the chicken RNA pol I promoter.
- pol II promoters include, but are not limited to, the cytomegalovirus immediate early (CMV) promoter, the Rous sarcoma virus long terminal repeat (RSV-LTR) promoter, and the simian virus 40 (SV40) immediate early promoter.
- pol III promoters include the U6 and H1 promoters. Inducible promoters such as metallothionein promoters can be used.
- promoters include the T7 phage promoter, the T3 phage promoter, the ⁇ -galactosidase promoter, and the Sp6 phage promoter.
- the promoter may be cauliflower mosaic virus 35S promoter, corn Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, corn U3 promoter, rice actin promoter.
- the 5' end of the guide RNA coding sequence is connected to a first The 3' end of the ribozyme coding sequence, the first ribozyme is designed to cleave the first ribozyme-guide RNA fusion generated by intracellular transcription at the 5' end of the guide RNA, thereby forming a fusion that does not carry 5' guide RNA with extra nucleotides at the end.
- the 3' end of the guide RNA coding sequence is linked to the 5' end of a second ribozyme coding sequence, and the second ribozyme is designed to cleave intracellularly at the 3' end of the guide RNA.
- the resulting guide RNA-second ribozyme fusion is transcribed, thereby forming a guide RNA that does not carry additional nucleotides at the 3' end.
- the 5' end of the guide RNA coding sequence is linked to the 3' end of the first ribozyme coding sequence, and the 3' end of the guide RNA coding sequence is linked to the 5' end of the second ribozyme coding sequence.
- the first ribozyme is designed to cleave the first ribozyme-guide RNA-second ribozyme fusion generated by intracellular transcription at the 5' end of the guide RNA
- the second ribozyme is designed to The first ribozyme-guide RNA-second ribozyme fusion generated by intracellular transcription is cleaved at the 3' end of the guide RNA, thereby forming a guide RNA that does not carry extra nucleotides at the 5' and 3' ends.
- first or second ribozyme is within the capabilities of those skilled in the art. For example, see Gao et al., JIPB, Apr, 2014; Vol 56, Issue 4, 343-349.
- the 5' end of the guide RNA coding sequence is connected to a first 3' end of the tRNA coding sequence
- the first tRNA is designed to be cleaved at the 5' end of the guide RNA (i.e., by the precise tRNA processing machinery present within the cell (which precisely excises the 5' and 5' ends of the precursor tRNA 3' additional sequence to form the first tRNA-guide RNA fusion generated by intracellular transcription of the mature tRNA), thus forming a non-carrying Guide RNA with extra nucleotides at the 5' end.
- the 3' end of the guide RNA coding sequence is connected to the 5' end of a second tRNA coding sequence, and the second tRNA is designed to be generated by intracellular transcription of the 3' end tRNA of the guide RNA.
- a guide RNA-second tRNA fusion thereby forming a guide RNA that does not carry additional nucleotides at the 3' end.
- the 5' end of the guide RNA coding sequence is connected to the 3' end of the first tRNA coding sequence
- the 3' end of the guide RNA coding sequence is connected to the 5' end of the second tRNA coding sequence
- the first tRNA is designed to cleave the first tRNA-guide RNA-second tRNA fusion generated by intracellular transcription at the 5' end of the guide RNA
- the second tRNA is designed to cleave at the 5' end of the guide RNA.
- the 3' end cleaves the first tRNA-guide RNA-second tRNA fusion generated by intracellular transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 5' and 3' ends.
- tRNA-guide RNA fusions are within the capabilities of those skilled in the art. For example, you can refer to Xie et al., PNAS, Mar 17, 2015; vol.112, no.11, 3570-3575.
- the present invention provides a method for site-directed modification of a target nucleic acid sequence in the genome of a cell, comprising introducing the genome editing system of the present invention into the cell.
- the invention also provides a method of producing genetically modified cells, comprising introducing the genome editing system of the invention into the cells.
- the invention also provides genetically modified organisms comprising genetically modified cells or progeny cells thereof produced by the methods of the invention.
- the target sequence to be modified can be located anywhere in the genome, such as within a functional gene such as a protein-coding gene, or can be located in a gene expression regulatory region such as a promoter region or enhancer region, thereby achieving the described Modification of gene function or modification of gene expression. Modifications in the cellular target sequence can be detected by T7EI, PCR/RE or sequencing methods.
- the gene editing system can be introduced into cells through various methods well known to those skilled in the art.
- Methods that can be used to introduce the gene editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, lipofection, microinjection, viral infection (such as baculovirus, vaccinia virus, adenovirus, etc.) viruses, adeno-associated viruses, lentiviruses and other viruses), biolistics, PEG-mediated protoplast transformation, Agrobacterium-mediated transformation.
- the methods of the invention are performed in vitro.
- the cells are isolated cells, or cells in an isolated tissue or organ.
- the methods of the present invention can also be performed in vivo.
- the cells are cells in an organism, and the system of the present invention can be introduced into the cells in vivo by, for example, virus- or Agrobacterium-mediated methods.
- Cells that can be genome edited by the method of the present invention can be from prokaryotes or eukaryotes, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens and ducks , goose; plant, bag Including monocots and dicots, such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc.
- the invention provides a method of producing a genetically modified plant, comprising introducing a genome editing system of the invention into at least one said plant, thereby causing a modification in the genome of said at least one plant.
- the genome editing system can be introduced into the plant by various methods well known to those skilled in the art.
- Methods that can be used to introduce the genomic system of the present invention into plants include, but are not limited to: biolistic method, PEG-mediated protoplast transformation, soil Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube channel method and ovary injection Law.
- the modification of the target sequence can be achieved by simply introducing or producing the TraC effector protein or its functional variant or the fusion protein or guide RNA in plant cells, and the modification can be stable Genetically, there is no need to stably transform plants with the genome editing system. This avoids potential off-target effects of the stably existing genome editing system and avoids the integration of foreign nucleotide sequences in the plant genome, resulting in higher biosafety.
- the introduction is performed in the absence of selection pressure, thereby avoiding integration of exogenous nucleotide sequences into the plant genome.
- the introduction includes transforming the genome editing system of the invention into isolated plant cells or tissues, and then regenerating the transformed plant cells or tissues into intact plants.
- the regeneration is performed in the absence of selection pressure, that is, without the use of any selection agent against the selection gene carried on the expression vector during tissue culture. Not using a selection agent can improve plant regeneration efficiency and obtain herbicide-resistant plants that do not contain exogenous nucleotide sequences.
- the genome editing system of the present invention can be transformed into specific parts of an intact plant, such as leaves, shoot tips, pollen tubes, young ears, or hypocotyls. This is particularly suitable for the transformation of plants that are difficult to regenerate in tissue culture.
- in vitro expressed proteins and/or in vitro transcribed RNA molecules are directly transformed into the plant.
- the protein and/or RNA molecules can achieve genome editing in plant cells and are subsequently degraded by the cells, avoiding the integration of exogenous nucleotide sequences in the plant genome.
- the method further includes treating (e.g., culturing) the plant cells, tissues, or intact plants that have been introduced with the genome editing system at an elevated temperature (relative to a conventional culture temperature, e.g., room temperature), the elevated temperature being The highest temperature is, for example, 32°C.
- the plant is rice.
- genetic modification of plants using the methods of the present invention can result in plants whose genomes are free of exogenous polynucleotide integration, that is, non-transgene (transgene-free) modified plants.
- the modification is associated with a plant trait, such as an agronomic trait, eg the modification results in the plant having altered (preferably improved) traits, eg agronomic traits, relative to a wild-type plant.
- the method further includes the step of screening plants for desired modifications and/or desired traits, such as agronomic traits.
- the method further includes obtaining progeny of the genetically modified plant.
- the genetically modified plant or its progeny has the desired modifications and/or desired traits such as agronomic traits.
- the present invention also provides a genetically modified plant or a progeny thereof or a part thereof, wherein said plant is obtained by the above-mentioned method of the present invention.
- the genetically modified plant or progeny thereof or parts thereof are non-transgenic.
- the genetically modified plant or its progeny has the desired genetic modification and/or the desired traits such as agronomic traits.
- the present invention also provides a plant breeding method, comprising crossing a genetically modified first plant obtained by the above-mentioned method of the present invention with a second plant that does not contain the modification, so that the modified Import the second plant.
- the genetically modified first plant has desirable traits such as agronomic traits.
- the invention also encompasses the use of the genome editing system of the invention in disease treatment.
- the up-regulation, down-regulation, inactivation, activation or mutation correction of disease-related genes can be achieved, thereby achieving prevention and/or treatment of diseases.
- the genome modification described in the present invention can be located in the protein coding region of a disease-related gene, or can be located in a gene expression regulatory region such as a promoter region or enhancer region, thereby achieving functional modification or modification of the disease-related gene. Modification of disease-related gene expression. Therefore, modification of disease-related genes described herein includes modifications to the disease-related genes themselves (such as protein coding regions), as well as modifications to their expression regulatory regions (such as promoters, enhancers, introns, etc.).
- a “disease-associated” gene refers to any gene that produces a transcription or translation product at abnormal levels or in an abnormal form in cells derived from disease-affected tissue as compared to non-disease control tissues or cells. Where altered expression is associated with the emergence and/or progression of a disease, it may be a gene that is expressed at an abnormally high level; it may be a gene that is expressed at an abnormally low level.
- Disease-associated genes also refer to genes that have one or more mutations or genetic variants that are directly responsible for or in linkage disequilibrium with one or more genes responsible for the etiology of the disease. The mutation or genetic variation is, for example, a single nucleotide variation (SNV).
- SNV single nucleotide variation
- the invention also provides methods of treating a disease in a subject in need thereof, comprising delivering to said subject an effective amount of a genome editing system of the invention to modify a gene associated with said disease.
- the present invention also provides the use of a genome editing system for the preparation of a pharmaceutical composition for treating a disease in a subject in need thereof, wherein the genome editing system is used to modify a gene associated with the disease.
- the present invention also provides a pharmaceutical composition for treating a disease in a subject in need thereof, comprising the genome editing system of the present invention, and optionally a pharmaceutically acceptable carrier, wherein the genome editing system is used to modify the Disease-related genes.
- the "subject" of the present invention is a mammal, such as a human.
- the genome editing systems described herein are used to introduce point mutations into nucleic acids.
- the genome editing systems described herein are used for the correction of genetic defects, such as in the correction of point mutations that result in loss of function in a gene product.
- the genetic defect is associated with a disease or condition (eg, a lysosomal storage disease or a metabolic disease, such as, for example, Type I diabetes).
- the methods provided herein can be used to introduce inactivating point mutations into genes encoding gene products associated with diseases or disorders. Cause or allele.
- the protocols described herein are intended for the treatment of patients with diseases associated with or caused by point mutations that can be corrected by the genome editing systems provided herein.
- the disease is a proliferative disease.
- the disease is a genetic disease.
- the disease is a de novo disease.
- the disease is a metabolic disease.
- the disease is a lysosomal storage disease.
- mitochondrial diseases or disorders refers to diseases caused by abnormal mitochondria, such as mitochondrial gene mutations, enzymatic pathways, etc.
- disorders include, but are not limited to: neurological disorders, loss of motor control, muscle weakness and pain, gastrointestinal disorders and difficulty swallowing, poor growth, heart disease, liver disease, diabetes, respiratory complications, epilepsy, vision/hearing problems, lactic acid Toxicity, developmental delay, and susceptibility to infection.
- diseases described in the present invention include, but are not limited to, genetic diseases, circulatory system diseases, muscle diseases, brain, central nervous system and immune system diseases, Alzheimer's disease, secretase disorders, amyotrophic lateral sclerosis (ALS) ), autism, trinucleotide repeat expansion disorders, hearing disorders, gene-targeted therapy of non-dividing cells (neurons, muscles), liver and kidney diseases, epithelial cell and lung diseases, cancer, Usher syndrome or retinitis pigmentosa-39, cystic fibrosis, HIV and AIDS, beta thalassemia, sickle cell disease, herpes simplex virus, autism, drug addiction, age-related macular degeneration, schizophrenia .
- genetic diseases include, but are not limited to, genetic diseases, circulatory system diseases, muscle diseases, brain, central nervous system and immune system diseases, Alzheimer's disease, secretase disorders, amyotrophic lateral sclerosis (ALS) ), autism, trinucleotide repeat expansion disorders, hearing disorders, gene-targeted therapy
- WO2015089465A1 PCT/US2014/070135
- WO2016205711A1 PCT/US2016/038181
- WO2018141835A1 PCT/EP2018/052491
- WO2020191234A1 PCT/US2 020/023713
- WO2020191233A1 PCT/ Related diseases for which the genome editing system listed in US2020/023712
- WO2019079347A1 PCT/US2018/056146
- WO2021155065A1 PCT/US2021/015580
- Administration of the genome editing systems or pharmaceutical compositions of the invention can be tailored to the body weight and species of the patient or subject.
- the frequency of administration is within the limits of medical or veterinary medicine. It depends on general factors including the patient or subject's age, sex, general health, other conditions, and the specific condition or symptom being addressed.
- kits for use in the method of the invention, the kit comprising the genome editing system of the invention, and instructions for use.
- Kits generally include labels indicating the intended use and/or method of use of the contents of the kit.
- the term label includes any written or recorded material on or provided with the kit or otherwise provided with the kit.
- the next step is to filter out candidate proteins that are the same as the annotated type or have similar sequences to the annotated proteins through CRISPR type analysis and protein similarity analysis. After removing redundancy, 37 new proteins containing conserved domains (SEQ ID NO: 1- 37). These proteins are defined as intermediates between transposons and C RISPR-Cas12 (TraC for short). Correspondingly, the CRISPR system using TraC as the effector protein is defined as the CRISPR-TraC system.
- FIG. 2 illustrates the prokaryotic expression of TraC-N483 protein.
- 483 represents the name of the new protein, and repeat is the CRISPR locus region.
- NC1 and NC2 are non-coding RNA regions where tracrRNA may exist.
- Example 2 Using a fluorescent reporter system to screen new CRISPR lines with DNA binding ability in prokaryotic cells system
- the inventor used a fluorescent reporter system to screen the function of the new CRISPR system. This system can screen CRISPR systems with DNA double-strand binding ability.
- the specific experimental design is shown in Figure 3 and Figure 4:
- a plasmid with p15a as the backbone is used to express Cas12 protein, miniCRISPR (repeat-spacer-repeat) and non-coding RNA sequence (ncRNA), use another plasmid with pBR322 as the backbone to express yellow fluorescent protein (YFP) (pUC-PAM-YFP), in which there is a target site complementary to the spacer sequence in the 5' untranslated region of the YFP protein and its
- the sequence of the upstream random PAM library is: nnnnnnGTGATCGACAGCAACAAGTGAGCG or nnnnGTGATCGACAGCAACAAGTGAGCG, where nnnnnn and nnnn are PAM libraries of different lengths, covering 4096
- FIG. 4 shows the screening results of dLbCas12a protein. Bacteria with extremely low YFP expression in the P2 region (B box) were sorted by flow cytometry.
- the above system can be used to screen the DNA double-stranded binding characteristics of candidate proteins of the new CRISPR system, and the inventors screened some representative candidate proteins.
- TraC-N287, TraC-445, TraC-483, and TraC-655 all screened out T-rich PAMs
- TraC-N701 is a G-rich PAM, which implies that most of these proteins are T-rich or have a small amount of G.
- the enriched PAM is consistent with the previously reported finding that most Cas12 family proteins recognize T-enriched PAM.
- Example 3 Detailed detection of PAM with double-stranded cleavage functional protein by next-generation sequencing
- this system can be used to screen CRISPR systems with DNA double-strand cutting ability.
- the specific experimental design is as follows:
- the plasmid containing the PAM library and the plasmid expressing the protein were co-transfected (this is the treatment group).
- the protein expression vector with the crRNA expression box deleted was co-transfected with the plasmid of the PAM library to form a control group.
- PAMs that can be recognized and cleaved by the protein to be tested will be lost, resulting in a decrease in the proportion of targeted PAMs relative to the control group. Therefore, the PAM sequence of the protein to be tested can be obtained by comparing the depletion status of the two PAM libraries through next-generation sequencing.
- Example 4 Using plasmid interference system to screen new CRISPR systems with DNA cutting ability in prokaryotic cells
- this example uses a plasmid interference system as a detection model.
- the specific experimental design is shown in Figure 7.
- the plasmid interference experimental system was used to verify the specific PAM information of the candidate protein with obvious PAM obtained in Example 3.
- the specific implementation process is as follows: Taking the candidate protein TraC-459 as an example, it was obtained in Example 3 that the protein can recognize a typical 5 '-TTC-3'PAM motif, the 3' of the motif is adjacent to the GFP-T1 target site (SEQ ID NO: 79), using the pUC-polyT-YFP vector as a template to construct a series of PAM sequences with Tra-C459 that can be recognized Target vectors (pUC-TTC-YFP, pUC-GTC-YFP, pUC-TCC-YFP, pUC-TTG-YFP, pUC-TGC-YFP, pUC-CTTC-YFP, pUC-GTTC-YFP and pUC-TTTC -YFP), co-transform the Y53-459 vector and the above target vector into E.
- TraC-875 protein has strong cleavage activity under the 5'-CTCGTG-3'PAM motif, and its detailed PAM sequence needs further exploration; TraC-297 protein can extensively and efficiently cleave 5'-GTTG-3', 5'-CTTG-3', 5'-TCTG-3', 5'-TTTA-3', 5'-TTAG-3'
- the target sequence under the PAM motif; TraC-949 protein can cleave the target sequence under the 5'-NTGA-3'PAM motif, and the cleavage efficiency of the target sequence under the 5'-TTGA-3'PAM motif is The highest, while the cleavage efficiency for 5'-TTGA-3', 5'-ATGA-3', 5'-GTGA-3', and 5'-CTGA-3'PAM targets is relatively low.
- Figure 8B The results are shown in Figure 8B.
- the inventors predicted the secondary structure of the accessory RNA of the V-type CRISPR and TnpB systems and analyzed its structural folding model (Figure 9A).
- the study found that different protein subtypes can be divided into three categories according to the folding model.
- the folding model reflects the characteristics of the three types of CRISPR loci, which are the distance between the CRISPR protein and tracrRNA or the absence of tracrRNA.
- the classification results indicate that the TnpB protein may have experienced a transposon jump to the CRISPR site, or reRNA split into tracrRNA and CRISPR RNA.
- the diversity of accessory RNA assemblages also supports the model of coevolution of effectors and accessory RNAs. type (see Figure 9B for the evolutionary model).
- Example 6 Editing activity of TraC protein using sgRNA as guide RNA
- the TraC-459 protein in the TraC system was selected to verify its editing activity on DNA.
- sgRNA-pre predicted sgRNA (sgRNA-predicted, referred to as sgRNA-pre) for the VEGFA-T1 site of HEK293T cells through tracrRNA and crRNA recombination (see Figure 10A).
- sgRNA-pre predicted sgRNA
- tracrRNA crRNA complementary region truncated to a length of 11-15bp, or tracrRNA 5' region truncated to a length of 19-21bp, or a spacer length of 22-27bp shows better editing effects.
- sgRNA-opt the optimized sgRNA (sgRNA-optimal, referred to as sgRNA-opt) was obtained.
- This optimization strategy is called the second generation sgRNA optimization method of the TraC system (referred to as sgRNA-v2).
- Figure 11 shows that sgRNA-opt, as a guide RNA, can significantly improve the editing efficiency of TraC-459.
- Example 7 Editing activity of TraC protein using reRNA as guide RNA
- the coevolution model of Example 5 predicts that the TraC protein is an evolved descendant of TnpB. Because the TnpB system uses the 3' flanking sequence as a guide RNA for DNA cleavage (Karvelis, T. et al. Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease. Nature 599, 692-696 (2021).). This example examines the use of TnpB guide RNA in the CRISPR system of TraC protein.
- the inventors selected reRNAs of TnpB mutant proteins (882-TnpB-reRNA, 966-TnpB-reRNA) with similar structures as guide RNAs to be verified.
- the inventor aimed at the GFP-T1 target and fused the scaffold sequences of 882-TnpB-reRNA and 966-TnpB-reRNA with the targeting sequence of GFP-T1.
- the plasmid interference experiment in Example 4 was then used to analyze the dsDNA cutting ability of TraC-459 in E. coli e.coli under different guide RNAs.
- the experimental results are shown in Figure 12.
- Experimental results show that TraC-459 shows varying degrees of DNA interference activity under different types of guide RNA relative to the blank vector control (shown as pEmpty in Figure 12).
- TraC4-59 has a dual-guide mechanism and has the targeted cleavage pathways of both the TnpB system and the CRISPR system. That is, the TraC effector protein can both target and bind the target DNA under the guidance of reRNA. Targeted binding to target DNA under the guidance of sgRNA.
- the inventor constructed the duplex sequence of the TraC-459 protein and used the multimer v3 model of AlphaFold2 to predict the three-dimensional structure folding of the TraC-459 protein.
- the results showed that the five predicted TraC-459 proteins were the most In the optimal protein structures (Rank 1 to 5), there is no double-body interaction (Figure 13).
- a predicted alignment error (PAE) heatmap providing a distance error for each pair of residues. When predicted and true It gives AlphaFold2's estimate of the positional error at residue x when the structure is aligned at residue y. Values range from 0-35 Angstroms (white-black).
- this example obtained a series of optimized TraC-459 variants through arginine scanning mutation, directed evolution and artificial intelligence-assisted evolution methods.
- the screening process is shown in Figure 14a- c.
- some of the selected TraC-459 mutants have higher editing efficiency.
- the experiment tested the editing efficiency of the five mutants in the mutant library, and a total of three sets of parallel experiments were conducted (Table 2).
- the ratio of the editing efficiency of the obtained mutant to that of wild-type TraC-459 is >1, indicating that the mutant has higher editing efficiency.
- mutants with improved editing efficiency screened according to this method are shown in Table 3.
- Representative 5-arginine mutants screened through arginine scanning mutation screening include TraC-5M-7 (S137R, P148R, D150R, K315R and A369R), which is the TraC-5M-7 mutant in Figure 14b.
- Research shows that the editing efficiency of TraC-5M-7 at the VEGFA-T1 site is 24.02 times higher than that of the original TraC-459.
- the TraC mutants with improved editing efficiency designed according to this method are shown in Table 3.
- the inventors developed a deep learning model through the data of a series of TraC variants and obtained 7 representative mutants TraC-B22, -B24, -B26, - B32, -B34, -B35, and B36 have enhanced editing activity in human cells (the editing activity is shown in Figure 14d, and the mutation sites are shown in Table 4).
- TraC-459 is a highly compact monomeric Cas12-like protein.
- the inventor found that it is the smallest monomeric CRISPR effector protein currently known. And it has a unique sgRNA and reRNA dual-guide mechanism and dual-pairing function, which does not exist in other Cas12 subtypes.
Landscapes
- Genetics & Genomics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Plant Pathology (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Crystallography & Structural Chemistry (AREA)
- Mycology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
Abstract
Description
Claims (55)
- 一种工程化的规律间隔成簇短回文重复序列(CRISPR)系统,包含:a)转座子和CRISPR-Cas12中间体(TraC)效应蛋白或者编码该效应蛋白的一种或多种核苷酸序列;和b)一种或多种向导RNA,或者编码该一种或多种向导RNA的核苷酸序列,其中向导RNA选自i)衍生自转座子右端元件的向导RNA(reRNA)和/或ii)包含tracrRNA和/或crRNA的向导RNA,例如包含tracrRNA和crRNA的单向导RNA(sgRNA);所述TraC效应蛋白能够与向导RNA形成CRISPR复合物;所述TraC效应蛋白既能够在衍生自转座子右端元件的向导RNA的引导下靶向结合目标DNA序列,也能够在包含tracrRNA和/或crRNA的向导RNA的引导下靶向结合目标DNA序列。
- 根据权利要求1所述的工程化的CRISPR系统,其中所述tracrRNA含有与非靶向链(NTS)互补配对的非靶向链结合序列(NTB)。
- 一种包含一种或多种构建体的工程化的规律间隔成簇短回文重复序列CRISPR载体系统,包含:a)可操作地连接至编码转座子和CRISPR-Cas12中间体(TraC)效应蛋白的核苷酸序列的第一调节元件;和b)可操作地连接至一种或多种核苷酸序列的第二调节元件,该一种或多种核苷酸序列编码一种或多种向导RNA,该向导RNA选自i)衍生自转座子右端元件的向导RNA(reRNA)和/或ii)包含tracrRNA和/或crRNA的向导RNA,例如包含tracrRNA和crRNA的单向导RNA(sgRNA);所述TraC效应蛋白能够与向导RNA形成CRISPR复合物;所述TraC效应蛋白既能够在衍生自转座子右端元件的向导RNA的引导下靶向结合目标DNA序列,也能够在包含tracrRNA和/或crRNA的向导RNA的引导下靶向结合目标DNA序列。
- 根据权利要求3所述的工程化的CRISPR载体系统,其中所述tracrRNA含有与非靶向链(NTS)互补配对的非靶向链结合序列(NTB)。
- 如权利要求2或4所述的系统,其中向导RNA为包含tracrRNA和crRNA的向导RNA,其中所述tracrRNA含有与非靶向链(NTS)互补配对的非靶向链结合序列(NTB),其中所述向导RNA通过crRNA与目标DNA序列的靶向链(TS)杂交,并且通过NTB与非靶向链(NTS)杂交。
- 如权利要求4所述的系统,其中当转录时,该一种或多种向导RNA与目标DNA杂交,并且向导RNA与该TraC效应蛋白形成复合物,该复合物引起该目标DNA序列远端切割。
- 如权利要求1-6中任一项所述的系统,其中该目标DNA序列是在细胞内,优选 为真核细胞内。
- 如权利要求1-7中任一项所述的系统,其中该效应蛋白包含一个或多个核定位序列(NLS)、细胞质定位序列、叶绿体定位序列或线粒体定位序列。
- 如权利要求1-8中任一项所述的系统,其中编码该效应蛋白的这些核酸序列被密码子优化,用于在真核细胞中表达。
- 如权利要求1-9中任一项所述的系统,其中组分a)和b)或它们的核苷酸序列构建在相同或不同载体上。
- 一种修饰目的DNA序列的方法,该方法包括将如权利要求1-10中任一项所述的系统地送到所述目的DNA序列或含有该目的DNA序列的细胞中。
- 一种修饰目的DNA序列的方法,该方法包括将TraC效应蛋白和一种或多种核酸组分的组合物递送至所述目的DNA序列,其中所述效应蛋白既能够在衍生自转座子右端元件的向导RNA的引导下靶向结合目标DNA序列,也能够在包含tracrRNA和crRNA的向导RNA的引导下靶向结合目标DNA序列;该效应蛋白与该一种或多种核酸组分形成CRISPR复合物,并且在所述复合物与是前间区序列邻近基序(PAM)的3’的目的DNA序列靶向结合后,该效应蛋白诱导对该目的DNA序列的修饰。
- 如权利要求12所述的方法,其中该目的基因是在细胞内,优选为真核细胞。
- 如权利要求13所述的方法,其中该细胞是动物细胞或人类细胞。
- 如权利要求13所述的方法,其中该细胞是植物细胞。
- 如权利要求12所述的方法,其中该效应蛋白包含一个或多个核定位序列(NLS)、细胞质定位序列、叶绿体定位序列或线粒体定位序列。
- 如权利要求12所述的方法,其中效应蛋白和核酸组分,或表达所述效应蛋白和核酸组分的构建体被包含在一个递送系统中。
- 如权利要求17所述的方法,其中递送系统包括病毒、病毒样颗粒、病毒体、脂质体、囊泡、外来体、脂质体纳米颗粒(LNP)、N-乙酰半乳糖胺(GalNAc)或工程菌。
- 一种用于在生物体或生物体细胞中进行基因组编辑的转座子与CRISPR-Cas12中间体(TraC)效应蛋白或其功能性变体,其中所述TraC效应蛋白或其功能性变体能够与向导RNA形成CRISPR复合物;所述向导RNA选自i)衍生自转座子右端元件的向导RNA(reRNA)和/或ii)包含tracrRNA和crRNA的向导RNA,例如包含tracrRNA和crRNA的单向导RNA(sgRNA);所述TraC效应蛋白或其功能性变体既能够在衍生自转座子右端元件的向导RNA的引导下靶向结合目标DNA序列,也能够在包含tracrRNA和crRNA的向导RNA的引导下靶向结合目标DNA序列。
- 用于在生物体或生物体细胞中进行基因组编辑的转座子与CRISPR-Cas12中间蛋白(TraC)效应蛋白或其功能性变体,所述TraC效应蛋白或其功能性变体(i)包含选自“TSxxCxxCx”、“GIDRG”和“CxxCGxxxxADxxAA”的至少一个、至少两个或全部三个氨基酸序列基序,其中x代表任意氨基酸,例如任意天然编码的氨基 酸;和(ii)包含与SEQ ID NO:1-37之一具有至少30%、至少35%、至少40%、至少45%、至少50%、至少55%、至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、甚至100%序列相同性的氨基酸序列,或包含相对于SEQ ID NO:1-37具有一或多个,例如1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸取代、缺失或添加的氨基酸序列。
- 权利要求20所述的TraC效应蛋白或其功能性变体,其中所述效应蛋白功能性变体衍生自SEQ ID NO:25,且相对于SEQ ID NO:25序列包含选自K78R、D86R、S137R、V145R、I147R、P148R、D150R、V228R、V254R、A510R、A278R、K315R、S334R、L343R、A369R、H392R、L394R、S408R、N456R、V500R、A510R、T573R的一个或多个氨基酸取代。
- 权利要求20或21所述的TraC效应蛋白或其功能性变体,其中所述效应蛋白功能性变体衍生自SEQ ID NO:25,且相对于SEQ ID NO:25序列包含选自表3或表4中所示的任一组突变。
- 权利要求20所述的TraC效应蛋白或其功能性变体,其中所述效应蛋白功能性变体包含选自SEQ ID NO:80-87的氨基酸序列。
- 权利要求20-23中任一项的TraC效应蛋白或其功能性变体,其至少具有向导RNA介导的序列特异性靶向能力。
- 权利要求20-23中任一项的TraC效应蛋白或其功能性变体,其具有向导RNA介导的序列特异性靶向能力,以及双链核酸切割活性。
- 权利要求20-23中任一项的TraC效应蛋白或其功能性变体,其具有向导RNA介导的序列特异性靶向能力,以及切口酶活性。
- 权利要求20-23中任一项的TraC效应蛋白或其功能性变体,其具有向导RNA介导的序列特异性靶向能力,但不具有双链核酸切割活性和/或切口酶活性。
- 权利要求24-27中任一项的TraC效应蛋白或其功能性变体,其中所述向导RNA选自i)衍生自转座子右端元件的向导RNA(reRNA)和/或ii)包含tracrRNA和/或crRNA的向导RNA,例如包含tracrRNA和crRNA的单向导RNA(sgRNA)。
- 权利要求28的TraC效应蛋白或其功能性变体,所述TraC效应蛋白或其功能性变体既能够在衍生自转座子右端元件的向导RNA的引导下靶向结合目标DNA序列,也能够在包含tracrRNA和crRNA的向导RNA的引导下靶向结合目标DNA序列。
- 权利要求28的TraC效应蛋白或其功能性变体,其中所述向导RNA是衍生自TnpB系统的reRNA,例如,所述reRNA包含SEQ ID NO:77或78所示支架序列。
- 权利要求28的TraC效应蛋白或其功能性变体,其中所述向导RNA是tracrRNA和crRNA的单向导RNA(sgRNA),例如,所述sgRNA包含SEQ ID NO:75或76所示 支架序列。
- 权利要求19-31中任一项的TraC效应蛋白或其功能性变体,其还包含至少一个核定位序列(NLS)、细胞质定位序列、叶绿体定位序列或线粒体定位序列。
- 一种融合蛋白,包含权利要求19-32中任一项所述TraC效应蛋白或其功能性变体,以及至少一种其它功能性蛋白。
- 权利要求33的融合蛋白,其中所述其它功能性蛋白是脱氨酶。
- 权利要求34的融合蛋白,其中所述脱氨酶是胞嘧啶脱氨酶,例如,所述胞嘧啶脱氨酶选自APOBEC1脱氨酶、激活诱导的胞苷脱氨酶(AID)、APOBEC3G、CDA1、人APOBEC3A脱氨酶、双链DNA脱氨酶(Ddd)、单链DNA脱氨酶(Sdd)或它们的功能性变体。
- 权利要求35的融合蛋白,所述融合蛋白还包含尿嘧啶DNA糖基化酶抑制剂(UGI)。
- 权利要求34的融合蛋白,其中所述脱氨酶是腺嘌呤脱氨酶,例如,衍生自大肠杆菌tRNA腺嘌呤脱氨酶TadA(ecTadA)的DNA依赖型腺嘌呤脱氨酶。
- 权利要求34-37中任一项的融合蛋白,其中所述融合蛋白包括胞嘧啶脱氨酶和腺嘌呤脱氨酶。
- 权利要求33的融合蛋白,其中所述其它功能性蛋白是选自转录激活蛋白、转录抑制蛋白、DNA甲基化酶、DNA去甲基化酶、逆转录酶。
- 权利要求33-39中任一项的融合蛋白,其中所述融合蛋白的不同部分之间可以独立地通过接头或直接相连。
- 权利要求33-40中任一项的融合蛋白,其还包含至少一个核定位序列(NLS)、细胞质定位序列、叶绿体定位序列或线粒体定位序列。
- 权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白在对细胞,优选真核细胞,更优选植物细胞进行基因组编辑的用途。
- 权利要求42的用途,其中所述基因组编辑包括碱基编辑(Base Editor)、引导编辑(Prime Editor)、PrimeRoot编辑(PrimRoot Editor)。
- 一种用于对细胞基因组中靶核酸序列进行定点修饰的基因组编辑系统,其包含:权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白;和/或编码权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白的核苷酸序列的表达构建体。
- 权利要求44的基因组编辑系统,其还包括至少一种向导RNA(gRNA)和/或包含编码所述至少一种向导RNA的核苷酸序列的表达构建体。
- 权利要求45的基因组编辑系统,其中所述基因组编辑系统包含选自以下的任一项:i)权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中 任一项的融合蛋白,和所述至少一种向导RNA,任选地,所述TraC效应蛋白或其功能性变体或所述融合蛋白和所述至少一种向导RNA形成复合物;ii)包含编码权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白的核苷酸序列的表达构建体,和所述至少一种向导RNA;iii)权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;iv)包含编码权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白的核苷酸序列的表达构建体,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;v)包含编码权利要求19-32中任一项的TraC效应蛋白或其功能性变体或权利要求33-41中任一项的融合蛋白的核苷酸序列和编码所述至少一种向导RNA的核苷酸序列的表达构建体。
- 权利要求45-46中任一项的基因组编辑系统,其中所述向导RNA选自i)衍生自转座子右端元件的向导RNA(reRNA)和/或ii)包含tracrRNA和/或crRNA的向导RNA,例如包含tracrRNA和crRNA的单向导RNA(sgRNA)。
- 权利要求47的基因组编辑系统,其中所述向导RNA是衍生自TnpB系统的reRNA,例如,所述reRNA包含SEQ ID NO:77或78所示支架序列。
- 权利要求45-46中任一项的基因组编辑系统,其中所述向导RNA是包含tracrRNA和crRNA的单向导RNA(sgRNA),例如,所述sgRNA包含SEQ ID NO:75或76所示支架序列。
- 权利要求47或49的基因组编辑系统,其中所述向导RNA包含tracrRNA和crRNA,例如是包含tracrRNA和crRNA的单向导RNA(sgRNA),其中所述crRNA包含与PAM紧邻的靶序列相同的序列,tracrRNA包含与位于PAM靶序列方向远端的序列互补的序列(非靶向链结合序列,NTB)。
- 权利要求44-50中任一项的基因组编辑系统,其中所述基因组编辑系统还包含供体核酸分子,所述供体核酸分子包含待定点插入基因组中的核苷酸序列,例如所述待定点插入基因组中的核苷酸序列两侧包含与基因组中靶序列两侧序列同源的序列。
- 权利要求44-51中任一项的基因组编辑系统,其中编码所述TraC效应蛋白或其功能性变体或所述融合蛋白的核苷酸序列和/或编码所述至少一种向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
- 权利要求44-52中任一项的基因组编辑系统,其中所述基因组编辑系统的组分被包含在递送体系中,所述递送体系选自病毒、病毒样颗粒、病毒体、脂质体、囊泡、外来体、脂质体纳米颗粒(LNP)、N-乙酰半乳糖胺(GalNAc)或工程菌。
- 一种产生经遗传修饰的细胞的方法,包括将权利要求44-53中任一项的基因组编辑系统导入所述细胞。
- 权利要求54的方法,其中所述细胞来自原核生物或真核生物,优选来自哺乳动 物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥。
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23815286.2A EP4534683A1 (en) | 2022-06-01 | 2023-06-01 | Novel crispr gene editing system |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210620277.9 | 2022-06-01 | ||
| CN202210620277 | 2022-06-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023232109A1 true WO2023232109A1 (zh) | 2023-12-07 |
Family
ID=88991305
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/097783 Ceased WO2023232109A1 (zh) | 2022-06-01 | 2023-06-01 | 新的crispr基因编辑系统 |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP4534683A1 (zh) |
| CN (1) | CN117187213A (zh) |
| AR (1) | AR129505A1 (zh) |
| WO (1) | WO2023232109A1 (zh) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4399305A4 (en) * | 2021-09-08 | 2025-10-01 | Metagenomi Inc | CLASS II, TYPE V CRISPR SYSTEMS |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119752847B (zh) * | 2024-12-26 | 2025-12-05 | 崖州湾国家实验室 | 一种高温耐受型的TnpB核酸酶在核酸检测及基因编辑中的应用 |
Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US2371320A (en) | 1945-03-13 | Temt office | ||
| WO2015089465A1 (en) | 2013-12-12 | 2015-06-18 | The Broad Institute Inc. | Delivery, use and therapeutic applications of the crispr-cas systems and compositions for hbv and viral diseases and disorders |
| WO2016205711A1 (en) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
| WO2018141835A1 (en) | 2017-02-03 | 2018-08-09 | The Broad Institute, Inc. | Compounds, compositions and methods for cancer treatment |
| WO2019079347A1 (en) | 2017-10-16 | 2019-04-25 | The Broad Institute, Inc. | USES OF BASIC EDITORS ADENOSINE |
| US20200023712A1 (en) | 2018-07-17 | 2020-01-23 | Eberspächer Climate Control Systems GmbH & Co. KG | Vehicle heater |
| WO2020191233A1 (en) | 2019-03-19 | 2020-09-24 | The Broad Institute, Inc. | Methods and compositions for editing nucleotide sequences |
| WO2021031085A1 (zh) * | 2019-08-19 | 2021-02-25 | 南方医科大学 | 一种高保真CRISPR/AsCpf1突变体的构建及其应用 |
| US20210115421A1 (en) * | 2019-10-17 | 2021-04-22 | Pairwise Plants Services, Inc. | Variants of cas12a nucleases and methods of making and use thereof |
| WO2021155065A1 (en) | 2020-01-28 | 2021-08-05 | The Broad Institute, Inc. | Base editors, compositions, and methods for modifying the mitochondrial genome |
| CN113373130A (zh) * | 2021-05-31 | 2021-09-10 | 复旦大学 | Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用 |
| CN114075559A (zh) * | 2020-09-14 | 2022-02-22 | 珠海舒桐医疗科技有限公司 | 一种2型CRISPR/Cas9基因编辑系统及其应用 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA3093580A1 (en) * | 2018-03-14 | 2019-09-19 | Arbor Biotechnologies, Inc. | Novel crispr dna and rna targeting enzymes and systems |
| US11384344B2 (en) * | 2018-12-17 | 2022-07-12 | The Broad Institute, Inc. | CRISPR-associated transposase systems and methods of use thereof |
| CN110029194A (zh) * | 2019-04-24 | 2019-07-19 | 安邦(厦门)生物科技有限公司 | 基于CRISPR-Cas基因编辑技术的连续荧光监测检测方法及装置 |
-
2023
- 2023-06-01 EP EP23815286.2A patent/EP4534683A1/en active Pending
- 2023-06-01 CN CN202310646033.2A patent/CN117187213A/zh active Pending
- 2023-06-01 WO PCT/CN2023/097783 patent/WO2023232109A1/zh not_active Ceased
- 2023-06-01 AR ARP230101397A patent/AR129505A1/es unknown
Patent Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US2371320A (en) | 1945-03-13 | Temt office | ||
| WO2015089465A1 (en) | 2013-12-12 | 2015-06-18 | The Broad Institute Inc. | Delivery, use and therapeutic applications of the crispr-cas systems and compositions for hbv and viral diseases and disorders |
| WO2016205711A1 (en) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
| WO2018141835A1 (en) | 2017-02-03 | 2018-08-09 | The Broad Institute, Inc. | Compounds, compositions and methods for cancer treatment |
| WO2019079347A1 (en) | 2017-10-16 | 2019-04-25 | The Broad Institute, Inc. | USES OF BASIC EDITORS ADENOSINE |
| US20200023712A1 (en) | 2018-07-17 | 2020-01-23 | Eberspächer Climate Control Systems GmbH & Co. KG | Vehicle heater |
| WO2020191233A1 (en) | 2019-03-19 | 2020-09-24 | The Broad Institute, Inc. | Methods and compositions for editing nucleotide sequences |
| WO2020191234A1 (en) | 2019-03-19 | 2020-09-24 | The Broad Institute, Inc. | Methods and compositions for editing nucleotide sequences |
| WO2021031085A1 (zh) * | 2019-08-19 | 2021-02-25 | 南方医科大学 | 一种高保真CRISPR/AsCpf1突变体的构建及其应用 |
| US20210115421A1 (en) * | 2019-10-17 | 2021-04-22 | Pairwise Plants Services, Inc. | Variants of cas12a nucleases and methods of making and use thereof |
| WO2021155065A1 (en) | 2020-01-28 | 2021-08-05 | The Broad Institute, Inc. | Base editors, compositions, and methods for modifying the mitochondrial genome |
| CN114075559A (zh) * | 2020-09-14 | 2022-02-22 | 珠海舒桐医疗科技有限公司 | 一种2型CRISPR/Cas9基因编辑系统及其应用 |
| CN113373130A (zh) * | 2021-05-31 | 2021-09-10 | 复旦大学 | Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用 |
Non-Patent Citations (15)
| Title |
|---|
| "Biocomputing: Informatics and Genome Projects", 1993, ACADEMIC PRESS |
| "Computer Analysis of Sequence Data", 1994, HUMANA PRESS |
| "Sequence Analysis Primer", 1991, M STOCKTON PRESS |
| ANZALONE, A.VRANDOLPH, PB.DAVIS, J.R. ET AL.: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, vol. 576, 2019, pages 149 - 157, XP038006531, DOI: 10.1038/s41586-019-1711-4 |
| CARRILLO, H.LIPMAN, D., SIAM J APPLIED MATH, vol. 48, 1988, pages 1073 |
| GAO ET AL., JIPB, vol. 56, April 2014 (2014-04-01), pages 343 - 349 |
| KARVELIS TAUTVYDAS; DRUTEIKA GYTIS; BIGELYTE GRETA; BUDRE KAROLINA; ZEDAVEINYTE RIMANTE; SILANSKAS ARUNAS; KAZLAUSKAS DARIUS; VENC: "Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease", NATURE, vol. 599, no. 7886, 7 October 2021 (2021-10-07), pages 692 - 696, XP037627757, DOI: 10.1038/s41586-021-04058-1 * |
| KARVELIS, T. ET AL.: "Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease", NATURE, vol. 599, 2021, pages 692 - 696, XP037627757, DOI: 10.1038/s41586-021-04058-1 |
| MA WANG; XU YING-SHUANG; SUN XIAO-MAN; HUANG HE: "Transposon-Associated CRISPR-Cas System: A Powerful DNA Insertion Tool", TRENDS IN MICROBIOLOGY, ELSEVIER SCIENCE LTD., KIDLINGTON., GB, vol. 29, no. 7, 18 February 2021 (2021-02-18), GB , pages 565 - 568, XP086604210, ISSN: 0966-842X, DOI: 10.1016/j.tim.2021.01.017 * |
| NAKAMURA, Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292 |
| SAMBROOK, J.FRITSCH, E.F.MANIATIS, T.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS |
| SUN, C.LEI, YLI, B. ET AL.: "Precise integration of large DNA sequences in plant genomes using PrimeRoot editors", NAT BIOTECHNOL, 2023 |
| VON HEINJE, G.: "Sequence Analysis in Molecular Biology", 1987, THE BENJAMIN/CUMMINGS PUB. CO., pages: 224 |
| XIE ET AL., PNAS, vol. 112, no. 11, 17 March 2015 (2015-03-17), pages 3570 - 3575 |
| ZHANG MENG-SI, ZHU DE-KANG, WANG MING-SHU: "Transposases in Bacterial Insertion Sequences and Their Transposition Mechanisms", CHINESE JOURNAL OF BIOCHEMISTRY AND MOLECULAR BIOLOGY, vol. 34, no. 10, 1 October 2018 (2018-10-01), pages 1057 - 1064, XP093115635, DOI: 10.13865/j.cnki.cjbmb.2018.10.06 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4399305A4 (en) * | 2021-09-08 | 2025-10-01 | Metagenomi Inc | CLASS II, TYPE V CRISPR SYSTEMS |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4534683A1 (en) | 2025-04-09 |
| CN117187213A (zh) | 2023-12-08 |
| AR129505A1 (es) | 2024-09-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Chen et al. | Prime editing for precise and highly versatile genome manipulation | |
| CN111742051A (zh) | 延伸的单向导rna及其用途 | |
| WO2021032155A1 (zh) | 一种碱基编辑系统和其使用方法 | |
| CN111836894A (zh) | 使用CRISPR/Cpf1系统的基因组编辑组合物及其用途 | |
| EP4491723A1 (en) | Cytosine deaminase and use thereof in base editing | |
| CN107922949A (zh) | 用于通过同源重组的基于crispr/cas的基因组编辑的化合物和方法 | |
| US20210095271A1 (en) | System and method for genome editing | |
| Broothaerts et al. | New genomic techniques: State-of-the-art review | |
| CN115427564B (zh) | 改进的胞嘧啶碱基编辑系统 | |
| WO2023232109A1 (zh) | 新的crispr基因编辑系统 | |
| WO2023169454A1 (zh) | 腺嘌呤脱氨酶及其在碱基编辑中的用途 | |
| WO2024235293A1 (zh) | 基于环状rna的引导编辑系统 | |
| Song et al. | Generation of new β-conglycinin-deficient soybean lines by editing the lincRNA lincCG1 using the CRISPR/Cas9 system | |
| WO2020087631A1 (zh) | 基于C2c1核酸酶的基因组编辑系统和方法 | |
| JP2024501892A (ja) | 新規の核酸誘導型ヌクレアーゼ | |
| US20250043313A1 (en) | Cas9 variants with improved specificity | |
| WO2023165613A1 (zh) | 5'→3'核酸外切酶在基因编辑系统中的用途和基因编辑系统及其编辑方法 | |
| CN119998447A (zh) | 新型腺嘌呤脱氨酶变体及使用其进行碱基编辑的方法 | |
| CN117126827A (zh) | 一种融合蛋白及含有由尿嘧啶-n-糖基化酶突变体介导的碱基编辑系统和应用 | |
| WO2024230760A1 (zh) | 一种可作用于dna的腺苷脱氨酶及其应用 | |
| CN119591727B (zh) | 优化的qbe碱基编辑系统及其应用 | |
| CN120026006A (zh) | 一种基于TraC效应蛋白的碱基编辑系统及其应用 | |
| WO2025051243A1 (zh) | 一种模块化的基因编辑工具及其应用 | |
| WO2025085787A1 (en) | Engineered components of crispr and crispr-associated transposons systems | |
| US20230407278A1 (en) | Compositions and methods for cas9 molecules with improved gene editing properties |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23815286 Country of ref document: EP Kind code of ref document: A1 |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112024025106 Country of ref document: BR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023815286 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023815286 Country of ref document: EP Effective date: 20250102 |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01E Ref document number: 112024025106 Country of ref document: BR Free format text: APRESENTE NOVAS FOLHAS DO RELATORIO DESCRITIVO E RESUMO ADAPTADAS AOS ARTS. 26 E 40 DAPORTARIA NO 14/2024, UMA VEZ QUE O CONTEUDO ENVIADO NA PETICAO NO 870250000342 ENCONTRA-SEFORA DA NORMA: OS DOCUMENTOS DEVEM SER INICIADOS PELO TITULO CENTRALIZADO SEM O USO DEPALAVRAS ADICIONAIS (RELATORIO DESCRITIVO DE?, PATENTE DE INVENCAO...). A EXIGENCIA DEVE SERRESPONDIDA EM ATE 60 (SESSENTA) DIAS DE SUA PUBLICACAO E DEVE SER REALIZADA POR MEIO DAPETICAO GRU CODIGO DE SERVICO 207. |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023815286 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 112024025106 Country of ref document: BR Kind code of ref document: A2 Effective date: 20241129 |