WO2025077734A1 - Constructs and methods for preparing circular rnas and uses thereof - Google Patents
Constructs and methods for preparing circular rnas and uses thereof Download PDFInfo
- Publication number
- WO2025077734A1 WO2025077734A1 PCT/CN2024/123677 CN2024123677W WO2025077734A1 WO 2025077734 A1 WO2025077734 A1 WO 2025077734A1 CN 2024123677 W CN2024123677 W CN 2024123677W WO 2025077734 A1 WO2025077734 A1 WO 2025077734A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- rna
- intron
- fragment
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/67—General methods for enhancing the expression
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/0008—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
- A61K48/0025—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the non-active part clearly interacts with the delivered nucleic acid
- A61K48/0041—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the non-active part clearly interacts with the delivered nucleic acid the non-active part being polymeric
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/42—Vector systems having a special element relevant for transcription being an intron or intervening sequence for splicing and/or stability of RNA
Definitions
- the present invention relates to the field of molecular biology, in particular to constructs and methods for preparing circular RNAs and uses of the circular RNAs in, for example, expressing a protein of interest in a eukaryotic cell or functioning as noncoding RNA.
- Circular RNAs are a category of RNA molecules formed by head-to-tail ligation, which were demonstrated to have multiple biological functions in recent years. (Yang et al., Cell Research, 27 (5) : 626-641 (2017) ; Abe et al., Scientific Reports, 5: 16435 (2015) ; Gao et al., Nature Cell Biology, 23 (3) : 278-291 (2021) ; Pamudurti et al., Molecular Cell, 66 (1) : 9-21 (2017) ) . Compared with linear RNAs, circRNAs have better stability and therefore provide a promising new platform for RNA drugs.
- compositions, methods and systems provided herein address this need and provide related advantages.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- the circRNA consists of the target sequence.
- the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1.
- Z1 encodes a therapeutic product.
- Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259.
- the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118.
- TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
- IRES internal ribosome entry site
- ASO antisense oligonucleotide
- TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof. In some embodiments, TI is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225.
- the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI. In some embodiments, the 3’ target sequence fragment further comprises two linkers (L) flanking Z1.
- the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1. In some embodiments, the 3’ target sequence fragment further comprises two linkers (L) flanking TI.
- the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TI B ) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TI A ) .
- the 3’ target sequence further comprises two linkers (L) flanking Z1.
- the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1 B ) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’ , TI and a 5’ fragment of Z1 (Z1 A ) .
- the 3’ target sequence further comprises two linkers (L) flanking TI.
- RNAs provided herein have a structure selected from the group consisting of Formulae (I) - (IV) :
- 3’ IF is the 3’ intron fragment
- 5’ IF is the 5’ intron fragment
- TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TI A ) and a 3’ fragment TI (TI B )
- Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1 A ) and a 3’ fragment (Z1 B )
- the D1-like sequence comprises an EBS1’ sequence and a ⁇ ” nucleotide wherein the EBS1’ sequence, and the ⁇ ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in a target sequence. In some embodiments, the complementarily paired regions are located at one or both ends of the target sequence.
- RNAs provided herein comprise a structure selecting from the group consisting of Formulae (1) - (12) :
- TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
- RNAs provided herein comprise a structure selecting from the group consisting of Formulae (1) - (8) :
- TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D3L is D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
- the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100% identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, and 205.
- the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, and 209.
- the D6-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, and 210.
- the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, and 206.
- the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 152, 163, 177, 189, 196, 201, and 207.
- RNAs provided herein further comprise a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- the 5’ and the 3’ homology arms have up to 10%base mismatches.
- (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
- the RNAs provided herein have group IIB intron activity.
- the 5’ intron fragment and the 3’ intron fragment of the RNAs provided herein are obtained by segmenting a group II intron at an unpaired region into two fragments. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2, between D2 and D3, between D3 and D4, between D4 and D5, or between D5 and D6.
- the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a deletion, a substitution, and an addition.
- the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4.
- the modification comprises a deletion of an open reading frame (ORF) .
- the D1 of the group II intron comprises an EBS1 sequence and an EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- the D1 of the group II intron comprises an EBS1 sequence and a ⁇ nucleotide, wherein the EBS1 sequence, and the ⁇ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- the D1 of the group II intron comprises an EBS1’ sequence and a ⁇ ” nucleotide, wherein the EBS1’ sequence, and the ⁇ ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- the complementarily paired regions are located at one or both ends of the target sequence.
- RNAs provided herein further compris3 a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- the 5’ and the 3’ homology arms have up to 10%base mismatches.
- (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
- the group II intron is a group II intron derived from a microorganism. In some embodiments, the group II intron is a group IIB intron. In some embodiments, the group II intron is Cte 1. In some embodiments, the group II intron has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
- the 3’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NO: 42-52 and 228.
- the 5’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229.
- RNAs disclosed herein are also cells comprising the RNAs disclosed herein, the circRNAs disclosed herein, or the vectors disclosed herein.
- RNA of disclosed herein Provided herein are also methods of making a circRNA comprising subjecting the RNA of disclosed herein under conditions sufficient for it to self-splice.
- the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e.g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell,
- neural cell e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor
- Embodiment 1 A non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment;
- RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- circRNA circular RNA
- Embodiment 2 The RNA of embodiment 1, wherein the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1.
- TI translation initiation sequence
- Z1 protein-coding sequence
- Embodiment 3 The RNA of embodiment 2, wherein Z1 encodes a therapeutic product.
- Embodiment 4 The RNA of embodiment 2, wherein Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259.
- Embodiment 5 The RNA of embodiment 3, wherein the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118.
- Embodiment 6 The RNA of any one of embodiments 2 to 5, wherein TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
- TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a
- Embodiment 7 The RNA of embodiment 6, wherein TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof.
- Embodiment 8 The RNA of embodiment 6, wherein TI is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225.
- Embodiment 9 The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI.
- Embodiment 10 The RNA of embodiment 8, wherein the 3’ target sequence fragment further comprises two linkers (L) flanking Z1.
- Embodiment 11 The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1.
- Embodiment 12 The RNA of embodiment 11, wherein the 3’ target sequence fragment further comprises two linkers (L) flanking TI.
- Embodiment 13 The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TI B ) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TI A ) .
- Embodiment 14 The RNA of embodiment 13, wherein the 3’ target sequence further comprises two linkers (L) flanking Z1.
- Embodiment 15 The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1 B ) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’ , TI and a 5’ fragment of Z1 (Z1 A ) .
- Embodiment 16 The RNA of embodiment 15, wherein the 3’ target sequence further comprises two linkers (L) flanking TI.
- Embodiment 17 The RNA of embodiment 1, having a structure selected from the group consisting of Formulae (I) - (IV) :
- 3’ IF is the 3’ intron fragment
- 5’ IF is the 5’ intron fragment
- TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TI A ) and a 3’ fragment TI (TI B )
- Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1 A ) and a 3’ fragment (Z1 B )
- Embodiment 18 The RNA of any one of embodiments 2 to 14, or of embodiment 17 having a structure selected from the group consisting of Formulae (I) - (III) , further comprising (1) an exon fragment 2 (E2) between the 3’ intron fragment and the target sequence, (2) an exon fragment 1 (E1) between the target sequence and the 5’ intron fragment; or (3) both (1) and (2) .
- Embodiment 19 The RNA of embodiment 18, wherein E2 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63.
- Embodiment 39 The RNA of embodiment 38, wherein: (1) the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100% identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, and 205; (2) the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, and 209; (3) the D6-like sequence a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180
- Embodiment 40 The RNA of any one of embodiments 1 to 39, further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- Embodiment 41 The RNA of embodiment 40, wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- Embodiment 42 The RNA of embodiment 40 or 41, wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
- Embodiment 43 The RNA of embodiment 42, wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
- Embodiment 44 The RNA of any one of embodiments 1 to 43, wherein the RNA has group IIB intron activity.
- Embodiment 45 The RNA of any one of embodiments 1 to 22, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments.
- Embodiment 46 The RNA of embodiment 45, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6.
- Embodiment 47 The RNA of embodiment 45, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2, between D2 and D3, between D3 and D4, between D4 and D5, or between D5 and D6.
- Embodiment 48 The RNA of any one of embodiments 45 to 47, wherein the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a deletion, a substitution, and an addition.
- Embodiment 49 The RNA of embodiment 48, wherein the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4.
- IEP intron-encoded protein
- Embodiment 50 The RNA of embodiment 48, wherein the modification comprises a deletion of an open reading frame (ORF) .
- ORF open reading frame
- Embodiment 51 The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1 sequence and an EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- Embodiment 52 The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1 sequence and a ⁇ nucleotide, wherein the EBS1 sequence, and the ⁇ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- Embodiment 53 The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- Embodiment 54 The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1’ sequence and a ⁇ ” nucleotide, wherein the EBS1’ sequence, and the ⁇ ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- Embodiment 55 The RNA of embodiment 53 or 54, wherein the complementarily paired regions are located at one or both ends of the target sequence.
- Embodiment 56 The RNA of any one of embodiments 45 to 55, further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- Embodiment 57 The RNA of embodiment 56, wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- Embodiment 58 The RNA of embodiment 56 or 57, wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
- Embodiment 60 The RNA of any one of embodiments 45 to 59, wherein the group II intron is a group II intron derived from a microorganism.
- Embodiment 69 The RNA of embodiment 67, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is m5C (5-methylcytidine) , m5U (5-methyluridine) , m6A (N6-methyladenosine) , Y (pseudouridine) , or m1A (1-methyladenosine) .
- Embodiment 70 The RNA of any one of embodiments 67 to 69, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is introduced at in vitro transcription (IVT)
- Embodiment 71 A circRNA produced by the self-splicing of the RNA of any one of embodiments 1 to 70.
- Embodiment 72 A vector encoding the RNA of any one of embodiments 1 to 70.
- Embodiment 73 A cell comprising the RNA of any one of embodiments 1 to 70, the circRNA of embodiment 71, or the vector of embodiment 72.
- Embodiment 74 A method of making a circRNA comprising subjecting the RNA of any one of embodiments 1 to 70 under conditions sufficient for it to self-splice.
- Embodiment 75 A method of expressing a protein in a cell comprising transfecting the cell with the circRNA of embodiment 71.
- Embodiment 76 The method of embodiment 75 wherein the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e.g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell, NK cell, liver starlet cell, HEK293, HEK293T, HeLa, MCF
- Embodiment 77 A method of expressing a protein in vivo comprising administering to a subject the circRNA of embodiment 71 or the vector of embodiment 72.
- Embodiment 78 A method of expressing an RNA in vivo comprising administering to a subject the vector of embodiment 72.
- Embodiment 79 The method of embodiment 77 or 78 wherein the subject is a human.
- Embodiment 80 A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence of SEQ ID NO. 263.
- Embodiment 81 A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence comprising the following elements: 5’ HBB UTR, CDS, 3’ HBB UTR and polyA.
- the nucleotide sequence of CDS is selected from SEQ ID NOs. 258-259.
- the nucleotide sequence of 5’ HBB UTR is selected from SEQ ID NO. 260.
- the nucleotide sequence of 3’ HBB UTR is selected from SEQ ID NO. 261.
- the nucleotide sequence of polyA is selected from SEQ ID NO. 262.
- RNA comprising the following operably linked elements from 5’ to 3’ :
- RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- circRNA circular RNA
- RNA of paragraph [00120] wherein the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1.
- TI translation initiation sequence
- Z1 protein-coding sequence
- RNA of paragraph [00121] wherein Z1 encodes a therapeutic product.
- RNA of paragraph [00121] wherein Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259.
- RNA of paragraph [00122] wherein the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118.
- TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
- IRES internal ribosome entry site
- ASO antisense oligonucleotide
- TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof.
- RNA of any one of paragraphs [00121] to [00126] wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI.
- RNA of any one of paragraphs [00120] to [00136] wherein the circRNA consists of the target sequence.
- RNA of any one of paragraphs [00120] to [00141] wherein the 3’ intron fragment comprises D5-like sequence, and the 5’ intron fragment comprises a D1-like sequence.
- the D1-like sequence comprises EBS1 sequence and EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- the D1-like sequence comprises an EBS1 sequence and a ⁇ nucleotide wherein the EBS1 sequence, and the ⁇ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking a target sequence.
- the D1-like sequence comprises EBS1’ sequence and EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- the D1-like sequence comprises an EBS1’ sequence and a ⁇ ” nucleotide wherein the EBS1’ sequence, and the ⁇ ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in a target sequence.
- RNA of paragraph [00145] or [00146] wherein the complementarily paired regions are located at one or both ends of the target sequence.
- RNA of any one of paragraphs [00142] to [00147] wherein the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, 205, and 265.
- RNA of any one of paragraphs [00142] to [00148] wherein the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, 209, and 269.
- the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine or an atypical bulged adenosine.
- RNA of paragraph [00150] wherein the D6-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, 210, and 270.
- RNA of any one of paragraphs [00142] to [00151] wherein the 5’ intron fragment further comprises a D2-like sequence or a D3-like sequence at the 3’ end of the D1-like sequence.
- RNA of any one of paragraph [00152] wherein the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- RNA of any one of paragraphs [00142] to [00154] further comprising a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’end of the 5’ intron fragment, wherein the pair of D4 stem-like sequences each has a region that is 10-200 or 30-60 nucleotides in length and are at least 60%complementarily paired (acomplementary region) .
- RNA of paragraph [00155] wherein the 3’ and 5’ D4 stem-like sequences have two or more complementary regions.
- RNA of one of paragraphs [00120] to [00127] comprising a structure selecting from the group consisting of Formulae (1) - (12) :
- TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D3L is D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
- the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, 205, and 265;
- the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, 209, and 269;
- the D6-like sequence a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, 210, and 270; or
- the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, 206, and 266; or (b) the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting NOs: 152, 163, 177, 189, 196, 201, 207 and 267; or both (a) and (b) ; or any combination of (1) - (4) .
- RNA of any one of paragraphs [00120] to [00159] further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- RNA of paragraph [00160] wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- RNA of paragraph [00160] or [00161] wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
- RNA of paragraph [00162] wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
- RNA of any one of paragraphs [00120] to [00163] wherein the RNA has group IIB intron activity.
- RNA of any one of paragraphs [00120] to [00141] wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments.
- RNA of paragraph [00165] wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6.
- RNA of any one of paragraphs [00165] to [00167] wherein the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a deletion, a substitution, and an addition.
- RNA of paragraph [00168] wherein the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4.
- IEP intron-encoded protein
- RNA of any one of paragraphs [00168] to [00170] wherein the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- a method of expressing a protein in vivo comprising administering to a subject the circRNA of paragraph [00192] or the vector of paragraph [00193] .
- a method of expressing an RNA in vivo comprising administering to a subject the vector of paragraph [00193] .
- RNA has a nucleotide sequence of SEQ ID NO. 263.
- RNA has a nucleotide sequence comprising the following elements: 5’ HBB UTR, CDS, 3’ HBB UTR and polyA, wherein the CDS encodes a protein that is not HBB.
- FIGs. 1A-1C depict the secondary structure of group II introns.
- FIG. 1A provides a schematic diagram of group II intron’s general structure.
- a typical group II intron can have six stem-loop structures, referred to as Domains 1-6, or D1-D6.
- D4 contains an open reading frame.
- the 6 domains are sequentially arranged and comprise multiple exon binding sequences (EBSs) , such as EBS1, EBS2, and EBS3.
- EBSs exon binding sequences
- IBSs intron binding sequences
- FIG. 1B provides the secondary structure of exemplary group II intron Cte.
- FIG. 1C provides the secondary structure of an exemplary synthetic group II intron based on sequence elements of Cte: Cte-Syn1.
- FIGs. 2A-2F depict the structure of group II introns in which the sequence elements that provide the long-range interactions essential for their tertiary structure and function are denoted with Greek letters.
- FIG. 2A depicts a general group II intron.
- FIG. 2B and FIG. 2C depict the exemplary group IIB intron Cte.
- FIG. 2B marks the sequence elements participating in self-splicing.
- FIG. 2C also marks sequence elements that contribute the tertiary structure of the intron.
- FIG. 2D depict the structure of group IIA intron
- FIG. 2E depict the structure of group IIC intron.
- FIG. 2F identifies the relevant nucleotides in exemplary group II intron Cte.
- FIG. 2G identifies the relevant nucleotides in a synthetic group II intron derived from Cte: Cte-syn1, another exemplary group II intron.
- FIG. 2H identifies the relevant nucleotides in LtrB, another exemplary group II intron.
- FIG. 2I identifies the relevant nucleotides in a synthetic group II intron derived from LtrB: LtrB-syn1.
- FIG. 2J identifies the relevant nucleotides in Pli, another exemplary group II intron.
- FIG. 2K identifies the relevant nucleotides in a synthetic group II intron derived from Pli: Pli-syn1.
- FIGs. 3A-3B depict two mechanisms for group II intron self-splicing. As shown, Group II introns catalyze self-splicing via two consecutive transesterification reactions.
- FIG. 3A depicts the hydrolysis pathway, which uses an external water molecule as the first-step nucleophile, resulting in liberation of a linear intron molecule.
- FIG. 3B depicts the branching pathway, which uses the 2’ -OH group in an adenosine (branch site) in D6 as the first-step nucleophile, resulting in liberation of a lariat intron molecule.
- FIGs. 4A-4D provide schematic diagrams identifying the exon-intron interactions essential for the near-scarless or scarless splicing of the cRNAzymes disclosed herein.
- FIG. 4A depicts near-scarless splicing based on the interactions between IBS1 and EBS1; and IBS3 and EBS3; optionally also between IBS2 and EBS2.
- a group II intron with flanking exon sequences E1 and E2 is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments.
- Arrows indicate the interactions between IBS1 and EBS1; IBS2 and EBS2; and IBS3 and EBS3.
- the self-splicing of construct produces a circRNA consisting of the target sequence, E1 and E2.
- FIG. 4B depicts near-scarless splicing based on the interactions between IBS1 and EBS1; and the ⁇ nucleotide and IBS3; optionally also between IBS2 and EBS2.
- a group II intron with flanking exon sequences E1 and E2 is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments.
- Arrows indicate the interactions between IBS1 and EBS1; IBS2 and EBS2; and IBS3 and ⁇ .
- the self-splicing of construct produces a circRNA consisting of the target sequence, E1 and E2.
- FIG. 4C depicts scarless splicing based on the interactions between IBS1’ and EBS1’ ; and IBS3’ and EBS3’ .
- a group II intron is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments.
- Arrows indicate the interactions between IBS1’ and EBS1’ ; and IBS3’ and EBS3’ .
- the self-splicing of construct produces a circRNA consisting of the target sequence.
- RNAs that are engineered ribozymes with self-splicing activity which, upon self-splicing, forms circRNAs are also referred to herein as “cRNAzymes. ”
- the cRNAzymes can have in vitro self-splicing activity.
- novel non-naturally occurring RNAs that have group II intron self-splicing activity and that, upon self-splicing, forms circRNAs, or “group II cRNAzymes.
- vectors comprising polynucleotides encoding these group II cRNAzymes, methods of preparing the group II cRNAzymes disclosed herein by transcribing these vectors, and uses of these group II cRNAzymes in making circRNAs.
- the term “about” is used to indicate that a value includes the inherent variation of error for the device, the method being employed to determine the value, or the variation that exists among the study subjects.
- the term “about” encompasses the exact number recited.
- “about” means within plus or minus 10%of a given value or range.
- “about” means that the variation is ⁇ 5%, ⁇ 4%, ⁇ 3%, ⁇ 2%, ⁇ 1%, ⁇ 0.5%, ⁇ 0.2%, or ⁇ 0.1%of the value to which “about” refers.
- “about” means that the variation is ⁇ 1%, ⁇ 0.5%, ⁇ 0.2%, or ⁇ 0.1%of the value to which “about” refers.
- the peptides may be about 100, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, about 1000, about 1250, about 1500, about 1750, about 2000, about 2250, about 2500, about 2750, about 3000, about 3250, about 3500, about 3750, about 4000, about 4250, about 4500, about 4750, are about 5000 amino acid residues in length.
- the nucleic acids or polynucleotides can be heterogenous or homogenous in composition, can be isolated from naturally occurring sources, or can be artificially or synthetically produced.
- the nucleic acids may be DNA or RNA, or a mixture thereof, and can exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states.
- Nucleic acid structures also include, for instance, a DNA/RNA helix, peptide nucleic acid (PNA) , morpholino nucleic acid (see, e.g., Braasch and Corey, Biochemistry, 4 (14) : 4503-4510 (2002) and U.S.
- nucleic acid strand is inherently directional, as the carbon atoms in the sugar ring are numbered from 1’ to 5’ and the “5’ -end” has a free hydroxyl (or phosphate) on a 5’ carbon and the “3’ prime end” has a free hydroxyl (or phosphate) on a 3’ carbon.
- a nucleic acid having certain sequence elements “from 5’ to 3’ ” means that these sequence elements are arranged linearly from the 5’ end to the 3’ end of the nucleic acid.
- sequence similarity is used to denote similarity between two sequences. Sequence similarity or identity may be determined using standard techniques known in the art, including, but not limited to, the local sequence identity algorithm of Smith &Waterman, Adv. Appl. Math. 2, 482 (1981) , by the sequence identity alignment algorithm of Needleman &Wunsch, J Mol. Biol. 48, 443 (1970) , by the search for similarity method of Pearson &Lipman, Proc. Natl. Acad. Sci.
- complementary and complementarity refers to the relationship between two nucleic acid molecules having the capacity to form hydrogen bond (s) with one another by either traditional Watson-Crick base-paring or other non-traditional types of pairing.
- the two DNA/RNA strands with complementary sequences bind to form a duplex that follows the Watson-Crick base-pairing rules: A binds to T (U) with two hydrogen bonds; G binds to C with three hydrogen bonds.
- the degree of complementarity between two nucleotide sequences can be indicated by the percentage of nucleotides in a nucleotide sequence which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleotide sequence (e.g., about 50%, about 60%, about 70%, about 80%, about 90%, and 100%complementary) .
- Two nucleotide sequences are “perfectly complementary” or “100%complementary” if all the contiguous nucleotides of a nucleotide sequence will hydrogen bond with the same number of contiguous nucleotides in a second nucleotide sequence.
- Two nucleotide sequences are “substantially complementary” if the degree of complementarity between the two nucleotide sequences is at least 60% (e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%) over a region of at least 8 nucleotides (e.g., at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more nucleotides) , or if the two nucleotide sequences hybridize under at least moderate, or, in some embodiments high, stringency conditions.
- at least 8 nucleotides e.g., at least 9, at least 10, at least 11, at least 12, at least
- Exemplary moderate stringency conditions include overnight incubation at 37°C in a solution comprising 20%formamide, 5%SSC (150 mM NaCl, 15 mM trisodium citrate) , 50 mM sodium phosphate (pH 7.6) , 5x Denhardt’s solution, 10%dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1*SSC at about 37-50°C, or substantially similar conditions, e.g., the moderately stringent conditions described in Sambrook, J., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press; 4th edition (June 15, 2012) .
- High stringency conditions are conditions that use, for example (1) low ionic strength and high temperature for washing, such as 0.015 M sodium chloride/0.0015 M sodium citrate/0.1%sodium dodecyl sulfate (SDS) at 50°C, (2) employ a denaturing agent during hybridization, such as formamide, for example, 50% (v/v) formamide with 0.1%bovine serum albumin (BSA) /0.1%Ficoll/0.1%polyvinylpyrrolidone (PVP) /50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride and 75 mM sodium citrate at 42°C, or (3) employ 50%formamide, 5xSSC (0.75 M NaCl, 0.075 M sodium citrate) , 50 mM sodium phosphate (pH 6.8) , 0.1%sodium pyrophosphate, 5x Denhardt’s solution, sonicated salmon sperm DNA (50 pg/ml) , 0.1%SDS
- exogenous refers to a protein, gene, nucleic acid, or polynucleotide that has been introduced into the cell or organism by artificial or natural means; or in relation to a cell, the term refers to a cell that was isolated and subsequently introduced into a cell population or to an organism by artificial or natural means.
- An exogenous nucleic acid may be from a different organism or cell, or it may be one or more additional copies of a nucleic acid that occurs naturally within the organism or cell.
- An exogenous cell may be from a different organism, or it may be from the same organism.
- an exogenous nucleic acid is one that is in a chromosomal location different from where it would be in natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature.
- operably linked means that these sequence elements (e.g., an intron fragment, a target sequence, a promoter, and a coding sequence) are functionally related to each other.
- a promoter is operatively linked to a coding sequence if it controls the transcription of the sequence; or a ribosome binding site is operatively linked to a coding sequence if it is positioned so as to permit translation.
- hybridization or “hybridized” when referring to nucleotide sequences is the association formed between and/or among sequences having complementarity.
- Cte refers to a group IIB intron C. te. I1, found in the human pathogen Clostridium tetani. (McNeil et al., RNA, 20 (6) : 855-866 (2014) ) .
- Pli refers to a group IIB intron Pli, found in the mitochondrial genome of a filamentous brown alga pathogen Pylaiella littoralis. (Zhao and Pyle, Trends in Biochem. Sci., 42.6 (2017) : 470-482) .
- Oi refers to a group IIC intron O. i., found in the Oceanobacillus iheyensis. (Toor et al. (2010) , RNA 16, 57-69) .
- LtrB refers to a group IIA intron Ll. LtrB, found in the Lactococcus lactis. (Qu, G. et al. (2016) Nat. Struct. Mol. Biol. 23, 549-557) .
- D5-like sequence refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D5, including the catalytic triad, ⁇ ’ , ⁇ ’ , and ⁇ ’a s depicted on FIGs. 2A-2E.
- catalytic triad refers to a highly conserved region (AGC) which forms base triples with other nucleotides to form a triple helix known as the “catalytic triplex” .
- AGC highly conserved region
- D6-like sequence refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D6, including a bulged adenosine (or an atypical bulged A) that acts as the nucleophile for the first step of splicing (the branching pathway) as depicted on FIGs. 2A-2E.
- bulged adenosine also known as bulged A
- the conventional group II-type bulged adenosine is located on domain 6 (D6, or DVI) of a group II intron (FIGs. 1A and 2A) .
- the bulged adenosine is 7 or 8 nucleotides away from the 3' splicing site. The bulged adenosine is normally conserved and plays a central role in the splicing process.
- atypical bulged adenosine also known as atypical bulged A
- atypical bulged A refers to a region found within D6 of a group IIB intron C. te. I1 (Cte) , found in the in the human pathogen Clostridium tetani. D6 of Cte does not have a clearly bulged adenosine. Instead, it has a looped region, see FIGs. 2B-2C, which acts as the nucleophile for the first step of splicing (the branching pathway) . (McNeil et al., RNA, 20 (6) : 855-866 (2014) )
- D2-like sequence refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron and forms the stem-loop structure of the D2 a naturally occurring group II intron.
- D2/D3-like sequence refers to an RNA sequence that is either a “D2-like sequence” or a “D3-like sequence. ”
- the term “3’ D4 stem-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the 3’ stem of the D4 of a naturally occurring group II intron.
- the term “5’ D4 stem-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the 5’ stem of the D4 of a naturally occurring group II intron.
- the 3’ and 5’ D4 stem-like sequences each has a region that are at least 60%complementarily paired (acomplementary region) .
- scar refers to the non-target sequence region in the circRNA splicing product.
- scarless splicing refers to the self-splicing of the cRNAzymes which produce circRNAs that do not contain additional sequence elements beyond the target sequence. As such, a “scarless” circRNA contains no scar, meaning that it solely consists of the target sequence.
- near-scarless splicing refers to the self-splicing of the cRNAzymes which produce circRNAs that include no more than 20 nucleotides besides the target sequence.
- a “near-scarless circRNA” is a circRNA resulting from “near-scarless” self-splicing of a cRNAzyme, which include no more than 20 nucleotides besides the target sequence.
- EBS exon-binding sequences
- IBSs intron-binding sequences
- group II intron refers to an exon binding sequence in the intron, which interact (e.g., complementarily pair) with the intron binding sequences ( “IBSs” ) flanking the exon regions to trigger splicing.
- a group II intron can have multiple EBSs, such as EBS1, EBS2, and EB2s, which interact with IBS1, IBS2, and IBS3, respectively.
- EBS the single nucleotide located directly upstream of EBS1 in domain 1
- the “ ⁇ nucleotide, ” can also pair with IBS3, and the interaction between ⁇ and IBS3 is referred to ⁇ -IBS3 pairing.
- EBS refers to an EBS modified to allow scarless splicing of a group II intron.
- the sequence elements within the target sequence that pair with the EBS’s are referred to herein as the IBS’s .
- EBS1’ , EBS2’ , and EBS3’ refer to the EBS1, EBS2, and EBS3 sequences that are modified to allow scarless splicing, respectively.
- IBS1’ , IBS2’ , and IBS3’ refer to the sequences in the target sequence that function as the IBS1, IBS2, and IBS3 in the native exon sequences flanking a group II intron to locate splicing site by interacting with EBS1’ , EBS2’ , and EBS3’ , respectively.
- the “ ⁇ ” nucleotide” refers to the nucleotide upstream of EBS1’ that pairs with IBS3’ , and the interaction between ⁇ ” and IBS3’ is referred to as the ⁇ ” -IBS3’ pairing.
- E1 and E2 refer to the exon fragments flanking the target sequence in the cRNAzymes, which remain with the target sequence after self-splicing of the cRNAzymes.
- E2 is linked to 5’ end of the target sequence and E1 is linked to the 3’ end of the target sequence.
- Both E1 and E2 comprise an IBS and facilitate the self-splicing of the cRNAzyme.
- the E1 and/or E2 can be the exon sequences flanking naturally existing group II intron.
- the E1 and/or E2 can be artificial sequences that are engineered into cRNAzymes to, e.g., enhance the accuracy and/or efficiency of self-splicing.
- in vitro transcription refers to versatile method to produce RNA in vitro that uses an RNA polymerase, ribonucleotides, and appropriate buffer conditions to synthesize RNA from a DNA template.
- control elements refers collectively to promoter regions, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites (IRES) , enhancers, splice junctions, and the like, which collectively provide for the replication, transcription, post-transcriptional processing, and translation of a coding sequence in a recipient cell.
- IRS internal ribosome entry sites
- promoter refers to a nucleotide region comprising a DNA regulatory sequence, wherein the regulatory sequence is derived from a gene that is capable of binding to an RNA polymerase and allowing for the initiation of transcription of a downstream (3' direction) coding sequence. It may contain genetic elements at which regulatory proteins and molecules may bind, such as RNA polymerase and other transcription factors, to initiate the specific transcription of a nucleic acid sequence.
- a promoter that is “operatively positioned, ” “operatively linked” mean that a promoter is in a correct functional location and/or orientation in relation to a nucleic acid sequence to control transcriptional initiation and/or expression of that sequence, which is “under control” and “under transcriptional control” of the promoter.
- the term “enhancer” as used herein means a nucleic acid sequence that, when positioned proximate to a promoter, confers increased transcription activity relative to the transcription activity resulting from the promoter in the absence of the enhancer domain.
- internal ribosome entry site refers to cis elements of viral or human cellular RNAs (e.g., messenger RNA (mRNA) and/or circRNAs) that bypass the steps of canonical eukaryotic cap-dependent translation initiation.
- mRNA messenger RNA
- circRNAs messenger RNA
- IRES-like sequence and “Internal Ribosome Entry Site-like sequence, ” as used interchangeably herein refer to non-naturally occurring nucleotide sequences that display a function of a naturally occurring IRES.
- vector or “construct” refers to a vehicle that is used to carry genetic material (e.g., a nucleotide sequence) , which can be introduced into a host cell, where it can be replicated and/or expressed.
- genetic material e.g., a nucleotide sequence
- treat refers to executing a protocol or plan, which can include administering one or more drugs or active agents to a patient, in an effort to alleviate signs or symptoms of the disease or the recurrence of the disease. Desirable effects of treatment include decreasing the rate of disease progression, ameliorating or palliating the disease state, and remission, increased survival, improved quality of life or improved prognosis. Alleviation or prevention can occur prior to signs or symptoms of the disease or condition appearing, as well as after their appearance. As used herein, a “treatment” does not require complete alleviation of signs or symptoms, and does not require a cure.
- the term “therapeutic beneficial” or “therapeutically effective” when used in connection with a therapeutic refers to the property of the therapeutic that promotes or enhances the well-being of the subject. This includes, but is not limited to, a reduction in the frequency, severity, or rate of progression of the signs or symptoms of a disease.
- treatment of cancer may involve, for example, a reduction in the size of a tumor, a reduction in the invasiveness of a tumor, reduction in the growth rate of the cancer, or a reduction in the rate of metastasis or recurrence. Treatment of cancer can also refer to prolonging survival of a subject with cancer.
- the term “pharmaceutical or pharmacologically acceptable” refers to molecular entities and compositions that do not produce an adverse, allergic, or other untoward reaction when administered to an animal, such as a human, as appropriate.
- animal e.g., human
- preparations should meet sterility, pyrogenicity, general safety, and purity standards as required, e.g., by the FDA Office of Biological Standards.
- the term “pharmaceutically acceptable carrier” includes any and all aqueous biocompatible solvents (e.g., saline solutions, phosphate buffered saline, parenteral vehicles, such as sodium chloride, Ringer's dextrose, etc. ) , antioxidants, preservatives (e.g., antibacterial or antifungal agents, anti-oxidants, chelating agents, and inert gases) , isotonic agents, such like materials and combinations thereof, as would be known to one of ordinary skill in the art.
- aqueous biocompatible solvents e.g., saline solutions, phosphate buffered saline, parenteral vehicles, such as sodium chloride, Ringer's dextrose, etc.
- preservatives e.g., antibacterial or antifungal agents, anti-oxidants, chelating agents, and inert gases
- isotonic agents such like materials and combinations thereof, as would be known to one of ordinary skill in the art.
- nucleotides, nucleic acids, nucleosides, and amino acids used herein is consistent with International Union of Pure and Applied Chemistry (IUPAC) standards (see, e.g., bioinformatics. org/smsylupac. html) .
- IUPAC International Union of Pure and Applied Chemistry
- Exemplary genes and polypeptides are described herein with reference to GenBank numbers, GI numbers and/or SEQ ID NOS. It is understood that one skilled in the art can readily identify homologous sequences by reference to sequence sources, including but not limited to Uniprot (https: //www. uniprot. org/) , GenBank (ncbi. nlm. nih. gov/genbank/) and EMBL (embl. org/) .
- CL refers to Subdoligranulum variabile strain DSM 15176 chromosome, complete genome, which is from NZ_CP102293.1. (subdo. li. gra’ nu. lum. L. adj. subdolus deceptive, alludes to the somewhat deceptive and unusual coccoid form; L. neu. n. granulum a small grain; N. L. neu. n. Subdoligranulum, a deceptive grain; va. ri. a’ bi. le. L. neut. adj. variabile, because the cells are varied in shape) .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- RNAs The group II self-splicing activities of the RNAs (sRNAzymes) provided herein are provided by the 3’ and 5’ intron fragments.
- target sequences included in the RNAs (or cRNAzymes) provided herein are provided in detail in sections below.
- the cation can be selected from the group consisting of Ba 2+ , Ca 2+ , Mg 2+ , Mn 2+ , Fe 2+ , Cu 2+ , Zn 2+ , Cd 2+ , Pb 2+ , Li + , Cs + , Na + , K + , Rb + , and NH 4 + , or a combination thereof.
- the cation is a bivalent cation, such as Ba 2+ , Ca 2+ , Mg 2+ , Mn 2+ , Fe 2+ , Cu 2+ , Zn 2+ , Cd 2+ , and Pb 2+ .
- Stem-loop structure is a type of an RNA secondary structure, which can be determined by any suitable polynucleotide folding algorithm.
- the 6 stem-loop structures of naturally occurring group II introns are called domains 1 to 6 (D1 to D6) , and arranged sequentially from 5’ to 3’ .
- Naturally occurring group II introns comprise multiple exon binding sequences (EBSs) , such as EBS1, EBS2, and EBS3, which interact, such as complementarily pair, with the intron binding sequences (IBSs) in exon regions, triggering splicing by virtue of their own hydroxyl groups within the EBS nucleic acid sequences (FIG. 1A) .
- group II introns also share a common tertiary structure, particularly within the catalytic core. Most of the domains can be transcribed as separate molecules the fold independently and which, when combined with other sections of the intron, retain the catalytic activity.
- group II intron structural elements and their role in reaction chemistry can be described by referring to regions within the intron secondary structure (FIGs. 2A-2B) .
- Intron Domain 1 (or “D1” ) is the largest domain. It provides the recognition sites for sequence-specific exon binding, and is essential for recognizing the exon in splicing reactions.
- D1 contains the active site constituents that form the molecular framework with which the other intronic domain associate.
- a set of intramolecular pairings are highly conservative and functionally important, including the B-B’ pair, the ⁇ - ⁇ ’ pair, the ⁇ - ⁇ ’ pair, the ⁇ - ⁇ ’ pair, the ⁇ - ⁇ ’ pair, the ⁇ - ⁇ ’ pair, the ⁇ - ⁇ ’ pair, and the ⁇ - ⁇ ’ pair.
- EBSs exon-binding sequences
- IBSs intron-binding sequences
- D1 also contains the key EBSs for binding with exon interaction.
- EBSs such as EBS1, EBS2, and EBS3 interact, such as complementarily pair, with the IBSs in exon regions (such as IBS1, IBS2, and IBS3) , whereby the hydroxyl groups within the EBS trigger splicing at the splicing site.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- both the 5’ homology arm and the 3’ homology arm are 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, or 60 nucleotides in length.
- a general principle of designing a group II cRNAzyme construct is to preserve maximum self-splicing activity with minimum size.
- group II introns minimally require D1 and D5 for self-splicing activity.
- the presence of D6 allows the group II intron to self-splice by branching instead of hydrolysis; and the presence of D2 and/or D3 may enhance the specificity and/or efficiency of the group II intron.
- the 3’ intron fragment comprises a D5-like sequence
- the 5’ intron fragment comprises a D1-like sequence.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence (FIGs. 6A-6B and 8A-8B) .
- D1 includes essential sequence and structural elements of group II introns.
- D1-like sequence refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D1, including ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 as depicted on FIGs. 2A-2E.
- the D1-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- the D1-like sequence can be at least 70%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- the D1-like sequence can be at least 80%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- the D1-like sequence can be at least 90%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- the D1-like sequence can be at least 95%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- the D1-like sequence can be at least 98%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- the D1-like sequence can be 100%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron.
- D5 contains the catalytic core of group II introns.
- D5-like sequence refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D5, including the catalytic triad, ⁇ ’ , ⁇ ’ , and ⁇ ’a s depicted on FIGs. 2A-2E.
- the D5-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the D5-like sequence can be at least 70%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the D5-like sequence can be at least 80%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the D5-like sequence can be at least 90%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the D5-like sequence can be at least 95%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the D5-like sequence can be at least 98%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the D5-like sequence can be 100%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron.
- the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence (FIGs. 8A-8B) .
- the presence of D6 allows the intron to self-splice using the branching pathway instead of the hydrolysis pathway (FIG.
- the D6-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- the D6-like sequence can be at least 70%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- the D6-like sequence can be at least 80%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- the D6-like sequence can be at least 90%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- the D6-like sequence can be at least 95%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- the D6-like sequence can be at least 98%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- the D6-like sequence can be 100%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a D2/D3-like sequence, from 5’ to 3’ .
- the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, and a D3-like sequence, from 5’ to 3’ .
- the term “D2-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron and forms the stem-loop structure of the D2 a naturally occurring group II intron.
- the D2-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the D2-like sequence can be at least 70%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the D2-like sequence can be at least 80%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the D2-like sequence can be at least 90%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the D2-like sequence can be at least 95%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the D2-like sequence can be at least 98%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the D2-like sequence can be 100%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron.
- the person of ordinary skill in the art would be able to determine whether an RNA can form the D2 of a naturally occurring group II intron using assays disclosed herein or otherwise known in the art. Sequences of exemplary group II introns are also included in sections below.
- the term “D3-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron and forms the stem-loop structure of the D3 a naturally occurring group II intron.
- the D3-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron.
- TS is the target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ;
- (3’ D4L) is 3’ D4 stem-like sequence;
- (5’ D4L) is 5’ D4 stem-like sequence;
- D1L is D1-like sequence;
- D2L is D2-like sequence;
- D3L is D3-like sequence;
- D2/D3L is D2/D3-like sequence;
- D5L is D5-like sequence; and D6L is D6-like sequence;
- RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (1) : 5’ -D5L-TS-D1L-3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (2) 5’ -D5L-TS-D1L-D2/D3L-3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (3) 5’ -D5L-TS-D1L-D2L-D3L-3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (4) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (5) 5’ - (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (6) 5’ - (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (7) 5’ -D5L-D6L-TS-D1L-3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (8) 5’ -D5L-D6L-TS-D1L-D2/D3L-3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (9) 5’ -D5L-D6L-TS-D1L-D2L-D3L-3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (10) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (11) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’ .
- the RNAs (or cRNAzymes) provided herein can have a structure of Formula (12) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’ .
- TS is the target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- the exons flanking a naturally occurring group II intron can play important roles for the self-splicing.
- the 5’ exon refers to the naturally occurring exon sequence on the 5’ end of the group II intron and the 3’ exon refers to the naturally occurring exon sequence on the 3’ end of the group II intron.
- the 5’ and 3’ flanking exons contain intron binding sequences (IBS) that interact, such as complementarily pair, with the EBS sequence within the intron, which allows the hydroxyl groups within the EBS to trigger splicing at the splicing site.
- IBS intron binding sequences
- IBS1-EBS1 and IBS3-EBS3 ( ⁇ -IBS3) interaction are generally required for efficient self-splicing of the group II introns, the EBS2/IBS2 interaction can be removed without significantly affecting the self-splicing activity.
- the EBS1 and EBS3 (or ⁇ ) of the RNAs (or cRNAzymes) provided herein need to interact, such as complementarily pair, with IBS1 and IBS3 in the exon elements.
- the exon elements can be either contained within the target sequence (typically at the terminal regions of the target sequence) or flanking the target sequence.
- the target sequence of the RNAs (cRNAzymes) provided herein is flanked by the exon elements E1 and E2, wherein E1 comprises IBS1 and E2 comprises IBS3, and wherein the 3’ end of E2 is linked to 5’ end of the target sequence and the 5’ end of E1 is linked to the 3’ end of the target sequence.
- RNAzymes comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment, (2) E2; (3) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (4) E1; and (5) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment, flanked by E1 and E2.
- the exon elements are contained within the target sequence.
- certain sequence elements within the target sequence which, for example, can be present at the terminal regions of the target sequence, contain IBS1 and IBS3 and can serve as E1 and E2 in self-splicing.
- the self-splicing of these RNAs produce circRNAs that do not contain additional sequence elements beyond the target sequence and is therefore referred to herein as “scarless” splicing.
- a circRNAs that solely consists of the target sequence is referred to herein as a “scarless” circRNA.
- the terms “scar, ” as used herein, refers to the non-target sequence region in the circRNA splicing product. As such, a “scarless” circRNA contains no scar.
- the naturally occurring EBSs of a group II intron are modified to be complementary to sequence elements of a corresponding length with the target sequence that serve as the IBSs.
- the RNAs (or cRNAzymes) disclosed herein are modified to have a modified EBS region which is complementary to a region of a corresponding length in a target sequence.
- an EBS modified to allow scarless splicing is referred to as an EBS’ .
- the sequence elements within the target sequence that pair with the EBS’s are referred to herein as the IBS’s .
- the region of the target sequence that is complementary paired with the EBS’ can exist anywhere in the target sequence that allows it to pair with the EBS’ to form a secondary structure necessary for self-splicing. In general, sequences at both ends of the target sequence can be used as they correspond to the location of the IBS sequences of E1 and E2 that naturally interact with EBS.
- the EBS’ regions include modified EBS1 (or EBS1’ ) and modified EBS3 (EBS3’ ) regions.
- the modified EBS e.g., EBS1’ , EBS3’ , or both, is (are) complementary to a stretch of sequence located at the 3’ and/or 5’ end of the target sequence.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- RNAs or cRNAzymes
- the D1-like domain of the RNAs (or cRNAzymes) provided herein comprises EBS1 and the ⁇ nucleotide, wherein EBS1 is 60% complementarily paired with a region of a corresponding length flanking the target sequence and the ⁇ nucleotide is complementarily paired with a nucleotide within a sequence that flanks the target sequence.
- the sequence elements of the terminal region of the target sequence can be modified to pair with the EBS sequences in D1.
- the 3’ terminal region of the target sequence can be modified to contain IBS1’ , the sequence element to pair with EBS1’ in D1.
- the 5’ terminal region of the target sequence can be modified to contain IBS3’ , the sequence element to pair with EBS3’ in D1.
- the EBS1’and EBS3’a re complementarily paired with IBS1’ and IBS3' , respectively on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions.
- the D1-like domain of the RNAs (or cRNAzymes) provided herein comprises EBS1’and the ⁇ ” nucleotide, wherein EBS1’ is 60%complementarily paired with a region of a corresponding length in the target sequence and the ⁇ nucleotide is complementarily paired with a nucleotide within the target sequence.
- the complementarily paired regions can be located at one or both ends of the target sequence.
- the target sequence can contain a sequence element at its 3’ terminal region that can serve as E1 and another sequence element at its 5’ terminal region that can serve as E2, wherein E1 and E2 comprise IBS1’ and ⁇ ” , respectively, and wherein the EBS1’ -IBS1’ interaction and the ⁇ ” -IBS3’ interaction allow the self-splicing and the production of a scarless circRNA.
- the sequence elements of the terminal region of the target sequence can be modified to pair with the EBS sequences in D1.
- the 3’ terminal region of the target sequence can be modified to contain IBS1’ , the sequence element to pair with EBS1’ in D1.
- the 5’ terminal region of the target sequence can be modified to contain the ⁇ ” nucleotide (optionally with its upstream) , the sequence element to pair with EBS3’ in D1.
- the 4-10 nucleotides on the immediate upstream of the ⁇ ” nucleotide are complementarily paired with IBS3’ on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions.
- the naturally occurring group II intron can be modified.
- the modified group II intron can include a substitution, a deletion and/or an addition of one or more nucleotides.
- the modification does not affect the self-splicing activity of the group II intron, especially the in vitro self-splicing activity.
- the 5’ fragment and the 3’ fragment of the naturally occurring group II intron are mutated, swapped and re-ligated with the target sequence inserted in between to form the RNAs (or cRNAzymes) disclosed herein.
- the mutation can comprise modification of one or more nucleotides, such as an addition, a deletion, and a substitution of one or more nucleotides, relative to their naturally occurring wild-type sequences.
- the modification promotes the accuracy and/or efficacy of the self-splicing of the resulting RNAs (or cRNAzymes) disclosed herein.
- the modification includes deletion of the intron encoded protein (IEP) sequence in D4.
- IEP intron encoded protein
- the IEP sequence or similar structures in D4 are present in all group II introns, and known to be not required for in vitro transcription.
- the modification includes deletion of the IEP of D4, whereas RNAs (or cRNAzymes) disclosed herein still comprise the 5’ and 3’ stem sequences of D4.
- the complementarity of the 5’ and 3’ stem sequences of D4 can help shorten the spatial distance between the 5’ intron fragment and the 3’ intron fragment, thereby facilitating the circularization reaction.
- the modification includes deletion of the entire D4.
- the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments.
- an unpaired region is a linear region between two adjacent domains of the group II intron.
- the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1.
- the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D2.
- the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D3. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D4. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D5. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D6.
- the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D2 and D3. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D3 and D4. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D4 and D5. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D5 and D6.
- each of the four sets of splicing mechanism described above can also apply in RNAs (or cRNAzymes) disclosed herein of which the 5’ and 3’ intron fragments are generated by segmenting, swapping and re-ligating the 5’ and 3’ fragments of naturally occurring group II introns.
- Group II introns are found in eubacteria, archaebacteria, and the organelles of plants, fungi, and various lower eukaryotes. While the RNAs (or cRNAzymes) exemplified below focus on specific group II introns, namely, Cte, Oi, Pli, LtrB, and Syn1, guided with the teachings of instant disclosure, a person of ordinary skill in the art would be able to prepare additional RNAs (or cRNAzymes) based on using sequence elements from other group II introns.
- RNAs (or cRNAzymes) disclosed herein can be derived from any naturally occurring group II intron.
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from any naturally occurring group II intron disclosed herein or otherwise known in the art. Lists of naturally group II introns expressly contemplated herein, and their GenBank ID numbers are provided in Tables 25-33.
- group II introns can be derived from the microorganism kingdom.
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from the microorganism kingdom.
- group II introns can be derived from the bacteria domain.
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from the bacteria domain.
- the group II intron is derived from Clostridium (such as Clostridium tetani) , Bacillus (such as Bacillus thuringiensis) , Oceanobacillus (such as Oceanobacillus iheyensis) , Pylaiella (such as Pylaiella littoralis) , Lactococcus (such as Lactococcus lactis) .
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Clostridium (such as Clostridium tetani) , Bacillus (such as Bacillus thuringiensis) , Oceanobacillus (such as Oceanobacillus iheyensis) , Pylaiella (such as Pylaiella littoralis) , Lactococcus (such as Lactococcus lactis) . It is understood by those skilled in the art that compositions and methods provided herein is not limited to specific group II introns.
- RNAs (or cRNAzymes) disclosed herein can be derived from Cte.
- the secondary structure of Cte is provided in FIG. 1B.
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Cte.
- RNAs (or cRNAzymes) disclosed herein can be derived from Oi.
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Oi.
- RNAs (or cRNAzymes) disclosed herein can be derived from Pli.
- RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Pli. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from LtrB. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from LtrB. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from Bth. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Bth.
- RNAs that contain sequence elements derived from exemplary group II introns Cte (SEQ ID NO: 135) .
- RNAs (or cRNAzymes) provided contain sequence elements derived from a modified Cte (e.g., SEQ ID NO: 136 or 137) or a synthetic Cte (e.g., Cte-Syn1; SEQ ID NO: 143) .
- the secondary structures of Cte and Cte-Syn1 are provided in FIGs. 1B and 1C, respectively.
- the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 39. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO:40. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 41. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 135.
- the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 136. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 137. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 138.
- the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 139. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 140. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 141.
- the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 142. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 143. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 144. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 145.
- the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists essentially of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
- the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
- the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists essentially of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
- the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
- the group II intron has the nucleotide sequence of SEQ ID NO: 33.
- the group II intron has the nucleotide sequence of SEQ ID NO: 34.
- the group II intron has the nucleotide sequence of SEQ ID NO: 35.
- the group II intron has the nucleotide sequence of SEQ ID NO: 36.
- the group II intron has the nucleotide sequence of SEQ ID NO: 37. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 38. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 39. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 40. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 41. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 135. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 136.
- the group II intron has the nucleotide sequence of SEQ ID NO: 137. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 138. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 139. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 140. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 141. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 142. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 143. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 144. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 145.
- sequence elements form the essential structural elements required for the self-splicing activities of group II introns.
- a person of ordinary skill in the art would be able to identify such sequence elements with the aid of the sequence analysis tools for RNAs disclosed herein or otherwise known in the art.
- sequence elements for exemplary naturally occurring group II introns Cte, Oi, Pli, and LtrB as well as synthetic cRNAzymes derived therefrom with group II intron activities (Cte-syn1, Oi-syn1, Pli-syn1, and LtrB-syn1 are provided in FIGs. 2A-2K and summarized in Table 18. 8.2.3.2 cRNAzymes and sequences thereof
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
- the non-naturally occurring RNAs comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence.
- the 5’ intron fragment of the RNAs comprise a D1-like sequence, wherein the D1-like sequence is derived from Cte D1.
- the D1-like sequence is at least 60%identical to Cte D1 (SEQ ID NO: 147) and comprises the following sequence elements of Cte D1: ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 (as depicted in FIGs. 2B, 2C and 2F and Table 18) .
- the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte D1 (SEQ ID NO: 147) .
- the D1-like sequence is at least 70%identical to Cte D1 (SEQ ID NO: 147) .
- the D1-like sequence is at least 80%identical to Cte D1 (SEQ ID NO: 147) .
- the D1-like sequence is at least 85%identical to Cte D1 (SEQ ID NO: 147) .
- the D1-like sequence is at least 90%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 95%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 98%identical to Cte D1 (SEQ ID NO: 147) .
- Cte can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence and a Cte D2/D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, and a Cte D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, and a Cte D3-like sequence, from 5’ to 3’ .
- the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment.
- both the 5’ and 3’ D4 stem-like sequences are derived from Cte D4.
- the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired.
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence and a Cte D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence and a 5’ Cte D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence, a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, and a 5’ Cte D4 stem-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence, a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2/Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence, from 5’ to 3’ .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence and a Cte D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, a Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence, a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, a Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence, from 5’ to 3’ .
- the 5’ intron fragment of the RNAs comprise a D1-like sequence, wherein the D1-like sequence is derived from Cte-Syn1 D1.
- the D1-like sequence is at least 60%identical to Cte-Syn1 D1 (SEQ ID NO: 194) and comprises the following sequence elements of Cte-Syn1 D1: ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 (as depicted in FIGs. 2B, 2C and 2G and Table 18) .
- the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 70%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 80%identical to Cte-Syn1 D1 (SEQ ID NO: 194) .
- the D1-like sequence is at least 85%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 90%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 95%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 98%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . Aperson of ordinary skill in the art would understand that Cte-Syn1 can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
- the 3’ intron fragment of the RNAs comprise a D5-like sequence, wherein the D5-like sequence is derived from Cte-Syn1.
- the D5-like sequence is at least 60%identical to Cte-Syn1 D5 and comprises the following sequence elements of Cte-Syn1 D5: the catalytic triad, ⁇ ’ , ⁇ ’ , and ⁇ ’ (as depicted in FIGs. 2B, 2C and 2G and Table 18) .
- the D5-like sequence is at least 85%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 90%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 95%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 98%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . Aperson of ordinary skill in the art would understand that Cte-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the non-naturally occurring RNAs comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence.
- the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 70%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 80%identical to Cte-Syn1 D6 (SEQ ID NO: 198) .
- the D6-like sequence is at least 85%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 90%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 95%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 98%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . Aperson of ordinary skill in the art would understand that Cte-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, and a Cte-Syn1 D3-like sequence, from 5’ to 3’ .
- the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment.
- both the 5’ and 3’ D4 stem-like sequences are derived from Cte-Syn1 D4.
- the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired.
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence and a Cte-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence and a 5’ Cte-Syn1 D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence, a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence, from 5’ to 3’ .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence and a Cte-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, a Cte-Syn1 D3-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence, a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, a Cte-Syn1 D3-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence, from 5’ to 3’ .
- the 5’ intron fragment of the RNAs comprise a D1-like sequence, wherein the D1-like sequence is derived from Oi D1.
- the D1-like sequence is at least 60%identical to Oi D1 (SEQ ID NO: 159) and comprises the following sequence elements of Oi D1: ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 (as depicted in FIG. 2E and Table 18) .
- the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D1 (SEQ ID NO: 159) .
- the D1-like sequence is at least 70%identical to Oi D1 (SEQ ID NO: 159) .
- the D1-like sequence is at least 80%identical to Oi D1 (SEQ ID NO: 159) .
- the D1-like sequence is at least 85%identical to Oi D1 (SEQ ID NO: 159) .
- the 3’ intron fragment of the RNAs comprise a D5-like sequence, wherein the D5-like sequence is derived from Oi.
- the D5-like sequence is at least 60%identical to Oi D5 and comprises the following sequence elements of Oi D5: the catalytic triad, ⁇ ’ , ⁇ ’ , and ⁇ ’ (as depicted in FIG. 2E and Table 18) .
- the D2-like sequence is at least 60%identical to Oi D2 and forms the stem-loop structure of Oi D2 (as depicted in FIG. 2E and Table 18) .
- the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D2 (SEQ ID NO: 162) .
- the D2-like sequence is at least 70%identical to Oi D2 (SEQ ID NO: 162) .
- the D2-like sequence is at least 80%identical to Oi D2 (SEQ ID NO: 162) .
- the D2-like sequence is at least 85%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 90%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 95%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 98%identical to Oi D2 (SEQ ID NO: 162) .
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the D3-like sequence is at least 85%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 90%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 95%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 98%identical to Oi D3 (SEQ ID NO: 163) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence and a Oi D2/D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence, a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, and a 5’ Oi D4 stem-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence, a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, a Oi D2-like sequence, a Oi D3-like sequence, and a 5’ Oi D4 stem-like sequence, from 5’ to 3’ .
- the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D1 (SEQ ID NO: 170) .
- the D1-like sequence is at least 70%identical to Pli D1 (SEQ ID NO: 170) .
- the D1-like sequence is at least 80%identical to Pli D1 (SEQ ID NO: 170) .
- the D1-like sequence is at least 85%identical to Pli D1 (SEQ ID NO: 170) .
- the D1-like sequence is at least 90%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 95%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 98%identical to Pli D1 (SEQ ID NO: 170) .
- Pli can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
- the 3’ intron fragment of the RNAs comprise a D5-like sequence, wherein the D5-like sequence is derived from Pli.
- the D5-like sequence is at least 60%identical to Pli D5 and comprises the following sequence elements of Pli D5: the catalytic triad, ⁇ ’ , ⁇ ’ , and ⁇ ’ (as depicted in FIGs. 2B, 2C and 2J and Table 18) .
- the non-naturally occurring RNAs comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence.
- the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine.
- the D6-like sequence is at least 60%identical to Pli D6 and comprises the following sequence elements of Pli D6: the bulged adenosine (as depicted in FIGs. 2B, 2C and 2J and Table 18) .
- the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Pli D6.
- the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D6 (SEQ ID NO: 180) .
- the D6-like sequence is at least 70%identical to Pli D6 (SEQ ID NO: 180) .
- the D6-like sequence is at least 80%identical to Pli D6 (SEQ ID NO: 180) .
- the D6-like sequence is at least 85%identical to Pli D6 (SEQ ID NO: 180) .
- the D6-like sequence is at least 90%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 95%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 98%identical to Pli D6 (SEQ ID NO: 180) .
- Pli can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the D2-like sequence is at least 80%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 85%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 90%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 95%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 98%identical to Pli D2 (SEQ ID NO: 176) .
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the D3-like sequence is at least 60%identical to Pli D3 and forms the stem-loop structure of Pli D3 (as depicted in FIGs. 2B, 2C and 2J and Table 18) .
- the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D3 (SEQ ID NO: 177) .
- the D3-like sequence is at least 70%identical to Pli D3 (SEQ ID NO: 177) .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence and a Pli D2/D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence and a Pli D2/D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2-like sequence, and a Pli D3-like sequence, from 5’ to 3’ .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence and a Pli D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2/Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence, a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2/Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence, from 5’ to 3’ .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence and a Pli D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2-like sequence, a Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence from 5’ to 3’ .
- the 5’ intron fragment of the RNAs comprise a D1-like sequence, wherein the D1-like sequence is derived from Pli-Syn1 D1.
- the D1-like sequence is at least 60%identical to Pli-Syn1 D1 (SEQ ID NO: 199) and comprises the following sequence elements of Pli-Syn1 D1: ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 (as depicted in FIGs. 2B, 2C and 2K and Table 18) .
- the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 70%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 80%identical to Pli-Syn1 D1 (SEQ ID NO: 199) .
- the D1-like sequence is at least 85%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 90%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 95%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 98%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . Aperson of ordinary skill in the art would understand that Pli-Syn1 can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
- the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D5 (SEQ ID NO: 203) .
- the D5-like sequence is at least 70%identical to Pli-Syn1 D5 (SEQ ID NO: 203) .
- the D5-like sequence is at least 80%identical to Pli-Syn1 D5 (SEQ ID NO: 203) .
- the D5-like sequence is at least 85%identical to Pli-Syn1 D5 (SEQ ID NO: 203) .
- the D5-like sequence is at least 90%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 95%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 98%identical to Pli-Syn1 D5 (SEQ ID NO: 203) .
- Pli-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the non-naturally occurring RNAs comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence.
- the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine.
- the D6-like sequence is at least 60%identical to Pli-Syn1 D6 and comprises the following sequence elements of Pli-Syn1 D6: the bulged adenosine (as depicted in FIGs. 2B, 2C and 2K and Table 18) .
- the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Pli-Syn1 D6.
- the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D6 (SEQ ID NO: 204) .
- the D6-like sequence is at least 70%identical to Pli-Syn1 D6 (SEQ ID NO: 204) .
- the D6-like sequence is at least 80%identical to Pli-Syn1 D6 (SEQ ID NO: 204) .
- the D6-like sequence is at least 85%identical to Pli-Syn1 D6 (SEQ ID NO: 204) .
- the D6-like sequence is at least 90%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 95%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 98%identical to Pli-Syn1 D6 (SEQ ID NO: 204) .
- Pli-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAzymes comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence.
- the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence.
- the D2/D3-like sequence is a D2-like sequence.
- the D2/D3-like sequence is a D3-like sequence.
- the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- the D2-like sequence is at least 80%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 85%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 90%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 95%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 98%identical to Pli-Syn1 D2 (SEQ ID NO: 200) .
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the D3-like sequence is at least 60%identical to Pli-Syn1 D3 and forms the stem-loop structure of Pli-Syn1 D3 (as depicted in FIGs. 2B, 2C and 2K and Table 18) .
- the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D3 (SEQ ID NO: 201) .
- the D3-like sequence is at least 70%identical to Pli-Syn1 D3 (SEQ ID NO: 201) .
- the D3-like sequence is at least 80%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 85%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 90%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 95%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 98%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence and a Pli-Syn1 D2/D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence and a Pli-Syn1 D2/D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, and a Pli-Syn1 D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, and a Pli-Syn1 D3-like sequence, from 5’ to 3’ .
- the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment.
- both the 5’ and 3’ D4 stem-like sequences are derived from Pli-Syn1 D4.
- the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired.
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence and a Pli-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence and a 5’ Pli-Syn1 D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence, a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence, a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2/Pli-Syn1 D3-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence, from 5’ to 3’ .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence and a Pli-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, a Pli-Syn1 D3-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence from 5’ to 3’ .
- the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D1 (SEQ ID NO: 184) .
- the D1-like sequence is at least 70%identical to LtrB D1 (SEQ ID NO: 184) .
- the D1-like sequence is at least 80%identical to LtrB D1 (SEQ ID NO: 184) .
- the D1-like sequence is at least 85%identical to LtrB D1 (SEQ ID NO: 184) .
- the D1-like sequence is at least 90%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 95%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 98%identical to LtrB D1 (SEQ ID NO: 184) .
- LtrB can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
- the 3’ intron fragment of the RNAs comprise a D5-like sequence, wherein the D5-like sequence is derived from LtrB.
- the D5-like sequence is at least 60%identical to LtrB D5 and comprises the following sequence elements of LtrB D5: the catalytic triad, ⁇ ’ , ⁇ ’ , and ⁇ ’ (as depicted in FIGs. 2D and 2H and Table 18) .
- the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D5 (SEQ ID NO: 191) .
- the D5-like sequence is at least 70%identical to LtrB D5 (SEQ ID NO: 191) .
- the D5-like sequence is at least 80%identical to LtrB D5 (SEQ ID NO: 191) .
- the D5-like sequence is at least 85%identical to LtrB D5 (SEQ ID NO: 191) .
- the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine.
- the D6-like sequence is at least 60%identical to LtrB D6 and comprises the following sequence elements of LtrB D6: the bulged adenosine (as depicted in FIGs. 2D and 2H and Table 18) .
- the D6-like sequence lacks the GNRA tetraloop in the naturally occurring LtrB D6.
- the D6-like sequence is at least 90%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 95%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 98%identical to LtrB D6 (SEQ ID NO: 192) .
- LtrB can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAzymes comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence.
- the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence.
- the D2/D3-like sequence is a D2-like sequence.
- the D2/D3-like sequence is a D3-like sequence.
- the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- the D2-like sequence is at least 60%identical to LtrB D2 and forms the stem-loop structure of LtrB D2 (as depicted in FIGs. 2D and 2H and Table 18) .
- the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D2 (SEQ ID NO: 188) .
- the D2-like sequence is at least 70%identical to LtrB D2 (SEQ ID NO: 188) .
- the D2-like sequence is at least 80%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 85%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 90%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 95%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 98%identical to LtrB D2 (SEQ ID NO: 188) .
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the D3-like sequence is at least 60%identical to LtrB D3 and forms the stem-loop structure of LtrB D3 (as depicted in FIGs. 2D and 2H and Table 18) .
- the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D3 (SEQ ID NO: 189) .
- the D3-like sequence is at least 70%identical to LtrB D3 (SEQ ID NO: 189) .
- the D3-like sequence is at least 80%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 85%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 90%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 95%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 98%identical to LtrB D3 (SEQ ID NO: 189) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence and a LtrB D2/D3-like sequence, from 5’ to 3’ .
- the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment.
- both the 5’ and 3’ D4 stem-like sequences are derived from LtrB D4.
- the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100% complementarily paired.
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence, a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, and a 5’ LtrB D4 stem-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence, a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2/LtrB D3-like sequence, and a 5’ LtrB D4 stem-like sequence, from 5’ to 3’ .
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence and a LtrB D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2-like sequence, a LtrB D3-like sequence, and a 5’ LtrB D4 stem-like sequence from 5’ to 3’ .
- the 5’ intron fragment of the RNAs comprise a D1-like sequence, wherein the D1-like sequence is derived from LtrB-Syn1 D1.
- the D1-like sequence is at least 60%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) and comprises the following sequence elements of LtrB-Syn1 D1: ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 (as depicted in FIGs. 2D and 2I and Table 18) .
- the D5-like sequence is at least 85%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 90%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 95%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 98%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . Aperson of ordinary skill in the art would understand that LtrB-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the D6-like sequence is at least 85%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 90%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 95%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 98%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) .
- LtrB-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence.
- the D2/D3-like sequence is a D2-like sequence.
- the D2/D3-like sequence is a D3-like sequence.
- the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- the D2-like sequence is at least 80%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 85%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 90%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 95%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, and a LtrB-Syn1 D3-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, and a LtrB-Syn1 D3-like sequence, from 5’ to 3’ .
- the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment.
- both the 5’ and 3’ D4 stem-like sequences are derived from LtrB-Syn1 D4.
- the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired.
- a person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence and a LtrB-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence and a 5’ LtrB-Syn1 D4 stem-like sequence from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence, a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence, from 5’ to 3’ .
- RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence, a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, a LtrB-Syn1 D3-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence, from 5’ to 3’ .
- the 5’ intron fragment of the RNAs comprise a D1-like sequence, wherein the D1-like sequence is derived from a group II intron listed in Tables 25-33.
- the D1-like sequence is at least 60%identical to the D1 of the group II intron and comprises the following sequence elements of the D1: ⁇ , ⁇ , ⁇ ’ , ⁇ , ⁇ , ⁇ ’ , B’ , ⁇ , EBS1, Stem 2, ⁇ ’ and EBS3 (as depicted in FIGs. 2A-2E and Table 18) .
- this group II intron can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
- the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine.
- the D6-like sequence is at least 60%identical to the D6 of a group II intron listed in Tables 25-33 and comprises the following sequence elements of the D6: the bulged adenosine (as depicted in FIGs. 2A-2E and Table 18) .
- the D6-like sequence lacks the GNRA tetraloop in the naturally occurring D6.
- the group II intron can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
- RNAzymes comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, wherein the D1-like sequence, the D5-like sequence and the D6-like sequence are all derived from a group II intron listed in Tables 25-33.
- the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence.
- the D2/D3-like sequence is a D2-like sequence.
- the D2/D3-like sequence is a D3-like sequence.
- the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2/D3-like sequence, and a 5’ D4 stem-like sequence from 5’ to 3’ , wherein the D1-like sequence, the D2/D3-like sequence, the D5-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
- the 3’ intron fragment of the RNAs (cRNAzymes) provided herein comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 42-52 and 228. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 42. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 43.
- the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 47. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 48. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 49.
- the 3’ intron fragment consists essentially of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 42-52 and 228. In some embodiments, the 3’ intron fragment consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 42-52 and 228. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 42.
- the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 43. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 44. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 45. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 46. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 47. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 48. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 49.
- the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 50. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 51. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 52. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 228.
- the 5’ intron fragment of the RNAs (cRNAzymes) provided herein comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100% identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229.
- the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 75.
- the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 76.
- the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 77. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 78. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 79.
- the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 80. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 81. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 82.
- the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 86. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 87. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 88. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 229.
- the 5’ intron fragment consists essentially of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229. In some embodiments, the 5’ intron fragment consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 75.
- the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 76. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 77. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 78. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 79. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 80. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 81.
- IBS3 (or IBS3’ ) , the region of a corresponding length of EBS3 or EBS3’ , which either flanks a target sequence (IBS3) or is within the target sequence (IBS3’ ) , optionally with its down sequence is selected from the group consisting of: (a) SEQ ID NO: 131, (b) SEQ ID NO: 132, (c) SEQ ID NO: 133, and (d) SEQ ID NO: 134.
- the ⁇ nucleotide and the ⁇ upstream comprises a nucleotide sequence selected from the group consisting of: (a) SEQ ID NO: 127, (b) SEQ ID NO: 128, (c) SEQ ID NO: 129, and (d) SEQ ID NO: 130.
- the 5’ homology arm comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 105. In some embodiments, the 5’ homology arm consists essentially a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 105. In some embodiments, the 5’ homology arm consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 105. In some embodiments, the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105.
- the target sequences can comprise fragments of any sequence desired to be prepared into a circRNA.
- the term “resulting target sequence” refers to the target sequence as it is formed in the circRNA upon self-splicing of the RNAs (or cRNAzymes) provided herein.
- the 3’ -end of the 5’ target sequence fragment is linked to the 5’ -end of the 3’ target sequence fragment (FIG. 9) .
- the resulting target sequence can be a coding sequence, or a noncoding sequence, or a combination thereof.
- the resulting target sequence can comprise an expression construct, or an expression cassette.
- an “expression construct” or “expression cassette” means a nucleotide sequence that directs translation.
- An expression construct includes, at a minimum, one or more transcriptional control elements (such as a translation initiation site, an internal ribosome entry site (IRES) , or a structure functionally equivalent thereof) that direct protein translation one or more desired cell types, tissues or organs.
- An expression construct can also include coding sequence encoding the desired expression product. Additional elements, such as a transcription termination signal, can also be included.
- the resulting target sequence consists of an expression construct.
- the resulting targeting sequence comprises an expression cassette.
- the resulting targeting sequence comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1 (FIGs. 10A (a) - (b) , 10 (a) - (b) , 11A (a) - (b) and 11B (a) - (b) ) .
- the resulting circRNA comprises the resulting target sequence.
- the resulting circRNA consists of the resulting target sequence.
- the target sequence consists of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI (FIGs. 10A (a) - (b) ) .
- the 3’ target sequence fragment further comprises one or two linkers flanking Z1 (FIGs. 10B (a) - (b) ) .
- circRNAs produced by the self-splicing of the RNAs provided herein comprise TI and Z1, wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1 (FIGs. 10A (a) - (b) ) .
- the Z1 is flanked by one or two linkers (FIGs. 10B (a) - (b) ) .
- the RNAs provided herein further comprise two homology arms, a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- the target sequence consists of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; wherein the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1 (FIGs. 11A (a) - (b) ) .
- the 3’ target sequence fragment further comprises two linkers (L) flanking TI (FIGs. 11B (a) - (b) ) .
- circRNAs produced by the self-splicing of the RNAs provided herein comprise TI and Z1, wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1 (FIGs. 11A (a) - (b) ) .
- the TI is flanked by one or two linkers (FIGs. 11B (a) - (b) ) .
- the RNAs provided herein further comprise two homology arms, a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
- the translation initiation sequence TI of the RNAs described herein can be segmented into a 5’ fragment (TI A ) and a 3’ fragment TI (TI B ) .
- the target sequence consists of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; wherein the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TI B ) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TI A ) (FIGs. 12A (a) - (b) ) .
- the infectious agent is associated with humans, non-human primates, or other animals, such as birds, pigs, horses, dogs, cats, rabbits, mice, rats, cows, sheep, goats, and deer.
- antigen-binding fragments include, but are not limited to, single-domain antibodies (variable domain of heavy chain antibodies (VHHs) or nanobodies) , Fabs, F (ab’ ) 2S, and scFvs (single-chain variable fragments) .
- the resulting target sequence disclosed herein can be codon-optimized, for example, via any codon-optimization technique known to one of skill in the art (see, e.g., review by Quax et al., 2015, Mol. Cell 59: 149-161) .
- a codon optimized sequence can be one in which codons in a polynucleotide encoding a therapeutic product have been substituted in order to increase the expression, stability and/or activity of the therapeutic product.
- Factors that influence codon optimization include, but are not limited to one or more of: (i) variation of codon biases between two or more organisms or genes or synthetically constructed bias tables, (ii) variation in the degree of codon bias within an organism, gene, or set of genes, (iii) systematic variation of codons including context, (iv) variation of codons according to their decoding tRNAs, (v) variation of codons according to GC %, either overall or in one position of the triplet, (vi) variation in degree of similarity to a reference sequence for example a naturally occurring sequence, (vii) variation in the codon frequency cutoff, (viii) structural properties of mRNAs transcribed from the DNA sequence, (ix) prior knowledge about the function of the DNA sequences upon which design of the codon substitution set is to be based, and/or (x) systematic variation of codon sets for each amino acid.
- a codon optimized polynucleotide can minimize ribozyme collisions and/or limit
- the resulting target sequence can be a noncoding sequence.
- the noncoding sequence is selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an IRES, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a decoy, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
- ASO antisense oligonucleotide
- the resulting target sequence comprises an aptamer sequence.
- the resulting target sequence encodes a single-stranded DNA or RNA (ssDNA or ssRNA) molecules that can selectively bind to a specific target, including proteins, peptides, carbohydrates, small molecules, toxins, and even live cells.
- ssDNA or ssRNA single-stranded DNA or RNA
- the resulting target sequence encodes a ribozyme.
- the resulting target sequence encodes an antisense oligonucleotide (ASO) , which binds sequence specifically to the target RNA and modulate protein expression through several different mechanisms.
- ASO antisense oligonucleotide
- the resulting target sequence encodes a decoy, which is a short stretch of sequence sharing same or homology to miRNA-binding sites or protein binding sites in endogenous targets.
- the resulting target sequence encodes an RNA scaffold, which is an RNA sequence designed to co-localize enzymes in engineered biological pathways through interactions between scaffold’s protein docking domains and their affinity protein-enzyme fusions, in vivo.
- control elements refers collectively to promoter regions, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites (IRES) , enhancers, splice junctions, and the like, which collectively provide for the replication, transcription, post-transcriptional processing, and translation of a coding sequence in a recipient cell. Not all of these control elements need to be present so long as the selected coding sequence is capable of being replicated, transcribed, and translated in an appropriate host cell.
- promoter refers to a nucleotide region comprising a DNA regulatory sequence, wherein the regulatory sequence is derived from a gene that is capable of binding to an RNA polymerase and allowing for the initiation of transcription of a downstream (3' direction) coding sequence. It may contain genetic elements at which regulatory proteins and molecules may bind, such as RNA polymerase and other transcription factors, to initiate the specific transcription of a nucleic acid sequence.
- a promoter that is “operatively positioned, ” “operatively linked” mean that a promoter is in a correct functional location and/or orientation in relation to a nucleic acid sequence to control transcriptional initiation and/or expression of that sequence, which is “under control” and “under transcriptional control” of the promoter.
- the term “enhancer” as used herein and understood in the art means a nucleic acid sequence that, when positioned proximate to a promoter, confers increased transcription activity relative to the transcription activity resulting from the promoter in the absence of the enhancer domain.
- the resulting target sequence comprises a translation initiation element, or a IT.
- the translation initiation element (or TI) is an IRES, or an IRES-like sequence.
- IRES intranal ribosome entry site
- IRES internal ribosome entry site sequence
- IRES sequence region refer to cis elements of viral or human cellular RNAs (e.g., messenger RNA (mRNA) and/or circRNAs) that bypass the steps of canonical eukaryotic cap-dependent translation initiation.
- IRESs attracts a ribosomal (e.g., eukaryotic ribosomal) to form translation initiation complex and promotes translation initiation.
- IRESs typically comprise a long and highly structured 5 -UTR which mediates he translation initiation complex binding and catalyzes the formation of a functional ribosome.
- IRES sequences include sequences derived from or isolated from a wide variety of viruses, such as from leader sequences of piconaviruses such as the encephalomyocarditis virus (EMCV) UTR, the polio leader sequence, the hepatitis A virus leader sequence, the hepatitis C virus IRES, human rhinovirus type 2 IRES, an IRES element from the foot and mouth disease virus, a giardiavirus IRES, and the like.
- EMCV encephalomyocarditis virus
- the naturally occurring IRES sequence is isolated or derived from an IRES sequence of Taura syndrome virus, Triatoma virus, Theiler's encephalomyelitis virus, simian virus 40, Solenopsis invicta virus 1, Rhopalosiphum padi virus, reticuloendotheliosis virus, human poliovirus 1, Plautia stall intestine virus, Kashmir bee virus, human rhinovirus 2, Homalodisca coagulata virus-1, human immunodeficiency virus type 1, Himetobi P virus, hepatitis C virus, hepatitis A virus, hepatitis A virus HA 16, hepatitis GB virus, foot and mouth disease virus, human enterovirus 71, equine rhinitis virus, ectrapis obliqua picoma-like virus, encephalomyocarditis virus, drosophila C virus, human coxsackievirus B3, crucifer tobamovirus, cricket paralysis
- an IRES sequence is isolated or derived from a cellular IRES element selected from AML1/RUNX1, Antp-D, Antp-DE, Antp-CDE, ATlR varl, ATlR_var2, ATlR_var3, ATlR_var4, BAGl_p36delta236nt, BAGl_p36, BiP_-222_-3, C-IAP1 285-1399, c IAP1 13 13-1462, c-jun, Cat-l_224, CCND1, eIF4GI-ext, eIF4GII, eIF4GII-long, FGF1A, FMR1, Gtx-l33-l4l, Gtx-l-l66, Gtx-l-l20, Gtx-l-l96, HAP4, HIFla, hSNMl, HsplOl, hsp70, hsp70, Hsp90, IGF2_leader2, L-myc, MNT
- the IRES sequence comprises a sequence isolated or derived from a natural IRES sequence.
- the term “IRES-like sequence” or “Internal Ribosome Entry Site-like sequence” refer to non-naturally occurring nucleotide sequences that display a function of a naturally occurring IRES.
- the IRES-like sequence can recruit ribosomal components to mediate cap-independent translation.
- An IRES-like sequence may be identified by methods known in the art, such as in PCT application No. PCT/CN2022/095949.
- the IRES-like sequence is greater than or equal to 3 nucleic acid residues in length. In some embodiments, the IRES-like sequence is 3-300 nucleic acid residues in length. In some embodiments, the IRES-like sequence is 3-200, 4-200, 5-200, 6-200, 7-200, 3-100, 4-100, 5-100, 6-100, 7-100, 3-50, 4-50, 5-50, 6-50, 7-50, 3-40, 4-40, 5-40, 6-40, 7-40, 3-30, 4-30, 5-30, 6-30, 7-30, 3-20, 4-20, 5-20, 6-20, 7-20 nucleic acid residues in length.
- RNAs (or cRNAzymes) provided herein comprises a linker sequence (L) .
- the linker sequence is 3-300 nucleic acid residues in length. In some embodiments the linker sequence is about 3-10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-125, 125-150, 150-175, 175-200, 200-225, 225-250, 250-275 or 275-300 nucleic acid sequences in length. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is an integer selected from 1-100.
- Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 219. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 220. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 221.
- RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1, TI, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 232) .
- the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP.
- RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, TI, Z1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 253) .
- the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc.
- provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 253.
- RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, TI B , Z1, TI A , 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 256) .
- the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP.
- RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, linker, Z1, linker, TI, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 258) .
- the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP.
- RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, linker, TI, linker, Z1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 261) .
- the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc.
- RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, Z1 B , linker, TI, linker, Z1 A , 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 262) .
- the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP.
- RNAs or cRNAzymes having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 262.
- RNAs or cRNAzymes
- the vectors are DNA vectors.
- vector or “construct” refers to a vehicle that is used to carry genetic material (e.g., a nucleotide sequence) , which can be introduced into a host cell, where it can be replicated and/or expressed.
- genetic material e.g., a nucleotide sequence
- vectors can be used, including, for example, expression vectors, plasmids, phage vectors, viral vectors, episomes and artificial chromosomes, which can include selection sequences or markers operable for stable integration into a host cell’s chromosome.
- a “plasmid” is a common type of a vector, which is an extra-chromosomal DNA molecule separate from the chromosomal DNA that is capable of replicating independently of the chromosomal DNA. In certain cases, it is circular and double-stranded.
- Exemplary artificial chromosomes such as yeast artificial chromosome (YAC) , bacterial artificial chromosome (BAC) , or P1-derived artificial chromosome (PAC) .
- Exemplary bacteriophages include such as lambda phage or M13 phage.
- Examples of categories of animal viruses useful as vectors include, without limitation, retrovirus (including lentivirus) , adenovirus, adeno-associated virus (AAV) , herpesvirus (e.g., herpes simplex virus) , poxvirus, baculovirus, papillomavirus, and papovavirus (e.g., SV40) .
- expression vectors are pClneo vectors (Promega) for expression in mammalian cells; pLenti4/V5-DEST TM , pLenti6/V5-DEST TM , and pLenti6.2/V5-GW/lacZ (Invitrogen) for lentivirus-mediated gene transfer and expression in mammalian cells.
- Exemplary AAV serotypes include AAV1, AAV2, AAV4, AAV5, AAV6, AAV9 AAV8, and AAV9.
- the vector is engineered to harbor the sequence coding for the origin of DNA replication or “ori” from a lymphotrophic herpes virus or a gamma herpesvirus, an adenovirus, SV40, a bovine papilloma virus, or a yeast, specifically a replication origin of a lymphotrophic herpes virus or a gamma herpesvirus corresponding to oriP of EBV.
- the lymphotrophic herpes virus may be Epstein Barr virus (EBV) , Kaposi's sarcoma herpes virus (KSHV) , Herpes virus saimiri (HS) , or Marek's disease virus (MDV) .
- Epstein Barr virus (EBV) and Kaposi's sarcoma herpes virus (KSHV) are also examples of a gamma herpesvirus.
- the host cell comprises the viral replication transactivator protein that activates the replication.
- vectors can include one or more selectable marker genes and appropriate expression control sequences.
- Selectable marker genes that can be included, for example, provide resistance to antibiotics or toxins, complement auxotrophic deficiencies, or supply critical nutrients not in the culture media.
- “Expression control sequences, ” “control elements, ” or “regulatory sequences” present in an expression vector are those non-translated regions of the vector-origin of replication, selection cassettes, promoters, enhancers, translation initiation signals (Shine Dalgarno sequence or Kozak sequence) introns, a polyadenylation sequence, 5' and 3' untranslated regions-which interact with host cellular proteins to carry out transcription and translation. Such elements can vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including ubiquitous promoters and inducible promoters can be used.
- inducible promoters/systems include, but are not limited to, steroid-inducible promoters such as promoters for genes encoding glucocorticoid or estrogen receptors (inducible by treatment with the corresponding hormone) , metallothionine promoter (inducible by treatment with various heavy metals) , MX-1 promoter (inducible by interferon) , the “GeneSwitch” mifepristone-regulatable system (Sirin et al., 2003, Gene, 323: 67) , the cumate inducible gene switch (WO 2002/088346) , tetracycline-dependent regulatory systems, etc.
- steroid-inducible promoters such as promoters for genes encoding glucocorticoid or estrogen receptors (inducible by treatment with the corresponding hormone)
- metallothionine promoter inducible by treatment with various heavy metals
- MX-1 promoter inducible by interfer
- the vectors provided herein can be made using standard techniques of molecular biology.
- the various elements of the vectors provided herein can be obtained using recombinant methods, such as by screening cDNA and genomic libraries from cells, or by deriving the polynucleotides from a vector known to include the same.
- the various elements of the vectors provided herein can also be produced synthetically, rather than cloned, based on the known sequences.
- the complete sequence can be assembled from overlapping oligonucleotides prepared by standard methods and assembled into the complete sequence. See, e.g., Edge, Nature (1981) 292: 756; Nambair et al., Science (1984) 223 : 1299; and Jay et al., J. Biol. Chem. (1984) 259: 631 1.
- nucleotide sequences can be obtained from vectors harboring the desired sequences or synthesized completely, or in part, using various oligonucleotide synthesis techniques known in the art, such as site-directed mutagenesis and polymerase chain reaction (PCR) techniques where appropriate.
- oligonucleotide synthesis techniques known in the art, such as site-directed mutagenesis and polymerase chain reaction (PCR) techniques where appropriate.
- PCR polymerase chain reaction
- One method of obtaining nucleotide sequences encoding the desired vector elements is by annealing complementary sets of overlapping synthetic oligonucleotides produced in a conventional, automated polynucleotide synthesizer, followed by ligation with an appropriate DNA ligase and amplification of the ligated nucleotide sequence via PCR. See, e.g., Jayaraman et al., Proc. Natl. Acad. Sci.
- oligonucleotide-directed synthesis Jones et al., Nature (1986) 54: 75-82
- oligonucleotide directed mutagenesis of preexisting nucleotide regions Riechmann et al., Nature (1988) 332: 323-327 and Verhoeyen et al., Science (1988) 239: 1534-1536
- enzymatic filling-in of gapped oligonucleotides using T4 DNA polymerase Queen et al., Proc. Natl. Acad. Sci. USA (1989) 86: 10029-10033
- RNAs (or cRNAzymes) provided herein can be generated by incubating a vector provided herein under conditions permissive of transcription of the RNAs encoded by the vector.
- RNAs (or cRNAzymes) provided herein can be synthesized by incubating a vector provided herein that comprises an RNA polymerase promoter upstream of its 5’ duplex forming region and/or expression sequence with a compatible RNA polymerase enzyme under conditions permissive of in vitro transcription.
- the vector is incubated inside of a cell by a bacteriophage RNA polymerase or in the nucleus of a cell by host RNA polymerase P.
- RNAs or cRNAzymes provided herein by performing in vitro transcription using a vector provided herein as a template (e.g., a vector provided herein with an RNA polymerase promoter positioned upstream of the 5’ homology region) .
- a vector provided herein as a template e.g., a vector provided herein with an RNA polymerase promoter positioned upstream of the 5’ homology region
- the resulting RNAs (or cRNAzymes) can be used to generate circular RNA.
- circRNAs prepared by the self-splicing of the RNAs (cRNAzymes) disclosed herein.
- the circRNAs provided herein has higher functional stability than mRNA comprising the same expression sequence.
- the circRNAs provided herein have higher functional stability than mRNA comprising the same expression sequence, 5moU modifications, an optimized UTR, a cap, and/or a polyA tail.
- the circRNAs provided herein have a functional half-life of at least 5 hours, 10 hours, 15 hours, 20 hours. 30 hours, 40 hours, 50 hours, 60 hours, 70 hours or 80 hours. In some embodiments, the circRNAs provided herein provided herein have a functional half-life of 5-80, 10-70, 15-60, and/or 20-50 hours. In some embodiments, the circRNAs provided herein provided herein have a functional half-life greater than (e.g., at least 1.5-fold greater than, at least 2-fold greater than) that of an equivalent linear RNAs encoding the same protein. In some embodiments, functional half-life can be assessed through the detection of functional protein synthesis.
- the circRNAs provided herein comprise one or more expression sequences and are configured for persistent expression in a cell of a subject in vivo.
- the circRNAs is configured such that expression of the one or more expression sequences in the cell at a later time point is equal to or higher than an earlier time point.
- the expression of the one or more expression sequences can be either maintained at a relatively stable level or can increase over time. The expression of the expression sequences can be relatively stable for an extended period of time.
- the expression of the one or more expression sequences in the cell over a time period of at least 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 23 or more days does not decrease by 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5%.
- the expression of the one or more expression sequences in the cell is maintained at a level that does not vary by more than 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5%for at least 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 23 or more days.
- the circRNAs provided herein provided herein have a higher magnitude of expression than equivalent linear mRNA, e.g., a higher magnitude of expression 24 hours after administration of RNA to cells.
- the circRNAs provided herein provided herein have a higher magnitude of expression than mRNA comprising the same expression sequence, 5moU modifications, an optimized UTR, a cap, and/or a polyA tail.
- the circRNAs provided herein provided herein have higher stability than an equivalent linear mRNA. In some embodiments, this can be shown by measuring receptor presence and density in vitro or in vivoi post electroporation, with time points measured over 1 week. In some embodiments, this can be shown by measuring RNA presence via qPCR or ISH.
- the circRNAs disclosed herein can be of any length or size. In some embodiments the circRNA is between 300 and 10000, 400 and 9000, 500 and 8000, 600 and 7000, 700 and 6000, 800 and 5000, 900 and 5000, 1000 and 5000, 1100 and 5000, 1200 and 5000, 1300 and 5000, 1400 and 5000, and/or 1500 and 5000 nucleotides in length.
- the circRNAs disclosed herein can be at least 300 nt, 400 nt, 500 nt, 600 nt, 700 nt, 800 nt, 900 nt, 1000 nt, 1100 nt, 1200 nt, 1300 nt, 1400 nt, 1500 nt, 2000 nt, 2500 nt, 3000 nt, 3500 nt, 4000 nt, 4500 nt, or 5000 nt in length.
- the circRNA is no more than 3000 nt, 3500 nt, 4000 nt, 4500 nt, 5000 nt, 6000 nt, 7000 nt, 8000 nt, 9000 nt, or 10000 nt in length.
- circRNAs disclosed herein can be about 300 nt, 400 nt, 500 nt, 600 nt, 700 nt, 800 nt, 900 nt, 1000 nt, 1100 nt, 1200 nt, 1300 nt, 1400 nt, 1500 nt, 2000 nt, 2500 nt, 3000 nt, 3500 nt, 4000 nt, 4500 nt, 5000 nt, 6000 nt, 7000 nt, 8000 nt, 9000 nt, or 10000 nt in length.
- circRNAs disclosed herein can be at least 500 nucleotides in length, at least 1000 nucleotides in length, or at least 1500 nucleotides in length.
- the circRNAs are scarless. In some embodiments, the circRNA are near-scarless. As understood in the art, near-scarless circRNAs, especially scarless circRNAs, are less immunogenic than their counterparts that has a larger scar (i.e., extra sequence besides the target sequence) , which make them better suited for therapeutic uses.
- the circRNAs provided herein provided herein have modified RNA nucleotides and/or modified nucleosides.
- the modified nucleoside is m 5 C (5-methylcytidine) .
- the modified nucleoside is m 5 U (5-methyluridine) .
- the modified nucleoside is m 6 A (N 6 -methyladenosine) .
- the modified nucleoside is s 2 U (2-thiouridine) .
- the modified nucleoside is Y (pseudouridine) .
- the modified nucleoside is Um (2 '-O-methyluridine) .
- the modified nucleoside is m ! A (1-methyladenosine) ; m 2 A (2-methyladenosine) ; Am (2’ -0-methyladenosine) ; ms 2 m 6 A (2-methylthio-N 6 -methyladenosine) ; i 6 A (N 6 -isopentenyladenosine) ; ms2i6A (2-methylthio-N 6 isopentenyladenosine) ; io 6 A (N 6 - (cis-hydroxyisopentenyl) adenosine) ; ms 2 io 6 A (2-methylthio-N 6 - (cis-hydroxyisopentenyl) adenosine) ; g 6 A (N 6 -glycinylcarbamoyladenosine) ; t 6 A (N 6 -threonylcarbamoyladeno sine) ; ms
- G (1-methylguanosine) ; m 2 G (N 2 -methylguanosine) ; m 7 G (7-methylguanosine) ; Gm (2'-0-methylguanosine) ; m 2 2G (N 2 , N 2 -dimethylguanosine) ; m 2 Gm (N 2 , 2’ -O-dimethylguanosine) ; m 2 aGm (N 2 , N 2 , 2’ -O-trimethylguanosine) ; Gr (p) (2’ -0-ribosylguanosine (phosphate) ) ; yW (wybutosine) ; oayW (peroxywybutosine) ; OHyW (hydroxy wybutosine) ; OHyW* (undermodified hydroxywybutosine) ; imG (wyosine) ; mimG (methylwyosine) ; Q
- the modified nucleoside can include a compound selected from the group of: pyridin-4-one ribonucleoside, 5-aza-uridine, 2-thio-5-aza-uridine, 2-thiouridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxyuridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyluridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, l-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-1-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-deaza-pse
- RNAs (or cRNAzymes) provided herein are produced by self-splicing of the RNAs (or cRNAzymes) provided herein, which has group II intron activity.
- methods of making circRNAs comprising incubating the RNAs (or cRNAzymes) provided herein under conditions suitable for circularization (self-splicing) .
- RNAs (or cRNAzymes) provided herein are produced by transcription using a vector provided herein as a template.
- RNAs (or cRNAzymes) provided herein are produced by run-off transcription.
- RNAs (or cRNAzymes) provided herein are produced by in vitro transcription.
- the self-splicing buffer can also comprise 10 mM to 100 mM, such as 10 mM, 20 mM, 30 mM, 40 mM, 50 mM, 60 mM, 70 mM, 80 mM, 90 mM, and 100 mM NaCl.
- the self-splicing reaction is performed in vitro for about 5 min to about 1 h, such as about 5 min, about 10 min, about 15 min, about 20 min, about 25 min, about 30 min, about 35 min, about 40 min, about 45 min, about 50 min, about 55 min, and about 1 h.
- the self-splicing reaction is performed at a temperature between 20 and 60 °C, between 20 and 50 °C, between 20 and 40 °C, between 20 and 30 °C, between 30 and 40 °C, between 40 and 50 °C, or between 50 and 60 °C.
- the precursor RNAs disclosed herein are capable of achieving a circularization rate of at least 30%, such as a circularization rate of at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, and at least 95%.
- circRNAs prepared by self-splicing the RNAs (or cRNAzymes) disclosed herein.
- the circRNAs disclosed herein are purified. Purification includes, but is not limited to, the removal of non-circularized linear RNAs, dsRNAs, and other unwanted components.
- the circRNAs disclosed herein are purified before being transfected into cells. The phosphate groups at both ends of a linear RNA and some dsRNAs might activate the RIG-1 signaling pathway, and the immune response resulted from RIG-1 signaling can lead to the degradation of exogenous RNAs, thus affecting the function of circular RNAs.
- the methods of purification can include any of the following: enzymatic treatment; chromatography, including but not limited to affinity column chromatography, reversed-phase silica gel column liquid chromatography, gel filtration chromatography, and gel exclusion liquid chromatography; and electrophoresis, including but not limited to gel electrophoresis such as agarose gel electrophoresis, and capillary electrophoresis; and any combination thereof.
- Methods for removing linear RNAs for example, include enzymatic treatment, such as treatment with RNase R; and chromatography, such as high performance liquid chromatography (HPLC) .
- Methods for removing terminal phosphate groups for example, include treatment with alkaline phosphatases, such as calf intestinal alkaline phosphatase (CIP) .
- purification comprises one or more of the following steps: phosphatase treatment, HPLC size exclusion purification, and RNase R digestion. In some embodiments, purification comprises the following steps in order: RNase R digestion, phosphatase treatment, and HPLC size exclusion purification. In some embodiments, purification comprises reverse phase HPLC. In some embodiments, a purified composition contains less double stranded RNA, DNA splints, triphosphorylated RNA, phosphatase proteins, protein ligases, capping enzymes and/or nicked RNA than unpurified RNA.
- the RNAs (or cRNAzymes) provided herein comprise a purification tag that is removed during self-splicing, which can be used for negative selection of the circRNAs.
- a sample containing the circRNAs such as the product of splicing reaction starting from the tagged linear precursors, can be mixed with a probe that is immobilized on a solid surface, wherein the tag-containing precursors or any other tag-containing impurities can bind to the probe and be removed from the solution, resulting in a circRNA solution substantially free from the precursor, intron and any other tag-containing impurities.
- the purification tag is a 15-40 nt polynucleotides.
- a purification matrix includes, for example, magnetic resin or beads, silicone resin, Sephadex resin, affinity resin, nanoparticles, and nanomaterial surface or coated surfaces.
Landscapes
- Genetics & Genomics (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
Provided herein are constructs and methods for preparing circular RNAs and uses thereof. In particular, provided herein include modified group II introns and novel constructs with group II intron self-splicing activity, as well as their uses in the preparation of circular RNAs. Uses of resulting circRNAs are also provided. Related compositions and systems are also provided herein.
Description
1. Related applications
The application claims priority to, and the benefit of PCT Application No. PCT/CN2023/123553, filed on October 9, 2023, the contents of which are incorporated herein by reference in their entirety.
2. Incorporation by reference of sequence listing
The contents of the electronic sequence listing (TFG00911PCT-Sequence listing. xml; Size: 341, 593 bytes; and Date of Creation: October 8, 2024) are herein incorporated by reference in its entirety.
The present invention relates to the field of molecular biology, in particular to constructs and methods for preparing circular RNAs and uses of the circular RNAs in, for example, expressing a protein of interest in a eukaryotic cell or functioning as noncoding RNA.
Circular RNAs (circRNAs) are a category of RNA molecules formed by head-to-tail ligation, which were demonstrated to have multiple biological functions in recent years. (Yang et al., Cell Research, 27 (5) : 626-641 (2017) ; Abe et al., Scientific Reports, 5: 16435 (2015) ; Gao et al., Nature Cell Biology, 23 (3) : 278-291 (2021) ; Pamudurti et al., Molecular Cell, 66 (1) : 9-21 (2017) ) . Compared with linear RNAs, circRNAs have better stability and therefore provide a promising new platform for RNA drugs.
Despite the great therapeutic potential and recent progresses in the field, efficient and reliable methods for preparing circular RNAs are still lacking. The compositions, methods and systems provided herein address this need and provide related advantages.
Provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment. In some embodiments, the circRNA consists of the target sequence.
In some embodiments, the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1. In some embodiments, Z1 encodes a therapeutic product. In some embodiments, Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259. In some embodiments, the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118. In some embodiments, TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site. In some embodiments, TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof. In some embodiments, TI is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225.
In some embodiments of the RNAs provided herein, the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI. In some embodiments, the 3’ target sequence fragment further comprises two linkers (L) flanking Z1.
In some embodiments of the RNAs provided herein, the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1. In some embodiments, the 3’ target sequence fragment further comprises two linkers (L) flanking TI.
In some embodiments of the RNAs provided herein, the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TIB) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TIA) . In some embodiments, the 3’ target sequence further comprises two linkers (L) flanking Z1.
In some embodiments of the RNAs provided herein, the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1B) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’ , TI and a 5’ fragment of Z1 (Z1A) . In some embodiments, the 3’ target sequence further comprises two linkers (L) flanking TI.
In some embodiments, the RNAs provided herein have a structure selected from the group consisting of Formulae (I) - (IV) :
(I) 5’ - (3’ IF) - (L) n-Z1- (L) n-TI- (5’ IF) -3’ ;
(II) 5’ - (3’ IF) - (L) n-TI- (L) n-Z1- (5’ IF) -3’ ;
(III) 5’ - (3’ IF) -TIB- (L) n-Z1- (L) n-TIA- (5’ IF) -3’ ;
(IV) 5’ - (3’ IF) -Z1B- (L) n-TI- (L) n-Z1A- (5’ IF) -3’ ; and
wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) ; Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1A) and a 3’ fragment (Z1B) ; and each L is independently a linker sequence, and n=0, 1 or 2.
In some embodiments, the RNAs provided herein further comprise (1) an exon fragment 2 (E2) between the 3’ intron fragment and the target sequence, (2) an exon fragment 1 (E1) between the target sequence and the 5’ intron fragment; or (3) both (1) and (2) . In some embodiments, E2 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63. In some embodiments, E1 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 64-74. In some embodiments, E1 and E2 together are 20 or less nucleotides in length.
In some embodiments of the RNAs provided herein, the 3’ intron fragment comprises D5-like sequence, and the 5’ intron fragment comprises a D1-like sequence. In some embodiments, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine or an atypical bulged adenosine. In some embodiments, the 5’ intron fragment further comprises a D2-like sequence or a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the RNAs provided herein further comprise a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment, wherein the pair of D4 stem-like sequences each has a region that is 10-200 or 30-60 nucleotides in length and are at least 60%complementarily paired (acomplementary region) . In some embodiments, the 3’ and 5’ D4 stem-like sequences have two or more complementary regions.
In some embodiments, the D1-like sequence comprises EBS1 sequence and EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence. In some embodiments, the D1-like sequence comprises an EBS1 sequence and a δ nucleotide wherein the EBS1 sequence, and the δ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking a target sequence. In some embodiments, the D1-like sequence comprises EBS1’ sequence and EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence. In some embodiments, the D1-like sequence comprises an EBS1’ sequence and a δ” nucleotide wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in a target
sequence. In some embodiments, the complementarily paired regions are located at one or both ends of the target sequence.
In some embodiments, the RNAs provided herein comprise a structure selecting from the group consisting of Formulae (1) - (12) :
(1) 5’ -D5L-TS-D1L-3’ ;
(2) 5’ -D5L-TS-D1L-D2/D3L-3’ ;
(3) 5’ -D5L-TS-D1L-D2L-D3L-3’ ;
(4) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ ;
(5) 5’ - (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’ ;
(6) 5’ - (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
(7) 5’ -D5L-D6L-TS-D1L-3’ ;
(8) 5’ -D5L-D6L-TS-D1L-D2/D3L-3’ ;
(9) 5’ -D5L-D6L-TS-D1L-D2L-D3L-3’ ;
(10) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ ;
(11) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’ ; and
(12) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
In some embodiments, the RNAs provided herein comprise a structure selecting from the group consisting of Formulae (1) - (8) :
(1) 5’ -D5L-TS-D1L-3’ ;
(2) 5’ -D5L-TS-D1L-D3L-3’ ;
(3) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ ;
(4) 5’ - (3’ D4L) -D5L-TS-D1L-D3L- (5’ D4L) -3’ ;
(5) 5’ -D5L-D6L-TS-D1L-3’ ;
(6) 5’ -D5L-D6L-TS-D1L-D3L-3’ ;
(7) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ ;
(8) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D3L- (5’ D4L) -3’ ; and
wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D3L is D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
In some embodiments of the RNAs provided herein, the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%
identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, and 205.
In some embodiments of the RNAs provided herein, the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, and 209.
In some embodiments of the RNAs provided herein, the D6-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, and 210.
In some embodiments of the RNAs provided herein, the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, and 206.
In some embodiments of the RNAs provided herein, the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 152, 163, 177, 189, 196, 201, and 207.
In some embodiments, RNAs provided herein further comprise a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. In some embodiments, the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length. In some embodiments, the 5’ and the 3’ homology arms have up to 10%base mismatches. In some embodiments, (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
In some embodiments, the RNAs provided herein have group IIB intron activity.
In some embodiments, the 5’ intron fragment and the 3’ intron fragment of the RNAs provided herein are obtained by segmenting a group II intron at an unpaired region into two fragments. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2, between D2 and D3, between D3 and D4, between D4 and D5, or between D5 and D6.
In some embodiments, the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a
deletion, a substitution, and an addition. In some embodiments, the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4. In some embodiments, the modification comprises a deletion of an open reading frame (ORF) .
In some embodiments, the D1 of the group II intron comprises an EBS1 sequence and an EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence. In some embodiments, the D1 of the group II intron comprises an EBS1 sequence and a δ nucleotide, wherein the EBS1 sequence, and the δnucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence. In some embodiments, the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence. In some embodiments, the D1 of the group II intron comprises an EBS1’ sequence and a δ” nucleotide, wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in the target sequence. In some embodiments, the complementarily paired regions are located at one or both ends of the target sequence.
In some embodiments, RNAs provided herein further compris3 a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. In some embodiments, the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length. In some embodiments, the 5’ and the 3’ homology arms have up to 10%base mismatches. In some embodiments, (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
In some embodiments, the group II intron is a group II intron derived from a microorganism. In some embodiments, the group II intron is a group IIB intron. In some embodiments, the group II intron is Cte 1. In some embodiments, the group II intron has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
In some embodiments of the RNAs provided herein, the 3’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NO: 42-52 and 228. In some embodiments of the RNAs provided herein, the 5’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229.
In some embodiments, RNAs provided herein have a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 230-263.
In some embodiments, RNAs provided herein comprise a modified RNA nucleotide and/or modified nucleoside. In some embodiments, RNAs provided herein comprise at least 10%modified RNA nucleotides and/or modified nucleosides. In some embodiments, at least one of the modified RNA nucleotide and/or modified nucleoside is m5C (5-methylcytidine) , m5U (5-methyluridine) , m6A (N6-methyladenosine) , Y (pseudouridine) , or m1A (1-methyladenosine) . In some embodiments, at least one of the modified RNA nucleotide and/or modified nucleoside is introduced at in vitro transcription (IVT)
Provided herein are also circRNAs produced by the self-splicing of the RNAs disclosed herein. Provided herein are also vectors encoding the RNAs disclosed herein.
Provided herein are also cells comprising the RNAs disclosed herein, the circRNAs disclosed herein, or the vectors disclosed herein.
Provided herein are also methods of making a circRNA comprising subjecting the RNA of disclosed herein under conditions sufficient for it to self-splice.
Provided herein are also methods of expressing a protein in a cell comprising transfecting the cell with the circRNAs disclosed herein. In some embodiments, the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e.g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell, NK cell, liver starlet cell, HEK293, HEK293T, HeLa, MCF7, PC3, A549, NCI-H727, HCT-116, MCF10A, HPReC, FHC, immortalized cell lines, primary cell, yeast cell (e.g., Saccharomyces cerevisiae and Pichia pastoris) , bacteria cell (e.g., Escherichia coli) , insect cell (e.g., Spodoptera frugiperda sf9, Mimic Sf9 and sf21) , or Drosophila S2.
Provided herein are methods of expressing a protein in vivo comprising administering to a subject the circRNA disclosed herein or the vector disclosed herein. Provided herein are also method of expressing an RNA in vivo comprising administering to a subject the vector disclosed herein. In some embodiments, the subject is a human.
6. Illustrative Embodiments
Embodiment 1: A non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’
target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment;
wherein the RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
Embodiment 2: The RNA of embodiment 1, wherein the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1.
Embodiment 3: The RNA of embodiment 2, wherein Z1 encodes a therapeutic product.
Embodiment 4: The RNA of embodiment 2, wherein Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259.
Embodiment 5: The RNA of embodiment 3, wherein the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118.
Embodiment 6: The RNA of any one of embodiments 2 to 5, wherein TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
Embodiment 7: The RNA of embodiment 6, wherein TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof.
Embodiment 8: The RNA of embodiment 6, wherein TI is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225.
Embodiment 9: The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI.
Embodiment 10: The RNA of embodiment 8, wherein the 3’ target sequence fragment further comprises two linkers (L) flanking Z1.
Embodiment 11: The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1.
Embodiment 12: The RNA of embodiment 11, wherein the 3’ target sequence fragment further comprises two linkers (L) flanking TI.
Embodiment 13: The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TIB) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TIA) .
Embodiment 14: The RNA of embodiment 13, wherein the 3’ target sequence further comprises two linkers (L) flanking Z1.
Embodiment 15: The RNA of any one of embodiments 2 to 7, wherein the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1B) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’ , TI and a 5’ fragment of Z1 (Z1A) .
Embodiment 16: The RNA of embodiment 15, wherein the 3’ target sequence further comprises two linkers (L) flanking TI.
Embodiment 17: The RNA of embodiment 1, having a structure selected from the group consisting of Formulae (I) - (IV) :
(I) 5’ - (3’ IF) - (L) n-Z1- (L) n-TI- (5’ IF) -3’ ;
(II) 5’ - (3’ IF) - (L) n-TI- (L) n-Z1- (5’ IF) -3’ ;
(III) 5’ - (3’ IF) -TIB- (L) n-Z1- (L) n-TIA- (5’ IF) -3’ ;
(IV) 5’ - (3’ IF) -Z1B- (L) n-TI- (L) n-Z1A- (5’ IF) -3’ ; and
wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) ; Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1A) and a 3’ fragment (Z1B) ; and each L is independently a linker sequence, and n=0, 1 or 2.
Embodiment 18: The RNA of any one of embodiments 2 to 14, or of embodiment 17 having a structure selected from the group consisting of Formulae (I) - (III) , further comprising (1) an exon fragment 2 (E2) between the 3’ intron fragment and the target sequence, (2) an exon fragment 1 (E1) between the target sequence and the 5’ intron fragment; or (3) both (1) and (2) .
Embodiment 19: The RNA of embodiment 18, wherein E2 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63.
Embodiment 20: The RNA of embodiment 18 or 19, wherein E1 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 64-74.
Embodiment 21: The RNA of any one of embodiments 18 to 20, wherein E1 and E2 together are 20 or less nucleotides in length.
Embodiment 22: The RNA of any one of embodiments 1 to 17, wherein the circRNA consists of the target sequence.
Embodiment 23: The RNA of any one of embodiments 1 to 22, wherein the 3’ intron fragment comprises D5-like sequence, and the 5’ intron fragment comprises a D1-like sequence.
Embodiment 24: The RNA of embodiment 23, wherein the D1-like sequence comprises EBS1 sequence and EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
Embodiment 25: The RNA of embodiment 23, wherein the D1-like sequence comprises an EBS1 sequence and a δ nucleotide wherein the EBS1 sequence, and the δ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking a target sequence.
Embodiment 26: The RNA of embodiment 23, wherein the D1-like sequence comprises EBS1’ sequence and EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
Embodiment 27: The RNA of embodiment 23, wherein the D1-like sequence comprises an EBS1’ sequence and a δ” nucleotide wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in a target sequence.
Embodiment 28: The RNA of embodiment 26 or 27, wherein the complementarily paired regions are located at one or both ends of the target sequence.
Embodiment 29: The RNA of any one of embodiments 23 to 28, wherein the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, and 205.
Embodiment 30: The RNA of any one of embodiments 23 to 29, wherein the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, and 209.
Embodiment 31: The RNA of any one of embodiments 23 to 30, wherein the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine or an atypical bulged adenosine.
Embodiment 32: The RNA of embodiment 31, wherein the D6-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, and 210.
Embodiment 33: The RNA of any one of embodiments 23 to 32, wherein the 5’ intron fragment further comprises a D2-like sequence or a D3-like sequence at the 3’ end of the D1-like sequence.
Embodiment 34: The RNA of any one of embodiments 33, wherein the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
Embodiment 35: The RNA of embodiments 33 or 34, wherein the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, and 206 and the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 152, 163, 177, 189, 196, 201, and 207.
Embodiment 36: The RNA of any one of embodiments 23 to 35, further comprising a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment, wherein the pair of D4 stem-like sequences each has a region that is 10-200 or 30-60 nucleotides in length and are at least 60%complementarily paired (acomplementary region) .
Embodiment 37: The RNA of embodiment 36, wherein the 3’ and 5’ D4 stem-like sequences have two or more complementary regions.
Embodiment 38: The RNA of one of embodiments 1 to 8 comprising a structure selecting from the group consisting of Formulae (1) - (12) :
(1) 5’ -D5L-TS-D1L-3’ ;
(2) 5’ -D5L-TS-D1L-D2/D3L-3’ ;
(3) 5’ -D5L-TS-D1L-D2L-D3L-3’ ;
(4) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ ;
(5) 5’ - (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’ ;
(6) 5’ - (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
(7) 5’ -D5L-D6L-TS-D1L-3’ ;
(8) 5’ -D5L-D6L-TS-D1L-D2/D3L-3’ ;
(9) 5’ -D5L-D6L-TS-D1L-D2L-D3L-3’ ;
(10) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ ;
(11) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’ ; and
(12) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
Embodiment 39: The RNA of embodiment 38, wherein: (1) the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%
identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, and 205; (2) the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, and 209; (3) the D6-like sequence a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, and 210; or (4) (a) the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, and 206; or (b) the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting NOs: 152, 163, 177, 189, 196, 201, and 207; or both (a) and (b) ; or any combination of (1) - (4) .
Embodiment 40: The RNA of any one of embodiments 1 to 39, further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
Embodiment 41: The RNA of embodiment 40, wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
Embodiment 42: The RNA of embodiment 40 or 41, wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
Embodiment 43: The RNA of embodiment 42, wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
Embodiment 44: The RNA of any one of embodiments 1 to 43, wherein the RNA has group IIB intron activity.
Embodiment 45: The RNA of any one of embodiments 1 to 22, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments.
Embodiment 46: The RNA of embodiment 45, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6.
Embodiment 47: The RNA of embodiment 45, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2, between D2 and D3, between D3 and D4, between D4 and D5, or between D5 and D6.
Embodiment 48: The RNA of any one of embodiments 45 to 47, wherein the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a deletion, a substitution, and an addition.
Embodiment 49: The RNA of embodiment 48, wherein the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4.
Embodiment 50: The RNA of embodiment 48, wherein the modification comprises a deletion of an open reading frame (ORF) .
Embodiment 51: The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1 sequence and an EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
Embodiment 52: The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1 sequence and a δ nucleotide, wherein the EBS1 sequence, and the δ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
Embodiment 53: The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
Embodiment 54: The RNA of any one of embodiments 48 to 50, wherein the D1 of the group II intron comprises an EBS1’ sequence and a δ” nucleotide, wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
Embodiment 55: The RNA of embodiment 53 or 54, wherein the complementarily paired regions are located at one or both ends of the target sequence.
Embodiment 56: The RNA of any one of embodiments 45 to 55, further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
Embodiment 57: The RNA of embodiment 56, wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
Embodiment 58: The RNA of embodiment 56 or 57, wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
Embodiment 59: The RNA of embodiment 56, wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
Embodiment 60: The RNA of any one of embodiments 45 to 59, wherein the group II intron is a group II intron derived from a microorganism.
Embodiment 61: The RNA of any one of embodiments 45 to 60, wherein the group II intron is a group IIB intron.
Embodiment 62: The RNA of embodiment 61, wherein the group II intron is Cte 1.
Embodiment 63: The RNA of any one of embodiments 45 to 59, wherein the group II intron has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145.
Embodiment 64: The RNA of any one of embodiments 45 to 59, wherein the 3’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NO: 42-52 and 228.
Embodiment 65: The RNA of any one of embodiments 45 to 59, wherein the 5’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229.
Embodiment 66: The RNA of embodiment 1, wherein the RNA has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 230-263.
Embodiment 67: The RNA of any one of embodiments 1 to 66, comprising a modified RNA nucleotide and/or modified nucleoside.
Embodiment 68: The RNA of embodiment 67, comprising at least 10%modified RNA nucleotides and/or modified nucleosides.
Embodiment 69: The RNA of embodiment 67, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is m5C (5-methylcytidine) , m5U (5-methyluridine) , m6A (N6-methyladenosine) , Y (pseudouridine) , or m1A (1-methyladenosine) .
Embodiment 70: The RNA of any one of embodiments 67 to 69, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is introduced at in vitro transcription (IVT)
Embodiment 71: A circRNA produced by the self-splicing of the RNA of any one of embodiments 1 to 70.
Embodiment 72: A vector encoding the RNA of any one of embodiments 1 to 70.
Embodiment 73: A cell comprising the RNA of any one of embodiments 1 to 70, the circRNA of embodiment 71, or the vector of embodiment 72.
Embodiment 74: A method of making a circRNA comprising subjecting the RNA of any one of embodiments 1 to 70 under conditions sufficient for it to self-splice.
Embodiment 75: A method of expressing a protein in a cell comprising transfecting the cell with the circRNA of embodiment 71.
Embodiment 76: The method of embodiment 75 wherein the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e.g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell, NK cell, liver starlet cell, HEK293, HEK293T, HeLa, MCF7, PC3, A549, NCI-H727, HCT-116, MCF10A, HPReC, FHC, immortalized cell lines, primary cell, yeast cell (e.g., Saccharomyces cerevisiae and Pichia pastoris) , bacteria cell (e.g., Escherichia coli) , insect cell (e.g., Spodoptera frugiperda sf9, Mimic Sf9 and sf21) , or Drosophila S2.
Embodiment 77: A method of expressing a protein in vivo comprising administering to a subject the circRNA of embodiment 71 or the vector of embodiment 72.
Embodiment 78: A method of expressing an RNA in vivo comprising administering to a subject the vector of embodiment 72.
Embodiment 79: The method of embodiment 77 or 78 wherein the subject is a human.
Embodiment 80: A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence of SEQ ID NO. 263.
Embodiment 81: A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence comprising the following elements: 5’ HBB UTR, CDS, 3’ HBB UTR and polyA. In some embodiments, the nucleotide sequence of CDS is selected from SEQ ID NOs. 258-259. In some embodiments, the nucleotide sequence of 5’ HBB UTR is selected from SEQ ID NO. 260. In some embodiments, the nucleotide sequence of 3’ HBB UTR is selected from SEQ ID NO. 261. In some embodiments, the nucleotide sequence of polyA is selected from SEQ ID NO. 262.
A non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ :
(1) a 3’ intron fragment;
(2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and
(3) a 5’ intron fragment;
wherein the RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
The RNA of paragraph [00120] , wherein the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1.
The RNA of paragraph [00121] , wherein Z1 encodes a therapeutic product.
The RNA of paragraph [00121] , wherein Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259.
The RNA of paragraph [00122] , wherein the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118.
The RNA of any one of paragraphs [00121] to [00124] , wherein TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
The RNA of paragraph [00125] , wherein TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof.
The RNA of paragraph [00125] , wherein TI is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225.
The RNA of any one of paragraphs [00121] to [00126] , wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI.
The RNA of paragraph [00127] , wherein the 3’ target sequence fragment further comprises two linkers (L) flanking Z1.
The RNA of any one of paragraphs [00121] to [00126] , wherein the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1.
The RNA of paragraph [00130] , wherein the 3’ target sequence fragment further comprises two linkers (L) flanking TI.
The RNA of any one of paragraphs [00121] to [00126] , wherein the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TIB) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TIA) .
The RNA of paragraph [00132] , wherein the 3’ target sequence further comprises two linkers (L) flanking Z1.
The RNA of any one of paragraphs [00121] to [00126] , wherein the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1B) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’ , TI and a 5’ fragment of Z1 (Z1A) .
The RNA of paragraph [00134] , wherein the 3’ target sequence further comprises two linkers (L) flanking TI.
The RNA of paragraph [00120] , having a structure selected from the group consisting of Formulae (I) - (IV) :
(I) 5’ - (3’ IF) - (L) n-Z1- (L) n-TI- (5’ IF) -3’ ;
(II) 5’ - (3’ IF) - (L) n-TI- (L) n-Z1- (5’ IF) -3’ ;
(III) 5’ - (3’ IF) -TIB- (L) n-Z1- (L) n-TIA- (5’ IF) -3’ ;
(IV) 5’ - (3’ IF) -Z1B- (L) n-TI- (L) n-Z1A- (5’ IF) -3’ ; and
wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) ; Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1A) and a 3’ fragment (Z1B) ; and each L is independently a linker sequence, and n=0, 1 or 2.
The RNA of any one of paragraphs [00121] to [00133] , or of paragraph [00136] having a structure selected from the group consisting of Formulae (I) - (III) , further comprising (1) an exon fragment 2 (E2) between the 3’ intron fragment and the target sequence, (2) an exon fragment 1 (E1) between the target sequence and the 5’ intron fragment; or (3) both (1) and (2) .
The RNA of paragraph [00137] , wherein E2 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63.
The RNA of paragraph [00137] or [00138] , wherein E1 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 64-74.
The RNA of any one of paragraphs [00137] to [00139] , wherein E1 and E2 together are 20 or less nucleotides in length.
The RNA of any one of paragraphs [00120] to [00136] , wherein the circRNA consists of the target sequence.
The RNA of any one of paragraphs [00120] to [00141] , wherein the 3’ intron fragment comprises D5-like sequence, and the 5’ intron fragment comprises a D1-like sequence.
The RNA of paragraph [00142] , wherein the D1-like sequence comprises EBS1 sequence and EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
The RNA of paragraph [00142] , wherein the D1-like sequence comprises an EBS1 sequence and a δ nucleotide wherein the EBS1 sequence, and the δ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking a target sequence.
The RNA of paragraph [00142] , wherein the D1-like sequence comprises EBS1’ sequence and EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
The RNA of paragraph [00142] , wherein the D1-like sequence comprises an EBS1’ sequence and a δ” nucleotide wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in a target sequence.
The RNA of paragraph [00145] or [00146] , wherein the complementarily paired regions are located at one or both ends of the target sequence.
The RNA of any one of paragraphs [00142] to [00147] , wherein the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, 205, and 265.
The RNA of any one of paragraphs [00142] to [00148] , wherein the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, 209, and 269.
The RNA of any one of paragraphs [00142] to [00149] , wherein the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine or an atypical bulged adenosine.
The RNA of paragraph [00150] , wherein the D6-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, 210, and 270.
The RNA of any one of paragraphs [00142] to [00151] , wherein the 5’ intron fragment further comprises a D2-like sequence or a D3-like sequence at the 3’ end of the D1-like sequence.
The RNA of any one of paragraph [00152] , wherein the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
The RNA of paragraph [00152] or [00153] , wherein the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%
identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, 206, and 266, and the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 152, 163, 177, 189, 196, 201, 207, and 267.
The RNA of any one of paragraphs [00142] to [00154] , further comprising a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’end of the 5’ intron fragment, wherein the pair of D4 stem-like sequences each has a region that is 10-200 or 30-60 nucleotides in length and are at least 60%complementarily paired (acomplementary region) .
The RNA of paragraph [00155] , wherein the 3’ and 5’ D4 stem-like sequences have two or more complementary regions.
The RNA of one of paragraphs [00120] to [00127] comprising a structure selecting from the group consisting of Formulae (1) - (12) :
(1) 5’ -D5L-TS-D1L-3’ ;
(2) 5’ -D5L-TS-D1L-D2/D3L-3’ ;
(3) 5’ -D5L-TS-D1L-D2L-D3L-3’ ;
(4) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ ;
(5) 5’ - (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’ ;
(6) 5’ - (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
(7) 5’ -D5L-D6L-TS-D1L-3’ ;
(8) 5’ -D5L-D6L-TS-D1L-D2/D3L-3’ ;
(9) 5’ -D5L-D6L-TS-D1L-D2L-D3L-3’ ;
(10) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ ;
(11) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’ ; and
(12) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
The RNA of paragraph [00157] comprising a structure selecting from the group consisting of Formulae (1) - (8) :
(1) 5’ -D5L-TS-D1L-3’ ;
(2) 5’ -D5L-TS-D1L-D3L-3’ ;
(3) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ ;
(4) 5’ - (3’ D4L) -D5L-TS-D1L-D3L- (5’ D4L) -3’ ;
(5) 5’ -D5L-D6L-TS-D1L-3’ ;
(6) 5’ -D5L-D6L-TS-D1L-D3L-3’ ;
(7) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ ; and
(8) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D3L- (5’ D4L) -3’ ;
wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D3L is D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
The RNA of paragraph [00157] or [00158] , wherein:
(1) the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, 205, and 265;
(2) the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, 209, and 269;
(3) the D6-like sequence a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, 210, and 270; or
(4) (a) the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, 206, and 266; or (b) the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting NOs: 152, 163, 177, 189, 196, 201, 207 and 267; or both (a) and (b) ; or any combination of (1) - (4) .
The RNA of any one of paragraphs [00120] to [00159] , further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
The RNA of paragraph [00160] , wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
The RNA of paragraph [00160] or [00161] , wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
The RNA of paragraph [00162] , wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
The RNA of any one of paragraphs [00120] to [00163] , wherein the RNA has group IIB intron activity.
The RNA of any one of paragraphs [00120] to [00141] , wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments.
The RNA of paragraph [00165] , wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6.
The RNA of paragraph [00165] , wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2, between D2 and D3, between D3 and D4, between D4 and D5, or between D5 and D6.
The RNA of any one of paragraphs [00165] to [00167] , wherein the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a deletion, a substitution, and an addition.
The RNA of paragraph [00168] , wherein the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4.
The RNA of paragraph [00168] , wherein the modification comprises a deletion of an open reading frame (ORF) .
The RNA of any one of paragraphs [00168] to [00170] , wherein the D1 of the group II intron comprises an EBS1 sequence and an EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
The RNA of any one of paragraphs [00168] to [00170] , wherein the D1 of the group II intron comprises an EBS1 sequence and a δ nucleotide, wherein the EBS1 sequence, and the δnucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
The RNA of any one of paragraphs [00168] to [00170] , wherein the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
The RNA of any one of paragraphs [00168] to [00170] , wherein the D1 of the group II intron comprises an EBS1’ sequence and a δ” nucleotide, wherein the EBS1’s equence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
The RNA of paragraph [00173] or [00174] , wherein the complementarily paired regions are located at one or both ends of the target sequence.
The RNA of any one of paragraphs [00165] to [00175] , further comprising a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment.
The RNA of paragraph [00176] , wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
The RNA of paragraph [00176] or [00177] , wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
The RNA of paragraph [00176] , wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
The RNA of any one of paragraphs [00165] to [00179] , wherein the group II intron is a group II intron derived from a microorganism.
The RNA of any one of paragraphs [00165] to [00180] , wherein the group II intron is a group IIB intron.
The RNA of paragraph [00181] , wherein the group II intron is Cte 1.
The RNA of paragraph [00181] , wherein the group II intron is CL.
The RNA of any one of paragraphs [00165] to [00179] , wherein the group II intron has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41, 135-145, and 264.
The RNA of any one of paragraphs [00165] to [00179] , wherein the 3’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NO: 42-52 and 228.
The RNA of any one of paragraphs [00165] to [00179] , wherein the 5’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229.
The RNA of paragraph [00120] , wherein the RNA has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 230-264.
The RNA of any one of paragraphs [00120] to [00186] , comprising a modified RNA nucleotide and/or modified nucleoside.
The RNA of paragraph [00187] , comprising at least 10%modified RNA nucleotides and/or modified nucleosides.
The RNA of paragraph [00187] , wherein at least one of the modified RNA nucleotide and/or modified nucleoside is m5C (5-methylcytidine) , m5U (5-methyluridine) , m6A (N6-methyladenosine) , Y (pseudouridine) , or m1A (1-methyladenosine) .
The RNA of any one of paragraphs [00187] to [00189] , wherein at least one of the modified RNA nucleotide and/or modified nucleoside is introduced at in vitro transcription (IVT) .
A circRNA produced by the self-splicing of the RNA of any one of paragraphs [00120] to [00191] .
A vector encoding the RNA of any one of paragraphs [00120] to [00191] .
A cell comprising the RNA of any one of paragraphs [00120] to [00191] , the circRNA of paragraph [00192] , or the vector of paragraph [00193] .
A method of making a circRNA comprising subjecting the RNA of any one of paragraphs [00120] to [00191] under conditions sufficient for it to self-splice.
A method of expressing a protein in a cell comprising transfecting the cell with the circRNA of paragraph [00192] .
The method of paragraph [00196] wherein the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e.g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell, NK cell, liver starlet cell, HEK293, HEK293T, HeLa, MCF7, PC3, A549, NCI-H727, HCT-116, MCF10A, HPReC, FHC, immortalized cell lines, primary cell, yeast cell (e.g., Saccharomyces cerevisiae and Pichia pastoris) , bacteria cell (e.g., Escherichia coli) , insect cell (e.g., Spodoptera frugiperda sf9, Mimic Sf9 and sf21) , or Drosophila S2.
A method of expressing a protein in vivo comprising administering to a subject the circRNA of paragraph [00192] or the vector of paragraph [00193] .
A method of expressing an RNA in vivo comprising administering to a subject the vector of paragraph [00193] .
The method of paragraph [00198] or [00199] wherein the subject is a human.
A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence of SEQ ID NO. 263.
A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence comprising the following elements: 5’ HBB UTR, CDS, 3’ HBB UTR and polyA, wherein the CDS encodes a protein that is not HBB.
FIGs. 1A-1C depict the secondary structure of group II introns. FIG. 1A provides a schematic diagram of group II intron’s general structure. As shown, a typical group II intron can
have six stem-loop structures, referred to as Domains 1-6, or D1-D6. D4 contains an open reading frame. The 6 domains are sequentially arranged and comprise multiple exon binding sequences (EBSs) , such as EBS1, EBS2, and EBS3. These EBS sequences interact, such as complementarily pair, with the intron binding sequences (IBSs) in exon regions (such as IBS1, IBS2, and IBS3) , to trigger self-splicing. Note that a single nucleotide, the δ nucleotide, which is located directly upstream of EBS1 in domain 1 can also pair with IBS3, and the interaction between δ and IBS3 is referred to δ-IBS3 pairing. FIG. 1B provides the secondary structure of exemplary group II intron Cte. FIG. 1C provides the secondary structure of an exemplary synthetic group II intron based on sequence elements of Cte: Cte-Syn1.
FIGs. 2A-2F depict the structure of group II introns in which the sequence elements that provide the long-range interactions essential for their tertiary structure and function are denoted with Greek letters. FIG. 2A depicts a general group II intron. FIG. 2B and FIG. 2C depict the exemplary group IIB intron Cte. FIG. 2B marks the sequence elements participating in self-splicing. FIG. 2C also marks sequence elements that contribute the tertiary structure of the intron. FIG. 2D depict the structure of group IIA intron, and FIG. 2E depict the structure of group IIC intron. FIG. 2F identifies the relevant nucleotides in exemplary group II intron Cte. FIG. 2G identifies the relevant nucleotides in a synthetic group II intron derived from Cte: Cte-syn1, another exemplary group II intron. FIG. 2H identifies the relevant nucleotides in LtrB, another exemplary group II intron. FIG. 2I identifies the relevant nucleotides in a synthetic group II intron derived from LtrB: LtrB-syn1. FIG. 2J identifies the relevant nucleotides in Pli, another exemplary group II intron. FIG. 2K identifies the relevant nucleotides in a synthetic group II intron derived from Pli: Pli-syn1.
FIGs. 3A-3B depict two mechanisms for group II intron self-splicing. As shown, Group II introns catalyze self-splicing via two consecutive transesterification reactions. FIG. 3A depicts the hydrolysis pathway, which uses an external water molecule as the first-step nucleophile, resulting in liberation of a linear intron molecule. FIG. 3B depicts the branching pathway, which uses the 2’ -OH group in an adenosine (branch site) in D6 as the first-step nucleophile, resulting in liberation of a lariat intron molecule.
FIGs. 4A-4D provide schematic diagrams identifying the exon-intron interactions essential for the near-scarless or scarless splicing of the cRNAzymes disclosed herein.
FIG. 4A depicts near-scarless splicing based on the interactions between IBS1 and EBS1; and IBS3 and EBS3; optionally also between IBS2 and EBS2. A group II intron with flanking exon sequences E1 and E2 is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments. Arrows indicate the interactions between IBS1 and EBS1; IBS2 and EBS2; and IBS3
and EBS3. As shown, the self-splicing of construct produces a circRNA consisting of the target sequence, E1 and E2.
FIG. 4B depicts near-scarless splicing based on the interactions between IBS1 and EBS1; and the δ nucleotide and IBS3; optionally also between IBS2 and EBS2. A group II intron with flanking exon sequences E1 and E2 is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments. Arrows indicate the interactions between IBS1 and EBS1; IBS2 and EBS2; and IBS3 and δ. As shown, the self-splicing of construct produces a circRNA consisting of the target sequence, E1 and E2.
FIG. 4C depicts scarless splicing based on the interactions between IBS1’ and EBS1’ ; and IBS3’ and EBS3’ . A group II intron is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments. Arrows indicate the interactions between IBS1’ and EBS1’ ; and IBS3’ and EBS3’ . As shown, the self-splicing of construct produces a circRNA consisting of the target sequence.
FIG. 4D depicts scarless splicing based on the interactions between IBS1’ and EBS1’ ; and the δ” nucleotide and IBS3’ . A group II intron is split into two fragments at the D4 domain, with the 5’ intron fragment and the 3’ intron fragment swapped, and a target sequence inserted between the two fragments. Arrows indicate the interactions between IBS1’ and EBS1’ ; and IBS3’ and δ” . As shown, the self-splicing of construct produces a circRNA consisting of the target sequence.
FIGs. 5A (a) -5D (b) provide schematic diagrams of four systems designed for cRNAzyme having group II intron activity to self-splice via the hydrolysis pathway. The cRNAzymes have a 3’ intron fragment comprising a D5-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence. FIGs. 5A (a) - (b) and FIGs. 5B (a) - (b) depict the near-scarless splicing based on the interactions between IBS1 and EBS1, IBS2 and EBS2, and IBS3 and EBS3 (FIG. 5A (a) ) ; and between IBS1 and EBS1, IBS2 and EBS2, and IBS3 and δ (FIG. 5B (a) ) . For group IIB introns, IBS2 and EBS2 interaction can be absent. As such, the near-scarless splicing of group IIB introns can be based on the interactions between IBS1 and EBS1, and IBS3 and EBS3 (FIG. 5A (b) ) ; and between IBS1 and EBS1, and IBS3 and δ (FIG. 5B (b) ) . IBSs are contained within customized exons (E1 and E2) , which are inserted between the intron fragments and the target sequence, and are maintained as the scar in the resulting circRNA. FIG. 5C (a) and 5D(a) depict the scarless splicing based on (FIG. 5C (a) ) the interactions between IBS1’ and EBS1’ , IBS3’ and EBS3’ ; and (FIG. 5D (a) ) between IBS1’ and EBS1’ , and IBS3’ and δ” . For group IIB introns, IBS2’and EBS2’ interaction can be absent. As such, the near-scarless splicing of group IIB introns can be based on the interactions between IBS1’ and EBS1’ , and IBS3’ and
EBS3’ (FIG. 5C (b) ) ; and between IBS1’ and EBS1’ , and IBS3’ and δ” (FIG. 5D (b) ) . In scarless splicing, IBSs are contained within the target sequence. Arrows indicate the interactions between EBSs (or the δ or δ” nucleotide) and IBSs.
FIGs. 6A-6B provide schematic diagrams for exemplary cRNAzymes using the hydrolysis pathway for self-splicing. As shown, cRNAzymes provided herein have a 3’ intron fragment comprising a D5-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence. The 3’ fragment can further comprise a D2/D3-like domain, or a D2-like sequence and a D3-like sequence. The cRNAzymes can either omit (FIG. 6A) or include (FIG. 6B) D4 stem-like sequences at both ends.
FIGs. 6C-6D provide schematic diagrams for exemplary cRNAzymes using the hydrolysis pathway for self-splicing. As shown, cRNAzymes provided herein have a 3’ intron fragment comprising a D5-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence. The 3’ fragment can further comprise a D3-like sequence. The cRNAzymes can either omit (FIG. 6C) or include (FIG. 6D) D4 stem-like sequences at both ends.
FIGs. 6E-6H provide schematic diagrams for exemplary cRNAzymes using the hydrolysis pathway for self-splicing. As shown, cRNAzymes provided herein have a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence. The 3’ fragment can further comprise a D3-like domain. The cRNAzymes can either omit (FIGs. 6E and 6G) or include (FIGs. 6F and 6H) D4 stem-like sequences at both ends.
FIGs. 7A (a) -7D (b) provide schematic diagrams of four systems designed for cRNAzyme having group II intron activity to self-splice via the branching pathway. The cRNAzymes have a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence. FIG. 7A (a) and FIG. 7B (a) depict the near-scarless splicing based on the interactions between IBS1 and EBS1, IBS2 and EBS2, and IBS3 and EBS3 (FIG. 7A (a) ) ; and between IBS1 and EBS1, IBS2 and EBS2, and IBS3 and δ(FIG. 7B (a) ) . For group IIB introns, IBS2 and EBS2 interaction can be absent. As such, the near-scarless splicing of group IIB introns can be based on the interactions between IBS1 and EBS1, and IBS3 and EBS3 (FIG. 7A (b) ) ; and between IBS1 and EBS1, and IBS3 and δ (FIG. 7B (b) ) . IBSs are contained within customized exons (E1 and E2) , which are inserted between the intron fragments and the target sequence, and are maintained as the scar in the resulting circRNA. FIG. 7C (a) and 7D (a) depict the scarless splicing based on the interactions between IBS1’ and EBS1’ , IBS3’ and EBS3’ (FIG. 7C (a) ) ; and between IBS1’ and EBS1’ , and IBS3’ and δ” (FIG. 7D (a) ) . For group IIB introns, IBS2’and EBS2’ interaction can be absent. As such, the near-scarless splicing of group IIB introns can be based on the interactions between IBS1’ and
EBS1’ , and IBS3’ and EBS3’ (FIG. 7C (b) ) ; and between IBS1’ and EBS1’ , and IBS3’ and δ” (FIG. 7D (b) ) . In scarless splicing, IBSs are contained within the target sequence. Arrows indicate the interactions between EBSs (or the δ or δ” nucleotide) and IBSs.
FIGs. 7E (a) and 7E (b) provide schematic diagrams of four systems designed for cRNAzyme having group II intron activity to self-splice via the branching pathway. The cRNAzymes have a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence and a D3-like sequence. FIGs. 7E (a) and 7E (b) reveal the near-scarless splicing based on the interactions between IBS1 and EBS1, IBS2 and EBS2, and IBS3 and EBS3 (FIG. 7E (a) ) ; and between IBS1 and EBS1, IBS2 and EBS2, and IBS3 and δ (FIG. 7E (b) )
FIGs. 8A-8B provide schematic diagrams for exemplary cRNAzymes using the branching pathway for self-splicing. As shown, cRNAzymes provided herein have a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, a target sequence, and a 3’ intron fragment comprising a D1-like sequence. The 3’ fragment can further comprise a D2/D3-like sequence, or a D2-like sequence and a D3-like sequence. The cRNAzymes can either omit (FIG. 8A) or include (FIG. 8B) D4 stem-like sequences at both ends.
FIG. 9 provides schematic diagrams showing the re-ligation of the target sequence fragment upon self-splicing of the cRNAzymes provided herein. As shown, the target sequence can be segmented into a 5’ fragment and 3’ fragment, which can be swapped and cloned into the cRNAzyme. Upon self-splicing cRNAzyme, circularization links the 3’ -end of the 5’ target sequence fragment to the 5’ -end of the 3’ target sequence fragment.
FIGs. 10A (a) -10B (b) provides schematic diagrams of cRNAzymes in which the 3’ target sequence fragment comprises a protein-coding sequence (Z1) and the 5’ target sequence fragment comprises a translation initiation sequence (TI) . The cRNAzymes can either omit linkers (FIGs. 10A (a) - (b) ) or include linkers flanking Z1 (FIGs. 10B (a) - (b) ) . Scarless splicing is depicted in FIGs. 10A (a) and 10B (a) . Near-scarless splicing is depicted in FIGs. 10A (b) and 10B (b) .
FIGs. 11A (a) -11B (b) provide schematic diagrams of cRNAzymes in which the 3’ target sequence fragment comprises a translation initiation sequence (TI) and the 5’ target sequence fragment comprises a protein-coding sequence (Z1) . The cRNAzymes can either omit linkers (FIGs. 11A (a) - (b) ) or include linkers flanking TI (FIGs. 11B (a) - (b) ) . Scarless splicing is depicted in FIGs. 11A (a) and 11B (a) . Near-scarless splicing is depicted in FIGs. 11A (b) and 11B (b) .
FIGs. 12A (a) -12B (b) provide schematic diagrams of cRNAzymes in which the 3’ target sequence fragment comprises a 3’ fragment of TI (TIB) and a protein-coding sequence (Z1) and the 5’ target sequence fragment comprises a translation initiation sequence (TI) . The cRNAzymes can either omit linkers (FIGs. 12A (a) - (b) ) or include linkers flanking Z1 (FIGs. 12B (a) - (b) ) .
Scarless splicing is depicted in FIGs. 12A (a) and 12B (a) . Near-scarless splicing is depicted in FIGs. 12A (b) and 12B (b) .
FIGs. 13A-13B provide schematic diagrams of cRNAzymes in which the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1B) and the 5’ target sequence fragment comprises a translation initiation sequence (TI) and a 5’ fragment of Z1 (Z1B) . The cRNAzymes can either omit linkers (FIG. 13A) or include linkers flanking TI (FIG. 13B) .
Note, depicted in FIGs. 10A (a) , 10B (a) , 11A (a) , 11B (a) , 12A (a) , 12B (a) , 13A, and 13B is scarless splicing wherein the sequence elements within the target sequence serve as E1 and E2. In some embodiments, the 5’ terminal region of the target sequence (e.g., part of TI, Z1, or linker) can serve as E2. In some embodiments, the 3’ terminal region of the target sequence (e.g., part of TI, Z1, or linker) can serve as E1. Optionally, as depicted on FIGs. 10A (b) , 10B (b) , 11A (b) , 11B(b) , 12A (b) , and 12B (b) , (1) extra exon sequence E2 can be included between the 3’ intron fragments and the target sequence; (2) extra exon sequence E1 can be included between the target sequence and the 5’ intron fragments; or both (1) and (2) . As such, E1 and/or E2 remain with the target sequence in the circRNA after self-splicing.
FIG. 14A demonstrates the SDS-PAGE results of CL-56_13. In terms of CL-56_13, D2 and D4 domains of CL introns are knocked out to investigate whether it circularizes under such conditions and Cte-56_123 means D4 domain of Cte intron is knocked out. According to the SDS analysis, the circularization of CL-56_13 was indicated. RT-PCR was utilized to sequence the circular RNA.
FIG. 14B reveals the SDS results of A00A-1 Primers and A00A-2 Primers. Two pairs of junction primers were designed and sequenced. A00A-1 Primers have short sequences and the amplified bands are approximately 300bp. A00A-2 Primers have long sequences and the amplified bands are approximately 1000bp. According to the RT-PCR SDS analysis, the sizes of the truncated fragments are same as that in the control groups. CL-56_13 and Cte-56_123 are the primers used in this experiment. CL-56_13 means D2 and D4 domains of CL intron are knocked out and Cte-56_123 means D4 domain of Cte intron is knocked out
FIG. 14C reveals the PCR sequencing result of CL-56_13. The circularization point at CL-56_13 was validated using PCR sequencing.
FIG. 15 demonstrates the full pattern of CL intron, which is separated into 6 domains, I, II, III, IV, V and VI.
FIG. 16 demonstrates the CL-56_13 pattern.
Provided herein are non-naturally occurring RNAs having group II intron self-splicing activity which, upon self-splicing, can form circular RNAs. Circular RNAs (circRNAs) are
single-stranded RNAs that are joined head to tail. As known in the art, circRNAs can be produced in vitro using chemical means or via enzymatic activities from precursor RNA, which refers to the linear RNA molecule from which a circRNA is directly generated, regardless of the methods of circularization. For example, the 5’ end and 3’ end of a linear nucleic acid can be chemically linked by the catalysis of bromine cyanide and a morpholinyl derivative, or ligated in a head-to-tail manner by the activity of a nucleic acid ligase. CircRNAs can also be produced by splicing. When a linear precursor undergoes splicing, a portion of the molecule is excised, resulting in a circRNA that has fewer total nucleotides than the precursor RNA.
The circRNAs can also be produced by ribozyme-catalyzed RNA splicing. As used herein and understood in the art, the term “ribozyme” refers to an RNA molecule with an enzymatic activity. Some ribozymes can catalyze self-splicing independent of the spliceosome, which are referred to as “ribozymes with self-splicing activity, ” “self-splicing ribozymes, ” or “self-splicing introns. ” Naturally occurring self-splicing ribozymes can be divided into group I and group II introns. Although the splicing products of the two categories of ribozymes are similar, the structures and splicing mechanisms of the ribozymes themselves are quite different. The group I intron has a 9-helix structure, which requires an external hydroxyl group in guanosine monophosphate (pG-OH) to trigger the reaction during catalytic splicing, and are highly dependent on the sequences of exons located at both ends of the group I intron. The group II intron relies on its own hydroxyl groups within the nucleotide sequence to trigger splicing. This splicing mechanism is closer to the splicing reaction mediated by a spliceosome and better simulate splicing in higher organisms. The term “group I intron self-splicing activity” or “group I intron activity” refers to the self-splicing activity derived from a group I intron; and the term “group II intron self-splicing activity” or “group II intron activity” refers to the self-splicing activity derived from a group II intron. Method for preparing circRNAs based on group II intron self-splicing activities have at least the following advantages: reduction of the use of biological and chemical reagents (such as ligase and associated reagents) , ease of operation, and simple design.
RNAs that are engineered ribozymes with self-splicing activity which, upon self-splicing, forms circRNAs are also referred to herein as “cRNAzymes. ” In some embodiments, the cRNAzymes can have in vitro self-splicing activity. Disclosed herein are novel non-naturally occurring RNAs that have group II intron self-splicing activity and that, upon self-splicing, forms circRNAs, or “group II cRNAzymes. ” Further provided herein are also vectors comprising polynucleotides encoding these group II cRNAzymes, methods of preparing the group II cRNAzymes disclosed herein by transcribing these vectors, and uses of these group II cRNAzymes in making circRNAs.
Before the present disclosure is further described, it is to be understood that the disclosure is not limited to the particular embodiments set forth herein, and it is also to be understood that the terminology used herein is for the purpose of describing particular embodiments, and is not intended to be limiting.
8.1 Definitions
Unless otherwise defined herein, scientific and technical terms used in the present disclosures shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. Generally, nomenclatures used in connection with, and techniques of, cell and tissue culture, molecular biology, immunology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those well-known and commonly used in the art.
As used herein in the specification, “a” or “an” may mean one or more. As used herein in the claim (s) , when used in conjunction with the word “comprising, ” the words “a” or “an” may mean one or more than one.
As used herein, the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or. ” As used herein “another” or “additional” may mean at least a second or more.
As used herein, the term “about” is used to indicate that a value includes the inherent variation of error for the device, the method being employed to determine the value, or the variation that exists among the study subjects. The term “about” encompasses the exact number recited. In some embodiments, “about” means within plus or minus 10%of a given value or range. In certain embodiments, “about” means that the variation is ±5%, ±4%, ±3%, ±2%, ±1%, ±0.5%, ±0.2%, or ±0.1%of the value to which “about” refers. In some embodiments, “about” means that the variation is ±1%, ±0.5%, ±0.2%, or ±0.1%of the value to which “about” refers.
As used herein, “essentially free, ” in terms of a specified component, is used herein to mean that none of the specified component has been purposefully formulated into a composition and/or is present only as a contaminant or in trace amounts. The total amount of the specified component resulting from any unintended contamination of a composition is therefore well below 0.1%, preferably below 0.05%, and more preferably below 0.01%. Most preferred is a composition in which no amount of the specified component can be detected with standard analytical methods.
The terms “peptide, ” “polypeptide” and “protein” are used interchangeably herein, and refer to a polymeric form of amino acids comprising at least two or more contiguous amino acids
chemically or biochemically modified or derivatized amino acids. The term “peptide” as used herein refers to a class of short polypeptides. The term peptide may refer to a polymer of amino acids (natural or non-naturally occurring) having a length of up to about 100 amino acids. For example, peptides may be about 1 to about 10, about 10 to about 25, about 25 to about 50, about 50 to about 75, about 75 to about 100 amino acid residues in length. In some embodiments, the peptides may be about 100, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, about 1000, about 1250, about 1500, about 1750, about 2000, about 2250, about 2500, about 2750, about 3000, about 3250, about 3500, about 3750, about 4000, about 4250, about 4500, about 4750, are about 5000 amino acid residues in length.
The terms “nucleic acid, ” “polynucleotide, ” and “oligonucleotide” are used interchangeably herein and refer to a polymer or oligomer of nucleotides of any length. The nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases (such as methylated, hydroxymethylated, or glycosylated) , non-natural nucleotides, non-nucleotide building blocks that exhibit similar structure and/or function as natural nucleotides (i.e., “nucleotide analogs” ) , and/or any substrate that can be incorporated into a polymer by DNA or RNA polymerase. The nucleic acids or polynucleotides can be heterogenous or homogenous in composition, can be isolated from naturally occurring sources, or can be artificially or synthetically produced. In addition, the nucleic acids may be DNA or RNA, or a mixture thereof, and can exist permanently or transitionally in single-stranded or double-stranded form, including homoduplex, heteroduplex, and hybrid states. Nucleic acid structures also include, for instance, a DNA/RNA helix, peptide nucleic acid (PNA) , morpholino nucleic acid (see, e.g., Braasch and Corey, Biochemistry, 4 (14) : 4503-4510 (2002) and U.S. Patent 5,034,506) , locked nucleic acid (LNA; see Wahlestedt et al., Proc. Natl. Acad. Sci. U.S.A., 97: 5633-5638 (2000) ) , cyclohexenyl nucleic acids (see Wang, Am. Chem. Soc., 122: 8595-8602 (2000) ) , and/or a ribozyme.
As is understood in the art, a nucleic acid strand is inherently directional, as the carbon atoms in the sugar ring are numbered from 1’ to 5’ and the “5’ -end” has a free hydroxyl (or phosphate) on a 5’ carbon and the “3’ prime end” has a free hydroxyl (or phosphate) on a 3’ carbon. As used herein and understood in the art, a nucleic acid having certain sequence elements “from 5’ to 3’ ” means that these sequence elements are arranged linearly from the 5’ end to the 3’ end of the nucleic acid.
When referring to a nucleotide sequence or protein sequence, the term “identity” is used to denote similarity between two sequences. Sequence similarity or identity may be determined using standard techniques known in the art, including, but not limited to, the local sequence identity algorithm of Smith &Waterman, Adv. Appl. Math. 2, 482 (1981) , by the sequence identity alignment algorithm of Needleman &Wunsch, J Mol. Biol. 48, 443 (1970) , by the search
for similarity method of Pearson &Lipman, Proc. Natl. Acad. Sci. USA 85, 2444 (1988) , by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Drive, Madison, WI) , the Best Fit sequence program described by Devereux et al., Nucl. Acid Res. 12, 387-395 (1984) , or by inspection. Another algorithm is the BLAST algorithm, described in Altschul et al., J Mol. Biol. 215, 403-410, (1990) and Karlin et al., Proc. Natl. Acad. Sci. USA 90, 5873-5787 (1993) . A particularly useful BLAST program is the WU-BLAST-2 program which was obtained from Altschul et al., Methods in Enzymology, 266, 460-480 (1996) ; blast. wustl/edu/blast/README. html. WU-BLAST-2 uses several search parameters, which are optionally set to the default values. The parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched; however, the values may be adjusted to increase sensitivity. Further, an additional useful algorithm is gapped BLAST as reported by Altschul et al., (1997) Nucleic Acids Res. 25, 3389-3402. Unless otherwise indicated, percent identity is determined herein using the algorithm available at the internet address: blast. ncbi. nlm. nih. gov/Blast. cgi.
As used herein, terms “complementary” and “complementarity” refers to the relationship between two nucleic acid molecules having the capacity to form hydrogen bond (s) with one another by either traditional Watson-Crick base-paring or other non-traditional types of pairing. The two DNA/RNA strands with complementary sequences bind to form a duplex that follows the Watson-Crick base-pairing rules: A binds to T (U) with two hydrogen bonds; G binds to C with three hydrogen bonds. The degree of complementarity between two nucleotide sequences can be indicated by the percentage of nucleotides in a nucleotide sequence which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleotide sequence (e.g., about 50%, about 60%, about 70%, about 80%, about 90%, and 100%complementary) . Two nucleotide sequences are “perfectly complementary” or “100%complementary” if all the contiguous nucleotides of a nucleotide sequence will hydrogen bond with the same number of contiguous nucleotides in a second nucleotide sequence. Two nucleotide sequences are “substantially complementary” if the degree of complementarity between the two nucleotide sequences is at least 60% (e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%) over a region of at least 8 nucleotides (e.g., at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more nucleotides) , or if the two nucleotide sequences hybridize under at least moderate, or, in some
embodiments high, stringency conditions. Exemplary moderate stringency conditions include overnight incubation at 37℃ in a solution comprising 20%formamide, 5%SSC (150 mM NaCl, 15 mM trisodium citrate) , 50 mM sodium phosphate (pH 7.6) , 5x Denhardt’s solution, 10%dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1*SSC at about 37-50℃, or substantially similar conditions, e.g., the moderately stringent conditions described in Sambrook, J., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press; 4th edition (June 15, 2012) . High stringency conditions are conditions that use, for example (1) low ionic strength and high temperature for washing, such as 0.015 M sodium chloride/0.0015 M sodium citrate/0.1%sodium dodecyl sulfate (SDS) at 50℃, (2) employ a denaturing agent during hybridization, such as formamide, for example, 50% (v/v) formamide with 0.1%bovine serum albumin (BSA) /0.1%Ficoll/0.1%polyvinylpyrrolidone (PVP) /50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride and 75 mM sodium citrate at 42℃, or (3) employ 50%formamide, 5xSSC (0.75 M NaCl, 0.075 M sodium citrate) , 50 mM sodium phosphate (pH 6.8) , 0.1%sodium pyrophosphate, 5x Denhardt’s solution, sonicated salmon sperm DNA (50 pg/ml) , 0.1%SDS, and 10%dextran sulfate at 42℃, with washes at (i) 42℃ in 0.2*SSC, (ii) 55℃ in 50%formamide, and (iii) 55℃ in 0.1*SSC (optionally in combination with EDTA) . Additional details and an explanation of stringency of hybridization reactions are provided in, e.g., Sambrook, supra, and Ausubel et al., eds., SHORT PROTOCOLS IN MOLECULAR BIOLOGY, 5th ed., John Wiley &Sons, Inc., Hoboken, N. J. (2002) .
The term “exogenous, ” as used herein and understood in the art in relation to a protein, gene, nucleic acid, or polynucleotide in a cell or organism refers to a protein, gene, nucleic acid, or polynucleotide that has been introduced into the cell or organism by artificial or natural means; or in relation to a cell, the term refers to a cell that was isolated and subsequently introduced into a cell population or to an organism by artificial or natural means. An exogenous nucleic acid may be from a different organism or cell, or it may be one or more additional copies of a nucleic acid that occurs naturally within the organism or cell. An exogenous cell may be from a different organism, or it may be from the same organism. By way of a non-limiting example, an exogenous nucleic acid is one that is in a chromosomal location different from where it would be in natural cells, or is otherwise flanked by a different nucleic acid sequence than that found in nature.
The term “operably linked” as used herein and understood in the art with reference to sequence elements in nucleic acid molecules means that these sequence elements (e.g., an intron fragment, a target sequence, a promoter, and a coding sequence) are functionally related to each other. For example, a promoter is operatively linked to a coding sequence if it controls the transcription of the sequence; or a ribosome binding site is operatively linked to a coding sequence if it is positioned so as to permit translation.
The term “hybridization” or “hybridized” when referring to nucleotide sequences is the association formed between and/or among sequences having complementarity.
The term “homology” refers to the percent of identity between the nucleic acid residues of two polynucleotides or the amino acid residues of two polypeptides. The correspondence between one sequence and another can be determined by techniques known in the art. For example, homology can be determined by a direct comparison of the sequence information between two polypeptides by aligning the sequence information and using readily available computer programs. Two polynucleotide (e.g., DNA) or two polypeptide sequences are “substantially homologous” to each other when at least about 80%, preferably at least about 90%, and most preferably at least about 95%of the nucleotides, or amino acids, respectively match over a defined length of the molecules, as determined using the methods above.
The term “Cte” as used herein refers to a group IIB intron C. te. I1, found in the human pathogen Clostridium tetani. (McNeil et al., RNA, 20 (6) : 855-866 (2014) ) .
The term “Pli” as used herein refers to a group IIB intron Pli, found in the mitochondrial genome of a filamentous brown alga pathogen Pylaiella littoralis. (Zhao and Pyle, Trends in Biochem. Sci., 42.6 (2017) : 470-482) .
The term “Oi” as used herein refers to a group IIC intron O. i., found in the Oceanobacillus iheyensis. (Toor et al. (2010) , RNA 16, 57-69) .
The term “LtrB” as used herein refers to a group IIA intron Ll. LtrB, found in the Lactococcus lactis. (Qu, G. et al. (2016) Nat. Struct. Mol. Biol. 23, 549-557) .
The term “group I intron self-splicing activity” or “group I intron activity” refers to the self-splicing activity derived from a group I intron. The term “group II intron self-splicing activity” or “group II intron activity” refers to the self-splicing activity derived from a group II intron. The term “group IIA intron self-splicing activity” or “group IIA intron activity” refers to the self-splicing activity derived from a group IIA intron. The term “group IIB intron self-splicing activity” or “group IIB intron activity” refers to the self-splicing activity derived from a group IIA intron. The term “group IIC intron self-splicing activity” or “group IIC intron activity” refers to the self-splicing activity derived from a group IIA intron.
The term “cRNAzymes” as used herein refer to RNAs that are engineered ribozymes with self-splicing activity which, upon self-splicing, forms circRNAs.
The term “D1-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D1, including λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 as depicted on FIGs. 2A-2E.
The term “D5-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D5, including the catalytic triad, ζ’ , λ’ , and κ’a s depicted on FIGs. 2A-2E.
The term “catalytic triad” as used herein refer to a highly conserved region (AGC) which forms base triples with other nucleotides to form a triple helix known as the “catalytic triplex” . This catalytic triplex forms the binding pocket for the two active site magnesium ions. (Chan et al, Nature Comm. 9.1 (2018) : 1-10. )
The term “D6-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D6, including a bulged adenosine (or an atypical bulged A) that acts as the nucleophile for the first step of splicing (the branching pathway) as depicted on FIGs. 2A-2E.
The term “bulged adenosine” (also known as bulged A) as used herein refer to residue within the intron sequence as the initiating nucleophile. The conventional group II-type bulged adenosine is located on domain 6 (D6, or DVI) of a group II intron (FIGs. 1A and 2A) . In some group II introns, the bulged adenosine is 7 or 8 nucleotides away from the 3' splicing site. The bulged adenosine is normally conserved and plays a central role in the splicing process. During this process, The 2' hydroxyl of the bulged adenosine attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. This results in a branched intron lariat connected by a 2' phosphodiester linkage at the bulged adenosine (Van der Veen et al., The EMBO Journal, 6 (12) : 3827-3831 (1987) ; Jacquier et al., Journal of molecular biology, 219 (3) : 415-428 (1991) ; Daniels et al., Journal of molecular biology, 256 (1) : 31-49 (1996) ) .
The term “atypical bulged adenosine” (also known as atypical bulged A) as used herein refer to a region found within D6 of a group IIB intron C. te. I1 (Cte) , found in the in the human pathogen Clostridium tetani. D6 of Cte does not have a clearly bulged adenosine. Instead, it has a looped region, see FIGs. 2B-2C, which acts as the nucleophile for the first step of splicing (the branching pathway) . (McNeil et al., RNA, 20 (6) : 855-866 (2014) )
The term “D2-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron and forms the stem-loop structure of the D2 a naturally occurring group II intron.
The term “D3-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron and forms the stem-loop structure of the D3 a naturally occurring group II intron.
The term “D2/D3-like sequence” as used herein refers to an RNA sequence that is either a “D2-like sequence” or a “D3-like sequence. ”
The term “3’ D4 stem-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the 3’ stem of the D4 of a naturally occurring group II intron. Similarly, the term “5’ D4 stem-like sequence” as used herein refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the 5’ stem of the D4 of a naturally occurring group II intron. Additionally, the 3’ and 5’ D4 stem-like sequences each has a region that are at least 60%complementarily paired (acomplementary region) .
The terms “scar, ” as used herein, refers to the non-target sequence region in the circRNA splicing product. The term “scarless splicing” as used herein refers to the self-splicing of the cRNAzymes which produce circRNAs that do not contain additional sequence elements beyond the target sequence. As such, a “scarless” circRNA contains no scar, meaning that it solely consists of the target sequence. The term “near-scarless splicing” as used herein refers to the self-splicing of the cRNAzymes which produce circRNAs that include no more than 20 nucleotides besides the target sequence. A “near-scarless circRNA” is a circRNA resulting from “near-scarless” self-splicing of a cRNAzyme, which include no more than 20 nucleotides besides the target sequence.
The pairing between the exon-binding sequences (or “EBSs” ) in the intron and the intron-binding sequences (or “IBSs” ) in the flanking exons are critical during the splicing. The term “EBS” as used herein in connection with a group II intron refers to an exon binding sequence in the intron, which interact (e.g., complementarily pair) with the intron binding sequences ( “IBSs” ) flanking the exon regions to trigger splicing. A group II intron can have multiple EBSs, such as EBS1, EBS2, and EB2s, which interact with IBS1, IBS2, and IBS3, respectively. In addition to EBS1, EBS2 and EBS3, the single nucleotide located directly upstream of EBS1 in domain 1, the “δ nucleotide, ” can also pair with IBS3, and the interaction between δ and IBS3 is referred to δ-IBS3 pairing. As used herein, the term “EBS’ ” refers to an EBS modified to allow scarless splicing of a group II intron. Correspondingly, the sequence elements within the target sequence that pair with the EBS’s are referred to herein as the IBS’s . According, EBS1’ , EBS2’ , and EBS3’ refer to the EBS1, EBS2, and EBS3 sequences that are modified to allow scarless splicing, respectively. IBS1’ , IBS2’ , and IBS3’ refer to the sequences in the target sequence that function as the IBS1, IBS2, and IBS3 in the native exon sequences flanking a group II intron to locate splicing site by interacting with EBS1’ , EBS2’ , and EBS3’ , respectively. Additionally, the “δ” nucleotide” refers to the nucleotide upstream of EBS1’ that pairs with IBS3’ , and the interaction between δ” and IBS3’ is referred to as the δ” -IBS3’ pairing.
As used herein, the term “E1” and “E2” refer to the exon fragments flanking the target sequence in the cRNAzymes, which remain with the target sequence after self-splicing of the cRNAzymes. E2 is linked to 5’ end of the target sequence and E1 is linked to the 3’ end of the target sequence. Both E1 and E2 comprise an IBS and facilitate the self-splicing of the cRNAzyme. In some embodiments, the E1 and/or E2 can be the exon sequences flanking naturally existing group II intron. In some embodiments, the E1 and/or E2 can be artificial sequences that are engineered into cRNAzymes to, e.g., enhance the accuracy and/or efficiency of self-splicing. In some embodiments, E1, E2, or both can be absent, and part of the target sequence (e.g., linker, TI, or Z1, or combination thereof) can comprise IBS and serve as E1, E2, or both. In scarless splicing, both E1 and E2 are absent, and part of the target sequence (e.g., linker, TI, or Z1, or combination thereof) can comprise IBS and serve as E1 and/or E2.
The term “in vitro transcription, ” or “IVT, ” refers to versatile method to produce RNA in vitro that uses an RNA polymerase, ribonucleotides, and appropriate buffer conditions to synthesize RNA from a DNA template.
The term “resulting target sequence, ” as used herein, refers to the target sequence as it is formed in the circRNA upon self-splicing of the RNAs (or cRNAzymes) provided herein.
The term “expression construct” or “expression cassette, ” as used herein, means a nucleotide sequence that directs translation.
The terms “coding sequence, ” “coding sequence region, ” “coding region, ” and “CDS, ” as used interchangeably here refer to the portion of a nucleic acid (e.g., a DNA or an RNA) that is or can be translated to protein.
The terms “reading frame, ” “open reading frame, ” and “ORF” as used interchangeably herein refer to a nucleotide sequence that begins with an initiation codon (e.g., ATG) and, in some embodiments, ends with a termination codon (e.g., TAA, TAG, or TGA) .
The term “control elements” ” as used herein refers collectively to promoter regions, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites (IRES) , enhancers, splice junctions, and the like, which collectively provide for the replication, transcription, post-transcriptional processing, and translation of a coding sequence in a recipient cell.
The term “promoter” as used herein refers to a nucleotide region comprising a DNA regulatory sequence, wherein the regulatory sequence is derived from a gene that is capable of binding to an RNA polymerase and allowing for the initiation of transcription of a downstream (3' direction) coding sequence. It may contain genetic elements at which regulatory proteins and molecules may bind, such as RNA polymerase and other transcription factors, to initiate the specific transcription of a nucleic acid sequence. A promoter that is “operatively positioned, ”
“operatively linked” mean that a promoter is in a correct functional location and/or orientation in relation to a nucleic acid sequence to control transcriptional initiation and/or expression of that sequence, which is “under control” and “under transcriptional control” of the promoter.
The term “enhancer” as used herein means a nucleic acid sequence that, when positioned proximate to a promoter, confers increased transcription activity relative to the transcription activity resulting from the promoter in the absence of the enhancer domain.
The terms “internal ribosome entry site, ” “internal ribosome entry site sequence, ” “IRES” and “IRES sequence region” as used interchangeably herein refer to cis elements of viral or human cellular RNAs (e.g., messenger RNA (mRNA) and/or circRNAs) that bypass the steps of canonical eukaryotic cap-dependent translation initiation.
The term “IRES-like sequence” and “Internal Ribosome Entry Site-like sequence, ” as used interchangeably herein refer to non-naturally occurring nucleotide sequences that display a function of a naturally occurring IRES.
The term “vector” or “construct” (sometimes referred to as a gene delivery system or gene transfer “vehicle” ) refers to a vehicle that is used to carry genetic material (e.g., a nucleotide sequence) , which can be introduced into a host cell, where it can be replicated and/or expressed.
The term “treat” as used herein refers to executing a protocol or plan, which can include administering one or more drugs or active agents to a patient, in an effort to alleviate signs or symptoms of the disease or the recurrence of the disease. Desirable effects of treatment include decreasing the rate of disease progression, ameliorating or palliating the disease state, and remission, increased survival, improved quality of life or improved prognosis. Alleviation or prevention can occur prior to signs or symptoms of the disease or condition appearing, as well as after their appearance. As used herein, a “treatment” does not require complete alleviation of signs or symptoms, and does not require a cure.
As used herein, the term “therapeutic beneficial” or “therapeutically effective” when used in connection with a therapeutic refers to the property of the therapeutic that promotes or enhances the well-being of the subject. This includes, but is not limited to, a reduction in the frequency, severity, or rate of progression of the signs or symptoms of a disease. For example, treatment of cancer may involve, for example, a reduction in the size of a tumor, a reduction in the invasiveness of a tumor, reduction in the growth rate of the cancer, or a reduction in the rate of metastasis or recurrence. Treatment of cancer can also refer to prolonging survival of a subject with cancer.
As used herein, the term “pharmaceutical or pharmacologically acceptable” refers to molecular entities and compositions that do not produce an adverse, allergic, or other untoward reaction when administered to an animal, such as a human, as appropriate. For animal (e.g.,
human) administration, it will be understood that preparations should meet sterility, pyrogenicity, general safety, and purity standards as required, e.g., by the FDA Office of Biological Standards.
As used herein, the term “pharmaceutically acceptable carrier” includes any and all aqueous biocompatible solvents (e.g., saline solutions, phosphate buffered saline, parenteral vehicles, such as sodium chloride, Ringer's dextrose, etc. ) , antioxidants, preservatives (e.g., antibacterial or antifungal agents, anti-oxidants, chelating agents, and inert gases) , isotonic agents, such like materials and combinations thereof, as would be known to one of ordinary skill in the art. The pH and exact concentration of the various components in a pharmaceutical composition are adjusted according to well-known parameters.
As used herein, the term “target cell” refers to the cell or type of cells to which the RNAs (or cRNAzymes) or circRNAs disclosed herein are intended to deliver.
The terms “transfection, ” “transformation, ” and “transduction” are used interchangeably herein and refer to the introduction of one or more exogenous polynucleotides into a host cell by using physical or chemical methods.
As used herein, the term “subject” as used herein refers to any animal (e.g., a mammal) , including, but not limited to, humans, non-human primates, canines, felines, rodents, and the like, which is to be the recipient of a particular treatment. A subject can be a human. A subject can have a particular disease or condition.
Nomenclature for nucleotides, nucleic acids, nucleosides, and amino acids used herein is consistent with International Union of Pure and Applied Chemistry (IUPAC) standards (see, e.g., bioinformatics. org/smsylupac. html) . Exemplary genes and polypeptides are described herein with reference to GenBank numbers, GI numbers and/or SEQ ID NOS. It is understood that one skilled in the art can readily identify homologous sequences by reference to sequence sources, including but not limited to Uniprot (https: //www. uniprot. org/) , GenBank (ncbi. nlm. nih. gov/genbank/) and EMBL (embl. org/) .
Ranges: throughout this disclosure, various aspects of the invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.7, 3, 4, 5, 5.3, and 6. This applies regardless of the breadth of the range.
The term “CL” refers to Subdoligranulum variabile strain DSM 15176 chromosome, complete genome, which is from NZ_CP102293.1. (subdo. li. gra’ nu. lum. L. adj. subdolus deceptive, alludes to the somewhat deceptive and unusual coccoid form; L. neu. n. granulum a small grain; N. L. neu. n. Subdoligranulum, a deceptive grain; va. ri. a’ bi. le. L. neut. adj. variabile, because the cells are varied in shape) .
8.2 Intron fragments
Provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment. The group II self-splicing activities of the RNAs (sRNAzymes) provided herein are provided by the 3’ and 5’ intron fragments. The target sequences included in the RNAs (or cRNAzymes) provided herein are provided in detail in sections below.
8.2.1 Structures of naturally occurring group II introns
The structure and catalytic mechanism of group II introns have recently been elucidated through a combination of genetics, chemical biology, solution biochemistry, and crystallography (e.g., Koch et al. Mol. and Cel. Bio. 12.5 (1992) : 1950-1958; Qin and Pyle, Curr. Opin. Struc. Bio. 8.3 (1998) : 301-308; Toor et al., Science 320.5872 (2008) : 77-82; McNeil et al., Nucleic acids research 42.3 (2014) : 1959-1969; Zhao and Pyle, Trends in Biochem. Sci., 42.6 (2017) : 470-482; Chan et al. Nature Comm. 9.1 (2018) : 1-10. ) .
Group II introns catalyze self-splicing through an autocatalytic two-step reaction in which the introns excise themselves from surrounding RNA (exons) and stitch the resulting pieces back together (FIGs. 3 and 6) . As shown, self-splicing requires only two essential components (intron RNA and a cation) , and it can proceed using purified components in vitro. The cation can be selected from the group consisting of Ba2+, Ca2+, Mg2+, Mn2+, Fe2+, Cu2+, Zn2+, Cd2+, Pb2+, Li+, Cs+, Na+, K+, Rb+, and NH4
+, or a combination thereof. In some embodiments, the cation is a bivalent cation, such as Ba2+, Ca2+, Mg2+, Mn2+, Fe2+, Cu2+, Zn2+, Cd2+, and Pb2+. In some embodiments, the cation is a monovalent cation, such as Li+, Na+, K+, Rb+ and Cs+. In some embodiments, the self-splicing requires only the intron RNA and Mg2+.
The self-splicing reaction is a multistep process that can occur through one of two pathways, and for many introns (such as the yeast mitochondrial intron ai5γ) , both of these pathways are operative. In the branching pathway (FIG. 3B) , the nucleophile for the first step of splicing is a specific bulged adenosine within intron Domain 6 (D6) , whereas in the hydrolysis
pathway, the nucleophile during the first step is a water molecule (FIG. 3A) . Both of these reactions lead to productive splicing, and their mechanisms have been extensively investigated.
All group II introns share a conserved secondary structure that is based on a common set of six (6) radiating domains connected by linker nucleotides, all having the stem-loop structure (FIG. 1A) . Stem-loop structure is a type of an RNA secondary structure, which can be determined by any suitable polynucleotide folding algorithm. The 6 stem-loop structures of naturally occurring group II introns are called domains 1 to 6 (D1 to D6) , and arranged sequentially from 5’ to 3’ . Naturally occurring group II introns comprise multiple exon binding sequences (EBSs) , such as EBS1, EBS2, and EBS3, which interact, such as complementarily pair, with the intron binding sequences (IBSs) in exon regions, triggering splicing by virtue of their own hydroxyl groups within the EBS nucleic acid sequences (FIG. 1A) . Additionally, group II introns also share a common tertiary structure, particularly within the catalytic core. Most of the domains can be transcribed as separate molecules the fold independently and which, when combined with other sections of the intron, retain the catalytic activity. Typically, group II intron structural elements and their role in reaction chemistry can be described by referring to regions within the intron secondary structure (FIGs. 2A-2B) .
Intron Domain 1 (or “D1” ) is the largest domain. It provides the recognition sites for sequence-specific exon binding, and is essential for recognizing the exon in splicing reactions. In addition, D1 contains the active site constituents that form the molecular framework with which the other intronic domain associate. As provided in FIG. 2A, a set of intramolecular pairings are highly conservative and functionally important, including the B-B’ pair, the ε-ε’ pair, the λ-λ’ pair, the α-α’ pair, the ζ-ζ’ pair, the κ-κ’ pair, and the δ-δ’ pair.
The pairing between the exon-binding sequences (or “EBSs” ) in the intron and the intron-binding sequences (or “IBSs” ) in the flanking exons are critical during the splicing. D1 also contains the key EBSs for binding with exon interaction. As shown in FIGs. 5A-5D and 7A-7D, EBSs, such as EBS1, EBS2, and EBS3 interact, such as complementarily pair, with the IBSs in exon regions (such as IBS1, IBS2, and IBS3) , whereby the hydroxyl groups within the EBS trigger splicing at the splicing site. In addition to EBS1, EBS2 and EBS3, the single nucleotide located directly upstream of EBS1 in domain 1, the δ nucleotide, can also pair with IBS3, and the interaction between δ and IBS3 is referred to δ-IBS3 pairing. The EBS1-IBS1 interaction, optionally combined with the EBS2-IBS2 interaction are important for specifying the 5’ -splice site. The EBS3/δ -IBS3 interaction is important for specifying the 3’ -splice site.
Domain 2 (or “D2” ) can promote the assembly of the active intron structure, forming multiple interactions that control the position of D6 and the branch site. D2 is not
phylogenetically conserved and the deletion of D2 was shown to have little effect on the efficiency of self-splicing.
Domain 3 (or “D3” ) can stimulate reaction chemistry by forming a network of important interactions with D5. Like D2, D3 is also not required for catalysis. D2 and D3 serve to orient their conserved, intervening junction (J2/3) within the active site, where it forms a part of the core. Like D1 and D5, D3 can be transcribed as a separate molecule and added to splicing reactions in trans. The lower stem-loop structure of D3 is phylogenetically conserved.
Domain 4 (or “D4” ) is the least conserved region of the intron, which does not appear to affect the splicing efficiency. In many group II introns, D4 is found to contain an open reading frame from which a maturase is translated, which can bind to stem-loop structures near the basal stem of D4.
Domain 5 (or “D5” ) is the heart of the active site, and it contains the most highly conserved nucleotides within the intron. D5 is characterized by a terminal loop and stem regions that form critical tertiary interactions with D1, and by a dynamic, asymmetric bulge that is essential for binding of catalytic metal ions. D5 is a small hairpin-loop structure, containing a two-nucleotide bulge and it is capped by a conserved GNRA tetraloop (where N is any nucleotide and R represents a purine) . Studies have found that eight 2’ -hydroxyl groups on D5 have a strong effect specifically on either binding or chemical catalysis, while four pro-Rp phosphate oxygens of D5 affect the overall rate of self-splicing. A variety of nucleotides are important for its function, including the most conserved AGC triad in the first helix (the catalytic triad) , in which the guanine of the triad is invariant and critical for self-splicing in vitro and in vivo.
Domain 6 (or “D6” ) usually takes the form of a hairpin loop that presents the highly conserved branch-site adenosine, and the rest of the domain facilitates presentation of the branch-site nucleophile during the first step of splicing. The 2'-hydroxyl of this adenosine acts as the nucleophile in the first step of splicing by transesterification, resulting in a 2'-5' linkage between tile branch point adenosine and the first nucleotide of the intron (FIG. 3B) . This-lariat molecule is uniquely characteristic of group II and nuclear spliceosomal introns. Splicing in vivo and in vitro can occur without lariat formation -through a pathway in which water is the nucleophile during the first step of splicing (FIG. 3A) .
Of all six domains, D1 and D5 are the only two required for catalysis. The presence of D2, D3, and D6 can improve slicing accuracy and/or efficiency in some instances.
Group II introns share an almost identical catalytic core, and they utilize the same basic mechanism for chemical catalysis. However, they can be divided into several families, including Group IIA, IIB, IIC, and others that display distinct structural and functional differences. The various classes have different 5’ -exon recognition strategies, and they display diversification of
architectural scaffolding, protein interaction networks, and some aspects of chemical reactivity. The most prominent difference among IIA, IIB, and IIC ribozymes is the mechanism of exon recognition, because each class uses a distinct combination of pairing interactions to recognize the 5′and 3′exons (that is, different combinations of IBS1-EBS1, IBS2-EBS2, IBS3-EBS3, and δ-δ′pairings) .
The Group IIB Class are generally believed to be a highly evolved, modern form of the group II intron. The network of hydrogen bonds and metal ion interactions within the catalytic core of IIB introns (involving D5, J2/3, and D1) are almost identical to those visualized in group IIC intron. As demonstrated in the crystal structure of the P. littoralis intron, including the lariat form of the intron, the second exon recognition element EBS2 is not required for catalysis. Consistent with previous cross-linking studies, the β-β’ interaction is found to be present in most group IIA and IIB intron, which is proximal to the exon-recognition motifs (EBS1 and EBS2) and appears to facilitate exon orientation within the core. In the three-dimensional structure, the β-β’ kissing loop forms a brace that joins upstream and downstream halves of D1, and it is interesting because it appears to have evolved sequentially over time as group II intron families developed.
The Group IIA Class share almost all major structural features with IIB introns, although the two classes use different exon recognition strategies. The pairing interactions that are important for exon recognition are for group IIA introns are IBS1-EBS1, IBS2-EBS2, and δ-δ,′ but not IBS3-EBS3.
The Group IIC Class are the smallest and most streamlined class of group II introns. Group IIC intron contains only a single, short EBS1 and self-splice through hydrolysis of the 5’ -splice site, rather than by branching in vitro. The most conserved features that are shared by all group II introns can be found in group IIC introns. As demonstrated by the crystal structure of a group IIC intron from the bacterium Oceanobacillus iheyensis, D1 provides a supportive exoskeleton for the active-site domains, and D5 is docked at the center of the D1 shell, where it is secured through an elaborate network of highly conserved interactions, which include the κ-κ’ , λ-λ’ , andζ-ζ’ interactions that are also present in IIB introns. Nucleotides within the D5 bulge are twisted in a manner that brings their backbone phosphates into extremely close proximity, resulting in a highly specific binding site for the two divalent metal ions that are critical for chemical catalysis. The strained conformation of the D5 bulge is made possible by simultaneous interactions with an adjacent triple helix, which is formed by the major groove edge of a D5 stem and nucleotides at the junction between D2 and D3 (J2/3) .
The Group IIE and IIF Classes are known additional group II families. The two families were originally distinguished from the other classes by sequence divergence within the
protein maturase domains. Albeit smaller than the IIB class, the IIE and IIF introns appear to have much higher target sequence specificity than IIC introns.
8.2.2 Structures of cRNAzymes
Provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment. The “3’ intron fragment” and “5’ intron fragment” are referred herein as such because the linkage of the 5’ -end of the 3’ intron fragment to the 3’ -end of the 5’ intron fragment would form a linear cRNAzyme, wherein the cRNAzyme has the sequence elements that form the essential structural elements of a naturally occurring group II intron and that are sequentially arranged as they would in the naturally occurring group II intron. For illustrative purposes, the 3’ intron fragment of the RNAs provided herein can comprise a 3’ fragment of Cte (agroup IIB intron that is disclosed in greater detail below) comprising, e.g., D5 and D6 of Cte, or sequence elements that form the essential structural elements of D5 and D6, and a 5’ intron fragment can comprise a 5’ fragment of Cte comprising, e.g., D1 of Cte, optionally in combination with D2 and/or D3 of Cte.
In some embodiments, the RNAs (or cRNAzyme) provided herein further comprise two homology arms, including a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. The homology arms can help shorten the spatial distance between the 5’ intron fragment and the 3’ intron fragment, thereby facilitating the self-splicing (circularization) reaction. In some embodiments, the presence of the homology arms can enhance the self-splicing efficiency of the RNAs provided herein.
Accordingly, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 5’ homology arm, (2) a 3’ intron fragment; (3) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (4) a 5’ intron fragment; and (5) a 3’ homology arm; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment (FIG. 9) .
In some embodiments, the two homology arms can be 100%complementary to each other. In some embodiments, the two homology arms can have up to 1%, 2%, 3%, 4%, 5%, 6%,
7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, or 15%base mismatches. In some embodiments, the two homology arms are at least 85%complementary. In some embodiments, the two homology arms are 90%complementary. In some embodiments, the two homology arms are 95%complementary. In some embodiments, the two homology arms are 98%complementary. In some embodiments, the two homology arms are 99%complementary.
In some embodiments, the 5’ homology arm or 3’ homology arm is 15 to 60 nucleotides in length. In some embodiments, the 5’ homology arm or 3’ homology arm is 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, or 60 nucleotides in length. In some embodiments, both the 5’ homology arm and the 3’ homology arm are 15 to 60 nucleotides in length. In some embodiments, both the 5’ homology arm and the 3’ homology arm are 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, or 60 nucleotides in length.
8.2.2.1 Sequence elements
A general principle of designing a group II cRNAzyme construct is to preserve maximum self-splicing activity with minimum size. As described above, group II introns minimally require D1 and D5 for self-splicing activity. Additionally, the presence of D6 allows the group II intron to self-splice by branching instead of hydrolysis; and the presence of D2 and/or D3 may enhance the specificity and/or efficiency of the group II intron. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment comprises a D5-like sequence, and the 5’ intron fragment comprises a D1-like sequence. As such, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence (FIGs. 6A-6B and 8A-8B) .
D1 includes essential sequence and structural elements of group II introns. As used herein, the term “D1-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D1, including λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 as depicted on FIGs. 2A-2E. In some embodiments, the D1-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. The D1-like sequence can
be at least 70%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. The D1-like sequence can be at least 80%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. The D1-like sequence can be at least 90%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. The D1-like sequence can be at least 95%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. The D1-like sequence can be at least 98%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. The D1-like sequence can be 100%identical to the nucleotide sequence of the D1 of a naturally occurring group II intron. As a person of ordinary skill in the art would understand, the presence of the sequence elements required to form the essential structural elements would allow the D1-like sequence to function as the naturally occurring D1, despite the variation in sequence outside these elements. The person of ordinary skill in the art would be able to determine the sequence elements required to form the above-mentioned essential structural elements using assays disclosed herein or otherwise known in the art. Tables of such sequence elements for exemplary group II introns are also included in sections below.
D5 contains the catalytic core of group II introns. As used herein, the term “D5-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D5, including the catalytic triad, ζ’ , λ’ , and κ’a s depicted on FIGs. 2A-2E. In some embodiments, the D5-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. The D5-like sequence can be at least 70%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. The D5-like sequence can be at least 80%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. The D5-like sequence can be at least 90%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. The D5-like sequence can be at least 95%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. The D5-like sequence can be at least 98%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. The D5-like sequence can be 100%identical to the nucleotide sequence of the D5 of a naturally occurring group II intron. As a person of ordinary skill in the art would understand, the presence of the sequence elements required to form the essential structural elements would allow the D5-like sequence to function as the naturally occurring D5, despite the variation in sequence outside these elements. The person of ordinary skill in the art would be able to determine the sequence
elements required to form the above-mentioned essential structural elements using assays disclosed herein or otherwise known in the art. Tables of such sequence elements for exemplary group II introns are also included in sections below.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. As such, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence (FIGs. 8A-8B) . As described above, the presence of D6 allows the intron to self-splice using the branching pathway instead of the hydrolysis pathway (FIG. 3B) , which, in some embodiments, can promote the self-splice accuracy and efficiency. As used herein, the term “D6-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron, and that contains all sequence elements required to form the essential structural elements of the naturally occurring D6, including a bulged adenosine (or an atypical bulged A) as depicted on FIGs. 2A-2E. In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring group II intron. In some embodiments, the D6-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. The D6-like sequence can be at least 70%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. The D6-like sequence can be at least 80%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. The D6-like sequence can be at least 90%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. The D6-like sequence can be at least 95%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. The D6-like sequence can be at least 98%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. The D6-like sequence can be 100%identical to the nucleotide sequence of the D6 of a naturally occurring group II intron. As a person of ordinary skill in the art would understand, the presence of the sequence elements required to form the essential structural elements would allow the D6-like sequence to function as the naturally occurring D6, despite the variation in sequence outside these elements. The person of ordinary skill in the art would be able to determine the sequence elements required to form the above-mentioned essential structural elements using
assays disclosed herein or otherwise known in the art. Tables of such sequence elements for exemplary group II introns are also included in sections below.
In some embodiments, the presence of D2 and/or D3 can promote the accuracy and/or efficacy of self-splicing. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. As such, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a D2/D3-like sequence, from 5’ to 3’ .
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. As such, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, and a D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, and a D3-like sequence, from 5’ to 3’ .
As used herein, the term “D2-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron and forms the stem-loop structure of the D2 a naturally occurring group II intron. In some embodiments, the D2-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The D2-like sequence can
be at least 70%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The D2-like sequence can be at least 80%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The D2-like sequence can be at least 90%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The D2-like sequence can be at least 95%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The D2-like sequence can be at least 98%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The D2-like sequence can be 100%identical to the nucleotide sequence of the D2 of a naturally occurring group II intron. The person of ordinary skill in the art would be able to determine whether an RNA can form the D2 of a naturally occurring group II intron using assays disclosed herein or otherwise known in the art. Sequences of exemplary group II introns are also included in sections below.
As used herein, the term “D3-like sequence” refers to an RNA sequence that is at least 60%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron and forms the stem-loop structure of the D3 a naturally occurring group II intron. In some embodiments, the D3-like sequence is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The D3-like sequence can be at least 70%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The D3-like sequence can be at least 80%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The D3-like sequence can be at least 90%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The D3-like sequence can be at least 95%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The D3-like sequence can be at least 98%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The D3-like sequence can be 100%identical to the nucleotide sequence of the D3 of a naturally occurring group II intron. The person of ordinary skill in the art would be able to determine whether an RNA can form the D3 of a naturally occurring group II intron using assays disclosed herein or otherwise known in the art. Sequences of exemplary group II introns are also included in sections below.
As used herein, the term “D2/D3-like sequence” refers to an RNA sequence that is either a “D2-like sequence” or a “D3-like sequence. ”
D4 of naturally occurring group II introns contains a pair of stem-like sequences and a loop region, arranged as a 5’ D4 stem-like sequence, a loop region, and a 3’ D4 stem-like sequence, from 5’ to 3’ . The loop region of D4 of naturally occurring group II introns commonly contains the ORF for the maturase protein. In some embodiments, the RNAs (or cRNAzymes)
provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. As such, in some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a 5’ D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence, a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, and a 5’ D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2/D3-like sequence, and a 5’ D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence, a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2/D3-like sequence, and a 5’ D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, a D3-like sequence, and a 5’ D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence, a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, a D3-like sequence, and a 5’ D4 stem-like sequence, from 5’ to 3’ .
As used herein, the terms “3’ D4 stem-like sequence” and “5’ D4 stem-like sequence” refer to a pair of RNA sequences having a region that is at least 60%complementarily paired (acomplementary region) . In some embodiments, the 3’ D4 stem-like sequence is at least 60%identical to the nucleotide sequence of the 3’ stem of the D4 of a naturally occurring group II intron. In some embodiments, the 3’ D4 stem-like sequence is at least 60%identical to the nucleotide sequence of the 5’ stem of the D4 of a naturally occurring group II intron. In some embodiments, the 3’ and 5’ D4 stem-like sequences are at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively. The 3’ and 5’ D4 stem-like sequences can be at least 70%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively. The 3’ and 5’ D4 stem-like sequence can be at least 80%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively. The 3’ and 5’ D4 stem-like sequence can be at least 90%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively. The 3’ and 5’ D4 stem-like sequence can be at least 95%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively. The 3’ and 5’ D4 stem-like sequence can be at least 98%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively. The 3’ and 5’ D4 stem-like sequence can be 100%identical to the nucleotide sequences of the 3’ and 5’ stems of the D4 of a naturally occurring group II intron, respectively.
In some embodiments, the 3’ and 5’ D4 stem-like sequences each has a region that are at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%complementarily paired. In some embodiments, the 3’ and 5’ D4 stem-like sequences each has a region that are at least 70%complementarily paired. In some embodiments, the 3’ and 5’ D4 stem-like sequences each has a region that are at least 80%complementarily paired. In some embodiments, the 3’ and 5’ D4 stem-like sequences each has a region that are at least 90%complementarily paired. In some embodiments, the 3’ and 5’ D4 stem-like sequences each has a region that are at least 95%complementarily paired. In some embodiments, complementary region is 10-200 nucleotides in length. In some embodiments, complementary region is 10-100 nucleotides in length. In some embodiments, complementary region is 20-80 nucleotides in length. In some embodiments, complementary region is 30-60 nucleotides in
length. In some embodiments, the 3’ and 5’ of D4 stem-like sequences have two or more complementary regions, connected by linkers.
For further illustration, in some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure selecting from the group consisting of Formulae (1) - (12) :
(1) 5’ -D5L-TS-D1L-3’ ;
(2) 5’ -D5L-TS-D1L-D2/D3L-3’ ;
(3) 5’ -D5L-TS-D1L-D2L-D3L-3’ ;
(4) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ ;
(5) 5’ - (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’ ;
(6) 5’ - (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
(7) 5’ -D5L-D6L-TS-D1L-3’ ;
(8) 5’ -D5L-D6L-TS-D1L-D2/D3L-3’ ;
(9) 5’ -D5L-D6L-TS-D1L-D2L-D3L-3’ ;
(10) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ ;
(11) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’ ; and
(12) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’ ;
wherein TS is the target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence;
wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (1) : 5’ -D5L-TS-D1L-3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (2) 5’ -D5L-TS-D1L-D2/D3L-3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (3) 5’ -D5L-TS-D1L-D2L-D3L-3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (4) 5’ - (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (5) 5’ - (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (6) 5’ - (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (7) 5’ -D5L-D6L-TS-D1L-3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (8) 5’ -D5L-D6L-TS-D1L-D2/D3L-3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (9) 5’ -D5L-D6L-TS-D1L-D2L-D3L-3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (10) 5’ - (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (11) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein can have a structure of Formula (12) 5’ - (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’ .
In the embodiments described above, TS is the target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment.
8.2.2.2 “Scarless” splicing & “near-scarless” splicing
The exons flanking a naturally occurring group II intron can play important roles for the self-splicing. The 5’ exon refers to the naturally occurring exon sequence on the 5’ end of the group II intron and the 3’ exon refers to the naturally occurring exon sequence on the 3’ end of the group II intron. The 5’ and 3’ flanking exons contain intron binding sequences (IBS) that interact, such as complementarily pair, with the EBS sequence within the intron, which allows the hydroxyl groups within the EBS to trigger splicing at the splicing site. As described above, naturally occurring group II introns commonly contain EBS1, EBS2, and EBS3 within D1 which interact, such as complementarily pair, with IBS1, IBS2, and IBS3 that are present in the flanking exons, respectively. In addition to the pairing between EBS1 and IBS1, between EBS2 and IBS2, and between EBS3 and IBS3, a single nucleotide, the δ nucleotide, which is located directly upstream of EBS1 can also pair with IBS3, and the interaction between δ and IBS3 is referred to δ-IBS3 pairing. While the IBS1-EBS1 and IBS3-EBS3 (δ-IBS3) interaction are generally required for efficient self-splicing of the group II introns, the EBS2/IBS2 interaction can be removed without significantly affecting the self-splicing activity.
As such, to effect self-splicing, the EBS1 and EBS3 (or δ) of the RNAs (or cRNAzymes) provided herein need to interact, such as complementarily pair, with IBS1 and IBS3 in the exon elements. The exon elements can be either contained within the target sequence (typically at the terminal regions of the target sequence) or flanking the target sequence. In some embodiments, the target sequence of the RNAs (cRNAzymes) provided herein is flanked by the exon elements E1 and E2, wherein E1 comprises IBS1 and E2 comprises IBS3, and wherein the 3’ end of E2 is linked to 5’ end of the target sequence and the 5’ end of E1 is linked to the 3’ end of the target sequence. Accordingly, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment, (2) E2; (3) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (4) E1; and (5) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment, flanked by E1 and E2.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the exon elements are contained within the target sequence. In other words, certain sequence elements within the target sequence which, for example, can be present at the terminal regions of the target sequence, contain IBS1 and IBS3 and can serve as E1 and E2 in self-splicing. As such, the self-splicing of these RNAs produce circRNAs that do not contain additional sequence elements beyond the target sequence and is therefore referred to herein as “scarless” splicing. A circRNAs that solely consists of the target sequence is referred to herein as a “scarless” circRNA. The terms “scar, ” as used herein, refers to the non-target sequence region in the circRNA splicing product. As such, a “scarless” circRNA contains no scar.
In some embodiments of the precursor RNA (or cRNAzymes) provided herein that include E1 and E2, the self-splicing produce circRNAs that contain E1 and E2 in addition to the target sequence. In some embodiments, E1 is 0 to 20 nucleotides in length. In some embodiments, E2 is 0 to 20 nucleotides in length. In some embodiments, E1 is 0 to 10 nucleotides in length. In some embodiments, E2 is 0 to 10 nucleotides in length. In some embodiments, E1 is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length. In some embodiments, E2 is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length. In some embodiments, the E1 and E2 of the RNAs (or cRNAzyme) provided herein combined have no more than 20 nucleotides in length, such as no more than 10 nucleotides, such as 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides, and the circRNAs resulting from self-splicing of the RNA (or cRNAzyme) include no more than 20 nucleotides besides the target sequence, which are herein referred to as “near-scarless” circRNAs. The self-splicing of the precursor RNA (or cRNAzymes) provided herein that produce a “near-scarless” circRNA is
referred to as “near-scarless” splicing. In some embodiments, the near-scarless circRNA has a scar region equal to or less than 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, or 20 nucleotides in length.
In some embodiments, to ensure scarless splicing, the naturally occurring EBSs of a group II intron are modified to be complementary to sequence elements of a corresponding length with the target sequence that serve as the IBSs. Accordingly, in some embodiments, the RNAs (or cRNAzymes) disclosed herein are modified to have a modified EBS region which is complementary to a region of a corresponding length in a target sequence. As used herein, an EBS modified to allow scarless splicing is referred to as an EBS’ . Correspondingly, the sequence elements within the target sequence that pair with the EBS’s are referred to herein as the IBS’s . According, EBS1’ , EBS2’ , and EBS3’ refer to the EBS1, EBS2, and EBS3 sequences that are modified to allow scarless splicing, respectively. IBS1’ , IBS2’ , and IBS3’ refer to the sequences in the target sequence that function as the IBS1, IBS2, and IBS3 in the native exon sequences flanking a group II intron to locate splicing site by interacting with EBS1’ , EBS2’ , and EBS3’ , respectively. Similarly, δ” refers to the nucleotide upstream of EBS1’ that pairs with IBS3’ , and the interaction between δ” and IBS3’ is referred to as the δ” -IBS3’ pairing. In some embodiments, the EBS or the EBS’ , can be 3 to 20 nucleotides in length, preferably 5 to 15 nucleotides, more preferably 6 to 10 nucleotides, such as 6, 7, 8, 9 or 10 nucleotides.
The region of the target sequence that is complementary paired with the EBS’ , can exist anywhere in the target sequence that allows it to pair with the EBS’ to form a secondary structure necessary for self-splicing. In general, sequences at both ends of the target sequence can be used as they correspond to the location of the IBS sequences of E1 and E2 that naturally interact with EBS. In some embodiments, the EBS’ regions include modified EBS1 (or EBS1’ ) and modified EBS3 (EBS3’ ) regions. In some embodiments, the modified EBS, e.g., EBS1’ , EBS3’ , or both, is (are) complementary to a stretch of sequence located at the 3’ and/or 5’ end of the target sequence.
In some embodiments, the EBS (or EBS’ ) can be modified so that it is complementarily paired with a stretch of sequence in the target sequence (the IBS or IBS’ ) , thereby allowing interaction. In some embodiments, the modification includes substitution of one or more nucleotides. Certain degrees of mismatch can be tolerated as long as sufficient interaction between the EBS and the IBS exists. In some embodiments, the modified EBS (or EBS’ ) is complementarily paired with a region of a corresponding length in the target sequence on at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100%of the nucleotide positions, or is at least 60%identical, at least 70%, at least 80%, at least 90%, at least 95%, or 100%
identical to a complementary paired sequence of a region of a corresponding length in the target sequence.
As such, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment. Based on the detailed splicing mechanism, four sets of RNAs (or cRNAzymes) are expressly contemplated in the present disclosure.
Set 1: Near-scarless splicing based on the EBS1-IBS1 pairing and the EBS3-IBS3 pairing (FIGs. 4A, 5A (a) - (b) and 7A (a) - (b) ) . In some embodiments, the D1-like domain of the RNAs (or cRNAzymes) provided herein comprises EBS1 and EBS3 that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence. The target sequence can be flanked by E1 on its 3’ end and E2 on its 5’ end, wherein E1 and E2 comprise IBS1 and IBS3, respectively, and wherein the EBS1-IBS1 interaction and the EBS3-IBS3 interaction allow the self-splicing and the production of a near-scarless circRNA.
In some embodiments, EBS1 and EBS3 are complementarily paired with IBS1 and IBS3, respectively on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions. In some embodiments, the EBS1 and/or EBS3 can be modified from their naturally occurring counterpart (s) . The modification can include a substitution, a deletion and/or an addition of one or more nucleotides.
In some embodiments, the D1-like domain of the RNAs (or cRNAzymes) provided herein further comprises EBS2 that is at least 60%complementarily paired with a region of a corresponding length within the target sequence, namely, the IBS2, and the EBS2-IBS2 interaction can further promotes the efficiency and accuracy of the self-splicing. In some embodiments, EBS2 is complementarily paired with IBS2 on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions. In some embodiments, EBS2 can be modified from their naturally occurring counterpart (s) . The modification can include a substitution, a deletion and/or an addition of one or more nucleotides.
Set 2: Near-scarless splicing based on the EBS1-IBS1 pairing and the δ-IBS3 pairing (FIGs. 4B, 5B (a) - (b) and 7B (a) - (b) ) . In some embodiments, the D1-like domain of the RNAs (or cRNAzymes) provided herein comprises EBS1 and the δ nucleotide, wherein EBS1 is 60%
complementarily paired with a region of a corresponding length flanking the target sequence and the δ nucleotide is complementarily paired with a nucleotide within a sequence that flanks the target sequence. The target sequence can be flanked by E1 on its 3’ end and E2 on its 5’ end, wherein E1 and E2 comprise IBS1 and the δ nucleotide, respectively, and wherein the EBS1-IBS1 interaction and the δ-IBS3 interaction allow the self-splicing and the production of a near-scarless circRNA.
In some embodiments, the EBS1 is complementarily paired with IBS1 on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions. In some embodiments, the 4-10 nucleotides immediate upstream of the δ nucleotide (referred to herein as the “δ upstream” ) are complementarily paired with IBS3 on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions. In some embodiments, the EBS1 and/or the δ upstream can be modified from their naturally occurring counterpart (s) . The modification can include a substitution, a deletion and/or an addition of one or more nucleotides.
In some embodiments, the D1-like domain of the RNAs (or cRNAzymes) provided herein further comprises EBS2 that is at least 60%complementarily paired with a region of a corresponding length within the target sequence, namely, the IBS2, and the EBS2-IBS2 interaction can further promote the efficiency and accuracy of the self-splicing. In some embodiments, EBS2 is complementarily paired with IBS2 on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions. In some embodiments, EBS2 can be modified from their naturally occurring counterpart (s) . The modification can include a substitution, a deletion and/or an addition of one or more nucleotides.
Set 3: Scarless splicing based on the EBS1’ -IBS1’ pairing and the EBS3’ -IBS3’ pairing (FIGs. 4C, 5C (a) - (b) and 7C (a) - (b) ) . In some embodiments, the D1-like domain of the RNAs (or cRNAzymes) provided herein comprises EBS1’and EBS3’ that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence. The complementarily paired regions can be located at one or both ends of the target sequence. As such, the target sequence can contain a sequence element at its 3’ terminal region that can serve as E1 and another sequence element at its 5’ terminal region that can serve as E2, wherein E1 and E2 comprise IBS1’ and IBS3’ , respectively, and wherein the EBS1’ -IBS1’ interaction and the EBS3’ -IBS3’ interaction allow the self-splicing and the production of a scarless circRNA.
To achieve EBS1’ -IBS1’ pairing and the EBS3’ -IBS3’ pairing, in some embodiments, the intron sequence elements EBS1’and EBS3’a re modified from their naturally occurring
counterparts to pair with the corresponding regions in the target sequence (e.g., the terminal regions) . In some embodiments, EBS1’ is modified to pair with IBS1’a t the 3’ terminal region of the target sequence. As such, the 3’ terminal region of the target sequence serves as E1. In some embodiments, EBS3’ is modified to pair with IBS3’a t the 5’ terminal region of the target sequence. As such, the 5’ terminal region of the target sequence serves as E2. In some embodiments, the sequence elements of the terminal region of the target sequence can be modified to pair with the EBS sequences in D1. For example, serving as E1, the 3’ terminal region of the target sequence can be modified to contain IBS1’ , the sequence element to pair with EBS1’ in D1. Similarly, serving as E2, the 5’ terminal region of the target sequence can be modified to contain IBS3’ , the sequence element to pair with EBS3’ in D1.
In some embodiments, the EBS1’and EBS3’a re complementarily paired with IBS1’ and IBS3' , respectively on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions.
Set 4: Scarless splicing based on the EBS1’ -IBS1’ pairing and the δ” -IBS3’ pairing (FIGs. 4D, 5D (a) - (b) and 7D (a) - (b) ) . In some embodiments, the D1-like domain of the RNAs (or cRNAzymes) provided herein comprises EBS1’and the δ” nucleotide, wherein EBS1’ is 60%complementarily paired with a region of a corresponding length in the target sequence and the δnucleotide is complementarily paired with a nucleotide within the target sequence. The complementarily paired regions can be located at one or both ends of the target sequence. As such, the target sequence can contain a sequence element at its 3’ terminal region that can serve as E1 and another sequence element at its 5’ terminal region that can serve as E2, wherein E1 and E2 comprise IBS1’ and δ” , respectively, and wherein the EBS1’ -IBS1’ interaction and the δ” -IBS3’ interaction allow the self-splicing and the production of a scarless circRNA.
To achieve EBS1’ -IBS1’ pairing and the δ” -IBS3’ pairing, in some embodiments, the intron sequence elements EBS1’and δ” are modified from their naturally occurring counterparts to pair with the corresponding regions in the target sequence (e.g., the terminal regions) . In some embodiments, EBS1’ is modified to pair with IBS1’a t the 3’ terminal region of the target sequence. As such, the 3’ terminal region of the target sequence can serve as E1. In some embodiments, the δ” nucleotide (optionally with its upstream) is modified to pair with IBS3’a t the 5’ terminal region of the target sequence. As such, the 5’ terminal region of the target sequence can serve as E2. In some embodiments, the sequence elements of the terminal region of the target sequence can be modified to pair with the EBS sequences in D1. For example, serving as E1, the 3’ terminal region of the target sequence can be modified to contain IBS1’ , the sequence element to pair with EBS1’ in D1. Similarly, serving as E2, the 5’ terminal region of the
target sequence can be modified to contain the δ” nucleotide (optionally with its upstream) , the sequence element to pair with EBS3’ in D1.
In some embodiments, the EBS1’ is complementarily paired with IBS1’ on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions. In some embodiments, the 4-10 nucleotides on the immediate upstream of the δ” nucleotide (referred to herein as the “δ” upstream” ) are complementarily paired with IBS3’ on at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%of the nucleotide positions.
8.2.2.3 Segmenting
The 3’ and 5’ intron fragments of RNAs (or cRNAzymes) provided herein can be generated by segmenting a naturally occurring group II intron at an unpaired region into two fragments: the 5’ fragment and the 3’ fragment, respectively serving as the 3’ and 5’ intron fragments of RNAs (or cRNAzymes) provided herein. In other words, in the RNAs (or cRNAzyme) provided herein, the 5’ fragment and the 3’ fragment of the naturally occurring group II intron are swapped and re-ligated with the target sequence inserted in between, resulting in the following construct from 5’ to 3’ : 3’ intron fragment-target sequence-5’ intron fragment. the 5’ fragment and the 3’ fragment.
In some embodiments, the naturally occurring group II intron can be modified. The modified group II intron can include a substitution, a deletion and/or an addition of one or more nucleotides. In some embodiments, the modification does not affect the self-splicing activity of the group II intron, especially the in vitro self-splicing activity.
In some embodiments, the 5’ fragment and the 3’ fragment of the naturally occurring group II intron are mutated, swapped and re-ligated with the target sequence inserted in between to form the RNAs (or cRNAzymes) disclosed herein. The mutation can comprise modification of one or more nucleotides, such as an addition, a deletion, and a substitution of one or more nucleotides, relative to their naturally occurring wild-type sequences. In some embodiments, the modification promotes the accuracy and/or efficacy of the self-splicing of the resulting RNAs (or cRNAzymes) disclosed herein.
In some embodiments, the modification includes deletion of the intron encoded protein (IEP) sequence in D4. The IEP sequence or similar structures in D4 are present in all group II introns, and known to be not required for in vitro transcription. In some embodiments, the modification includes deletion of the IEP of D4, whereas RNAs (or cRNAzymes) disclosed herein still comprise the 5’ and 3’ stem sequences of D4. The complementarity of the 5’ and 3’ stem sequences of D4 can help shorten the spatial distance between the 5’ intron fragment and the
3’ intron fragment, thereby facilitating the circularization reaction. In some embodiments, the modification includes deletion of the entire D4.
In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments. In some embodiments, an unpaired region is a linear region between two adjacent domains of the group II intron. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D2. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D3. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D4. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D5. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D6.
In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D2 and D3. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D3 and D4. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D4 and D5. In some embodiments, the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D5 and D6.
As a person of ordinary skill in the art would understand, each of the four sets of splicing mechanism described above can also apply in RNAs (or cRNAzymes) disclosed herein of which the 5’ and 3’ intron fragments are generated by segmenting, swapping and re-ligating the 5’ and 3’ fragments of naturally occurring group II introns. Accordingly, expressly contemplated herein are RNAs (or cRNAzymes) described in the instant section that are capable of (1) near-scarless splicing based on the EBS1-IBS1 pairing and the EBS3-IBS3 pairing; (2) near-scarless splicing based on the EBS1-IBS1 pairing and the δ-IBS3 pairing; (3) scarless splicing based on the EBS1’ -IBS1’ pairing and the EBS3’ -IBS3’ pairing; or (4) scarless splicing based on the EBS1’ -IBS1’ pairing and the δ” -IBS3’ pairing.
8.2.3 Exemplary group II introns and fragments thereof
Group II introns are found in eubacteria, archaebacteria, and the organelles of plants, fungi, and various lower eukaryotes. While the RNAs (or cRNAzymes) exemplified below focus on specific group II introns, namely, Cte, Oi, Pli, LtrB, and Syn1, guided with the teachings of instant disclosure, a person of ordinary skill in the art would be able to prepare additional RNAs (or cRNAzymes) based on using sequence elements from other group II introns.
8.2.3.1 Naturally occurring group II introns and sequences thereof
RNAs (or cRNAzymes) disclosed herein can be derived from any naturally occurring group II intron. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from any naturally occurring group II intron disclosed herein or otherwise known in the art. Lists of naturally group II introns expressly contemplated herein, and their GenBank ID numbers are provided in Tables 25-33. In some embodiments, group II introns can be derived from the microorganism kingdom. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from the microorganism kingdom. In some embodiments, group II introns can be derived from the bacteria domain. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from the bacteria domain. In some embodiments, the group II intron is derived from Clostridium (such as Clostridium tetani) , Bacillus (such as Bacillus thuringiensis) , Oceanobacillus (such as Oceanobacillus iheyensis) , Pylaiella (such as Pylaiella littoralis) , Lactococcus (such as Lactococcus lactis) . In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Clostridium (such as Clostridium tetani) , Bacillus (such as Bacillus thuringiensis) , Oceanobacillus (such as Oceanobacillus iheyensis) , Pylaiella (such as Pylaiella littoralis) , Lactococcus (such as Lactococcus lactis) . It is understood by those skilled in the art that compositions and methods provided herein is not limited to specific group II introns.
A list of group II introns and their nucleotide sequences are provided in Tables 2 and 16. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from Cte. The secondary structure of Cte is provided in FIG. 1B. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Cte. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from Oi. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Oi. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from Pli. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Pli. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from LtrB. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from LtrB. In some embodiments, RNAs (or cRNAzymes) disclosed herein can be derived from Bth. In some embodiments, RNAs (or cRNAzymes) disclosed herein can contain sequence elements from Bth.
Expressly contemplated herein are RNAs (or cRNAzymes) that contain sequence elements that form the secondary structure and tertiary structural of group II introns and have group II intron self-splicing activity. In some embodiments, RNAs (or cRNAzymes) provided herein contain sequence elements derived from exemplary group II introns such as Cte, Oi, Pli, LtriB, or Bth. In some embodiments, RNAs (or cRNAzymes) provided herein contain sequence elements derived from one or more group II introns provided in Tables 25-33.
In some embodiments, for example, provided herein are RNAs (or cRNAzymes) that contain sequence elements derived from exemplary group II introns Cte (SEQ ID NO: 135) . In some embodiments, RNAs (or cRNAzymes) provided contain sequence elements derived from a modified Cte (e.g., SEQ ID NO: 136 or 137) or a synthetic Cte (e.g., Cte-Syn1; SEQ ID NO: 143) . The secondary structures of Cte and Cte-Syn1 are provided in FIGs. 1B and 1C, respectively.
In some embodiments, for example, provided herein are RNAs (or cRNAzymes) that contain sequence elements derived from exemplary group II introns Oi (SEQ ID NO: 138) . In some embodiments, RNAs (or cRNAzymes) provided contain sequence elements derived from a modified Oi (e.g., SEQ ID NO: 139) or a synthetic Oi. In some embodiments, for example, provided herein are RNAs (or cRNAzymes) that contain sequence elements derived from exemplary group II introns Pli (SEQ ID NO: 140) . In some embodiments, RNAs (or cRNAzymes) provided contain sequence elements derived from a modified Pli or a synthetic Pli (e.g., Pli-Syn1; SEQ ID NO: 144) . In some embodiments, for example, provided herein are RNAs (or cRNAzymes) that contain sequence elements derived from exemplary group II introns LtrB (SEQ ID NO: 140 or 141) . In some embodiments, RNAs (or cRNAzymes) provided contain sequence elements derived from a modified LtrB or a synthetic LtrB (e.g., LtrB-Syn1; SEQ ID NO: 145) .
In some embodiments, the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 33. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 34. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO:35. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 36. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 37. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 38. In some
embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 39. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO:40. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 41. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 135. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 136. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 137. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 138. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 139. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 140. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 141. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 142. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 143. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 144. In some embodiments, the group II intron has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 145.
In some embodiments, the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists essentially of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145. In some embodiments, the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145. In some embodiments, the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists essentially of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145. In some embodiments, the nucleotide sequence of the group II intron from which the RNAs (or cRNAzymes) provided herein can be derived consists of a nucleotide
sequence selected from the group consisting of SEQ ID NOs: 33-41 and 135-145. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 33. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 34. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 35. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 36. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 37. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 38. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 39. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 40. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 41. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 135. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 136. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 137. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 138. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 139. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 140. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 141. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 142. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 143. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 144. In some embodiments, the group II intron has the nucleotide sequence of SEQ ID NO: 145.
As disclosed above and understood in the art, certain sequence elements form the essential structural elements required for the self-splicing activities of group II introns. A person of ordinary skill in the art would be able to identify such sequence elements with the aid of the sequence analysis tools for RNAs disclosed herein or otherwise known in the art. For illustrative purposes, the sequence elements for exemplary naturally occurring group II introns Cte, Oi, Pli, and LtrB as well as synthetic cRNAzymes derived therefrom with group II intron activities (Cte-syn1, Oi-syn1, Pli-syn1, and LtrB-syn1 are provided in FIGs. 2A-2K and summarized in Table 18. 8.2.3.2 cRNAzymes and sequences thereof
Provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence
fragment. In some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence.
cRNAzymes derived from Cte
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from Cte D1. In some embodiments, the D1-like sequence is at least 60%identical to Cte D1 (SEQ ID NO: 147) and comprises the following sequence elements of Cte D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIGs. 2B, 2C and 2F and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 70%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 80%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 85%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 90%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 95%identical to Cte D1 (SEQ ID NO: 147) . In some embodiments, the D1-like sequence is at least 98%identical to Cte D1 (SEQ ID NO: 147) . Aperson of ordinary skill in the art would understand that Cte can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from Cte. In some embodiments, the D5-like sequence is at least 60%identical to Cte D5 and comprises the following sequence elements of Cte D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2B, 2C and 2F and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte D5 (SEQ ID NO: 154) . In some embodiments, the D5-like sequence is at least 70%identical to Cte D5 (SEQ ID NO: 154) . In some embodiments, the D5-like sequence is at least 80%identical to Cte D5 (SEQ ID NO: 154) . In some embodiments, the D5-like sequence is at least 85%identical to Cte D5 (SEQ ID NO: 154) . In some embodiments, the D5-like sequence is at least 90%identical to Cte D5 (SEQ ID NO: 154) . In some embodiments, the D5-like sequence is at least 95%identical to Cte D5 (SEQ ID NO: 154) . In some embodiments, the D5-like sequence is at least 98%identical to Cte D5 (SEQ ID NO: 154) . Aperson of ordinary skill in the art would understand that
Cte can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to Cte D6 and comprises the following sequence elements of Cte D6: the bulged adenosine (as depicted in FIGs. 2B, 2C and 2F and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Cte D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte D6 (SEQ ID NO: 155) . In some embodiments, the D6-like sequence is at least 70%identical to Cte D6 (SEQ ID NO: 155) . In some embodiments, the D6-like sequence is at least 80%identical to Cte D6 (SEQ ID NO: 155) . In some embodiments, the D6-like sequence is at least 85%identical to Cte D6 (SEQ ID NO: 155) . In some embodiments, the D6-like sequence is at least 90%identical to Cte D6 (SEQ ID NO: 155) . In some embodiments, the D6-like sequence is at least 95%identical to Cte D6 (SEQ ID NO: 155) . In some embodiments, the D6-like sequence is at least 98%identical to Cte D6 (SEQ ID NO: 155) . Aperson of ordinary skill in the art would understand that Cte can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes)
provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to Cte D2 and forms the stem-loop structure of Cte D2 (as depicted in FIGs. 2B, 2C and 2F and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte D2 (SEQ ID NO: 151) . In some embodiments, the D2-like sequence is at least 70%identical to Cte D2 (SEQ ID NO: 151) . In some embodiments, the D2-like sequence is at least 80%identical to Cte D2 (SEQ ID NO: 151) . In some embodiments, the D2-like sequence is at least 85%identical to Cte D2 (SEQ ID NO: 151) . In some embodiments, the D2-like sequence is at least 90%identical to Cte D2 (SEQ ID NO: 151) . In some embodiments, the D2-like sequence is at least 95%identical to Cte D2 (SEQ ID NO: 151) . In some embodiments, the D2-like sequence is at least 98%identical to Cte D2 (SEQ ID NO: 151) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to Cte D3 and forms the stem-loop structure of Cte D3 (as depicted in FIGs. 2B, 2C and 2F and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte D3 (SEQ ID NO: 152) . In some embodiments, the D3-like sequence is at least 70%identical to Cte D3 (SEQ ID NO: 152) . In some embodiments, the D3-like sequence is at least 80%identical to Cte D3 (SEQ ID NO: 152) . In some embodiments, the D3-like sequence is at least 85%identical to Cte D3 (SEQ ID NO: 152) . In some embodiments, the D3-like sequence is at least 90%identical to Cte D3 (SEQ ID NO: 152) . In some embodiments, the D3-like sequence is at least 95%identical to Cte D3 (SEQ ID NO: 152) . In some embodiments, the D3-like sequence is at least 98%identical to Cte D3 (SEQ ID NO: 152) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence and a Cte D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment,
from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence and a Cte D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, and a Cte D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, and a Cte D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from Cte D4. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence and a Cte D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence and a 5’ Cte D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence, a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, and a 5’ Cte D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence and a Cte D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a
5’ intron fragment comprising a Cte D1-like sequence, a Cte D2/Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence, a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2/Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence and a Cte D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, a Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte D4 stem-like sequence, a Cte D5-like sequence and a Cte D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte D1-like sequence, a Cte D2-like sequence, a Cte D3-like sequence, and a 5’ Cte D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from Cte-Syn1
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from Cte-Syn1 D1. In some embodiments, the D1-like sequence is at least 60%identical to Cte-Syn1 D1 (SEQ ID NO: 194) and comprises the following sequence elements of Cte-Syn1 D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIGs. 2B, 2C and 2G and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 70%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 80%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 85%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 90%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 95%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . In some embodiments, the D1-like sequence is at least 98%identical to Cte-Syn1 D1 (SEQ ID NO: 194) . Aperson of ordinary skill in the art would understand that Cte-Syn1 can tolerate sequence modification that does not affect
these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from Cte-Syn1. In some embodiments, the D5-like sequence is at least 60%identical to Cte-Syn1 D5 and comprises the following sequence elements of Cte-Syn1 D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2B, 2C and 2G and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 70%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 80%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 85%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 90%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 95%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . In some embodiments, the D5-like sequence is at least 98%identical to Cte-Syn1 D5 (SEQ ID NO: 197) . Aperson of ordinary skill in the art would understand that Cte-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to Cte-Syn1 D6 and comprises the following sequence elements of Cte-Syn1 D6: the bulged adenosine (as depicted in FIGs. 2B, 2C and 2G and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Cte-Syn1 D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 70%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 80%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the
D6-like sequence is at least 85%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 90%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 95%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . In some embodiments, the D6-like sequence is at least 98%identical to Cte-Syn1 D6 (SEQ ID NO: 198) . Aperson of ordinary skill in the art would understand that Cte-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to Cte-Syn1 D2 and forms the stem-loop structure of Cte-Syn1 D2 (as depicted in FIGs. 2B, 2C and 2G and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . In some embodiments, the D2-like sequence is at least 70%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . In some embodiments, the D2-like sequence is at least 80%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . In some embodiments, the D2-like sequence is at least 85%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . In some embodiments, the D2-like sequence is at least 90%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . In some embodiments, the D2-like sequence is at least 95%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . In some embodiments, the D2-like sequence is at least 98%identical to Cte-Syn1 D2 (SEQ ID NO: 195) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to Cte-Syn1 D3 and forms the stem-loop structure of Cte-Syn1 D3 (as depicted in FIGs. 2B, 2C and 2G and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at
least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . In some embodiments, the D3-like sequence is at least 70%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . In some embodiments, the D3-like sequence is at least 80%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . In some embodiments, the D3-like sequence is at least 85%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . In some embodiments, the D3-like sequence is at least 90%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . In some embodiments, the D3-like sequence is at least 95%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . In some embodiments, the D3-like sequence is at least 98%identical to Cte-Syn1 D3 (SEQ ID NO: 196) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence and a Cte-Syn1 D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence and a Cte-Syn1 D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, and a Cte-Syn1 D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, and a Cte-Syn1 D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from Cte-Syn1 D4. In some
embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence and a Cte-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence and a 5’ Cte-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence, a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence and a Cte-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2/Cte-Syn1 D3-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence, a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2/Cte-Syn1 D3-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence and a Cte-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, a Cte-Syn1 D3-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably
linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Cte-Syn1 D4 stem-like sequence, a Cte-Syn1 D5-like sequence and a Cte-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Cte-Syn1 D1-like sequence, a Cte-Syn1 D2-like sequence, a Cte-Syn1 D3-like sequence, and a 5’ Cte-Syn1 D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from Oi
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from Oi D1. In some embodiments, the D1-like sequence is at least 60%identical to Oi D1 (SEQ ID NO: 159) and comprises the following sequence elements of Oi D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIG. 2E and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D1 (SEQ ID NO: 159) . In some embodiments, the D1-like sequence is at least 70%identical to Oi D1 (SEQ ID NO: 159) . In some embodiments, the D1-like sequence is at least 80%identical to Oi D1 (SEQ ID NO: 159) . In some embodiments, the D1-like sequence is at least 85%identical to Oi D1 (SEQ ID NO: 159) . In some embodiments, the D1-like sequence is at least 90%identical to Oi D1 (SEQ ID NO: 159) . In some embodiments, the D1-like sequence is at least 95%identical to Oi D1 (SEQ ID NO: 159) . In some embodiments, the D1-like sequence is at least 98%identical to Oi D1 (SEQ ID NO: 159) . Aperson of ordinary skill in the art would understand that Oi can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or cRNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from Oi. In some embodiments, the D5-like sequence is at least 60%identical to Oi D5 and comprises the following sequence elements of Oi D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIG. 2E and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D5 (SEQ ID NO: 165 or 168) . In some embodiments, the D5-like sequence is at least 70%identical to Oi D5 (SEQ ID NO: 165 or 168) . In some embodiments, the D5-like sequence is at least 80%identical to Oi D5 (SEQ ID NO: 165 or 168) . In some embodiments, the D5-like sequence is at least 85%identical to Oi D5 (SEQ ID NO: 165 or 168) . In some embodiments, the D5-like sequence is at least 90%identical to Oi D5 (SEQ ID NO: 165 or 168) . In some embodiments, the D5-like sequence is at least 95%identical to Oi D5 (SEQ ID NO: 165 or 168) . In some
embodiments, the D5-like sequence is at least 98%identical to Oi D5 (SEQ ID NO: 165 or 168) . Aperson of ordinary skill in the art would understand that Oi can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to Oi D6 and comprises the following sequence elements of Oi D6: the bulged adenosine (as depicted in FIG. 2E and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Oi D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D6 (SEQ ID NO: 166) . In some embodiments, the D6-like sequence is at least 70%identical to Oi D6 (SEQ ID NO: 166) . In some embodiments, the D6-like sequence is at least 80%identical to Oi D6 (SEQ ID NO: 166) . In some embodiments, the D6-like sequence is at least 85%identical to Oi D6 (SEQ ID NO: 166) . In some embodiments, the D6-like sequence is at least 90%identical to Oi D6 (SEQ ID NO: 166) . In some embodiments, the D6-like sequence is at least 95%identical to Oi D6 (SEQ ID NO: 166) . In some embodiments, the D6-like sequence is at least 98%identical to Oi D6 (SEQ ID NO: 166) . Aperson of ordinary skill in the art would understand that Oi can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes)
provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to Oi D2 and forms the stem-loop structure of Oi D2 (as depicted in FIG. 2E and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 70%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 80%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 85%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 90%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 95%identical to Oi D2 (SEQ ID NO: 162) . In some embodiments, the D2-like sequence is at least 98%identical to Oi D2 (SEQ ID NO: 162) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to Oi D3 and forms the stem-loop structure of Oi D3 (as depicted in FIG. 2E and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 70%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 80%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 85%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 90%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 95%identical to Oi D3 (SEQ ID NO: 163) . In some embodiments, the D3-like sequence is at least 98%identical to Oi D3 (SEQ ID NO: 163) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence and a Oi D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to
3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence and a Oi D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, a Oi D2-like sequence, and a Oi D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, a Oi D2-like sequence, and a Oi D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from Oi D4. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence and a Oi D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence and a 5’ Oi D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence, a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, and a 5’ Oi D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence and a Oi D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a
5’ intron fragment comprising a Oi D1-like sequence, a Oi D2/Oi D3-like sequence, and a 5’ Oi D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence, a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, a Oi D2/Oi D3-like sequence, and a 5’ Oi D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence and a Oi D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, a Oi D2-like sequence, a Oi D3-like sequence, and a 5’ Oi D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Oi D4 stem-like sequence, a Oi D5-like sequence and a Oi D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Oi D1-like sequence, a Oi D2-like sequence, a Oi D3-like sequence, and a 5’ Oi D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from Pli
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from Pli D1. In some embodiments, the D1-like sequence is at least 60%identical to Pli D1 (SEQ ID NO: 170) and comprises the following sequence elements of Pli D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIGs. 2B, 2C and 2J and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 70%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 80%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 85%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 90%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 95%identical to Pli D1 (SEQ ID NO: 170) . In some embodiments, the D1-like sequence is at least 98%identical to Pli D1 (SEQ ID NO: 170) . Aperson of ordinary skill in the art would understand that Pli can tolerate sequence modification that does not affect
these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from Pli. In some embodiments, the D5-like sequence is at least 60%identical to Pli D5 and comprises the following sequence elements of Pli D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2B, 2C and 2J and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D5 (SEQ ID NO: 179) . In some embodiments, the D5-like sequence is at least 70%identical to Pli D5 (SEQ ID NO: 179) . In some embodiments, the D5-like sequence is at least 80%identical to Pli D5 (SEQ ID NO: 179) . In some embodiments, the D5-like sequence is at least 85%identical to Pli D5 (SEQ ID NO: 179) . In some embodiments, the D5-like sequence is at least 90%identical to Pli D5 (SEQ ID NO: 179) . In some embodiments, the D5-like sequence is at least 95%identical to Pli D5 (SEQ ID NO: 179) . In some embodiments, the D5-like sequence is at least 98%identical to Pli D5 (SEQ ID NO: 179) . Aperson of ordinary skill in the art would understand that Pli can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to Pli D6 and comprises the following sequence elements of Pli D6: the bulged adenosine (as depicted in FIGs. 2B, 2C and 2J and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Pli D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 70%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 80%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 85%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 90%identical to Pli D6
(SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 95%identical to Pli D6 (SEQ ID NO: 180) . In some embodiments, the D6-like sequence is at least 98%identical to Pli D6 (SEQ ID NO: 180) . Aperson of ordinary skill in the art would understand that Pli can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to Pli D2 and forms the stem-loop structure of Pli D2 (as depicted in FIGs. 2B, 2C and 2J and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 70%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 80%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 85%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 90%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 95%identical to Pli D2 (SEQ ID NO: 176) . In some embodiments, the D2-like sequence is at least 98%identical to Pli D2 (SEQ ID NO: 176) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to Pli D3 and forms the stem-loop structure of Pli D3 (as depicted in FIGs. 2B, 2C and 2J and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli D3 (SEQ ID NO: 177) . In some embodiments, the D3-like sequence is at least 70%identical to Pli D3 (SEQ ID NO: 177) . In some embodiments, the D3-like sequence is at least 80%identical to Pli D3
(SEQ ID NO: 177) . In some embodiments, the D3-like sequence is at least 85%identical to Pli D3 (SEQ ID NO: 177) . In some embodiments, the D3-like sequence is at least 90%identical to Pli D3 (SEQ ID NO: 177) . In some embodiments, the D3-like sequence is at least 95%identical to Pli D3 (SEQ ID NO: 177) . In some embodiments, the D3-like sequence is at least 98%identical to Pli D3 (SEQ ID NO: 177) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence and a Pli D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence and a Pli D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2-like sequence, and a Pli D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2-like sequence, and a Pli D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from Pli D4. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence and a Pli D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence and a 5’ Pli D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence, a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, and a 5’ Pli D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence and a Pli D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2/Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence, a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2/Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence and a Pli D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2-like sequence, a Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli D4 stem-like sequence, a Pli D5-like sequence and a Pli D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli D1-like sequence, a Pli D2-like sequence, a Pli D3-like sequence, and a 5’ Pli D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from Pli-Syn1
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from Pli-Syn1 D1. In some embodiments, the D1-like sequence is at least 60%identical to Pli-Syn1 D1 (SEQ ID NO: 199) and comprises the following sequence elements of Pli-Syn1 D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIGs. 2B, 2C and 2K and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 70%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 80%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 85%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 90%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 95%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . In some embodiments, the D1-like sequence is at least 98%identical to Pli-Syn1 D1 (SEQ ID NO: 199) . Aperson of ordinary skill in the art would understand that Pli-Syn1 can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from Pli-Syn1. In some embodiments, the D5-like sequence is at least 60%identical to Pli-Syn1 D5 and comprises the following sequence elements of Pli-Syn1 D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2B, 2C and 2K and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 70%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 80%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 85%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 90%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 95%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . In some embodiments, the D5-like sequence is at least 98%identical to Pli-Syn1 D5 (SEQ ID NO: 203) . Aperson of ordinary skill in the art would understand that Pli-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to Pli-Syn1 D6 and comprises the following sequence elements of Pli-Syn1 D6: the bulged adenosine (as depicted in FIGs. 2B, 2C and 2K and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring Pli-Syn1 D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 70%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 80%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 85%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 90%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 95%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . In some embodiments, the D6-like sequence is at least 98%identical to Pli-Syn1 D6 (SEQ ID NO: 204) . Aperson of ordinary skill in the art would understand that Pli-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like
sequence is at least 60%identical to Pli-Syn1 D2 and forms the stem-loop structure of Pli-Syn1 D2 (as depicted in FIGs. 2B, 2C and 2K and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 70%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 80%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 85%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 90%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 95%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . In some embodiments, the D2-like sequence is at least 98%identical to Pli-Syn1 D2 (SEQ ID NO: 200) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to Pli-Syn1 D3 and forms the stem-loop structure of Pli-Syn1 D3 (as depicted in FIGs. 2B, 2C and 2K and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 70%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 80%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 85%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 90%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 95%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . In some embodiments, the D3-like sequence is at least 98%identical to Pli-Syn1 D3 (SEQ ID NO: 201) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence and a Pli-Syn1 D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target
sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence and a Pli-Syn1 D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, and a Pli-Syn1 D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, and a Pli-Syn1 D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from Pli-Syn1 D4. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence and a Pli-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence and a 5’ Pli-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence, a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-
Syn1 D4 stem-like sequence and a Pli-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2/Pli-Syn1 D3-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence, a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2/Pli-Syn1 D3-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence and a Pli-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, a Pli-Syn1 D3-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ Pli-Syn1 D4 stem-like sequence, a Pli-Syn1 D5-like sequence and a Pli-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Pli-Syn1 D1-like sequence, a Pli-Syn1 D2-like sequence, a Pli-Syn1 D3-like sequence, and a 5’ Pli-Syn1 D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from LtrB
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from LtrB D1. In some embodiments, the D1-like sequence is at least 60%identical to LtrB D1 (SEQ ID NO: 184) and comprises the following sequence elements of LtrB D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIGs. 2D and 2H and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 70%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 80%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 85%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 90%identical to LtrB D1 (SEQ ID NO: 184) . In some embodiments, the D1-like sequence is at least 95%identical to LtrB D1 (SEQ
ID NO: 184) . In some embodiments, the D1-like sequence is at least 98%identical to LtrB D1 (SEQ ID NO: 184) . Aperson of ordinary skill in the art would understand that LtrB can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from LtrB. In some embodiments, the D5-like sequence is at least 60%identical to LtrB D5 and comprises the following sequence elements of LtrB D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2D and 2H and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D5 (SEQ ID NO: 191) . In some embodiments, the D5-like sequence is at least 70%identical to LtrB D5 (SEQ ID NO: 191) . In some embodiments, the D5-like sequence is at least 80%identical to LtrB D5 (SEQ ID NO: 191) . In some embodiments, the D5-like sequence is at least 85%identical to LtrB D5 (SEQ ID NO: 191) . In some embodiments, the D5-like sequence is at least 90%identical to LtrB D5 (SEQ ID NO: 191) . In some embodiments, the D5-like sequence is at least 95%identical to LtrB D5 (SEQ ID NO: 191) . In some embodiments, the D5-like sequence is at least 98%identical to LtrB D5 (SEQ ID NO: 191) . Aperson of ordinary skill in the art would understand that LtrB can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to LtrB D6 and comprises the following sequence elements of LtrB D6: the bulged adenosine (as depicted in FIGs. 2D and 2H and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring LtrB D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 70%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 80%identical to
LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 85%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 90%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 95%identical to LtrB D6 (SEQ ID NO: 192) . In some embodiments, the D6-like sequence is at least 98%identical to LtrB D6 (SEQ ID NO: 192) . Aperson of ordinary skill in the art would understand that LtrB can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to LtrB D2 and forms the stem-loop structure of LtrB D2 (as depicted in FIGs. 2D and 2H and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 70%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 80%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 85%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 90%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 95%identical to LtrB D2 (SEQ ID NO: 188) . In some embodiments, the D2-like sequence is at least 98%identical to LtrB D2 (SEQ ID NO: 188) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to LtrB D3 and forms the stem-loop structure of LtrB D3 (as depicted in FIGs. 2D and 2H and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB D3
(SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 70%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 80%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 85%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 90%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 95%identical to LtrB D3 (SEQ ID NO: 189) . In some embodiments, the D3-like sequence is at least 98%identical to LtrB D3 (SEQ ID NO: 189) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence and a LtrB D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence and a LtrB D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2-like sequence, and a LtrB D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2-like sequence, and a LtrB D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from LtrB D4. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%
complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence and a LtrB D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence and a 5’ LtrB D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence, a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, and a 5’ LtrB D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence and a LtrB D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2/LtrB D3-like sequence, and a 5’ LtrB D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence, a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2/LtrB D3-like sequence, and a 5’ LtrB D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence and a LtrB D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB D1-like sequence, a LtrB D2-like sequence, a LtrB D3-like sequence, and a 5’ LtrB D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB D4 stem-like sequence, a LtrB D5-like sequence and a LtrB D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron
fragment comprising a LtrB D1-like sequence, a LtrB D2-like sequence, a LtrB D3-like sequence, and a 5’ LtrB D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from LtrB-Syn1
In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from LtrB-Syn1 D1. In some embodiments, the D1-like sequence is at least 60%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) and comprises the following sequence elements of LtrB-Syn1 D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as depicted in FIGs. 2D and 2I and Table 18) . In some embodiments, the D1-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . In some embodiments, the D1-like sequence is at least 70%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . In some embodiments, the D1-like sequence is at least 80%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . In some embodiments, the D1-like sequence is at least 85%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . In some embodiments, the D1-like sequence is at least 90%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . In some embodiments, the D1-like sequence is at least 95%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . In some embodiments, the D1-like sequence is at least 98%identical to LtrB-Syn1 D1 (SEQ ID NO: 205) . Aperson of ordinary skill in the art would understand that LtrB-Syn1 can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from LtrB-Syn1. In some embodiments, the D5-like sequence is at least 60%identical to LtrB-Syn1 D5 and comprises the following sequence elements of LtrB-Syn1 D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2D and 2I and Table 18) . In some embodiments, the D5-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 70%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 80%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 85%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 90%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 95%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . In some embodiments, the D5-like sequence is at least 98%identical to LtrB-Syn1 D5 (SEQ ID NO: 209) . Aperson of ordinary skill in the art would understand that LtrB-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in
D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to LtrB-Syn1 D6 and comprises the following sequence elements of LtrB-Syn1 D6: the bulged adenosine (as depicted in FIGs. 2D and 2I and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring LtrB-Syn1 D6. In some embodiments, the D6-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 70%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 80%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 85%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 90%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 95%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . In some embodiments, the D6-like sequence is at least 98%identical to LtrB-Syn1 D6 (SEQ ID NO: 210) . Aperson of ordinary skill in the art would understand that LtrB-Syn1 can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes)
provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to LtrB-Syn1 D2 and forms the stem-loop structure of LtrB-Syn1 D2 (as depicted in FIGs. 2D and 2I and Table 18) . In some embodiments, the D2-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 70%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 80%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 85%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 90%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 95%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . In some embodiments, the D2-like sequence is at least 98%identical to LtrB-Syn1 D2 (SEQ ID NO: 206) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, the D3-like sequence is at least 60%identical to LtrB-Syn1 D3 and forms the stem-loop structure of LtrB-Syn1 D3 (as depicted in FIG. 2D and Table 18) . In some embodiments, the D3-like sequence is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . In some embodiments, the D3-like sequence is at least 70%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . In some embodiments, the D3-like sequence is at least 80%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . In some embodiments, the D3-like sequence is at least 85%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . In some embodiments, the D3-like sequence is at least 90%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . In some embodiments, the D3-like sequence is at least 95%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . In some embodiments, the D3-like sequence is at least 98%identical to LtrB-Syn1 D3 (SEQ ID NO: 207) . Aperson of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence and a LtrB-Syn1 D2/D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence,
from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence and a LtrB-Syn1 D2/D3-like sequence, from 5’ to 3’ .
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, and a LtrB-Syn1 D3-like sequence, from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, and a LtrB-Syn1 D3-like sequence, from 5’ to 3’ .
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from LtrB-Syn1 D4. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence and a LtrB-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence and a 5’ LtrB-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence, a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence and a LtrB-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2/LtrB-Syn1 D3-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence, a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2/LtrB-Syn1 D3-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence, from 5’ to 3’ .
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence and a LtrB-Syn1 D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, a LtrB-Syn1 D3-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence from 5’ to 3’ . Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ LtrB-Syn1 D4 stem-like sequence, a LtrB-Syn1 D5-like sequence and a LtrB-Syn1 D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a LtrB-Syn1 D1-like sequence, a LtrB-Syn1 D2-like sequence, a LtrB-Syn1 D3-like sequence, and a 5’ LtrB-Syn1 D4 stem-like sequence, from 5’ to 3’ .
cRNAzymes derived from additional group II introns
Lists of naturally group II introns expressly contemplated herein, and their GenBank ID numbers are provided in Tables 25-33. A person of ordinary skill in the art guided by teachings of instant disclosures would be able to design additional cRNAzymes containing sequence elements from one or more of these naturally existing group II introns. In some embodiments, the 5’ intron fragment of the RNAs (or RNAzymes) comprise a D1-like sequence, wherein the D1-like sequence is derived from a group II intron listed in Tables 25-33. In some embodiments, the D1-like sequence is at least 60%identical to the D1 of the group II intron and comprises the following sequence elements of the D1: λ, α, ε’ , ζ, κ, δ’ , B’ , δ, EBS1, Stem 2, α’ and EBS3 (as
depicted in FIGs. 2A-2E and Table 18) . A person of ordinary skill in the art would understand that this group II intron can tolerate sequence modification that does not affect these sequence elements in D1, and can predict and confirm the activity of a variant using assays disclosed herein or otherwise known in the art.
In some embodiments, the 3’ intron fragment of the RNAs (or RNAzymes) comprise a D5-like sequence, wherein the D5-like sequence is derived from a group II intron listed in Tables 25-33. In some embodiments, the D5-like sequence is at least 60%identical to the D5 of the group II intron listed and comprises the following sequence elements of the D5: the catalytic triad, ζ’ , λ’ , and κ’ (as depicted in FIGs. 2A-2E and Table 18) . A person of ordinary skill in the art would understand that the group II intron can tolerate sequence modification that does not affect the above-mentioned sequence elements in D5, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
Accordingly, in some embodiments, the non-naturally occurring RNAs (or cRNAzymes) provided herein comprise the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D1-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D5-like sequence; wherein the D1-like sequence and the D5-like sequence are both derived from a group II intron listed in Tables 25-33.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine. In some embodiments, the D6-like sequence is at least 60%identical to the D6 of a group II intron listed in Tables 25-33 and comprises the following sequence elements of the D6: the bulged adenosine (as depicted in FIGs. 2A-2E and Table 18) . In some embodiments, the D6-like sequence lacks the GNRA tetraloop in the naturally occurring D6. A person of ordinary skill in the art would understand that the group II intron can tolerate sequence modification that does not affect the above-mentioned sequence elements in D6, and can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNAs (cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, wherein the D1-like sequence, the D5-like sequence and the D6-like sequence are all derived from a group II intron listed in Tables 25-33.
In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises a D2/D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2/D3-like sequence is a D2-like sequence. In some embodiments, the D2/D3-like sequence is a D3-like sequence. In some embodiments of the RNAs (or cRNAzymes) provided herein, the 5’ intron fragment further comprises, from 5’ to 3’ , a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence. In some embodiments, the D2-like sequence is at least 60%identical to the D2 of a group II intron listed in Tables 25-33 and forms the stem-loop structure of the D2 (as depicted in FIGs. 2A-2E and Table 18) . In some embodiments, the D3-like sequence is at least 60%identical to the D3 of a group II intron listed in Tables 25-33 and forms the stem-loop structure of the D3 (as depicted in FIGs. 2A-2E and Table 18) . A person of ordinary skill in the art can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
As such, provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a D2/D3-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D2/D3-like sequence, and the D5-like sequence are all derived from a group II intron listed in Tables 25-33. Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a D2/D3-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D2/D3-like sequence, the D5-like sequence, and the D6-like sequence are all derived from a group II intron listed in Tables 25-33.
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, and a D3-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D2-like sequence, the D3-like sequence, and the D5-like sequence are all derived from a group II intron listed in Tables 25-33.
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence
fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, and a D3-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D2-like sequence, the D3-like sequence, the D5-like sequence, and the D6-like sequence are all derived from a group II intron listed in Tables 25-33.
In some embodiments, the RNAs (or cRNAzymes) provided herein further comprise a pair of D4 stem-like sequences consisting of a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment. In some embodiments, both the 5’ and 3’ D4 stem-like sequences are derived from D4 derived from a group II intron listed in Tables 25-33. In some embodiments, the 5’ and 3’ D4 stem-like sequences at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99%, or 100%complementarily paired. A person of ordinary skill in the art would understand that can predict and confirm the activity of a variant using methods disclosed herein or otherwise known in the art.
In some embodiments, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence and a 5’ D4 stem-like sequence from 5’ to 3’ , wherein the D1-like sequence, the D5-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence, a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, and a 5’ D4 stem-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D5-like sequence, the D6-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2/D3-like sequence, and a 5’ D4 stem-like sequence from 5’ to 3’ , wherein the D1-like sequence, the D2/D3-like sequence, the D5-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence, a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2/D3-like sequence, and a 5’ D4 stem-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D2/D3-like sequence, the D5-like sequence, the D6-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence and a D5-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, a D3-like sequence, and a 5’ D4 stem-like sequence from 5’ to 3’ , wherein the D1-like sequence, the D2-like sequence, the D3-like sequence, the D5-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
Also provided herein are non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a 3’ D4 stem-like sequence, a D5-like sequence and a D6-like sequence, from 5’ to 3’ ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a D1-like sequence, a D2-like sequence, a D3-like sequence, and a 5’ D4 stem-like sequence, from 5’ to 3’ , wherein the D1-like sequence, the D2-like sequence, the D3-like sequence, the D5-like sequence, the D6-like sequence, and the 5’ and 3’ D4 stem-like sequences are all derived from a group II intron listed in Tables 25-33.
Exemplary 3’ intron fragment:
In some embodiments, the 3’ intron fragment of the RNAs (cRNAzymes) provided herein comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 42-52 and 228. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 42. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 43. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 44. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 45. In some embodiments,
the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 46. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 47. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 48. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 49. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 50. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 51. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 52. In some embodiments, the 3’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 228.
In some embodiments, the 3’ intron fragment consists essentially of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 42-52 and 228. In some embodiments, the 3’ intron fragment consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 42-52 and 228. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 42. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 43. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 44. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 45. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 46. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 47. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 48. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 49. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 50. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 51. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 52. In some embodiments, the 3’ intron fragment has the nucleotide sequence of SEQ ID NO: 228.
Exemplary 5’ intron fragment:
In some embodiments, the 5’ intron fragment of the RNAs (cRNAzymes) provided herein comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%
identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 75. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 76. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 77. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 78. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 79. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 80. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 81. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 82. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 83. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 84. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 85. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 86. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 87. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 88. In some embodiments, the 5’ intron fragment comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 229.
In some embodiments, the 5’ intron fragment consists essentially of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229. In some embodiments, the 5’ intron fragment consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 75. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 76. In some embodiments, the 5’ intron fragment has the
nucleotide sequence of SEQ ID NO: 77. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 78. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 79. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 80. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 81. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 82. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 83. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 84. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 85. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 86. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 87. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 88. In some embodiments, the 5’ intron fragment has the nucleotide sequence of SEQ ID NO: 229.
Exemplary E2 and E1:
In some embodiments, the E2 of the RNAs (cRNAzymes) provided herein comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63. In some embodiments, the E2 of the precursor RNA provided herein consists essentially of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63. In some embodiments, the E2 consists of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 53. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 54. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 55. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 56. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 57. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 58. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 59. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 60. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 61. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 62. In some embodiments, the E2 has the nucleotide sequence of SEQ ID NO: 63.
In some embodiments, the E1 of the of the RNAs (cRNAzymes) provided herein comprises a nucleotide sequence selected from the group consisting of: SEQ ID NOs: 64-74. In some embodiments, the E1 consists essentially of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 64-74. In some embodiments, the E1 consists of a nucleotide sequence selected from the group consisting of SEQ ID NOs: 64-74. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 64. In some embodiments, the E1 has the nucleotide
sequence of SEQ ID NO: 65. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 66. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 67. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 68. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 69. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 70. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 71. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 72. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 73. In some embodiments, the E1 has the nucleotide sequence of SEQ ID NO: 74.
In some embodiments, IBS3 (or IBS3’ ) , the region of a corresponding length of EBS3 or EBS3’ , which either flanks a target sequence (IBS3) or is within the target sequence (IBS3’ ) , optionally with its down sequence is selected from the group consisting of: (a) SEQ ID NO: 131, (b) SEQ ID NO: 132, (c) SEQ ID NO: 133, and (d) SEQ ID NO: 134. In some embodiments, the δnucleotide and the δ upstream (or the δ” nucleotide and the δ” upstream) comprises a nucleotide sequence selected from the group consisting of: (a) SEQ ID NO: 127, (b) SEQ ID NO: 128, (c) SEQ ID NO: 129, and (d) SEQ ID NO: 130.
Homology arms: In some embodiments, the RNAs provided herein further comprise two homology arms, including a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. The homology arms can help shorten the spatial distance between the 5’ intron fragment and the 3’ intron fragment, thereby facilitating the self-splicing (circularization) reaction. In some embodiments, the presence of the homology arms can enhance the self-splicing efficiency of the RNAs provided herein.
Accordingly, provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 5’ homology arm, (2) a 3’ intron fragment; (3) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; (4) a 5’ intron fragment; and (5) a 3’ homology arm; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment (FIG. 9) .
In some embodiments, the two homology arms can be 100%complementary to each other. In some embodiments, the two homology arms can have up to 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, or 15%base mismatches. In some embodiments, the two homology arms are at least 85%complementary. In some embodiments, the two homology arms are 90%complementary. In some embodiments, the two homology arms are 95%
complementary. In some embodiments, the two homology arms are 98%complementary. In some embodiments, the two homology arms are 99%complementary.
In some embodiments, the 5’ homology arm or 3’ homology arm is 15 to 60 nucleotides in length. In some embodiments, the 5’ homology arm or 3’ homology arm is 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, or 60 nucleotides in length. In some embodiments, both the 5’ homology arm and the 3’ homology arm are 15 to 60 nucleotides in length. In some embodiments, both the 5’ homology arm and the 3’ homology arm are 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, or 60 nucleotides in length.
In some embodiments, the 5’ homology arm comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 105. In some embodiments, the 5’ homology arm consists essentially a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 105. In some embodiments, the 5’ homology arm consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 105. In some embodiments, the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105.
In some embodiments, the 3’ homology arm comprises a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 106. In some embodiments, the 3’ homology arm consists essentially a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 106. In some embodiments, the 3’ homology arm consists of a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 106. In some embodiments, the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106.
Additional embodiments
As a person of ordinary skill in the art would understand in view of the teachings of instant disclosure, while the RNAs (or cRNAzymes) exemplified above contain sequence elements derived from one naturally occurring group II intron, namely, Cte, Oi, Pli, LtrB, or Syn1, RNAs (or cRNAzymes) provided herein can also comprise domains derived from more than one group II introns. For example, in some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment comprising a Cte D5-like sequence ; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment comprising a Syn1 D1-like sequence; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments
with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment. Expressly contemplated herein are RNAs (or cRNAzymes) comprising all combinations and permutations of the exemplified sequence elements disclosed herein.
Modified nucleotides/nucleosides
In some embodiments, the RNAs (or cRNAzymes) provided herein can comprise modified nucleotide/nucleoside. The modified RNA nucleotide and/or modified nucleoside can be introduced, for example, at in vitro transcription (IVT) . As used herein, in vitro transcription, or “IVT, ” refers to versatile method to produce RNA in vitro that uses an RNA polymerase, ribonucleotides, and appropriate buffer conditions to synthesize RNA from a DNA template.
In some embodiments, the RNAs (or cRNAzymes) provided herein comprise 10%to 100%modified RNA nucleotide and/or modified nucleoside. In some embodiments, the RNAs (or cRNAzymes) provided herein comprise 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%modified RNA nucleotide and/or modified nucleoside.
In some embodiments, the modified RNA nucleotide and/or modified nucleoside is m5C (5-methylcytidine) . In some embodiments, the polynucleotide construct of any one of Embodiments 47-48, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is m5U (5-methyluridine) . In some embodiments, the modified RNA nucleotide and/or modified nucleoside is m6A (N6-methyladenosine) . In some embodiments, the modified RNA nucleotide and/or modified nucleoside is Y (pseudouridine) . In some embodiments, the modified RNA nucleotide and/or modified nucleoside is m1A (1-methyladenosine) . In some embodiments, the modified nucleoside is selected from the group consisting of: m5C (5-methylcytidine) , m5U (5-methyluridine) , m6A (N6-methyladenosine) , s2U (2-thiouridine) , Y (pseudouridine) , Um (2 '-O-methyluridine) , m1A (1-methyladenosine) , m2A (2-methyladenosine) , Am (2’ -0-methyladenosine) , ms2 m6A (2-methylthio-N6-methyladenosine) , i6A (N6-isopentenyladenosine) , ms2i6A (2-methylthio-N6 isopentenyladenosine) , io6A (N6- (cis-hydroxyisopentenyl) adenosine) , ms2io6A (2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine) , g6A (N6-glycinylcarbamoyladenosine) , t6A (N6-threonylcarbamoyladeno sine) , ms2t6A (2-methylthio-N6-threonyl carbamoyladenosine) , m6t6A (N6-methyl-N6-threonylcarbamoyladenosine) , hn6A (N6-hydroxynorvalylcarbamoyladenosine) , ms2hn6A (2-methylthio-N6-hydroxynorvalyl carbamoyladenosine) , Ar (p) (2’ -0-ribosyladenosine (phosphate) ) ,
I (inosine) , m1I (1-methylinosine) , m1hn (1, 2’ -O-dimethylinosine) , m3C (3-methylcytidine) , Cm (2’-0-methylcytidine) , s2C (2-thiocytidine) , ac4C (N4-acetylcytidine) , (5-formylcytidine) , m5Cm (5 , 2 '-O-dimethylcytidine) , ac4Cm (N4-acetyl-2’ -O-methylcytidine) , k2C (lysidine) , m! G (1-methylguanosine) , m2G (N2-methylguanosine) , m7G (7-methylguanosine) , Gm (2'-0-methylguanosine) , m2 2G (N2, N2-dimethylguanosine) , m2Gm (N2, 2’ -O-dimethylguanosine) , m2 aGm (N2, N2, 2’ -O-trimethylguanosine) , Gr (p) (2’ -0-ribosylguanosine (phosphate) ) , yW (wybutosine) , oayW (peroxywybutosine) , OHyW (hydroxy wybutosine) , OHyW* (undermodified hydroxywybutosine) , imG (wyosine) , mimG (methylwyosine) , Q (queuosine) , oQ (epoxyqueuosine) , galQ (galactosyl-queuosine) , manQ (mannosyl-queuosine) , preQo (7-cyano-7-deazaguanosine) , preQi (7-aminomethyl-7-deazaguanosine) , G+ (archaeosine) , D (dihydrouridine) , m5Um (5, 2’ -0-dimethyluridine) , s4U (4-thiouridine) , m5s2U (5-methyl-2-thiouridine) , s2Um (2-thio-2’ -0-methyluridine) , acp3U (3- (3-amino-3-carboxypropyl) uridine) , ho5U (5-hydroxyuridine) , mo5U (5-methoxyuridine) , cmo5U (uridine 5-oxy acetic acid) , mcmo5U (uridine 5-oxy acetic acid methyl ester) , chm5U (5- (carboxyhydroxymethyl) uridine) ) , mchm5U (5- (carboxyhydroxymethyl) uridine methyl ester) , mcm5U (5-methoxycarbonylmethyluridine) , mcm5Um (5-methoxycarbonylmethyl-2’ -0-methyluridine) , mcm5s2U (5-methoxycarbonylmethyl-2-thiouridine) , nm5S2U (5-aminomethyl-2-thiouridine) , mnm5U (5-methylaminomethyluridine) , mnm5s2U (5-methylaminomethyl-2-thiouridine) , mnm5se2U (5-methylaminomethyl-2-selenouridine) , ncm5U (5-carbamoylmethyluridine) , ncm5Um (5-carbamoylmethyl-2 '-O-methyluridine) , cmnm5U (5-carboxymethylaminomethyluridine) , cmnm5Um (5-carboxymethylaminomethyl-2'-0-methyluridine) , cmnm5s2U (5-carboxymethylaminomethyl-2-thiouridine) , m6 2A (N6, N6-dimethyladenosine) , Im (2’ -0-methylinosine) , m4C (N4-methylcytidine) , m4Cm (N4, 2’ -0-dimethylcytidine) , hm5C (5-hydraxymethylcytidine) , m3U (3-methyluridine) , cm5U (5-carboxymethyluridine) , m6Am (N6, 2’ -O-dimethyladenosine) , m6 2Am (N6, N6, 0-2’ -trimethyladenosine) , m2, 7G (N2, 7-dimethylguanosine) , m2, 2, 7G (N2, N2, 7-trimethylguanosine) , m3Um (3, 2’ -0-dimethyluridine) , m5D (5-methyldihydrouridine) , f5Cm (5-formyl-2’ -0-methylcytidine) , m'Gm (l, 2’ -0-dimethylguanosine) , m'A m (l, 2’ -0-dimethyladenosine) , rm 5U (5-taurinomethyluridine) , τm5s2U (5-taurinomethyl-2-thiouridine) ) , imG-14 (4-demethylwyosine) , imG2 (isowyosine) , or ac6A (N6-acetyladenosine) , pyridin-4-one ribonucleoside, 5-aza-uridine, 2-thio-5-aza-uridine, 2-thiouridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxyuridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyluridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, l-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-1-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-deaza-
pseudouridine, 2-thio-1-methyl-1-deaza-pseudouridine, dihydrouridine, dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-dihydropseudouridine, 2-methoxyuridine, 2-methoxy-4-thio-uridine, 4-methoxy-pseudouridine, 4-m ethoxy-2-thio-pseudouridine, 5-aza-cytidine, pseudoisocytidine, 3-methyl-cytidine, N4-acetylcytidine, 5-formylcytidine, N4-methylcytidine, 5-hydroxymethylcytidine, 1-methyl-pseudoisocytidine, pyrrolo-cytidine, pyrrolo-pseudoisocytidine, 2-thio-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoisocytidine, 4-thio-1-methyl-pseudoisocytidine, 4-thio-1-methyl-1-deaza-pseudoisocytidine, 1-methyl-1-deaza-pseudoisocytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-thio-zebularine, 2-methoxy-cytidine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoisocytidine, 4-methoxy-1-methyl-pseudoisocytidine, 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2, 6-diaminopurine, 7-deaza-8-aza-2, 6-diaminopurine, 1-methyladenosine, N6-methyladenosine, N6-isopentenyladenosine, N6- (cis-hydroxyisopentenyl) adenosine, 2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine, N6-glycinylcarbamoyladenosine, N6-threonylcarbamoyladenosine, 2-methylthio-N6-threonyl carbamoyladenosine, N6, N6-dimethyladenosine, 7-methyladenine, 2-methylthio-adenine, 2-methoxy-adenine, inosine, 1-methyl-inosine, wyosine, wybutosine, 7-deaza-guanosine, 7-deaza-8-aza-guanosine, 6-thio-guanosine, 6-thio-7-deaza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methylinosine, 6-methoxy-guanosine, 1-methylguanosine, N2-methylguanosine, N2, N2-dimethylguanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, 1-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2, N2-dimethyl-6-thio-guanosine, 5-methylcytosine, pseudouridine, and 1-methylpseudouridine.
8.3 Target sequence
Provided herein are non-naturally occurring RNAs (or cRNAzymes) comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’ -end of the 3’ target sequence fragment (FIG. 9) . As the RNAs provided herein have group II intron activity and form circRNAs upon self-splicing, they are cRNAzymes.
The target sequences can comprise fragments of any sequence desired to be prepared into a circRNA. As used herein, the term “resulting target sequence” refers to the target sequence as it is formed in the circRNA upon self-splicing of the RNAs (or cRNAzymes) provided herein.
In a resulting target sequence, the 3’ -end of the 5’ target sequence fragment is linked to the 5’ -end of the 3’ target sequence fragment (FIG. 9) .
The resulting target sequence can be a coding sequence, or a noncoding sequence, or a combination thereof. In some embodiments, the resulting target sequence can comprise an expression construct, or an expression cassette. As used herein and understood in the art, an “expression construct” or “expression cassette” means a nucleotide sequence that directs translation. An expression construct includes, at a minimum, one or more transcriptional control elements (such as a translation initiation site, an internal ribosome entry site (IRES) , or a structure functionally equivalent thereof) that direct protein translation one or more desired cell types, tissues or organs. An expression construct can also include coding sequence encoding the desired expression product. Additional elements, such as a transcription termination signal, can also be included. In some embodiments, the resulting target sequence consists of an expression construct.
The coding sequence can encode any protein, e.g., a protein selected from a functional protein, an antigenic protein, a signal peptide, a tag protein, and the like. In some embodiments, the resulting target sequence can comprise noncoding sequence that is a spacer sequence, such as an AT-rich sequence, which can modulate the flexibility of the sequence. Such spacer sequences can be located anywhere in the target sequence, e.g., at one end of the target sequence.
In some embodiments, the resulting targeting sequence comprises an expression cassette. In some embodiments, the resulting targeting sequence comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1 (FIGs. 10A (a) - (b) , 10 (a) - (b) , 11A (a) - (b) and 11B (a) - (b) ) . In some embodiments, the resulting circRNA comprises the resulting target sequence. In some embodiments, the resulting circRNA consists of the resulting target sequence.
Set 1: In some embodiments of the RNAs provided herein, the target sequence consists of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI (FIGs. 10A (a) - (b) ) . In some embodiments, the 3’ target sequence fragment further comprises one or two linkers flanking Z1 (FIGs. 10B (a) - (b) ) . Accordingly, in some embodiments, circRNAs produced by the self-splicing of the RNAs provided herein comprise TI and Z1, wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1 (FIGs. 10A (a) - (b) ) . In some embodiments of the circRNAs, the Z1 is flanked by one or two linkers (FIGs. 10B (a) - (b) ) .
In some embodiments, RNAs provided herein can have a structure of Formula (I) : 5’ - (3’ IF) - (L) n-Z1- (L) n-TI- (5’ IF) -3’ ; wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, Z1 is a protein-coding sequence, and each L is independently a linker sequence, wherein n=0, 1 or 2 (FIGs. 10A (a) - (b) and 10B (a) - (b) , upper) .
In some embodiments, the RNAs provided herein further comprise two homology arms, a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. Accordingly, in some embodiments, RNAs provided herein can have a structure of Formula (I) ’ : 5’ - (5’ HA) - (3’ IF) - (L) nn-Z1- (L) n-TI- (5’ IF) - (3’ HA) -3’ ; wherein 5’ HA is the 5’ homology arm; 3’ HA is the 3’ homology arm; 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, Z1 is a protein-coding sequence, and each L is independently a linker sequence, wherein n=0, 1 or 2 (FIGs. 10A (a) - (b) and 10B (a) - (b) , upper) .
FIGs. 10A (a) and 10B (a) shows scarless splicing wherein the sequence elements within the target sequence serve as E1 and E2. As shown, in some embodiments, the 5’ terminal region of the target sequence (e.g., part of Z1 or linker) can serve as E2. In some embodiments, the 3’ terminal region of the target sequence (e.g., part of TI) can serve as E1. Optionally, as depicted on FIGs. 10A (b) and 10B (b) , (1) extra exon sequence E2 can be included between the 3’ intron fragments and the target sequence; (2) extra exon sequence E1 can be included between the target sequence and the 5’ intron fragments; or both (1) and (2) . As such, E1 and/or E2 remain with the target sequence in the circRNA after self-splicing.
Set 2: In some embodiments of the RNAs provided herein, the target sequence consists of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; wherein the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1 (FIGs. 11A (a) - (b) ) . In some embodiments, the 3’ target sequence fragment further comprises two linkers (L) flanking TI (FIGs. 11B (a) - (b) ) . Accordingly, in some embodiments, circRNAs produced by the self-splicing of the RNAs provided herein comprise TI and Z1, wherein the 3’ -end of TI is operatively linked to the 5’ -end of Z1 (FIGs. 11A (a) - (b) ) . In some embodiments of the circRNAs, the TI is flanked by one or two linkers (FIGs. 11B (a) - (b) ) .
Accordingly, in some embodiments, RNAs provided herein can have a structure of Formula (II) : 5’ - (3’ IF) - (L) n-TI- (L) n-Z1- (5’ IF) -3’ ; wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence; Z1 is a protein-coding sequence, and each L is independently a linker sequence, wherein n=0, 1 or 2 (FIGs. 11A (a) - (b) and 11B(a) - (b) , upper) .
In some embodiments, the RNAs provided herein further comprise two homology arms, a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. Accordingly, in some embodiments, RNAs provided herein can have a structure of Formula (II) ’ : 5’ - (5’ HA) - (3’ IF) -(L) n-TI- (L) n-Z1- (5’ IF) - (3’ HA) -3’ ; wherein 5’ HA is the 5’ homology arm; 3’ HA is the 3’ homology arm; 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation
initiation sequence; Z1 is a protein-coding sequence, and each L is independently a linker sequence, wherein n=0, 1 or 2 (FIGs. 11A (a) - (b) and 11B (a) - (b) , lower) .
FIGs. 11A (a) and 11B (a) shows scarless splicing wherein the sequence elements within the target sequence serve as E1 and E2. As shown, in some embodiments, the 5’ terminal region of the target sequence (e.g., part of TI or linker) can serve as E2. In some embodiments, the 3’ terminal region of the target sequence (e.g., part of Z1) can serve as E1. Optionally, as depicted on FIGs. 11A (b) and 11B (b) , (1) extra exon sequence E2 can be included between the 3’ intron fragments and the target sequence; (2) extra exon sequence E1 can be included between the target sequence and the 5’ intron fragments; or both (1) and (2) . As such, E1 and/or E2 remain with the target sequence in the circRNA after self-splicing.
Set 3: The translation initiation sequence TI of the RNAs described herein can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) . In some embodiments of the RNAs provided herein, the target sequence consists of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; wherein the 3’ target sequence fragment comprises, from 5’ to 3’ , a 3’ fragment of TI (TIB) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TIA) (FIGs. 12A (a) - (b) ) . In some embodiments, the 3’ target sequence further comprises two linkers (L) flanking Z1 (FIGs. 12B (a) - (b) ) . Accordingly, in some embodiments, circRNAs produced by the self-splicing of the RNAs provided herein comprise TIA, TIB and Z1, wherein the 3’ -end of TIA is operatively linked to the 5’ -end of TIB (FIGs. 12A (a) - (b) ) . In some embodiments of the circRNAs, Z1 is flanked by one or two linkers (FIGs. 12B (a) - (b) ) .
Accordingly, In some embodiments, the RNAs provided herein can have a structure of Formula (III) : 5’ - (3’ IF) -TIB- (L) n-Z1- (L) n-TIA- (5’ IF) -3’ ; wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) ; Z1 is a protein-coding sequence; and each L is independently a linker sequence, wherein n=0, 1 or 2 (FIGs. 12A (a) - (b) and 12B (a) - (b) , upper) .
In some embodiments, the RNAs provided herein further comprise two homology arms, a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. Accordingly, in some embodiments, RNAs provided herein can have a structure of Formula (III) ’ : 5’ - (5’ HA) - (3’ IF) -TIB- (L) n-Z1- (L) n-TIA- (5’ IF) - (3’ HA) -3’ ; wherein 5’ HA is the 5’ homology arm; 3’ HA is the 3’ homology arm; 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) ; Z1 is a protein-coding sequence; and each L is independently a linker sequence, wherein n=0, 1 or 2 (FIGs. 12A (a) - (b) and 12B (a) - (b) , lower) .
FIGs. 12A (a) and 12B (a) shows scarless splicing wherein the sequence elements within the target sequence serve as E1 and E2. As shown, in some embodiments, the 5’ terminal region of the target sequence (e.g., part of TIB) can serve as E2. In some embodiments, the 3’ terminal region of the target sequence (e.g., part of TIA) can serve as E1. Optionally, as depicted on FIGs. 12A (b) and 12B (b) , (1) extra exon sequence E2 can be included between the 3’ intron fragments and the target sequence; (2) extra exon sequence E1 can be included between the target sequence and the 5’ intron fragments; or both (1) and (2) . As such, E1 and/or E2 remain with the target sequence in the circRNA after self-splicing.
Set 4: The protein coding sequence Z1 of the RNAs described herein can be segmented into a 5’ fragment (Z1A) and a 3’ fragment Z1 (Z1B) . In some embodiments of the RNAs provided herein, the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1B) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’ , TI and a 5’ fragment of Z1 (Z1A) (FIG. 13A) . In some embodiments, the 3’ target sequence further comprises two linkers (L) flanking TI (FIG. 13B) . Accordingly, in some embodiments, circRNAs produced by the self-splicing of the RNAs provided herein comprise TI, Z1A, and Z1B and wherein the 3’ -end of Z1A is operatively linked to the 5’ -end of Z1B (FIG. 13A) . In some embodiments of the circRNAs, TI is flanked by one or two linkers (FIG. 13B) .
Accordingly, in some embodiments, the RNAs provided herein can have a structure of Formula (IV) : 5’ - (3’ IF) -Z1B- (L) n-TI- (L) n-Z1A- (5’ IF) -3’ ; wherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence; Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1A) and a 3’ fragment (Z1B) ; and each L is independently a linker sequence, and n=0, 1 or 2 (FIGs. 13A and 13B, upper) .
In some embodiments, the RNAs provided herein further comprise two homology arms, a 5’ homology arm operatively linked to the 5’ -end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’ -end of the 5’ intron fragment. Accordingly, in some embodiments, RNAs provided herein can have a structure of Formula (IV) ’ : 5’ - (5’ HA) - (3’ IF) -Z1B- (L) n-TI- (L) n-Z1A- (5’ IF) - (3’ HA) -3’ ; wherein 5’ HA is the 5’ homology arm; 3’ HA is the 3’ homology arm; 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence; Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1A) and a 3’ fragment (Z1B) ; and each L is independently a linker sequence, and n=0, 1 or 2 (FIGs. 13A and 13B, lower) .
FIGs. 13A and 13B shows scarless splicing wherein the sequence elements within the target sequence serve as E1 and E2. As shown, in some embodiments, the 5’ terminal region of the target sequence (e.g., part of Z1B) can serve as E2. In some embodiments, the 3’ terminal region of the target sequence (e.g., part of Z1A) can serve as E1.
E1 and E2: In some embodiments, the RNAs provided herein can further comprise one or two exon (s) flanking the target sequence that remains present in the resulting circRNA. Without being bound by theory, the presence of the exon (s) may improve the self-splicing efficiency. As such, in some embodiments, provided herein are RNAs comprising (1) an exon fragment 2 (E2) between the 3’ intron fragment and the target sequence; (2) an exon fragment 1 (E1) between the target sequence and the 5’ intron fragment; or (3) both (1) and (2) . Accordingly, in some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) E2; (3) a target sequence; (4) E1; and (5) a 5’ intron fragment. In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) E2; (3) a target sequence; and (4) a 5’ intron fragment. In some embodiments, provided herein are non-naturally occurring RNAs comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence; (3) E1; and (4) a 5’ intron fragment.
Scarless splicing generates circRNAs consisting of the resulting target sequences. Near scarless splicing generates resulting circRNAs that also contain E1 and/E2, in addition to the resulting targeting sequences.
8.3.1 Coding sequence
As used interchangeably herein and understood in the art, the terms “coding sequence, ” “coding sequence region, ” “coding region, ” and “CDS” refer to the portion of a nucleic acid (e.g., a DNA or an RNA) , for example, that is or can be translated to protein. As used interchangeably herein and understood in the art, the terms “reading frame, ” “open reading frame, ” and “ORF” refer to a nucleotide sequence that begins with an initiation codon (e.g., ATG) and, in some embodiments, ends with a termination codon (e.g., TAA, TAG, or TGA) . Open reading frames can contain introns and exons, and as such, all CDSs are ORFs, but not all ORF are CDSs.
The coding sequence can encode any protein, e.g., a protein selected from a functional protein, an antigenic protein, a signal peptide, a tag protein, and the like. In some embodiments, the resulting target sequence encodes a protein that is a therapeutic product.
In some embodiments, the therapeutic product is a polypeptide, a protein, an enzyme or an antibody. In some embodiments, the therapeutic product comprises one or more polypeptide, protein, enzyme, antibody, or a combination thereof. In some embodiments, the protein or enzyme is associated with a genetic disease (e.g., a disease in which a genetic alteration (e.g., mutation) and/or protein dysregulation plays a role in the initiation, development, and/or manifestation of the disease) .
In some embodiments, the therapeutic product is an antigen or agent which can induce vaccine-induced memory and/or enable the immune system to act quickly to protect the body from any of these agents in later encounters. In some embodiments, the therapeutic product is an antigen or agent which can stimulate the body's immune system to recognize the agent as a foreign invader, generate antibodies against it, destroy it and develop a memory of it. In some embodiments, the polypeptide or protein resembles a weakened or non-viable form of a disease-causing agent (e.g., an infectious agent such as pathogen) , which can be selected from a microorganism, such as a bacterium, virus, fungus, parasite, or one or more components of such microorganism, such as toxins, proteins (e.g., surface proteins) , and/or cell walls. In some embodiments, the therapeutic product is an antigen or agent which can stimulate the body's immune system to recognize the antigen or agent, generate antibodies against the antigen or agent, destroy the antigen or agent, and/or develop an immunological memory of the antigen or agent. In some embodiments, the therapeutic product is an antigen or agent which can induce and/or strengthen vaccine-induced memory and/or enable the immune system to respond rapidly and effectively to the antigen or agent in later encounters.
In some embodiments, the therapeutic product is derived from an infectious agent. In some embodiments, the infectious agent is selected from a member of the group consisting of strains of viruses and strains of bacteria.
In any of the embodiments provided herein, the infectious agent is a strain of virus selected from the group consisting of adenovirus; herpes simplex, type 1; herpes simplex, type 2; encephalitis virus, papillomavirus, varicella-zoster virus; Epstein-Barr virus; human cytomegalovirus; human herpes virus, type 8; human papillomavirus; BK virus; JC virus; smallpox; hepatitis B virus; human bocavirus; parvovirus B19; human astrovirus; Norwalk virus; coxsackievirus; hepatitis A virus; poliovirus; rhinovirus; severe acute respiratory syndrome virus; hepatitis C virus; yellow fever virus; dengue virus; West Nile virus; rubella virus; hepatitis E virus; human immunodeficiency virus (HIV) ; Guanarito virus; Junin virus; Lassa virus; Machupo virus; Sabiá virus; Crimean-Congo hemorrhagic fever virus; Ebola virus; Marburg virus; measles virus; mumps virus; parainfluenza virus; respiratory syncytial virus (RSV) ; human metapneumovirus; Hendra virus; Nipah virus; rabies virus; hepatitis D; rotavirus; orbivirus; coltivirus; Banna virus; human enterovirus; hantavirus; corona virus, severe acute respiratory syndrome (SARS) -associated coronavirus (SARS-CoV) , SARS-CoV-2 virus (COVID-19 associated) ; Middle East respiratory syndrome corona virus; Japanese encephalitis virus; vesicular exanthernavirus; Eastern equine encephalitis; and influenza virus. In some embodiments, the infectious agent is a strain of bacteria selected from Tuberculosis (Mycobacterium tuberculosis) , clindamycin-resistant Clostridium difficile, fluoroquinolon-resistant Clostridium
difficile, methicillin-resistant Staphylococcus aureus (MRSA) , multidrug-resistant Enterococcus faecalis, multidrug-resistant Enterococcus faecium, multidrug-resistance Pseudomonas aeruginosa, multidrug-resistant Acinetobacter baumannii, and vancomycin-resistant Staphylococcus aureus (VRSA) .
In some embodiments, the infectious agent is associated with humans, non-human primates, or other animals, such as birds, pigs, horses, dogs, cats, rabbits, mice, rats, cows, sheep, goats, and deer.
In some embodiments, the resulting target sequence encodes an antibody. Antibodies encoded by the resulting target sequence include, but are not limited to, monoclonal antibodies, polyclonal antibodies, recombinantly produced antibodies, human antibodies, humanized antibodies, chimeric antibodies, synthetic antibodies, tetrameric antibodies comprising two heavy chain and two light chain molecules, antibody light chain monomers, antibody heavy chain monomers, antibody light chain dimers, antibody heavy chain, antibody heavy chain dimers, antibody light chain-heavy chain pairs, intrabodies, heteroconjugate antibodies, monovalent antibodies, antigen-binding fragments of full-length antibodies, and fusion proteins of the above. Such antigen-binding fragments include, but are not limited to, single-domain antibodies (variable domain of heavy chain antibodies (VHHs) or nanobodies) , Fabs, F (ab’ ) 2S, and scFvs (single-chain variable fragments) .
In some embodiments, the resulting target sequence disclosed herein can be codon-optimized, for example, via any codon-optimization technique known to one of skill in the art (see, e.g., review by Quax et al., 2015, Mol. Cell 59: 149-161) . A codon optimized sequence can be one in which codons in a polynucleotide encoding a therapeutic product have been substituted in order to increase the expression, stability and/or activity of the therapeutic product. Factors that influence codon optimization include, but are not limited to one or more of: (i) variation of codon biases between two or more organisms or genes or synthetically constructed bias tables, (ii) variation in the degree of codon bias within an organism, gene, or set of genes, (iii) systematic variation of codons including context, (iv) variation of codons according to their decoding tRNAs, (v) variation of codons according to GC %, either overall or in one position of the triplet, (vi) variation in degree of similarity to a reference sequence for example a naturally occurring sequence, (vii) variation in the codon frequency cutoff, (viii) structural properties of mRNAs transcribed from the DNA sequence, (ix) prior knowledge about the function of the DNA sequences upon which design of the codon substitution set is to be based, and/or (x) systematic variation of codon sets for each amino acid. In some
embodiments, a codon optimized polynucleotide can minimize ribozyme collisions and/or limit structural interference between the expression sequence and the IRES.
8.3.2 Noncoding sequence
In some embodiments, the resulting target sequence can be a noncoding sequence. In some embodiments, the noncoding sequence is selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an IRES, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a decoy, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
In some embodiments, the resulting target sequence comprises an aptamer sequence. In some embodiments, the resulting target sequence encodes a single-stranded DNA or RNA (ssDNA or ssRNA) molecules that can selectively bind to a specific target, including proteins, peptides, carbohydrates, small molecules, toxins, and even live cells.
In some embodiments, the resulting target sequence encodes a ribozyme.
In some embodiments, the resulting target sequence encodes an antisense oligonucleotide (ASO) , which binds sequence specifically to the target RNA and modulate protein expression through several different mechanisms.
In some embodiments, the resulting target sequence encodes a decoy, which is a short stretch of sequence sharing same or homology to miRNA-binding sites or protein binding sites in endogenous targets.
In some embodiments, the resulting target sequence encodes an RNA scaffold, which is an RNA sequence designed to co-localize enzymes in engineered biological pathways through interactions between scaffold’s protein docking domains and their affinity protein-enzyme fusions, in vivo.
In some embodiments, the resulting target sequence comprises a control element. The term “control elements” refers collectively to promoter regions, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites (IRES) , enhancers, splice junctions, and the like, which collectively provide for the replication, transcription, post-transcriptional processing, and translation of a coding sequence in a recipient cell. Not all of these control elements need to be present so long as the selected coding sequence is capable of being replicated, transcribed, and translated in an appropriate host cell.
The term “promoter” as used herein and understood in the art refers to a nucleotide region comprising a DNA regulatory sequence, wherein the regulatory sequence is derived from a
gene that is capable of binding to an RNA polymerase and allowing for the initiation of transcription of a downstream (3' direction) coding sequence. It may contain genetic elements at which regulatory proteins and molecules may bind, such as RNA polymerase and other transcription factors, to initiate the specific transcription of a nucleic acid sequence. A promoter that is “operatively positioned, ” “operatively linked” mean that a promoter is in a correct functional location and/or orientation in relation to a nucleic acid sequence to control transcriptional initiation and/or expression of that sequence, which is “under control” and “under transcriptional control” of the promoter. The term “enhancer” as used herein and understood in the art means a nucleic acid sequence that, when positioned proximate to a promoter, confers increased transcription activity relative to the transcription activity resulting from the promoter in the absence of the enhancer domain.
In some embodiments, the resulting target sequence comprises a translation initiation element, or a IT. In some embodiments, the translation initiation element (or TI) is an IRES, or an IRES-like sequence. As used interchangeably herein and understood in the art, the terms “internal ribosome entry site, ” “internal ribosome entry site sequence, ” “IRES” and “IRES sequence region” refer to cis elements of viral or human cellular RNAs (e.g., messenger RNA (mRNA) and/or circRNAs) that bypass the steps of canonical eukaryotic cap-dependent translation initiation. The canonical cap-dependent mechanism used by the vast majority of eukaryotic mRNAs requires an m7G cap at the 5’ end of the mRNA, initiator Met-tRNAmet, more than a dozen initiation factor proteins, directional scanning, and GTP hydrolysis to place a translationally competent ribosome at the start codon. IRESs attracts a ribosomal (e.g., eukaryotic ribosomal) to form translation initiation complex and promotes translation initiation. IRESs typically comprise a long and highly structured 5 -UTR which mediates he translation initiation complex binding and catalyzes the formation of a functional ribosome.
A multitude of naturally occurring IRES sequences are available and include sequences derived from or isolated from a wide variety of viruses, such as from leader sequences of piconaviruses such as the encephalomyocarditis virus (EMCV) UTR, the polio leader sequence, the hepatitis A virus leader sequence, the hepatitis C virus IRES, human rhinovirus type 2 IRES, an IRES element from the foot and mouth disease virus, a giardiavirus IRES, and the like.
In some embodiments, the naturally occurring IRES sequence is isolated or derived from an IRES sequence of Taura syndrome virus, Triatoma virus, Theiler's encephalomyelitis virus, simian virus 40, Solenopsis invicta virus 1, Rhopalosiphum padi virus, reticuloendotheliosis virus, human poliovirus 1, Plautia stall intestine virus, Kashmir bee virus, human rhinovirus 2, Homalodisca coagulata virus-1, human immunodeficiency virus type 1, Himetobi P virus, hepatitis C virus, hepatitis A virus, hepatitis A virus HA 16, hepatitis GB virus, foot and mouth
disease virus, human enterovirus 71, equine rhinitis virus, ectrapis obliqua picoma-like virus, encephalomyocarditis virus, drosophila C virus, human coxsackievirus B3, crucifer tobamovirus, cricket paralysis virus, bovine viral diarrhea virus 1, black queen cell virus, aphid lethal paralysis virus, avian encephalomyelitis virus, acute bee paralysis virus, Hibiscus chlorotic ringspot virus, classical swine fever virus, tobacco etch virus, turnip crinkle virus, EMCV-A, EMCV-B, EMCV-Bf, EMCV-Cf, EMCV pEC9, picobimavirus, HCV QC64, human cosavirus E/D, human cosavirus F, human cosavirus JMY, rhinovirus NAT001, HRV14, HRV89, HRVC-02, HRV-A21, salivirus A SHI, salivirus FHB, salivirus NG-J1, human parechovirus 1, crohivirus B, Yc-3, rosavirus M-7, shanbavirus A, pasivirus A, pasivirus A 2, echovirus E14, human parechovirus 5, Aichi virus, phopivirus, CVA10, enterovirus C, enterovirus D, enterovirus J, human pegivirus 2, GBV-C GT110, GBV-C K1737, GBV-C Iowa, pegivirus A 1220, pasivirus A 3, sapelovirus, rosavirus B, bakunsa virus, tremovirus A, swine pasivirus 1, PLV-CHN, pasivirus A, sicinivirus, hepacivirus K, hepacivirus A, BVDV1, border disease virus, BVDV2, CSFV-PK15C, SF573 dicistravirus, Hubei picoma-like virus, CRPV, salivirus A BN5, salivirus A BN2, salivirus A 02394, salivirus A GUT, salivirus A CH, salivirus A SZ1, salivirus FHB, coxsackievirus (e.g., CVA3, CVA12, CVB1, CVB3, CVB5) , echovirus 7, enterovirus A71, and/or EV24.
In some embodiments, a natural IRES sequence is isolated or derived from a eukaryotic IRES element selected from human FGF2, human SFTPA1, human AML1/RUNX1, drosophila antennapedia, human AQP4, human AT1R, human BAG-1, human BCL2, human BiP, human c-IAPl, human c-myc, human eIF4G, mouse NDST4L, human LEF1, mouse HIFl alpha, human n.myc, mouse Gtx, human p27kipl, human PDGF2/c-sis, human p53, human Pim-1, mouse Rbm3, drosophila reaper, canine Scamper, drosophila Ubx, human UNR, mouse UtrA, human VEGF-A, human XIAP, drosophila hairless, S. cerevisiae TFIID, and S. cerevisiae YAP1.
In some embodiments, a naturally occurring sequence is an endogenous IRES sequence, which is derived from or isolated from homo sapiens. In some embodiments, a natural IRES sequence is an endogenous IRES sequence, which is derived from or isolated from human tissue or human sample.
In some embodiments, an IRES sequence is isolated or derived from a cellular IRES element selected from AML1/RUNX1, Antp-D, Antp-DE, Antp-CDE, ATlR varl, ATlR_var2, ATlR_var3, ATlR_var4, BAGl_p36delta236nt, BAGl_p36, BiP_-222_-3, C-IAP1 285-1399, c IAP1 13 13-1462, c-jun, Cat-l_224, CCND1, eIF4GI-ext, eIF4GII, eIF4GII-long, FGF1A, FMR1, Gtx-l33-l4l, Gtx-l-l66, Gtx-l-l20, Gtx-l-l96, HAP4, HIFla, hSNMl, HsplOl, hsp70, hsp70, Hsp90, IGF2_leader2, L-myc, MNT 75-267, MNT 36-160, MTG8a, MYB, MYT2 997-1 152, NRF_-653_-l7, NtHSFl, ODC1, p27kipl, p53_l28-269, PDGF2/c-sis, PITSLRE_p58, Rbm3 , reaper,
Scamper, TFIID, TIF4631, Ubx_l-966, Ubx_373-96l, UNR, Ure2, XIAP 5-464, XIAP 305-466, YAP1, (GAAA) l6, (PPTl9) 4 and XI.
In some embodiments, an IRES sequence is isolated or derived from a viral IRES element selected from ABPV IGRpred, AEV, ALPV IGRpred, BQCV IGRpred, BVDV1 1-385, BVDV1 29-391, CrPV 5NCR, CrPV IGR, crTMV_IRESmp228, CSFV, DCV IGR, EoPV_5NTR, ERBV_l62-920, EV7l_l-748, FMDV type C, GBV-A, GBV-C, HAV HM175, HiPVJGRpred, HIV-l, HoCVlJGRpred, IAPVJGRpred, idefix, KBV IGRpred, PSIV IGR, PV typel Mahoney, PV_type3_Leon, REV-A, RhPV 5NCR, RhPV IGR , SINV l IGRpred, SV40 661-830, TMEV, TMV_UI_IRESmp228, TRV 5NTR, TrV IGR, TSV, and IGR.
In some embodiments, the IRES sequence comprises a sequence isolated or derived from a natural IRES sequence. The term “IRES-like sequence” or “Internal Ribosome Entry Site-like sequence” refer to non-naturally occurring nucleotide sequences that display a function of a naturally occurring IRES. In some embodiments, the IRES-like sequence can recruit ribosomal components to mediate cap-independent translation. An IRES-like sequence may be identified by methods known in the art, such as in PCT application No. PCT/CN2022/095949.
In some embodiments, the IRES-like sequence is greater than or equal to 3 nucleic acid residues in length. In some embodiments, the IRES-like sequence is 3-300 nucleic acid residues in length. In some embodiments, the IRES-like sequence is 3-200, 4-200, 5-200, 6-200, 7-200, 3-100, 4-100, 5-100, 6-100, 7-100, 3-50, 4-50, 5-50, 6-50, 7-50, 3-40, 4-40, 5-40, 6-40, 7-40, 3-30, 4-30, 5-30, 6-30, 7-30, 3-20, 4-20, 5-20, 6-20, 7-20 nucleic acid residues in length. In some embodiments, IRES-like sequence is 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100 nucleic acid residues in length. In some embodiments, the IRES-like sequence is 5-12 nucleic acid residues in length.
In some embodiments, the IRES-like sequence described herein or a fragment thereof can be combined with a naturally occurring IRES sequence described herein or a fragment thereof, to form additional IRES-like sequences. In some embodiments, a complete natural IRES sequence is combined with a complete IRES-like sequence. In some embodiments, a fragment of a natural IRES sequence is combined with a complete IRES-like sequence. In some embodiments, a complete natural IRES sequence is combined with a fragment of an IRES-like sequence. In some embodiments, a fragment of a natural IRES sequence is combined with a fragment of an IRES-like sequence.
8.3.3 linkers
In some embodiments, RNAs (or cRNAzymes) provided herein comprises a linker sequence (L) . In some embodiments, the linker sequence is 3-300 nucleic acid residues in length. In some embodiments the linker sequence is about 3-10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-125, 125-150, 150-175, 175-200, 200-225, 225-250, 250-275 or 275-300 nucleic acid sequences in length. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is an integer selected from 1-100. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is an integer selected from 1-50. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is an integer selected from 1-20. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is an integer selected from 1-10. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is 1, 2, 3, 4 or 5. In some embodiments, the linker sequence is about 3N nucleic acid residues in length, wherein N is 1, 2 or 3. In some embodiments, the linker sequence is about 3 nucleic acid residues in length.
In some embodiments, the linker sequence comprises the nucleic acid sequence of RCC, wherein R is a guanine or an adenine. In some embodiments, the linker comprises the nucleic acid sequence of RCCRCC, wherein R is a guanine or an adenine. In some embodiments, the linker comprises the nucleic acid sequence of RCCRCCRCC, wherein R is a guanine or an adenine. In some embodiments, the linker has the polynucleotide sequence of SEQ ID NO: 226. In some embodiments, the linker has the polynucleotide sequence of SEQ ID NO: 227. In some embodiments, the linker has the polynucleotide sequence of SEQ ID NO: 248.
In some embodiments, the linker comprises a nucleic acid sequence that encodes a 5’ UTR, 3’ UTR, poly-A sequence, polyA-C sequence, poly-C sequence, poly-U sequence, poly-G sequence, ribosome binding site, aptamer, riboswitch, ribozyme, small RNA binding site, translation regulation elements (e.g., a Kozak sequence) , a protein binding site (e.g., PTBP1 or HUR) a non-natural nucleotide, or a non-nucleotide chemical-linker sequence.
In some embodiments, the resulting target sequences comprise a 3’ UTR. In some embodiments, the 3’ UTR can be the 3’ UTR from human beta globin, human alpha globin xenopus beta globin, xenopus alpha globin, human prolactin, human GAP-43, human eEFlal, human Tau, human TNFa, dengue virus, hantavirus small mRNA, bunyavirus small mRNA, turnip yellow mosaic virus, hepatitis C virus, rubella virus, tobacco mosaic virus, human IL-8, human actin, human GAPDH, human tubulin, hibiscus chlorotic linsgspot virus, woodchuck hepatitis virus post translationally regulated element, sindbis virus, turnip crinkle virus, tobacco etch virus, or Venezuelan equine encephalitis virus.
In some embodiments, the resulting target sequences comprise a 5’ UTR. In some embodiments, the 5’ UTR can be the 5’ UTR from human beta globin, Xenopus laevis beta globin, human alpha globin, Xenopus laevis alpha globin, rubella virus, tobacco mosaic virus, mouse Gtx, dengue virus, heat shock protein 70 kDa protein 1A, tobacco alcohol dehydrogenase, tobacco etch virus, turnip crinkle virus, or the adenovirus tripartite leader.
In some embodiments, the resulting target sequences comprise a polyA region. In some embodiments the polyA region is at least 30 nucleotides or at least 60 nucleotides in length.
8.3.4 Exemplary target sequence
In some embodiments, the resulting target sequences of the RNAs (or cRNAzymes) provided herein comprise a TI and a Z1, wherein: TI is an IRES-like sequence or an IRES sequence, Z1 is an expression sequence encoding a therapeutic product. In some embodiments, TI has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225. In some embodiments, TI has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence selected of SEQ ID NO: 222. In some embodiments, TI has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence selected of SEQ ID NO: 223. In some embodiments, TI has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence selected of SEQ ID NO: 224. In some embodiments, TI has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence selected of SEQ ID NO: 225.
In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 107. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 108. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 109. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 110. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 111. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 112. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%,
or 100%identical to the nucleotide sequence of SEQ ID NO: 214. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 215. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 216. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 217. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 218. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 219. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 220. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 221. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 258. In some embodiments, Z1 has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to the nucleotide sequence of SEQ ID NO: 259.
In some embodiments, Z1 comprises a nucleic acid sequence encoding the amino acid sequence selected from the group consisting of: (a) SEQ ID NO: 113; (b) SEQ ID NO: 114; (c) SEQ ID NO: 115; (d) SEQ ID NO: 116; (e) SEQ ID NO: 117; and (f) SEQ ID NO: 118.
In some embodiments, the target sequence comprises a 5’ arm sequence selected from the group consisting of SEQ ID NOs: 89-96. In some embodiments, the target sequence comprises a 3’ arm sequence selected from the group consisting of SEQ ID NOs: 97-104.
In some embodiments, the resulting target sequences of the RNAs (or cRNAzymes) provided herein comprise one expression sequence. In some embodiments, the resulting target sequences of the RNAs (or cRNAzymes) provided herein comprise more than one expression sequence, e.g., 2, 3, 4, or 5 expression sequences.
In some embodiments, the resulting target sequences of the RNAs (or cRNAzymes) provided herein encode a protein that is made up of subunits that are encoded by more than one gene. For example, the protein can be a heterodimer, wherein each chain or subunit of the protein is encoded by a separate gene. It is possible that more than one circRNAs are delivered in the transfer vehicle and each contains a resulting target sequence encoding a separate subunit of the protein. In some embodiments, separate circRNAs encoding the individual subunits can be administered in separate transfer vehicles. Alternatively, the resulting target sequences of the
RNAs (or cRNAzymes) provided herein can contain more than one expression cassette and encode more than one subunit.
8.3.5 Exemplary cRNAzymes with target sequence
In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 230-263.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : a 5’ homology arm, a 3’ intron fragment, E2, 3’ target sequence fragment, 5’ target sequence fragment, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 230) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 230.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, 3’ target sequence fragment, 5’ target sequence fragment, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 231) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 231.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1, TI, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 232) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 232.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1, TI, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 233) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 233.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TI, Z1, E1, 5’ intron fragment, and 3’ homology arm
(See e.g., SEQ ID NO: 234) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 234.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TI, Z1, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 235) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 235.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TIB, Z1, TIA, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 236) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 236.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TIB, Z1, TIA, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 237) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 237.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TIB, Z1, TIA, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 238) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 238.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TIB, Z1, TIA, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 239) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 239.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1, linker, TI, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 240) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 240.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1, linker, TI, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 241) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 241.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, linker, TI, linker, Z1, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 242) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 242.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, linker, TI, linker, Z1, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 243) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 243.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1B, linker, TI, linker, Z1A, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 244) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 244.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, Z1B, linker, TI, linker, Z1A, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 245) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs
(or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 245.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TIB, linker, Z1, linker, TIA, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 246) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 246.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, E2, TIB, linker, Z1, linker, TIA, E1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 247) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 247.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, 3’ target sequence fragment, 5’ target sequence fragment, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 248) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 248.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, 3’ target sequence fragment, 5’ target sequence fragment, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 249) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 249.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, Z1, TI, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 250) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 250.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, Z1, TI, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 251) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 251.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, TI, Z1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 252) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 252.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, TI, Z1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 253) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 253.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, Z1B, TI, Z1A, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 254) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 254.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, Z1B, TI, Z1A, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 255) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 255.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, TIB, Z1, TIA, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 256) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes)
having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 256.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, linker, Z1, linker, TI, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 258) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 258.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, linker, Z1, linker, TI, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 259) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 259.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, linker, TI, linker, Z1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 260) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 260.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, linker, TI, linker, Z1, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 261) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding Gluc. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 261.
In some embodiments, RNAs (or cRNAzymes) provided herein comprises, from 5’ to 3’ : 5’ homology arm, 3’ intron fragment, Z1B, linker, TI, linker, Z1A, 5’ intron fragment, and 3’ homology arm (See e.g., SEQ ID NO: 262) . Upon self-splicing, the RNAs (or cRNAzymes) provided herein form circRNAs encoding GFP. In some embodiments, provided herein are RNAs (or cRNAzymes) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 262.
In some embodiments, provided herein are RNAs (or circRNAs) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 257.
In some embodiments, provided herein are RNAs (or mRNAs) having a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to SEQ ID NO: 263.
8.4 Vectors and methods of production
Provided herein are also vectors that encode the non-naturally occurring RNAs (or cRNAzymes) provided herein comprising the following operably linked elements from 5’ to 3’ : (1) a 3’ intron fragment; (2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’ ; and (3) a 5’ intron fragment; wherein the RNA has group II intron activity and, upon self-splicing, can form a circRNA that comprises both the 5’ and 3’ target sequence fragments with the 3’ -end of the 5’ target sequence fragment linked to the 5’-end of the 3’ target sequence fragment. In some embodiments, the vectors are DNA vectors.
As used herein and understood in the art, the term “vector” or “construct” (sometimes referred to as a gene delivery system or gene transfer “vehicle” ) refers to a vehicle that is used to carry genetic material (e.g., a nucleotide sequence) , which can be introduced into a host cell, where it can be replicated and/or expressed.
Many vectors can be used, including, for example, expression vectors, plasmids, phage vectors, viral vectors, episomes and artificial chromosomes, which can include selection sequences or markers operable for stable integration into a host cell’s chromosome. A “plasmid” is a common type of a vector, which is an extra-chromosomal DNA molecule separate from the chromosomal DNA that is capable of replicating independently of the chromosomal DNA. In certain cases, it is circular and double-stranded. Exemplary artificial chromosomes such as yeast artificial chromosome (YAC) , bacterial artificial chromosome (BAC) , or P1-derived artificial chromosome (PAC) . Exemplary bacteriophages include such as lambda phage or M13 phage. Examples of categories of animal viruses useful as vectors include, without limitation, retrovirus (including lentivirus) , adenovirus, adeno-associated virus (AAV) , herpesvirus (e.g., herpes simplex virus) , poxvirus, baculovirus, papillomavirus, and papovavirus (e.g., SV40) . Examples of expression vectors are pClneo vectors (Promega) for expression in mammalian cells; pLenti4/V5-DESTTM, pLenti6/V5-DESTTM, and pLenti6.2/V5-GW/lacZ (Invitrogen) for lentivirus-mediated gene transfer and expression in mammalian cells. Exemplary AAV serotypes include AAV1, AAV2, AAV4, AAV5, AAV6, AAV9 AAV8, and AAV9. In some embodiments, provided herein are viral vectors encoding the RNAs (cRNAzymes) provided herein. In some embodiments, provided herein are AAVs encoding the RNAs (cRNAzymes) provided herein.
In some embodiments, the vector is an episomal vector or a vector that is maintained extrachromosomally. As used herein, the term “episomal” refers to a vector that is able to replicate without integration into host’s chromosomal DNA and without gradual loss from a
dividing host cell also meaning that said vector replicates extrachromosomally or episomally. The vector is engineered to harbor the sequence coding for the origin of DNA replication or “ori” from a lymphotrophic herpes virus or a gamma herpesvirus, an adenovirus, SV40, a bovine papilloma virus, or a yeast, specifically a replication origin of a lymphotrophic herpes virus or a gamma herpesvirus corresponding to oriP of EBV. In some embodiments, the lymphotrophic herpes virus may be Epstein Barr virus (EBV) , Kaposi's sarcoma herpes virus (KSHV) , Herpes virus saimiri (HS) , or Marek's disease virus (MDV) . Epstein Barr virus (EBV) and Kaposi's sarcoma herpes virus (KSHV) are also examples of a gamma herpesvirus. Typically, the host cell comprises the viral replication transactivator protein that activates the replication.
Additionally, vectors can include one or more selectable marker genes and appropriate expression control sequences. Selectable marker genes that can be included, for example, provide resistance to antibiotics or toxins, complement auxotrophic deficiencies, or supply critical nutrients not in the culture media. “Expression control sequences, ” “control elements, ” or “regulatory sequences” present in an expression vector are those non-translated regions of the vector-origin of replication, selection cassettes, promoters, enhancers, translation initiation signals (Shine Dalgarno sequence or Kozak sequence) introns, a polyadenylation sequence, 5' and 3' untranslated regions-which interact with host cellular proteins to carry out transcription and translation. Such elements can vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including ubiquitous promoters and inducible promoters can be used.
Illustrative ubiquitous expression control sequences that can be used in present disclosure include, but are not limited to, a cytomegalovirus (CMV) immediate early promoter, a viral simian virus 40 (SV40) promoter (e.g., early or late) , a Moloney murine leukemia virus (MoMLV) LTR promoter, a Rous sarcoma virus (RSV) LTR, a herpes simplex virus (HSV) (thymidine kinase) promoter, H5, P7.5, and P11 promoters from vaccinia virus, an elongation factor 1-alpha (EF1a) promoter, early growth response 1 (EGR1) , ferritin H (FerH) , ferritin L (FerL) , Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) , eukaryotic translation initiation factor 4A1 (EIF4A1) , heat shock 70kDa protein 5 (HSPA5) , heat shock protein 90kDa beta, member 1 (HSP90B1) , heat shock protein 70kDa (HSP70) , β-kinesin (β-KIN) , the human ROSA 26 locus (Irions et al., Nature Biotechnology 25, 1477 -1482 (2007) ) , a Ubiquitin C promoter (UBC) , a phosphoglycerate kinase-1 (PGK) promoter, a cytomegalovirus enhancer/chicken β-actin (CAG) promoter, and a β-actin promoter.
Illustrative examples of inducible promoters/systems include, but are not limited to, steroid-inducible promoters such as promoters for genes encoding glucocorticoid or estrogen receptors (inducible by treatment with the corresponding hormone) , metallothionine promoter
(inducible by treatment with various heavy metals) , MX-1 promoter (inducible by interferon) , the “GeneSwitch” mifepristone-regulatable system (Sirin et al., 2003, Gene, 323: 67) , the cumate inducible gene switch (WO 2002/088346) , tetracycline-dependent regulatory systems, etc.
The vectors provided herein can be made using standard techniques of molecular biology. For example, the various elements of the vectors provided herein can be obtained using recombinant methods, such as by screening cDNA and genomic libraries from cells, or by deriving the polynucleotides from a vector known to include the same. The various elements of the vectors provided herein can also be produced synthetically, rather than cloned, based on the known sequences. The complete sequence can be assembled from overlapping oligonucleotides prepared by standard methods and assembled into the complete sequence. See, e.g., Edge, Nature (1981) 292: 756; Nambair et al., Science (1984) 223 : 1299; and Jay et al., J. Biol. Chem. (1984) 259: 631 1.
Thus, particular nucleotide sequences can be obtained from vectors harboring the desired sequences or synthesized completely, or in part, using various oligonucleotide synthesis techniques known in the art, such as site-directed mutagenesis and polymerase chain reaction (PCR) techniques where appropriate. One method of obtaining nucleotide sequences encoding the desired vector elements is by annealing complementary sets of overlapping synthetic oligonucleotides produced in a conventional, automated polynucleotide synthesizer, followed by ligation with an appropriate DNA ligase and amplification of the ligated nucleotide sequence via PCR. See, e.g., Jayaraman et al., Proc. Natl. Acad. Sci. USA (1991) 88: 4084-4088. Additionally, oligonucleotide-directed synthesis (Jones et al., Nature (1986) 54: 75-82) , oligonucleotide directed mutagenesis of preexisting nucleotide regions (Riechmann et al., Nature (1988) 332: 323-327 and Verhoeyen et al., Science (1988) 239: 1534-1536) , and enzymatic filling-in of gapped oligonucleotides using T4 DNA polymerase (Queen et al., Proc. Natl. Acad. Sci. USA (1989) 86: 10029-10033) can be used.
The RNAs (or cRNAzymes) provided herein can be generated by incubating a vector provided herein under conditions permissive of transcription of the RNAs encoded by the vector. For example, in some embodiments, RNAs (or cRNAzymes) provided herein can be synthesized by incubating a vector provided herein that comprises an RNA polymerase promoter upstream of its 5’ duplex forming region and/or expression sequence with a compatible RNA polymerase enzyme under conditions permissive of in vitro transcription. In some embodiments, the vector is incubated inside of a cell by a bacteriophage RNA polymerase or in the nucleus of a cell by host RNA polymerase P.
In some embodiments, provided herein are methods of generating RNAs (or cRNAzymes) provided herein by performing in vitro transcription using a vector provided herein
as a template (e.g., a vector provided herein with an RNA polymerase promoter positioned upstream of the 5’ homology region) . In some embodiments, the resulting RNAs (or cRNAzymes) can be used to generate circular RNA.
The practice of the invention employs, unless otherwise indicated, conventional techniques in molecular biology, microbiology, genetic analysis, recombinant DNA, organic chemistry, biochemistry, PCR, oligonucleotide synthesis and modification, nucleic acid hybridization, and related fields within the skill of the art. These techniques are described in the references cited herein and are fully explained in the literature. See, e.g., Maniatis et al. (1982) MOLECULAR CLONING: A LABORATORY MANUAL, Cold Spring Harbor Laboratory Press; Sambrook et al. (1989) , MOLECULAR CLONING: A LABORATORY MANUAL, Second Edition, Cold Spring Harbor Laboratory Press; Sambrook et al. (2001) MOLECULAR CLONING: A LABORATORY MANUAL, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY; Ausubel et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley &Sons (1987 and annual updates) ; CURRENT PROTOCOLS IN IMMUNOLOGY, John Wiley &Sons (1987 and annual updates) Gait (ed. ) (1984) OLIGONUCLEOTIDE SYNTHESIS: A PRACTICAL APPROACH, IRL Press; Eckstein (ed. ) (1991) OLIGONUCLEOTIDES AND ANALOGUES: A PRACTICAL APPROACH, IRL Press; Birren et al. (eds. ) (1999) GENOME ANALYSIS: A LABORATORY MANUAL, Cold Spring Harbor Laboratory Press; Borrebaeck (ed. ) (1995) ; each of which is incorporated herein by reference in its entirety.
8.5 Circularization and circRNAs
Provided herein are also circRNAs prepared by the self-splicing of the RNAs (cRNAzymes) disclosed herein. In some embodiments, the circRNAs provided herein has higher functional stability than mRNA comprising the same expression sequence. In some embodiments, the circRNAs provided herein have higher functional stability than mRNA comprising the same expression sequence, 5moU modifications, an optimized UTR, a cap, and/or a polyA tail.
In some embodiments, the circRNAs provided herein have a functional half-life of at least 5 hours, 10 hours, 15 hours, 20 hours. 30 hours, 40 hours, 50 hours, 60 hours, 70 hours or 80 hours. In some embodiments, the circRNAs provided herein provided herein have a functional half-life of 5-80, 10-70, 15-60, and/or 20-50 hours. In some embodiments, the circRNAs provided herein provided herein have a functional half-life greater than (e.g., at least 1.5-fold greater than, at least 2-fold greater than) that of an equivalent linear RNAs encoding the same protein. In some embodiments, functional half-life can be assessed through the detection of functional protein synthesis.
In some embodiments, the circRNAs provided herein comprise one or more expression sequences and are configured for persistent expression in a cell of a subject in vivo. In some embodiments, the circRNAs is configured such that expression of the one or more expression
sequences in the cell at a later time point is equal to or higher than an earlier time point. In such embodiments, the expression of the one or more expression sequences can be either maintained at a relatively stable level or can increase over time. The expression of the expression sequences can be relatively stable for an extended period of time. For instance, in some cases, the expression of the one or more expression sequences in the cell over a time period of at least 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 23 or more days does not decrease by 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5%. In some cases, in some cases, the expression of the one or more expression sequences in the cell is maintained at a level that does not vary by more than 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5%for at least 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 23 or more days.
In some embodiments, the circRNAs provided herein provided herein have a higher magnitude of expression than equivalent linear mRNA, e.g., a higher magnitude of expression 24 hours after administration of RNA to cells. In some embodiments, the circRNAs provided herein provided herein have a higher magnitude of expression than mRNA comprising the same expression sequence, 5moU modifications, an optimized UTR, a cap, and/or a polyA tail. In some embodiments, the circRNAs provided herein provided herein have higher stability than an equivalent linear mRNA. In some embodiments, this can be shown by measuring receptor presence and density in vitro or in vivoi post electroporation, with time points measured over 1 week. In some embodiments, this can be shown by measuring RNA presence via qPCR or ISH.
In some embodiments, the circRNAs disclosed herein can be of any length or size. In some embodiments the circRNA is between 300 and 10000, 400 and 9000, 500 and 8000, 600 and 7000, 700 and 6000, 800 and 5000, 900 and 5000, 1000 and 5000, 1100 and 5000, 1200 and 5000, 1300 and 5000, 1400 and 5000, and/or 1500 and 5000 nucleotides in length.
In some embodiments, the circRNAs disclosed herein can be at least 300 nt, 400 nt, 500 nt, 600 nt, 700 nt, 800 nt, 900 nt, 1000 nt, 1100 nt, 1200 nt, 1300 nt, 1400 nt, 1500 nt, 2000 nt, 2500 nt, 3000 nt, 3500 nt, 4000 nt, 4500 nt, or 5000 nt in length. In some embodiments, the circRNA is no more than 3000 nt, 3500 nt, 4000 nt, 4500 nt, 5000 nt, 6000 nt, 7000 nt, 8000 nt, 9000 nt, or 10000 nt in length.
In some embodiments, circRNAs disclosed herein can be about 300 nt, 400 nt, 500 nt, 600 nt, 700 nt, 800 nt, 900 nt, 1000 nt, 1100 nt, 1200 nt, 1300 nt, 1400 nt, 1500 nt, 2000 nt, 2500 nt, 3000 nt, 3500 nt, 4000 nt, 4500 nt, 5000 nt, 6000 nt, 7000 nt, 8000 nt, 9000 nt, or 10000 nt in length.
In some embodiments, circRNAs disclosed herein can be at least 500 nucleotides in length, at least 1000 nucleotides in length, or at least 1500 nucleotides in length.
In some embodiments, the circRNAs are scarless. In some embodiments, the circRNA are near-scarless. As understood in the art, near-scarless circRNAs, especially scarless circRNAs, are less immunogenic than their counterparts that has a larger scar (i.e., extra sequence besides the target sequence) , which make them better suited for therapeutic uses.
In some embodiments, the circRNAs provided herein provided herein have modified RNA nucleotides and/or modified nucleosides. In some embodiments, the modified nucleoside is m5C (5-methylcytidine) . In another embodiment, the modified nucleoside is m5U (5-methyluridine) . In another embodiment, the modified nucleoside is m6A (N6-methyladenosine) . In another embodiment, the modified nucleoside is s2U (2-thiouridine) . In another embodiment, the modified nucleoside is Y (pseudouridine) . In another embodiment, the modified nucleoside is Um (2 '-O-methyluridine) . In other embodiments, the modified nucleoside is m! A (1-methyladenosine) ; m2A (2-methyladenosine) ; Am (2’ -0-methyladenosine) ; ms2 m6A (2-methylthio-N6-methyladenosine) ; i6A (N6-isopentenyladenosine) ; ms2i6A (2-methylthio-N6 isopentenyladenosine) ; io6A (N6- (cis-hydroxyisopentenyl) adenosine) ; ms2io6A (2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine) ; g6A (N6-glycinylcarbamoyladenosine) ; t6A (N6-threonylcarbamoyladeno sine) ; ms2t6A (2-methylthio-N6-threonyl carbamoyladenosine) ; m6t6A (N6-methyl-N6-threonylcarbamoyladenosine) ; hn6A (N6-hydroxynorvalylcarbamoyladenosine) ; ms2hn6A (2-methylthio-N6-hydroxynorvalyl carbamoyladenosine) ; Ar (p) (2’ -0-ribosyladenosine (phosphate) ) ; I (inosine) ; m1I (1-methylinosine) ; m1hn (1, 2’ -O-dimethylinosine) ; m3C (3-methylcytidine) ; Cm (2’ -0-methylcytidine) ; s2C (2-thiocytidine) ; ac4C (N4-acetylcytidine) ; (5-formylcytidine) ; m5Cm (5 , 2 '-O-dimethylcytidine) ; ac4Cm (N4-acetyl-2’ -O-methylcytidine) ; k2C (lysidine) ; m! G (1-methylguanosine) ; m2G (N2-methylguanosine) ; m7G (7-methylguanosine) ; Gm (2'-0-methylguanosine) ; m2 2G (N2, N2-dimethylguanosine) ; m2Gm (N2, 2’ -O-dimethylguanosine) ; m2 aGm (N2, N2, 2’ -O-trimethylguanosine) ; Gr (p) (2’ -0-ribosylguanosine (phosphate) ) ; yW (wybutosine) ; oayW (peroxywybutosine) ; OHyW (hydroxy wybutosine) ; OHyW* (undermodified hydroxywybutosine) ; imG (wyosine) ; mimG (methylwyosine) ; Q (queuosine) ; oQ (epoxyqueuosine) ; galQ (galactosyl-queuosine) ; manQ (mannosyl-queuosine) ; preQo (7-cyano-7-deazaguanosine) ; preQi (7-aminomethyl-7-deazaguanosine) ; G+ (archaeosine) ; D (dihydrouridine) ; m5Um (5, 2’ -0-dimethyluridine) ; s4U (4-thiouridine) ; m5s2U (5-methyl-2-thiouridine) ; s2Um (2-thio-2’ -0-methyluridine) ; acp3U (3- (3-amino-3-carboxypropyl) uridine) ; ho5U (5-hydroxyuridine) ; mo5U (5-methoxyuridine) ; cmo5U (uridine 5-oxy acetic acid) ; mcmo5U (uridine 5-oxy acetic acid methyl ester) ; chm5U (5- (carboxyhydroxymethyl) uridine) ) ; mchm5U (5- (carboxyhydroxymethyl) uridine methyl ester) ; mcm5U (5-methoxycarbonylmethyluridine) ; mcm5Um (5-methoxycarbonylmethyl-2’ -0-methyluridine) ; mcm5s2U (5-methoxycarbonylmethyl-2-thiouridine) ; nm5S2U (5-aminomethyl-2-thiouridine) ; mnm5U (5-methylaminomethyluridine) ;
mnm5s2U (5-methylaminomethyl-2-thiouridine) ; mnm5se2U (5-methylaminomethyl-2-selenouridine) ; ncm5U (5-carbamoylmethyluridine) ; ncm5Um (5-carbamoylmethyl-2 '-O-methyluridine) ; cmnm5U (5-carboxymethylaminomethyluridine) ; cmnm5Um (5-carboxymethylaminomethyl-2'-0-methyluridine) ; cmnm5s2U (5-carboxymethylaminomethyl-2-thiouridine) ; m6 2A (N6, N6-dimethyladenosine) ; Im (2’ -0-methylinosine) ; m4C (N4-methylcytidine) ; m4Cm (N4, 2’ -0-dimethylcytidine) ; hm5C (5-hydraxymethylcytidine) ; m3U (3-methyluridine) ; cm5U (5-carboxymethyluridine) ; m6Am (N6, 2’ -O-dimethyladenosine) ; m6 2Am (N6, N6, 0-2’ -trimethyladenosine) ; m2, 7G (N2, 7-dimethylguanosine) ; m2, 2, 7G (N2, N2, 7-trimethylguanosine) ; m3Um (3, 2’ -0-dimethyluridine) ; m5D (5-methyldihydrouridine) ; f5Cm (5-formyl-2’ -0-methylcytidine) ; m' Gm (l, 2’ -0-dimethylguanosine) ; m' A m (l, 2’ -0-dimethyladenosine) ; rm 5U (5-taurinomethyluridine) ; τm5s2U (5-taurinomethyl-2-thiouridine) ) ; imG-14 (4-demethylwyosine) ; imG2 (isowyosine) ; or ac6A (N6-acetyladenosine) .
In some embodiments, the modified nucleoside can include a compound selected from the group of: pyridin-4-one ribonucleoside, 5-aza-uridine, 2-thio-5-aza-uridine, 2-thiouridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxyuridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyluridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, l-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-1-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-deaza-pseudouridine, 2-thio-1-methyl-1-deaza-pseudouridine, dihydrouridine, dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-dihydropseudouridine, 2-methoxyuridine, 2-methoxy-4-thio-uridine, 4-methoxy-pseudouridine, 4-m ethoxy-2-thio-pseudouridine, 5-aza-cytidine, pseudoisocytidine, 3-methyl-cytidine, N4-acetylcytidine, 5-formylcytidine, N4-methylcytidine, 5-hydroxymethylcytidine, 1-methyl-pseudoisocytidine, pyrrolo-cytidine, pyrrolo-pseudoisocytidine, 2-thio-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoisocytidine, 4-thio-1-methyl-pseudoisocytidine, 4-thio-1-methyl-1-deaza-pseudoisocytidine, 1-methyl-1-deaza-pseudoisocytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-thio-zebularine, 2-methoxy-cytidine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoisocytidine, 4-methoxy-1-methyl-pseudoisocytidine, 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2, 6-diaminopurine, 7-deaza-8-aza-2, 6-diaminopurine, 1-methyladenosine, N6-methyladenosine, N6-isopentenyladenosine, N6- (cis-hydroxyisopentenyl) adenosine, 2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine, N6-glycinylcarbamoyladenosine, N6-threonylcarbamoyladenosine, 2-methylthio-N6-threonyl carbamoyladenosine, N6, N6-dimethyladenosine, 7-methyladenine, 2-methylthio-adenine, 2-methoxy-adenine, inosine, 1-methyl-inosine, wyosine, wybutosine, 7-deaza-guanosine, 7-deaza-
8-aza-guanosine, 6-thio-guanosine, 6-thio-7-deaza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methylinosine, 6-methoxy-guanosine, 1-methylguanosine, N2-methylguanosine, N2, N2-dimethylguanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, 1-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2, N2-dimethyl-6-thio-guanosine. In another embodiment, the modifications are independently selected from the group consisting of 5-methylcytosine, pseudouridine and 1-methylpseudouridine.
8.5.1 Circularization
The circRNAs provided herein are produced by self-splicing of the RNAs (or cRNAzymes) provided herein, which has group II intron activity. Thus, provided herein are also methods of making circRNAs comprising incubating the RNAs (or cRNAzymes) provided herein under conditions suitable for circularization (self-splicing) . In some embodiments, RNAs (or cRNAzymes) provided herein are produced by transcription using a vector provided herein as a template. In some embodiments, RNAs (or cRNAzymes) provided herein are produced by run-off transcription. In some embodiments, RNAs (or cRNAzymes) provided herein are produced by in vitro transcription.
Self-splicing of group II introns needs to be accomplished under high-salinity conditions, and does not require the introduction of GTP. As such, in some embodiments, the buffer used in the self-splicing reaction comprises 10 mM to 100 mM, such as 10 mM, 20 mM, 30 mM, 40 mM, 50 mM, 60 mM, 70 mM, 80 mM, 90 mM, and 100 mM divalent magnesium ions, such as MgCl2. The self-splicing buffer can also comprise 10 mM to 100 mM, such as 10 mM, 20 mM, 30 mM, 40 mM, 50 mM, 60 mM, 70 mM, 80 mM, 90 mM, and 100 mM NaCl.
In some embodiments, the self-splicing reaction is performed in vitro for about 5 min to about 1 h, such as about 5 min, about 10 min, about 15 min, about 20 min, about 25 min, about 30 min, about 35 min, about 40 min, about 45 min, about 50 min, about 55 min, and about 1 h.
In some embodiments, the self-splicing reaction is performed at a temperature between 20 and 60 ℃, between 20 and 50 ℃, between 20 and 40 ℃, between 20 and 30 ℃, between 30 and 40 ℃, between 40 and 50 ℃, or between 50 and 60 ℃.
In some embodiments, the precursor RNAs disclosed herein are capable of achieving a circularization rate of at least 30%, such as a circularization rate of at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, and at least 95%.
8.5.2 Purification
Provided herein are circRNAs prepared by self-splicing the RNAs (or cRNAzymes) disclosed herein. In some embodiments, the circRNAs disclosed herein are purified. Purification includes, but is not limited to, the removal of non-circularized linear RNAs, dsRNAs, and other unwanted components. In some embodiments, the circRNAs disclosed herein are purified before
being transfected into cells. The phosphate groups at both ends of a linear RNA and some dsRNAs might activate the RIG-1 signaling pathway, and the immune response resulted from RIG-1 signaling can lead to the degradation of exogenous RNAs, thus affecting the function of circular RNAs.
For illustrative purposes, the methods of purification can include any of the following: enzymatic treatment; chromatography, including but not limited to affinity column chromatography, reversed-phase silica gel column liquid chromatography, gel filtration chromatography, and gel exclusion liquid chromatography; and electrophoresis, including but not limited to gel electrophoresis such as agarose gel electrophoresis, and capillary electrophoresis; and any combination thereof. Methods for removing linear RNAs, for example, include enzymatic treatment, such as treatment with RNase R; and chromatography, such as high performance liquid chromatography (HPLC) . Methods for removing terminal phosphate groups, for example, include treatment with alkaline phosphatases, such as calf intestinal alkaline phosphatase (CIP) .
In some embodiments, purification comprises one or more of the following steps: phosphatase treatment, HPLC size exclusion purification, and RNase R digestion. In some embodiments, purification comprises the following steps in order: RNase R digestion, phosphatase treatment, and HPLC size exclusion purification. In some embodiments, purification comprises reverse phase HPLC. In some embodiments, a purified composition contains less double stranded RNA, DNA splints, triphosphorylated RNA, phosphatase proteins, protein ligases, capping enzymes and/or nicked RNA than unpurified RNA.
In some embodiments, the RNAs (or cRNAzymes) provided herein comprise a purification tag that is removed during self-splicing, which can be used for negative selection of the circRNAs. Specifically, a sample containing the circRNAs, such as the product of splicing reaction starting from the tagged linear precursors, can be mixed with a probe that is immobilized on a solid surface, wherein the tag-containing precursors or any other tag-containing impurities can bind to the probe and be removed from the solution, resulting in a circRNA solution substantially free from the precursor, intron and any other tag-containing impurities. In some embodiments, the purification tag is a 15-40 nt polynucleotides. A purification matrix includes, for example, magnetic resin or beads, silicone resin, Sephadex resin, affinity resin, nanoparticles, and nanomaterial surface or coated surfaces.
In some embodiments, the RNAs (or cRNAzymes) provided herein comprise a purification tag that is linked to the 5’ -end of the 3’ intron fragment. In some embodiments, the RNAs (or cRNAzymes) provided herein comprise a purification tag that is linked to the 3’ -end of the 5’ intron fragment. In some embodiments, the RNAs (or cRNAzymes) provided herein
comprise two purification tags, one linked to the 5’ -end of the 3’ intron fragment, and the other linked to the 3’ -end of the 5’ intron fragment.
8.6 Methods of uses
The circRNAs, cRNAzymes, and vectors encoding the cRNAzymes disclosed herein can be used for a variety of purposes, depending on the function of the resulting target sequences. For example, the circRNAs, cRNAzymes, and vectors encoding the cRNAzymes provided herein can be used for protein expression if the resulting target sequence comprises or consists of a protein coding sequence. The circRNAs, cRNAzymes, and vectors encoding the cRNAzymes provided herein can also be used for various functions such as regulating miRNA activity, neutralizing binding of RNA-binding proteins, and expressing aptamers.
In some embodiments, the methods for protein expression comprises translation of at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95%of the total length of the circRNAs into polypeptides. In some embodiments, the methods for protein expression comprises translation of the circRNAs into polypeptides of at least 5 amino acids, at least 10 amino acids, at least 15 amino acids, at least 20 amino acids, at least 50 amino acids, at least 100 amino acids, at least 150 amino acids, at least 200 amino acids, at least 250 amino acids, at least 300 amino acids, at least 400 amino acids, at least 500 amino acids, at least 600 amino acids, at least 700 amino acids, at least 800 amino acids, at least 900 amino acids, or at least 1000 amino acids. In some embodiments, the methods for protein expression comprises translation of the circRNAs into polypeptides of about 5 amino acids, about 10 amino acids, about 15 amino acids, about 20 amino acids, about 50 amino acids, about 100 amino acids, about 150 amino acids, about 200 amino acids, about 250 amino acids, about 300 amino acids, about 400 amino acids, about 500 amino acids, about 600 amino acids, about 700 amino acids, about 800 amino acids, about 900 amino acids, or about 1000 amino acids.
In some embodiments, the translation of the at least a region of the circRNAs disclosed herein takes place in vitro, such as rabbit reticulocyte lysate. In some embodiments, the translation of the at least a region of the circRNAs disclosed herein takes place in vivo, for instance, after transfection of a eukaryotic cell, or transformation of a prokaryotic cell such as a bacterial cell.
In some embodiments, the circRNAs, cRNAzymes, and vectors encoding the cRNAzymes disclosed herein can be used for expressing a protein in a cell. Accordingly, in some embodiments, provided herein are methods of protein expression comprising (a) subjecting the RNAs (or cRNAzymes) disclosed herein to a self-splicing circularization reaction to form a circRNA, wherein the circRNA comprises a protein-encoding target sequence, and (b) transfecting the cell with the circular RNA. Additionally, in some embodiments, provided herein
are methods of protein expression in a cell comprising transfecting the cell with a vector that encodes the RNAs (or cRNAzymes) disclosed herein, which can self-splice in the cell to form a circRNA comprising a protein-encoding target sequence. Furthermore, in some embodiments, the cRNAzymes can be prepared in vitro by e.g., in vitro transcription and transfected into the cell. Upon self-splicing in the cell, the cRNAzymes form circRNAs that encode the protein of interest, and the protein is then expressed in vivo.
The cell can be a eukaryotic cell. In some embodiments, the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e.g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e.g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell, NK cell, liver starlet cell, HEK293, HEK293T, HeLa, MCF7, PC3, A549, NCI-H727, HCT-116, MCF10A, HPReC, FHC, immortalized cell lines, primary cell, yeast cell, Saccharomyces cerevisiae, Pichia pastoris, bacteria cell, Escherichia coli, insect cell, Spodoptera frugiperda sf9, Mimic Sf9, sf21, or Drosophila S2.
The circRNAs disclosed herein can be delivered into cells or animals using any of a variety of delivery systems. For example, the delivery system is selected from one or more of a group of: liposomes, polyethyleneimine (PEI) , metal-organic frameworks (MOFs) , lipid nanoparticles (LNPs) , polycations, blood glycoproteins, red blood cell transport vehicles, Au nanoparticle (AuNP) vehicles, magnetic nanoparticle vehicles, carbon nanotubes, graphene molecular vehicles, quantum dot material vehicles, upconversion nanoparticles, layered double hydroxide material vehicles, silica nanoparticles, and calcium phosphate. In some embodiments, the circRNAs disclosed herein can be transfected into a cell using, for example, lipofection or electroporation.
In some embodiments, the present disclosure provides methods of in vivo expression of a protein of interest in a subject, comprising: administering a circRNA to a cell of the subject wherein the circRNA comprises the one or more expression sequences encoding the protein of interest; and expressing the one or more expression sequences from the circRNA in the cell. In some embodiments, the circRNA is configured such that expression of the one or more expression sequences in the cell at a later time point is equal to or higher than an earlier time point. Additionally, in some embodiments, provided herein are methods of in vivo expression of a protein of interest in a subject, comprising: administering to the subject a vector that encodes the cRNAzyme which can self-splice to form the circRNA that encodes the protein. Once the vector
is administered, it can be transcribed in vivo, and the transcribed cRNAzyme can self-splice to form the circRNA that encodes the protein of interest, allowing the in vivo expression of the protein. Furthermore, in some embodiments, the cRNAzymes can be prepared in vitro and administer to a subject. Upon in vivo self-splicing, the cRNAzymes form circRNAs that encode the protein of interest, and the protein is then expressed in vivo. Methods of the administering the circRNAs, cRNAzymes, or vectors disclosed herein to a subject are known to a person of ordinary skill in the art. Some exemplary methods are provided herein.
In some embodiments, the circRNA is configured such that expression of the one or more expression sequences in the cell over a time period of at least 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 23 or more days does not decrease by greater than about 40%. In some embodiments, the circRNA is configured such that expression of the one or more expression sequences in the cell is maintained at a level that does not vary by more than about 40%for at least 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 23 or more days. In some embodiments, the administration of the circRNA is conducted using any delivery method described herein. In some embodiments, the circular polyribonucleotide is administered to the subject via intravenous injection. In some embodiments, the administration of the circRNA includes, but is not limited to, prenatal administration, neonatal administration, postnatal administration, oral, by injection (e.g., intravenous, intraarterial, intraperotoneal, intradermal, subcutaneous and intramuscular) , by ophthalmic administration and by intranasal administration.
In some embodiments, the methods for protein expression comprise modification, folding, or other post-translation modification of the translation product. In some embodiments, the methods for protein expression comprise post-translation modification in vivo, e.g., via cellular machinery.
8.6.1 Pharmaceutical Compositions
In some embodiments, the circRNAs disclosed herein that are produced by the RNAs (or cRNAzymes) provided herein can be used as a therapeutic in the treatment of a disease or a condition.
As used herein, the term “treat” refers to executing a protocol or plan, which can include administering one or more drugs or active agents to a patient, in an effort to alleviate signs or symptoms of the disease or the recurrence of the disease. Desirable effects of treatment include decreasing the rate of disease progression, ameliorating or palliating the disease state, and remission, increased survival, improved quality of life or improved prognosis. Alleviation or prevention can occur prior to signs or symptoms of the disease or condition appearing, as well as after their appearance. As used herein and understood in the art, a “treatment” does not require complete alleviation of signs or symptoms, and does not require a cure.
As used herein, the term “therapeutic beneficial” or “therapeutically effective” when used in connection with a therapeutic refers to the property of the therapeutic that promotes or enhances the well-being of the subject. This includes, but is not limited to, a reduction in the frequency, severity, or rate of progression of the signs or symptoms of a disease. For example, treatment of cancer may involve, for example, a reduction in the size of a tumor, a reduction in the invasiveness of a tumor, reduction in the growth rate of the cancer, or a reduction in the rate of metastasis or recurrence. Treatment of cancer can also refer to prolonging survival of a subject with cancer.
As used herein, the term “pharmaceutical or pharmacologically acceptable” refers to molecular entities and compositions that do not produce an adverse, allergic, or other untoward reaction when administered to an animal, such as a human, as appropriate. For animal (e.g., human) administration, it will be understood that preparations should meet sterility, pyrogenicity, general safety, and purity standards as required, e.g., by the FDA Office of Biological Standards.
As used herein, the term “pharmaceutically acceptable carrier” includes any and all aqueous biocompatible solvents (e.g., saline solutions, phosphate buffered saline, parenteral vehicles, such as sodium chloride, Ringer's dextrose, etc. ) , antioxidants, preservatives (e.g., antibacterial or antifungal agents, anti-oxidants, chelating agents, and inert gases) , isotonic agents, such like materials and combinations thereof, as would be known to one of ordinary skill in the art. The pH and exact concentration of the various components in a pharmaceutical composition are adjusted according to well-known parameters.
In some embodiments, provided herein are compositions (e.g., pharmaceutical compositions) comprising a therapeutic agent provided herein. In some embodiments, the therapeutic agent is a circRNA provided herein. In some embodiments, the therapeutic agent is a vector provided herein, which encodes the RNA (or cRNAzyme) provided herein. In some embodiments, the therapeutic agent is a cell comprising a circRNA or vector provided herein. In some embodiments, the composition further comprises a pharmaceutically acceptable carrier. In some embodiments, the compositions provided herein comprise a therapeutic agent provided herein in combination with other pharmaceutically active agents or drugs. In some embodiments, the pharmaceutical composition comprises a cell provided herein or populations thereof.
With respect to pharmaceutical compositions, the pharmaceutically acceptable carrier can be any of those conventionally used and is limited only by chemico-physical considerations, such as solubility and lack of reactivity with the active agent (s) , and by the route of administration. The pharmaceutically acceptable carriers described herein, for example, vehicles, adjuvants, excipients, and diluents, are well-known to those skilled in the art and are readily available to the public. It is preferred that the pharmaceutically acceptable carrier be one which is
chemically inert to the therapeutic agent (s) and one which has no detrimental side effects or toxicity under the conditions of use.
The choice of carrier will be determined in part by the particular therapeutic agent, as well as by the particular method used to administer the therapeutic agent. Accordingly, there are a variety of suitable formulations of the pharmaceutical compositions provided herein.
In some embodiments, the pharmaceutical composition comprises a preservative. In some embodiments, suitable preservatives may include, for example, methylparaben, propylparaben, sodium benzoate, and benzalkonium chloride. Optionally, a mixture of two or more preservatives may be used. The preservative or mixtures thereof are typically present in an amount of about 0.0001%to about 2%by weight of the total composition.
In some embodiments, the pharmaceutical composition comprises a buffering agent. In some embodiments, suitable buffering agents may include, for example, citric acid, sodium citrate, phosphoric acid, potassium phosphate, and various other acids and salts. A mixture of two or more buffering agents optionally may be used. The buffering agent or mixtures thereof are typically present in an amount of about 0.001%to about 4%by weight of the total composition.
In some embodiments, the concentration of therapeutic agent in the pharmaceutical composition can vary, e.g., less than about 1%, or at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, or about 50%or more by weight, and can be selected primarily by fluid volumes, and viscosities, in accordance with the particular mode of administration selected.
The following formulations for oral, aerosol, parenteral (e.g., subcutaneous, intravenous, intraarterial, intramuscular, intradermal, intraperitoneal, and intrathecal) , and topical administration are merely exemplary and are in no way limiting. More than one route can be used to administer the therapeutic agents provided herein, and in some instances, a particular route can provide a more immediate and more effective response than another route.
Formulations suitable for oral administration can comprise or consist of (a) liquid solutions, such as an effective amount of the therapeutic agent dissolved in diluents, such as water, saline, or orange juice; (b) capsules, sachets, tablets, lozenges, and troches, each containing a predetermined amount of the active ingredient, as solids or granules; (c) powders; (d) suspensions in an appropriate liquid; and (e) suitable emulsions. Liquid formulations may include diluents, such as water and alcohols, for example, ethanol, benzyl alcohol and the polyethylene alcohols, either with or without the addition of a pharmaceutically acceptable surfactant. Capsule forms can be of the ordinary hard or soft shelled gelatin type containing, for example, surfactants, lubricants, and inert fillers, such as lactose, sucrose, calcium phosphate, and com starch. Tablet forms can include one or more of lactose, sucrose, mannitol, com starch, potato starch, alginic acid,
microcrystalline cellulose, acacia, gelatin, guar gum, colloidal silicon dioxide, croscarmellose sodium, talc, magnesium stearate, calcium stearate, zinc stearate, stearic acid, and other excipients, colorants, diluents, buffering agents, disintegrating agents, moistening agents, preservatives, flavoring agents, and other pharmacologically compatible excipients. Lozenge forms can comprise the therapeutic agent with a flavorant, usually sucrose, acacia or tragacanth. Pastilles can comprise the therapeutic agent with an inert base, such as gelatin and glycerin, or sucrose and acacia, emulsions, gels, and the like containing, in addition to, such excipients as are known in the art.
Formulations suitable for parenteral administration include aqueous and nonaqueous isotonic sterile injection solutions, which can contain antioxidants, buffers, bacteriostats, and solutes that render the formulation isotonic with the blood of the intended recipient, and aqueous and nonaqueous sterile suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, and preservatives. In some embodiments, the therapeutic agents provided herein can be administered in a physiologically acceptable diluent in a pharmaceutical carrier, such as a sterile liquid or mixture of liquids including water, saline, aqueous dextrose and related sugar solutions, an alcohol such as ethanol or hexadecyl alcohol, a glycol such as propylene glycol or polyethylene glycol, dimethylsulfoxide, glycerol, ketals such as 2, 2-dimethyl-1, 3 -dioxolane-4-methanol, ethers, poly (ethyleneglycol) 400, oils, fatty acids, fatty acid esters or glycerides, or acetylated fatty acid glycerides with or without the addition of a pharmaceutically acceptable surfactant such as a soap or a detergent, suspending agent such as pectin, carbomers, methylcellulose, hydroxypropylmethylcellulose, or carboxymethylcellulose, or emulsifying agents and other pharmaceutical adjuvants.
Oils, which can be used in parenteral formulations in some embodiments, include petroleum, animal oils, vegetable oils, or synthetic oils. Specific examples of oils include peanut, soybean, sesame, cottonseed, com, olive, petrolatum, and mineral oil. Suitable fatty acids for use in parenteral formulations include oleic acid, stearic acid, and isostearic acid. Ethyl oleate and isopropyl myristate are examples of suitable fatty acid esters.
Suitable soaps for use in some embodiments of parenteral formulations include fatty alkali metal, ammonium, and triethanolamine salts, and suitable detergents include (a) cationic detergents such as, for example, dimethyl dialkyl ammonium halides and alkyl pyridinium halides, (b) anionic detergents such as, for example, alkyl, aryl, and olefin sulfonates, alky, olefin, ether, and monoglyceride sulfates, and sulfosuccinates, (c) nonionic detergents such as, for example, fatty amine oxides, fatty acid alkanolamides, and polyoxyethylenepolypropylene copolymers, (d) amphoteric detergents such as, for example, alkyl-b -aminopropionates , and 2-alkyl-imidazoline quaterary ammonium salts, and (e) mixtures thereof.
In some embodiments, the parenteral formulations will contain, for example, from about 0.5%to about 25%by weight of the therapeutic agent in solution. Preservatives and buffers may be used. In order to minimize or eliminate irritation at the site of injection, such compositions may contain one or more nonionic surfactants having, for example, a hydrophile-lipophile balance (HLB) of from about 12 to about 17. The quantity of surfactant in such formulations will typically range, for example, from about 5%to about 15%by weight. Suitable surfactants include polyethylene glycol, sorbitan fatty acid esters such as sorbitan monooleate, and high molecular weight adducts of ethylene oxide with a hydrophobic base formed by the condensation of propylene oxide with propylene glycol. The parenteral formulations can be presented in unit-dose or multi-dose sealed containers, such as ampoules or vials, and can be stored in a freeze-dried (lyophilized) condition requiring only the addition of a sterile liquid excipient, for example, water, for injections, immediately prior to use. Extemporaneous injection solutions and suspensions can be prepared from sterile powders, granules, and tablets of the kind previously described.
In some embodiments, injectable formulations are provided herein. The requirements for effective pharmaceutical carriers for injectable compositions are well-known to those of ordinary skill in the art (see, e.g., PHARMACEUTICS AND PHARMACY PRACTICE, J. B. Lippincott Company, Philadelphia, PA, Banker and Chalmers, eds., pages 238-250 (1982) , and ASHP Handbook on Injectable Drugs, Toissel, 4th ed., pages 622-630 (1986) ) .
In some embodiments, topical formulations are provided herein. Topical formulations, including those that are useful for transdermal drug release, are suitable in the context of certain embodiments provided herein for application to skin. In some embodiments, the therapeutic agent alone or in combination with other suitable components, can be made into aerosol formulations to be administered via inhalation. These aerosol formulations can be placed into pressurized acceptable propellants, such as dichlorodifluoromethane, propane, nitrogen, and the like. They also may be formulated as pharmaceuticals for non-pressured preparations, such as in a nebulizer or an atomizer. Such spray formulations also can be used to spray mucosa.
In some embodiments, the therapeutic agents provided herein can be formulated as inclusion complexes, such as cyclodextrin inclusion complexes, or liposomes. Liposomes can serve to target the therapeutic agents to a particular tissue. Liposomes also can be used to increase the half-life of the therapeutic agents. Many methods are available for preparing liposomes, as described in, for example, Szoka et al., Ann. Rev. Biophys. Bioeng., 9, 467 (1980) and U.S. Patents 4,235,871, 4,501,728, 4,837,028, and 5,019,369.
In some embodiments, the therapeutic agents provided herein are formulated in time-released, delayed release, or sustained release delivery systems such that the delivery of the composition occurs prior to, and with sufficient time to cause, sensitization of the site to be
treated. Such systems can avoid repeated administrations of the therapeutic agent, thereby increasing convenience to the subject and the physician, and can be particularly suitable for certain composition embodiments provided herein. In one embodiment, the compositions of the disclosure are formulated such that they are suitable for extended-release of the circRNA contained therein. Such extended-release compositions may be conveniently administered to a subject at extended dosing intervals. For example, in one embodiment, the compositions of the present disclosure are administered to a subject twice a day, daily or every other day. In an embodiment, the compositions of the present disclosure are administered to a subject twice a week, once a week, every ten days, every two weeks, every three weeks, every four weeks, once a month, every six weeks, every eight weeks, every three months, every four months, every six months, every eight months, every nine months or annually.
In some embodiments, a protein encoded by the circRNAs described herein is produced by a target cell for sustained amounts of time. For example, the protein can be produced for more than one hour, more than four, more than six, more than 12, more than 24, more than 48 hours, or more than 72 hours after administration. In some embodiments the therapeutic product is expressed at a peak level about six hours after administration. In some embodiments the expression of the therapeutic product is sustained at least at a therapeutic level. In some embodiments the therapeutic product is expressed at least at a therapeutic level for more than one, more than four, more than six, more than 12, more than 24, more than 48, or more than 72 hours after administration. In some embodiments, the therapeutic product is detectable at a therapeutic level in patient serum or tissue (e.g., liver or lung) . In some embodiments, the level of detectable therapeutic product is from continuous expression from the circRNA composition over periods of time of more than one, more than four, more than six, more than 12, more than 24, more than 48, or more than 72 hours after administration.
In some embodiments, a protein encoded by a circRNA described herein is produced at levels above normal physiological levels. The level of protein can be increased as compared to a control. In some embodiments, the control is the baseline physiological level of the therapeutic product in a normal individual or in a population of normal individuals. In other embodiments, the control is the baseline physiological level of the therapeutic product in an individual having a deficiency in the relevant protein or polypeptide or in a population of individuals having a deficiency in the relevant protein or polypeptide. In some embodiments, the control can be the normal level of the relevant protein or polypeptide in the individual to whom the composition is administered. In other embodiments, the control is the expression level of the therapeutic product upon other therapeutic intervention, e.g., upon direct injection of the corresponding therapeutic product, at one or more comparable time points.
In some embodiments, the levels of a protein encoded by a circRNA described herein are detectable at 3 days, 4 days, 5 days, or 1 week or more after administration. Increased levels of secreted protein may be observed in the serum and/or in a tissue (e.g., liver or lung) .
In some embodiments, the method yields a sustained circulation half-life of a protein encoded by a circRNA described herein. For example, the protein can be detected for hours or days longer than the half-life observed via subcutaneous injection of the protein or mRNA encoding the protein. In some embodiments, the half-life of the protein is 1 day, 2 days, 3 days, 4 days, 5 days, or 1 week or more.
Many types of release delivery systems are available and known to those of ordinary skill in the art. They include polymer based systems such as poly (lactide-glycolide) , copolyoxalates, polycaprolactones, polyesteramides, polyorthoesters, polyhydroxybutyiic acid, and polyanhydrides. Microcapsules of the foregoing polymers containing drugs are described in, for example, U.S. Patent 5,075,109. Delivery systems also include non-polymer systems that are lipids including sterols such as cholesterol, cholesterol esters, and fatty acids or neutral fats such as mono-di-and tri-glycerides; hydrogel release systems; sylastic systems; peptide based systems: wax coatings; compressed tablets using conventional binders and excipients; partially fused implants; and the like. Specific examples include, but are not limited to: (a) erosional systems in which the active composition is contained in a form within a matrix such as those described in U.S. Patents 4,452,775, 4,667,014, 4,748,034, and 5,239,660 and (b) diffusional systems in which an active component permeates at a controlled rate from a polymer such as described in U.S. Patents 3,832,253 and 3,854,480. In addition, pump-based hardware delivery systems can be used, some of which are adapted for implantation.
In some embodiments, the therapeutic agent can be conjugated either directly or indirectly through a linking moiety to a targeting moiety. Methods for conjugating therapeutic agents to targeting moieties is known in the art. See, for instance, Wadwa et al., J. Drug Targeting 3: 111 (1995) and U.S. Patent 5,087,616.
In some embodiments, the therapeutic agents provided herein are formulated into a depot form, such that the manner in which the therapeutic agent is released into the body to which it is administered is controlled with respect to time and location within the body (see, for example, U.S. Patent 4,450,150) . Depot forms of therapeutic agents can be, for example, an implantable composition comprising the therapeutic agents and a porous or non-porous material, such as a polymer, wherein the therapeutic agents are encapsulated by or diffused throughout the material and/or degradation of the non-porous material. The depot is then implanted into the desired location within the body and the therapeutic agents are released from the implant at a predetermined rate.
8.6.2 Target Cells
Provided herein are cells comprising the RNAs (or cRNAzymes) provided herein. Provided herein are also cells comprising the circRNAs disclosed herein. Also provided herein are methods of delivering the RNAs (or cRNAzymes) or circRNAs disclosed herein to a cell. As used herein, the term “target cell” refers to the cell or type of cells to which the RNAs (or cRNAzymes) or circRNAs disclosed herein are intended to deliver.
In some embodiments, the RNAs (or cRNAzymes) or circRNAs disclosed herein can be formulated using liposomes, lipoplexes, lipid nanoparticles, polymer based delivery systems, and viral vectors. In some embodiments, the RNAs (or cRNAzymes) or circRNAs can be formulated in a lipid nanoparticle such as those described in WO2012170930, herein incorporated by reference in its entirety. In some embodiments, the lipid can be a cleavable lipid such as those described in WO2012170889, herein incorporated by reference in its entirety. In some embodiments, the pharmaceutical compositions of the RNAs (or cRNAzymes) or circRNAs can include at least one of the PEGylated lipids described in WO2012099755, herein incorporated by reference. In some embodiments, a lipid nanoparticle formulation can be formulated by the methods described in WO2011127255 or WO2008103276, each of which is herein incorporated by reference in its entirety. A lipid nanoparticle can be coated or associated with a co-polymer such as, but not limited to, a block co-polymer, such as a branched polyether-polyamide block copolymer described in WO2013012476, herein incorporated by reference in its entirety. Liposomes, lipoplexes, or lipid nanoparticles can be used to improve the efficacy of the RNAs (or cRNAzymes) or circRNAs directed protein production as these formulations may be able to increase cell transfection by the polynucleotide and circRNA, increase the in vivo or in vitro half-life of the polynucleotide and circRNA, and/or allow for controlled release.
The present disclosure contemplates the discriminatory targeting of target cells and tissues by both passive and active targeting means. The phenomenon of passive targeting exploits the natural distribution patterns of a transfer vehicle in vivo without relying upon the use of additional excipients or means to enhance recognition of the transfer vehicle by target cells. For example, transfer vehicles which are subject to phagocytosis by the cells of the reticulo-endothelial system are likely to accumulate in the liver or spleen, and accordingly, may provide a means to passively direct the delivery of the compositions to such target cells.
Alternatively, the present disclosure contemplates active targeting, which involves the use of targeting moieties that can be bound (either covalently or non-covalently) to the transfer vehicle to encourage localization of such transfer vehicle at certain target cells or target tissues. For example, targeting can be mediated by the inclusion of one or more endogenous targeting moieties in or on the transfer vehicle to encourage distribution to the target cells or tissues.
Recognition of the targeting moiety by the target tissues actively facilitates tissue distribution and cellular uptake of the transfer vehicle and/or its contents in the target cells and tissues (e.g., the inclusion of an apolipoprotein-E targeting ligand in or on the transfer vehicle encourages recognition and binding of the transfer vehicle to endogenous low density lipoprotein receptors expressed by hepatocytes) . As provided herein, the composition can comprise a moiety capable of enhancing affinity of the composition to the target cell. Targeting moieties can be linked to the outer bilayer of the lipid particle during formulation or post-formulation. These methods are well known in the art. In addition, some lipid particle formulations can employ fusogenic polymers such as PEAA, hemagglutinin, other lipopeptides (see U.S. Pat. No. 6,417,326, which is incorporated herein by reference) and other features useful for in vivo and/or intracellular delivery. In other some embodiments, the compositions of the present disclosure demonstrate improved transfection efficacies, and/or demonstrate enhanced selectivity towards target cells or tissues of interest. Contemplated therefore are compositions which comprise one or more moieties (e.g., peptides, aptamers, oligonucleotides, a vitamin or other molecules) that are capable of enhancing the affinity of the compositions and their nucleic acid contents for the target cells or tissues. Suitable moieties can optionally be bound or linked to the surface of the transfer vehicle. In some embodiments, the targeting moiety can span the surface of a transfer vehicle or be encapsulated within the transfer vehicle. Suitable moieties and are selected based upon their physical, chemical or biological properties (e.g., selective affinity and/or recognition of target cell surface markers or features) . Cell-specific target sites and their corresponding targeting ligand can vary widely. Suitable targeting moieties are selected such that the unique characteristics of a target cell are exploited, thus allowing the composition to discriminate between target and non-target cells. For example, compositions of the disclosure may include surface markers (e.g., apolipoprotein-B or apolipoprotein-E) that selectively enhance recognition of, or affinity to hepatocytes (e.g., by receptor-mediated recognition of and binding to such surface markers) . As an example, the use of galactose as a targeting moiety would be expected to direct the compositions of the present disclosure to parenchymal hepatocytes, or alternatively the use of mannose containing sugar residues as a targeting ligand would be expected to direct the compositions of the present disclosure to liver endothelial cells (e.g., mannose containing sugar residues that may bind preferentially to the asialoglycoprotein receptor present in hepatocytes) . (See Hillery A M, et al. “Drug Delivery and Targeting: For Pharmacists and Pharmaceutical Scientists” (2002) Taylor &Francis, Inc. ) The presentation of such targeting moieties that have been conjugated to moieties present in the transfer vehicle (e.g., a lipid nanoparticle) therefore facilitate recognition and uptake of the compositions of the present disclosure in target cells and tissues. Examples of
suitable targeting moieties include one or more peptides, proteins, aptamers, vitamins and oligonucleotides.
In some embodiments, the RNAs (or cRNAzymes) or circRNAs disclosed herein are formulated according to a process described in U.S. Pub. No. 20180153822. In some embodiments, the present disclosure provides a process of encapsulating the RNAs (or cRNAzymes) or circRNAs in lipid nanoparticles comprising the steps of forming lipids into pre-formed lipid nanoparticles (i.e., formed in the absence of RNA) and then combining the pre-formed lipid nanoparticles with RNA. In some embodiments, the formulation process results in an RNA formulation with higher potency (peptide or protein expression) and higher efficacy (improvement of a biologically relevant endpoint) both in vitro and in vivo with potentially better tolerability as compared to the same RNA formulation prepared without the step of preforming the lipid nanoparticles (e.g., combining the lipids directly with the RNA) .
In some embodiments, transfer vehicles are formulated and/or targeted as described in Shobaki N et al., Int J Nanomedicine 2018; 13: 8395-8410. In some embodiments, a transfer vehicle is made up of 3 lipid types. In some embodiments, a transfer vehicle is made up of 4 lipid types. In some embodiments, a transfer vehicle is made up of 5 lipid types. In some embodiments, a transfer vehicle is made up of 6 lipid types.
For certain cationic lipid nanoparticle formulations of RNA, in order to achieve high encapsulation of RNA, the RNA in buffer (e.g., citrate buffer) has to be heated. In those processes or methods, the heating is required to occur before the formulation process (i.e., heating the separate components) as heating post-formulation (post-formation of nanoparticles) does not increase the encapsulation efficiency of the RNA in the lipid nanoparticles. In contrast, in some embodiments, the order of heating of RNA does not appear to affect the RNA encapsulation percentage. In some embodiments, no heating (i.e., maintaining at ambient temperature) of one or more of the solution comprising the pre formed lipid nanoparticles, the solution comprising the RNA and the mixed solution comprising the lipid nanoparticle encapsulated RNA is required to occur before or after the formulation process.
RNA can be provided in a solution to be mixed with a lipid solution such that the RNA may be encapsulated in lipid nanoparticles. A suitable RNA solution can be any aqueous solution containing RNA to be encapsulated at various concentrations. For example, a suitable RNA solution can contain an RNA at a concentration of or greater than about 0.01 mg/ml, 0.05 mg/ml, 0.06 mg/ml, 0.07 mg/ml, 0.08 mg/ml, 0.09 mg/ml, 0.1 mg/ml, 0.15 mg/ml, 0.2 mg/ml, 0.3 mg/ml, 0.4 mg/ml, 0.5 mg/ml, 0.6 mg/ml, 0.7 mg/ml, 0.8 mg/ml, 0.9 mg/ml, or 1.0 mg/ml. In some embodiments, a suitable RNA solution can contain an RNA at a concentration in a range from about 0.01-1.0 mg/ml, 0.01-0.9 mg/ml, 0.01-0.8 mg/ml, 0.01-0.7 mg/ml, 0.01-0.6 mg/ml, 0.01-
0.5 mg/ml, 0.01-0.4 mg/ml, 0.01-0.3 mg/ml, 0.01-0.2 mg/ml, 0.01-0.1 mg/ml, 0.05-1.0 mg/ml, 0.05-0.9 mg/ml, 0.05-0.8 mg/ml, 0.05-0.7 mg/ml, 0.05-0.6 mg/ml, 0.05-0.5 mg/ml, 0.05-0.4 mg/ml, 0.05-0.3 mg/ml, 0.05-0.2 mg/ml, 0.05-0.1 mg/ml, 0.1-1.0 mg/ml, 0.2-0.9 mg/ml, 0.3-0.8 mg/ml, 0.4-0.7 mg/ml, or 0.5-0.6 mg/ml.
Typically, a suitable RNA solution can also contain a buffering agent and/or salt. Generally, buffering agents can include HEPES, ammonium sulfate, Tris, sodium bicarbonate, sodium citrate, sodium acetate, potassium phosphate or sodium phosphate In some embodiments, a suitable concentration of the buffering agent can be in a range from about 0.1 mM to 100 mM, 0.5 mM to 90 mM, 1.0 mM to 80 mM, 2 mM to 70 mM, 3 mM to 60 mM, 4 mM to 50 mM, 5 mM to 40 mM, 6 mM to 30 mM, 7 mM to 20 mM, 8 mM to 15 mM, or 9 to 12 mM.
Exemplary salts can include sodium chloride, magnesium chloride, and potassium chloride. In some embodiments, suitable concentration of salts in an RNA solution can be in a range from about 1 mM to 500 mM, 5 mM to 400 mM, 10 mM to 350 mM, 15 mM to 300 mM, 20 mM to 250 mM, 30 mM to 200 mM, 40 mM to 190 mM, 50 mM to 180 mM, 50 mM to 170 mM, 50 mM to 160 mM, 50 mM to 150 mM, or 50 mM to 100 mM.
In some embodiments, a suitable RNA solution can have a pH in a range from about 3.5-6.5, 3.5-6.0, 3.5-5.5, 3.5-5.0, 3.5-4.5, 4.0-5.5, 4.0-5.0, 4.0-4.9, 4.0-4.8, 4.0-4.7, 4.0-4.6, or 4.0-4.5.
Various methods can be used to prepare an RNA solution suitable for the present disclosure. In some embodiments, RNA may be directly dissolved in a buffer solution described herein. In some embodiments, an RNA solution can be generated by mixing an RNA stock solution with a buffer solution prior to mixing with a lipid solution for encapsulation. In some embodiments, an RNA solution can be generated by mixing an RNA stock solution with a buffer solution immediately before mixing with a lipid solution for encapsulation.
According to the present disclosure, a lipid solution contains a mixture of lipids suitable to form transfer vehicles for encapsulation of RNA. In some embodiments, a suitable lipid solution is ethanol based. For example, a suitable lipid solution may contain a mixture of desired lipids dissolved in pure ethanol (i.e., 100%ethanol) . In some embodiments, a suitable lipid solution is isopropyl alcohol based. In some embodiments, a suitable lipid solution is dimethylsulfoxide-based. In some embodiments, a suitable lipid solution is a mixture of suitable solvents including, but not limited to, ethanol, isopropyl alcohol and dimethylsulfoxide.
A suitable lipid solution can contain a mixture of desired lipids at various concentrations. In some embodiments, a suitable lipid solution may contain a mixture of desired lipids at a total concentration in a range from about 0.1-100 mg/ml, 0.5-90 mg/ml, 1.0-80 mg/ml, 1.0-70 mg/ml, 1.0-60 mg/ml, 1.0-50 mg/ml, 1.0-40 mg/ml, 1.0-30 mg/ml, 1.0-20 mg/ml, 1.0-15 mg/ml, 1.0-10 mg/ml, 1.0-9 mg/ml, 1.0-8 mg/ml, 1.0-7 mg/ml, 1.0-6 mg/ml, or 1.0-5 mg/ml.
Any desired lipids can be mixed at any ratios suitable for encapsulating RNAs. In some embodiments, a suitable lipid solution contains a mixture of desired lipids including cationic lipids, helper lipids (e.g., non-cationic lipids and/or cholesterol lipids) and/or PEGylated lipids. In some embodiments, a suitable lipid solution contains a mixture of desired lipids including one or more cationic lipids, one or more helper lipids (e.g., non-cationic lipids and/or cholesterol lipids) and one or more PEGylated lipids.
In some embodiments, the RNAs (or cRNAzymes) or circRNAs disclosed herein are formulated using viral vectors. Viral vectors can be derived from a variety of viruses including adenovirus, adeno-associated virus, lentivirus (e.g., HIV, FIV, and EIAV) , and herpes virus. Examples of commercially available viral vectors include pSilencer adeno (Ambion, Austin, Tex. ) and pLenti6/BLOCK-iTTM-DEST (Invitrogen, Carlsbad, Calif. ) . Selection of viral vectors, methods for expressing the RNAs (or cRNAzymes) from the vector and methods of delivering the viral vector are within the ordinary skill of one in the art. In some embodiments, the viral vector is a recombinant AAV (rAAV) vector, such as those known in the art (PMID: 30245471, PMID: 33614232) .
The present disclosure also provides a delivery system comprising the RNAs (or cRNAzymes) or circRNAs disclosed herein. In some embodiments, the delivery system is any one of a liposome, a nanoparticle, a polymer based delivery system or a ligand-conjugate delivery system. In some embodiments, the ligand-conjugate delivery system comprises one or more of an antibody, a peptide, a sugar moiety, or a combination thereof.
In some embodiments, the delivery system of the present disclosure comprises nanoparticles comprising the RNAs (or cRNAzymes) or circRNAs disclosed herein.
In some embodiments, the nanoparticle comprises a polymer-based nanoparticle, a lipid-polymer based nanoparticle, a metal based nanoparticle, a carbon nanotube based nanoparticle, a nanocrystal or a polymeric micelle. In some embodiments, the polymer-based nanoparticle comprises a multiblock copolymer, a deblock copolymer, a polymeric micelle or a hyperbranched macromolecule. In some embodiments, the polymer-based nanoparticle comprises a multiblock copolymer a diblock copolymer. In some embodiments, the polymer-based nanoparticle is pH responsive. In some embodiments, the polymer-based nanoparticle further comprises a buffering component.
In some embodiments, the delivery system comprises a liposome. Liposomes are spherical vesicles having at least one lipid bilayer, and in some embodiments, an aqueous core. In some embodiments, the lipid bilayer of the liposome may comprise phospholipids. An exemplary but non-limiting example of a phospholipid is phosphatidylcholine, but the lipid bilayer may comprise additional lipids, such as phosphatidylethanolamine. Liposomes may be multilamellar,
i.e. consisting of several lamellar phase lipid bilayers, or unilamellar liposomes with a single lipid bilayer. Liposomes can be made in a particular size range that makes them viable targets for phagocytosis. Liposomes can range in size from 20 nm to 100 nm, 100 nm to 400 nm, 1 μM and larger, or 200 nm to 3 μM. Examples of lipidoids and lipid-based formulations are provided in U.S. Pub. No. 20090023673. In some embodiments, the one or more lipids are one or more cationic lipids.
In some embodiments, the liposome or the nanoparticle of the present disclosure comprises a micelle. A micelle is an aggregate of surfactant molecules. An exemplary micelle comprises an aggregate of amphiphilic macromolecules, polymers or copolymers in aqueous solution, wherein the hydrophilic head portions contact the surrounding solvent, while the hydrophobic tail regions are sequestered in the center of the micelle.
In some embodiments, the nanoparticle comprises a nanocrystal. Exemplary nanocrystals are crystalline particles with at least one dimension of less than 1000 nanometers, preferably of less than 100 nanometers.
In some embodiments, the nanoparticle comprises a polymer based nanoparticle. In some embodiments, the polymer comprises a multiblock copolymer, a diblock copolymer, a polymeric micelle or a hyperbranched macromolecule. In some embodiments, the particle comprises one or more cationic polymers. In some embodiments, the cationic polymer is chitosan, protamine, polylysine, polyhistidine, polyarginine or poly (ethylene) imine. In some embodiments, the one or more polymers contain the buffering component, degradable component, hydrophilic component, cleavable bond component or some combination thereof.
In some embodiments, the nanoparticles or some portion thereof are degradable. In some embodiments, the lipids and/or polymers of the nanoparticles are degradable.
In some embodiments, any of these delivery systems of the present disclosure can comprise a buffering component. In some embodiments, any of the of the present disclosure can comprise a buffering component and a degradable component. In some embodiments, any of the of the present disclosure can comprise a buffering component and a hydrophilic component. In some embodiments, any of the of the present disclosure can comprise a buffering component and a cleavable bond component. In some embodiments, any of the of the present disclosure can comprise a buffering component, a degradable component and a hydrophilic component. In some embodiments, any of the of the present disclosure can comprise a buffering component, a degradable component and a cleavable bond component. In some embodiments, any of the of the present disclosure can comprise a buffering component, a hydrophilic component and a cleavable bond component. In some embodiments, any of the of the present disclosure can comprise a buffering component, a degradable component, a hydrophilic component and a cleavable bond
component. In some embodiments, the particle is composed of one or more polymers that contain any of the aforementioned combinations of components.
In some embodiments, the delivery system comprises a ligand-conjugate delivery system. In some embodiments, the ligand-conjugate delivery system comprises one or more of an antibody, a peptide, a sugar moiety, lipid or a combination thereof.
In some embodiments, the RNAs (or cRNAzymes) or circRNAs disclosed herein are conjugated to, complexed to, or encapsulated by the one or more lipids or polymers of the delivery system. In some embodiments, the RNAs (or cRNAzymes) or circRNAs can be encapsulated in the hollow core of a nanoparticle. Alternatively, or in addition, the RNAs (or cRNAzymes) or circRNAs can be incorporated into the lipid or polymer based shell of the delivery system, for example via intercalation. Alternatively, or in addition, the RNAs (or cRNAzymes) or circRNAs can be attached to the surface of the delivery system. In some embodiments, the RNAs (or cRNAzymes) or circRNAs are conjugated to one or more lipids or polymers of the delivery system, e.g., via covalent attachment.
In some embodiments, the ligand conjugate delivery system further comprises a targeting agent. In some embodiments, the targeting agent comprises a peptide ligand, a nucleotide ligand, a polysaccharide ligand, a fatty acid ligand, a lipid ligand, a small molecule ligand, an antibody, an antibody fragment, an antibody mimetic or an antibody mimetic fragment.
In some embodiments, the delivery system of the present disclosure comprises a polymer based delivery system. In some embodiments, polymer based delivery system comprises a blending polymer. In some embodiments, the blending polymer is a copolymer comprising a degradable component and hydrophilic component. In some embodiments, the degradable component of the blending polymer is a polyester, poly (ortho ester) , poly (ethylene imine) , poly (caprolactone) , polyanhydride, poly (acrylic acid) , polyglycolide or poly (urethane) . In some embodiments, the degradable component of the blending polymer is poly (lactic acid) (PLA) or poly (lactic-co-glycolic acid) (PLGA) . In some embodiments, the hydrophilic component of the blending polymer is a polyalkylene glycol or a polyalkylene oxide. In some embodiments, the polyalkylene glycol is polyethylene glycol (PEG) . In some embodiments, the polyalkylene oxide is polyethylene oxide (PEO) .
In some embodiments, the delivery system of the present disclosure is a polymer based nanoparticle. Polymer based nanoparticles comprise one or more polymers. In some embodiments, the one or more polymers comprise a polyester, poly (ortho ester) , poly (ethylene imine) , poly (caprolactone) , polyanhydride, poly (acrylic acid) , polyglycolide or poly (urethane) . In some embodiments, the one or more polymers comprise poly (lactic acid) (PLA) or poly (lactic-co-glycolic acid) (PLGA) . In some embodiments, the one or more polymers comprise poly (lactic-co-
glycolic acid) (PLGA) . In some embodiments, the one or more polymers comprise poly (lactic acid) (PLA) . In some embodiments, the one or more polymers comprise polyalkylene glycol or a polyalkylene oxide. In some embodiments, the polyalkylene glycol is polyethylene glycol (PEG) or the polyalkylene oxide is polyethylene oxide (PEO) .
In some embodiments, the polymer-based nanoparticle comprises poly (lactic-co-glycolic acid) PLGA polymers. In some embodiments, the PLGA nanoparticle further comprises a targeting agent, as described herein.
In some embodiments, the delivery system of the present disclosure is a nanoparticle of average characteristic dimension of less than about 500 nm, 400 nm, 300 nm, 250 nm, 200 nm, 180 nm, 150 nm, 120 nm, 100 nm, 90 nm, 80 nm, 70 nm, 60 nm, 50 nm, 40 nm, 30 nm or 20 nm. In some embodiments, the nanoparticle has an average characteristic dimension of 10 nm, 20 nm, 30 nm, 40 nm, 50 nm, 60 nm, 70 nm, 80 nm, 90 nm, 100 nm, 120 nm, 150 nm, 180 nm, 200 nm, 250 nm or 300 nm. In some embodiments, the nanoparticle has an average characteristic dimension of 10-500 nm, 10-400 nm, 10-300 nm, 10-250 nm, 10-200 nm, 10-150 nm, 10-100 nm, 10-75 nm, 10-50 nm, 50-500 nm, 50-400 nm, 50-300 nm, 50-200 nm, 50-150 nm, 50-100 nm, 50-75 nm, 100-500 nm, 100-400 nm, 100-300 nm, 100-250 nm, 100-200 nm, 100-150 nm, 150-500 nm, 150-400 nm, 150-300 nm, 150-250 nm, 150-200 nm, 200-500 nm, 200-400 nm, 200-300 nm, 200-250 nm, 200-500 nm, 200-400 nm or 200-300 nm.
In some embodiments, the target cells can be deficient in a protein or enzyme of interest. For example, where it is desired to deliver a nucleic acid to a hepatocyte, the hepatocyte represents the target cell. In some embodiments, some cells can be preferentially, or specifically, targeted. In some embodiments, the preferentially or specifically targeted cells can be, for example, hepatocytes, epithelial cells, hematopoietic cells, epithelial cells, endothelial cells, lung cells, bone cells, stem cells, mesenchymal cells, neural cells (e.g., meninges, astrocytes, motor neurons, cells of the dorsal root ganglia and anterior horn motor neurons) , photoreceptor cells (e.g., rods and cones) , retinal pigmented epithelial cells, secretory cells, cardiac cells, adipocytes, vascular smooth muscle cells, cardiomyocytes, skeletal muscle cells, beta cells, pituitary cells, synovial lining cells, ovarian cells, testicular cells, fibroblasts, B cells, T cells, dendritic cells, macrophages, reticulocytes, leukocytes, granulocytes and tumor cells, NK cells, liver starlet cells, HEK293, HEK293T, HeLa, MCF7, PC3, A549, NCI-H727, HCT-116, MCF10A, HPReC, FHC or other immortalized cell lines or primary cell lines.
In some embodiments, the compositions provided herein that comprise the RNAs (or cRNAzymes) or circRNAs disclosed herein can be optimized for certain target cells. The compositions provided herein can be prepared to preferentially distribute to and/or optimized for target cells such as in the heart, lungs, kidneys, liver, and spleen. In some embodiments, the
compositions provided herein can distribute into the cells of the liver to facilitate the delivery and the subsequent expression of the circRNA comprised therein by the cells of the liver (e.g., hepatocytes) . The targeted cells can function as a biological “reservoir” or “depot” capable of producing, and systemically excreting a functional protein or enzyme. Accordingly, in some embodiments, the transfer vehicle can target hepatocytes and/or preferentially distribute to the cells of the liver upon delivery. In some embodiments, following transfection of the target hepatocytes, the circRNA loaded in the vehicle are translated and a functional protein product is produced, excreted and systemically distributed. In other embodiments, cells other than hepatocytes (e.g., lung, spleen, heart, ocular, or cells of the central nervous system) can serve as a depot location for protein production.
In some embodiments, the compositions provided herein facilitate a subject’s endogenous production of one or more functional proteins and/or enzymes. In some embodiments, the transfer vehicles comprise circRNAs which encode a deficient protein or enzyme. Upon distribution of such compositions to the target tissues and the subsequent transfection of such target cells, the exogenous circRNAs loaded into the transfer vehicle (e.g., a lipid nanoparticle) can be translated in vivo to produce a functional protein or enzyme encoded by the exogenously administered circRNAs (e.g., a protein or enzyme in which the subject is deficient) . Accordingly, the compositions provided herein exploit a subject's ability to translate exogenously-or recombinantly-prepared circRNAs to produce an endogenously-translated protein or enzyme, and thereby produce (and where applicable excrete) a functional protein or enzyme. The expressed or translated proteins or enzymes can also be characterized by the in vivo inclusion of native post-translational modifications which may often be absent in recombinantly-prepared proteins or enzymes, thereby further reducing the immunogenicity of the translated protein or enzyme.
The administration of circRNAs encoding a deficient protein or enzyme avoids the need to deliver the nucleic acids to specific organelles within a target cell. Rather, upon transfection of a target cell and delivery of the nucleic acids to the cytoplasm of the target cell, the circRNA contents of a transfer vehicle can be translated and a functional protein or enzyme expressed.
In some embodiments, circRNAs provided herein comprise one or more miRNA binding sites. In some embodiments, circRNAs provided herein comprise one or more miRNA binding sites recognized by miRNA present in one or more non-target cells or non-target cell types (e.g., Kupffer cells) and not present in one or more target cells or target cell types (e.g., hepatocytes) . In some embodiments, circRNAs provided herein comprise one or more miRNA binding sites recognized by miRNA present in an increased concentration in one or more non-target cells or non-target cell types (e.g., Kupffer cells) compared to one or more target cells or target cell types
(e.g., hepatocytes) . miRNAs are thought to function by pairing with complementary sequences within RNA molecules, resulting in gene silencing.
In some embodiments, the target cells can be yeast cells, which include, but not limited to, Saccharomyces cerevisiae and Pichia pastoris. In some embodiments, the target cells can be bacteria cells, which include, but not limited to, Escherichia coli. In some embodiments, the target cells can be insect cells, which include, but not limited to, Spodoptera frugiperda sf9, Mimic Sf9, sf21, Drosophila S2. In some embodiments, the compositions provided herein are optimized for yeast cells, which include, but not limited to, Saccharomyces cerevisiae, Pichia pastoris.
In some embodiments, the compositions provided herein are optimized for a variety of bacteria cells, which include, but not limited to, Escherichia coli.
In some embodiments, the compositions provided herein are optimized for a variety of insect cells, which include, but not limited to, Spodoptera frugiperda sf9, Mimic Sf9, sf21, Drosophila S2.
The recombinant circRNAs or DNA vectors encoding RNAs (or cRNAzymes) provided herein can be introduced into a cell by any method, including, for example, by transfection, transformation, or transduction. The terms “transfection, ” transformation, ” and “transduction” are used interchangeably herein and refer to the introduction of one or more exogenous polynucleotides into a host cell by using physical or chemical methods. Many transfection techniques are known in the art and include, for example, calcium phosphate DNA co-precipitation (see, e.g., Murray E. J. (ed. ) , METHODS IN MOLECULAR BIOLOGY, Vol. 7, Gene Transfer and Expression Protocols, Humana Press (1991) ) ; DEAE-dextran; electroporation; cationic liposome-mediated transfection; tungsten particle-facilitated microparticle bombardment (Johnston, Nature, 346: 776-777 (1990) ) ; strontium phosphate DNA co-precipitation (Brash et al., Mol. Cell. Biol., 7: 2031-2034 (1987) ; and magnetic nanoparticle-based gene delivery (Dobson, J. Gene Ther, 13 (4) : 283-7 (2006) ) .
8.6.3 Articles of manufacture or kits
Articles of manufacture or kits comprising RNAs (or cRNAzymes) , vectors, or circRNAs disclosed herein are also provided herein. An article of manufacture or kit can further comprise a package insert comprising instructions. Suitable containers include, for example, bottles, vials, bags and syringes. The container may be formed from a variety of materials such as glass, plastic (such as polyvinyl chloride or poly olefin) , or metal alloy (such as stainless steel or hastelloy) . In some embodiments, the container holds the formulation and the label on, or associated with, the container may indicate directions for use. The article of manufacture or kit may further include other materials desirable from a commercial and user standpoint, including
other buffers, diluents, filters, needles, syringes, and package inserts with instructions for use. In some embodiments, the article of manufacture further includes one or more of another agent (e.g., a chemotherapeutic agent, and anti-neoplastic agent) . Suitable containers for the one or more agent include, for example, bottles, vials, bags and syringes.
8.7 Assays
A variety of methods and assays are available in the art to predict the secondary and even tertiary structure and determine whether an RNA molecule has group II intron self-splicing activity. Some of these methods and assays are provided below. Guided by teachings in the present disclosure, a person of ordinary skill in the art would be able to determine whether an RNA containing the sequence elements disclosed herein have self-splicing activity to form circRNAs.
8.7.1 Secondary structure analysis
Stem-loop structure is a type of an RNA secondary structure, which can be determined by any suitable polynucleotide folding algorithm. The energy associated with the secondary structure of a nucleic acid (e.g., an RNA) is free energy which incorporates both enthalpy and entropic contributions. The size of helix and different base-pairing techniques are used to minimize the free energy of RNA secondary structure. This energy is referred to as Gibbs free energy, or ΔG (kcal/mol) . Different techniques have been proposed, but “minimum free energy” is most widely used to predict the RNA secondary structure. Specifically, the structure having lowest Gibbs free energy provides more stability to the secondary structure and the stabilization of RNA molecules and the minimization of free energy can be obtained by stacking of base pairs in a helix. Loops and budges put the negative effect on the stability of the structure. In general, stability and accuracy of the secondary structure of a nucleic acid (e.g., an RNA) are measured based on the amount of free energy releases while forming the correct base pairs. As the negativity in free energy released by a structure increases, it gives a more stable sequence of base pairs. (Zuker M (1994) Prediction of RNS secondary structure by energy minimization, Volume 25 of COMPUTER ANALYSIS OF SEQUENCE DATA: PART II, Grifn AM, Grifn HG (eds) , Chapter 23, CRC Press, Inc., Totowa, NJ, pp 267-294. ) Various algorithms have been developed to calculate the MFE value of a nucleic acid, which can be either a global algorithm or a local (sliding window) algorithm. In some embodiments, the MFE value of a nucleic acid (e.g., an RNA) is computed using a global algorithm. In some embodiments, the MFE value of a nucleic acid (e.g., an RNA) is computed using a local (sliding window) algorithm.
Among the various algorithms available in the art, Dynamic programming (DP) is as an approach that can be used to compute the MFE value of RNA molecules in the absence of pseudoknots. The benchmarking of RNA folding algorithm named mFold is dependent on DP.
(Zuker and Stiegler, Nucleic Acids Res. 9 (1981) , 133-148. ) Nature inspired optimization algorithms such as evolutionary algorithms (EAs) , genetic algorithms (GAs) can also be used. Particle swarm optimization (PSO) techniques have also been adopted. Exemplary available algorithms for MFE value calculation include, for example, RnaPredict, RNAfold, HSRNAFold, SetPSO, HelixPSO, SARNAPredict, ACRNA, PSOfold, efold, and EAMD. RNAfold is an online folding algorithm developed by the Institute for Theoretical Chemistry at the University of Vienna using a centroid structure prediction algorithm (e.g., AR Gruber et al., 2008, Cell 106 (1) : 23-24; and PA Carr and GM Church, 2009, Nature Biotechnology 27 (12) : 1151-62) . RnaPredict applies selection, recombination, and mutation operators on to file the minimal energy structure. HSRNAFold algorithm uses the musical techniques for better searching results. SetPSO is based on PSO algorithm to solve the energy minimization problem in RNA. The HelixPSO algorithm works similar to RNAPredict algorithm and takes set of helices to obtain a secondary structure from RNA sequence. SARNAPredict is also a permutation-based algorithm like HelixPSO, which uses the method of cooling a substance of Simulated Annealing algorithm with a simple thermodynamic model. ACRNA uses ant colony optimizer and PSOfold are both PSO based algorithms. Additionally, environmental adaption method for dynamic environment (EAMD) is a meta-heuristic optimization algorithm. (e.g., Wiese and Glen (2003) Biosystems, 72 (1) : 29-41; Clote (2005) J Comput Biol 12 (1) : 83-101; Geem et al. (2001) Simulation 76 (2) : 60-68; Gruber et al., 2008, Cell 106 (1) : 23-24; Carr and Church, 2009, Nature Biotechnology 27 (12) : 1151-62; Mohsen et al (2010) An optimization algorithm based on harmony search for RNA secondary structure prediction. In: Geem ZW (eds) RECENT ADVANCES IN HARMONY SEARCH ALGORITHM. STUDIES IN COMPUTATIONAL INTELLIGENCE, vol 270. Springer, Berlin, Heidelberg; Geis and Middendorf (2011) Int J Intell Comput Cybern 4 (2) : 160-186; Liu et al. (2011) Chem Res Chin Univ 27 (1) : 108-112; Yu et al. (2010) J Bionic Eng 7 (4) : 382-389; Wu et al. (2011) A fuzzy adaptive particle swarm optimization for “RNA secondary structure” prediction. In: Information science and technology (ICIST) , 2011 international conference on 2011 Mar 26, IEEE, pp 1390-1393. ) Additional algorithms can be found in US Provisional Application No. 61/836, 080, which is incorporated herein by reference.
In some embodiments, whether an RNA can function as a group II intron is determined using an online predicting tool or a predicting software. An example of such online predicting tool is the online web server created by Zimmerly lab, University of Calgary.
8.7.2 In vitro self-splicing assay
Various in vitro assays are available in the art to test and confirm the self-splicing activity of a given cRNAzyme. The procedure below is described for illustrative purposes. The DNA sequence encoding a subject cRNAzyme is into a proper vector, e.g., pUC57. Vectors are
amplified and purified, and linearized using BamHI for in vitro transcription. Radiolabeled transcripts are prepared using T7 RNA polymerase, 5 mM MgCl2, 40 mM Tris-HCl pH 7.5, 0.05%Triton X-100, 10 μCi [α-31P] UTP (3000 Ci mmol-1) , 0.5mM UTP, and 1mM other NTPs. In vitro transcription reactions are done for 1 h at 37 ℃ followed by purification of the RNA product on a denaturing 4% (19: 1) polyacrylamide, 8M urea, 1X TBE gel. Radiolabeled RNA product is refolded in 40 mM Tris-HCl pH 7.5 through heating at 90 ℃ for 1 min followed by incubation in 40 mM Tris-HCl pH 7.5, 10mM MgCl2 for 15 min. To initiate self-splicing, RNA was mixed with an equal volume of 2X splicing buffer (2M NH4Cl, 40 mM Tris-HCl pH 7.5) . Reactions are quenched by mixing with an equal volume of 80%formamide, 100 mM EDTA. Splicing products are resolved using denaturing 4% (19: 1) polyacrylamide, 8M urea, 1× TBE gels. All splicing assays are done in triplicate.
The self-splicing products can be quantified using any methods known in the art. For example, splicing gels are exposed to storage phosphor screens and splicing products are quantitated using Quantity Analysis Software. Unequal loading can be accounted for by normalizing to an internal control RNA. Band intensity is determined by dividing the background subtracted intensity of each band by the number of uridine residues in the RNA sequence corresponding to the band. All band intensities are then normalized to an unspliced control to give fractional values of input RNA.
cDNA can be generated from the self-splicing products according to Takara RT-PCR manufacturers’ instructions. RNA samples (1 ug) , 5x gDNA eraser buffer, RT primer mix, 5x PrimeScript buffer2 and PrimeScript RT enzyme mix I can be added to synthesis cDNA at 37 ℃for 15 minutes. The reaction can be inactivated by heating to 85 ℃ for 5 seconds. cDNA products with 2x PrimeSTAR Max Premixc and qPCR primers (primer1 and primer 2 mixture) can be performed PCR by following protocol: 95 ℃ for 1 minutes as an initial denaturation step, followed by 30 cycles of denaturation 95 ℃ for 15 seconds, annealing at 55℃ for 15 seconds and elongation at 72℃ for 30 seconds, finally elongation stage is at 72℃ for 10minutes. The RT-PCR products can beresolved using 1%agarose gels and self-splicing site can be verified by sanger sequencing.
8.7.3 Electron Microscopy (EM)
EM is an effective tool to visualize the structure of a cRNAzyme. An exemplary EM procedure is provided below to determine the structure of a group II intron.
For negative staining EM, 4 μl of the solution containing the cRNAzyme is applied to a glow-discharged holey carbon grid that is pre-coated with a thin layer of continuous carbon film over the holes. After 1 min, the grid is washed consecutively with three droplets of 2% (w/v) uranyl acetate solution for total of 30 sec. After another 1 min, the residual stain is blotted off and
the grid was air-dried. The EM data for the negatively-stained specimen is collected on an FEI Tecnai-12 transmission electron microscope, equipped with a LaB6 filament and operated at 120 kV acceleration voltage. Images are collected at a nominal magnification of 68,000×, using a total dose of aboutand defocus value ranging from -1.0 to -1.5 μm on a Gatan Ultrascan4000 CCD camera, with a pixel size ofon the object scale. Total of 66 tilt-pair images of the specimen at 0° and 50° are recorded manually for the random-conical tilt 3D reconstruction.
For cryo-EM, frozen-hydrated specimens are prepared using the FEI Vitrobot Mark IV plunger. 4 μl of the diluted sample is placed on a glow-discharged holey carbon grid (Quantifoil Cu-Rh R1.2/1.3) pre-coated with continuous carbon film (with ~2 nm thickness) . The excess of solution from the grid is blotted for 2.0 s at 100%humidity at 22 ℃ before the grid is flash frozen into liquid ethane slush cooled at liquid nitrogen temperature. Cryo-EM data are collected on an FEI Titan Krios electron microscope, equipped with a Gatan K2 Summit direct-electron counting camera. The microscope is operated at 300 kV and images of the specimen are recorded with a defocus range of -1.2 to -3 μm at a calibrated magnification in super resolution mode of the K2 camera, yielding a pixel size of abouton the object scale. 1000 to 5000 movie stacks, each containing 32 sub-frames, are recorded using the semi-automated low-dose acquisition program UCSF-Image4, with an electron dose rate of 6.25 electrons perper second and total exposure time of 8 seconds.
8.8 Experimental
The examples provided below are for purposes of illustration only, which are not intended to be limiting unless otherwise specified. Thus, the invention should in no way be construed as being limited to the following examples, but rather, should be construed to encompass any and all variations which become evident as a result of the teaching provided herein.
8.8.1 Example 1. Confirmation of group II self-splicing activity of cRNAzymes
This example relates to a method for determining the in vitro self-splicing activity of cRANzymes provided herein.
A DNA sequence encoding a cRNAzyme provided herein is cloned into an expression vector, e.g., psiCHECK-2 (PromegaTM, C8021) . PCR amplification is performed to obtain template DNAs for transcription. The PCR reaction conditions are: 95℃ for 30 s, 60℃ for 20 s, and 72℃ for 60 s, for 23 to 25 cycles. The template DNAs obtained are extracted with phenol-chloroform at a volume ratio of 1 : 1, and then precipitated with 2.5 times by volume of absolute ethanol for purification.
The purified template DNAs are transcribed in vitro by T7 RNA polymerase. The transcripts are digested with DNase I at 37℃ for 30 min to degrade the PCR templates. The transcripts are then column purified to obtain high-purity RNAs.
The column-purified transcript RNAs are added to a self-splicing buffer (10, 20, 50 or 100 mM MgCl2, 50 mM NaCl, 40 mM Tris-HCl, pH = 7.5) for self-splicing reaction. The reaction conditions are: 95℃ for 1 min, 75℃ to 45℃ (-0.5℃, 15 sec/cycle, for 60 cycles in total) , holding at 45℃ with a buffer added, 45℃ for 5 min, and 53℃ for 15 to 30 min. Unspliced RNAs and spliced RNAs can be separated by electrophoresis. After the self-splicing reaction occur in vitro, 200 ng of the product is used for electrophoresis analysis to detect the self-splicing efficiency of group II introns.
8.8.2 Example 2. Expression of the Therapeutic Product Using circRNAs disclosed herein
The expression of circRNA products of different target sequences after transfection of cells is further tested. To minimize immunodegradation caused by linear RNAs, the RNA products are treated in the following three steps prior to transfection.
(1) RNase R treatment is performed to digest linear RNAs. The reaction conditions are at 37℃ for 30 min.
(2) CIP treatment is performed to remove phosphate groups at both ends of the linear RNAs. The reaction conditions are to add quick CIP (NEB) , and react at 37℃ for 30 min.
(3) HPLC purification is performed to remove small linear RNAs. HPLC conditions: gel exclusion chromatographic column: Waters BEH450A, column temperature: 40℃, flow rate: 1 min/ml, and elution conditions: 0 to 30 min, 100%buffer A (prepared with 10 mM Tris, 0.5 mM EDTA, and DEPC-treated water) .
The target RNAs are transfected using lipo RNAmax under the transfection conditions according to the supplier’s instructions, for 24 hours.
Using the construction methods described in the Example above, different target sequences are tested. To facilitate detection of protein expression, constructs comprise a fluorescent protein coding sequence.
For different target sequences, different methods are used to detect protein expression.
In the case that the translation product is GFP, the fluorescence is observed under a microscope. In addition, the cells are lysed with RIPA lysis solution (Beyotime) , and then the protein expression is detected by Western blotting.
In the case that the translation product is luciferase, the cells are first lysed with Passive lysis solution (PromegaTM) and then the protein expression is detected by a microplate reader using a luciferase detection kit (PromegaTM) .
8.8.3 Example 3. circRNAs mediate prolonged protein production
A major advantage of circular mRNAs is their superb stability because of lacking the free ends, therefore the circRNA should have good shelf life for protein expression compare to linear counterparts. To directly test this, both linear and circular mRNA encoding the Gaussia luciferase (Gluc) are synthesize and stored parallelly in pure water at room temperature for different days before transfecting them into 293 T cells and/or A549 cells. The activity of the circRNA to direct protein translation is tested in comparison to that of the linear mRNA. The prolonged protein translation is expected for the circRNA.
8.8.4 Example 4. Purified circRNAs direct robust translation of target proteins
The mRNA purity is found to be a key factor for the protein production and induction of innate immunity, as the removal of dsRNA by HPLC can eliminate immune activation and improves translation of linear nucleoside-modified mRNA (Kariko, K. et al., (2011) . To examine if the circRNAs produced using the cRNAzymes disclosed herein can induce innate immune response and cell toxicity, the circRNAs can be purified with gel purification or HPLC and measured if the circRNAs can induce cellular immune response upon transfection of circRNAs by testing cell survival and cytokine production (e.g., IL-10, IL-6, TNFα, MCP-1, IP-10, RIG-I, IFN-α2 and IFN-B1) . In-life examinations are monitored, such as viability, clinical observations, local tolerance, necropsy observations, cytokine detection. PK serum sample are collected at pre-dose (0 h) , 6 h, 1 day (24 h) , 2 day (48 h) , 3 day (72 h) , 5 day (120 h) , 7 day (168 h) , 10 day (240 h) .IVIS are collected at pre-dose (0 h) , 6 h, 1 day (24 h) , 2 day (48 h) , 3 day (72 h) , 5 day (120 h) , 7 day (168 h) , 10 day (240 h) .
8.8.5 Example 5. LNP encapsulated circRNAs direct robust protein production in mouse
An important question for the therapeutic application of circRNAs is whether the production of circRNAs can be scaled up reliably and how the reproducibility between different batch of production. To confirm the scalability of this system, the IVT and circularization reaction system can be expanded for 50 fold (from 20 μl into 1 ml) , and high circularization efficiency is tested. The efficacy is expected to stay essentially unchanged while the total amount of RNA products reach 7.5 mg in a single reaction.
The circRNAs produced herein can be further encapsulated with lipid nanoparticles (LNP) to for their in vivo delivery. The circRNAs in the aqueous solution are packed by ionizable cationic lipids, which form a nanoparticle with other lipid components such as DMG-PEG2000 and Cholesterol (see methods) , achieving a ~95%encapsulate efficiency with effective diameter at ~80 nm. Different ionizable cationic lipids can be tested in our formulation to encapsulate circRNAs that encode Gluc, and the resulting LNPs are into BALC/c mice, and/or C57BL/6J
mice through intramuscular (IM) or intraperitoneal (IP) injection (n=3 for each experimental group) . Three formulations using different ionizable cationic lipids (MC3, SM-102 and ALC-0315) are tested in this experiment. The expressions of luciferase are assayed either using the luciferase luminescence assay with serum or bioluminescence imaging of the animals.
8.8.6 Example 6. SDS-page and RT-PCR
cDNA synthesis is carried out using self-splicing products as per the manufacturer's protocol. Initially, 1 microgram of RNA samples are mixed with 5x gDNA eraser buffer, RT primer mix, 5x PrimeScript buffer 2, and PrimeScript RT enzyme mix I. The mixture is then incubated at 37℃ for 15 minutes to facilitate cDNA synthesis. Subsequently, the reaction is terminated by a brief heat treatment at 85℃ for 5 seconds.
For the subsequent PCR amplification, the cDNA products are combined with 2x PrimeSTAR Max Premix and a mixture of qPCR primers (primer1 and primer2) . The PCR protocol consists of an initial denaturation at 95℃ for 1 minute, followed by 30 cycles of denaturation at 95℃ for 15 seconds, annealing at 55℃ for 15 seconds, and elongation at 72℃for 30 seconds. The final elongation step is performed at 72℃ for 10 minutes.
The RT-PCR products are then resolved on 1%agarose gels. The PCR is designed to amplify either a 300 or 1000 base pair DNA fragment from the RT products, with the Cte backbone serving as a positive control. (the result is shown in FIGs. 14A and 14B) verification of the self-splicing site is achieved through Sanger sequencing. (see FIG. 14C)
9.Sequence Listing and Tables
Table 1.: Sequence Listing part 1
*As to the sequence of SEQ ID NO: 2 (Group II intron Cte_original) , the nucleotides from exon are represented by UPPER LETTERS. The nucleotides from exon and interacting with an EBS region are represented by UNDERLINED UPPER LETTERS. The six domains of the group II intron are represented by underlined lower letters. The EBS regions are indicated by italic bold underlined lower letters.
Table 2.: Group II Intron Sequences
*As to the sequences of SEQ ID NOs: 33-41 (Group II intron) , the nucleotides from exon are represented by UPPER LETTERS.
Table 3.: 3’ intron fragment
Table 4.: E2
Table 5.: E1
Table 6.: 5’ intron fragment
Table 7.: 5’ arm of target sequence
Table 8.: 3’ arm of target sequence
Table 9.: homology arm sequence
Table 10.: target sequence
Table 11.: amino acid sequence
Table 12.: EBS1
Table 13.: IBS1
Table 14.: δ and its upstream sequence
Table 15.: IBS3 and its downstream sequence
Table 16.: Group II intron sequence
*As to the sequence of Table 16 (SEQ ID NOs.: 135-142) , the nucleotides from exon are represented by UPPER LETTERS. The nucleotides from exon and interacting with an EBS region (IBS1, IBS3) are represented by UNDERLINED UPPER LETTERS. The six domains of the group II intron are each represented by underlined lower letters:
The bulged adenosine (A) on D6 is represented by an asterisk symbol (A*) . The EBS regions (EBS1,
EBS3) are indicated by italic bold underlined lower letters.The nucleotides from δ region is represented by bold underlined lower letters.
* As to the sequence of Table 16 (SEQ ID NOs.: 264) , The six domains of the group II intron are each represented by underlined lower letters:
Table 17.: Artificial Group II intron sequence
* As to the sequence of Table 17 (SEQ ID NOs.: 143-145) , the nucleotides from exon are represented by UPPER LETTERS. The nucleotides from exon and interacting with an EBS region (IBS1, IBS3) are represented by UNDERLINED UPPER LETTERS. The six domains of the group II intron are each represented by underlined lower letters:
The bulged adenosine (A) on D6 is represented by an asterisk symbol (A*) . The EBS regions (EBS1,EBS3) are indicated by italic bold underlined lower letters. The nucleotides from δ region is represented by bold underlined lower letters.
Table 18.: Group II intron element
Table 19.: target sequence fragment
Table 20.: Z1 sequence
Table 21.: TI sequence
Table 22.: linker sequence
Table 23.: 3' intron fragment sequence and 5' intron fragment sequence
Table 24.: sequence combination
Table 25.: Eubacteria Introns
Table 26.: Archaea Introns
Table 27.: ORF-less Introns
Table 28.: Twintron (Outer intron)
Table 29.: Twintron (Inner intron)
Table 30.: Mitochondrial introns
Table 31.: Chloroplast introns
Table 32.: Bacterial introns Fragments
Table 33.: Archaebacteria Intron fragments
Table 34.: Exemplary sequence
*As to the sequence of Table 34, the nucleotides from IRES/IRES-like are represented by ITALIC UPPER LETTERS.The nucleotides from CDS are represented by lower letters.
Claims (83)
- A non-naturally occurring RNA comprising the following operably linked elements from 5’ to 3’:(1) a 3’ intron fragment;(2) a target sequence consisting of (i) a 3’ target sequence fragment and (ii) a 5’ target sequence fragment, from 5’ to 3’; and(3) a 5’ intron fragment;wherein the RNA has group II intron activity and, upon self-splicing, can form a circular RNA (circRNA) that comprises both the 5’ and 3’ target sequence fragments with the 3’-end of the 5’ target sequence fragment linked to the 5’-end of the 3’ target sequence fragment.
- The RNA of claim 1, wherein the circRNA comprises a translation initiation sequence (TI) and a protein-coding sequence (Z1) , wherein the 3’-end of TI is operatively linked to the 5’-end of Z1.
- The RNA of claim 2, wherein Z1 encodes a therapeutic product.
- The RNA of claim 2, wherein Z1 is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 107-112, 214-221 and 258-259.
- The RNA of claim 3, wherein the therapeutic product has an amino acid sequence selected from the group consisting of SEQ ID NOs: 113-118.
- The RNA of any one of claims 2 to 5, wherein TI comprises a sequence selected from the group consisting of: a spacer sequence of SEQ ID NOs: 4-6, a polyA sequence, a poly-A-C sequence, a poly-C sequence, a poly-U sequence, an internal ribosome entry site (IRES) , an IRES-like nucleotide sequence, a ribosome binding site, an aptamer sequence, an RNA scaffold, a riboswitch, a ribozyme other than a self-splicing ribozyme, an antisense oligonucleotide (ASO) , a scaffold, a small RNA binding site, a translational regulatory sequence, and a protein binding site.
- The RNA of claim 6, wherein TI comprises an IRES, an IRES-like nucleotide sequence, or a combination thereof.
- The RNA of claim 6, wherein TI is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 222-225.
- The RNA of any one of claims 2 to 7, wherein the 3’ target sequence fragment comprises Z1 and the 5’ target sequence fragment comprises TI.
- The RNA of claim 8, wherein the 3’ target sequence fragment further comprises two linkers (L) flanking Z1.
- The RNA of any one of claims 2 to 7, wherein the 3’ target sequence fragment comprises TI and the 5’ target sequence fragment comprises Z1.
- The RNA of claim 11, wherein the 3’ target sequence fragment further comprises two linkers (L) flanking TI.
- The RNA of any one of claims 2 to 7, wherein the 3’ target sequence fragment comprises, from 5’ to 3’, a 3’ fragment of TI (TIB) and Z1; and wherein the 5’ target sequence fragment comprises a 5’ fragment of TI (TIA) .
- The RNA of claim 13, wherein the 3’ target sequence further comprises two linkers (L) flanking Z1.
- The RNA of any one of claims 2 to 7, wherein the 3’ target sequence fragment comprises a 3’ fragment of Z1 (Z1B) ; and wherein the 5’ target sequence fragment comprises, from 5’ to 3’, TI and a 5’ fragment of Z1 (Z1A) .
- The RNA of claim 15, wherein the 3’ target sequence further comprises two linkers (L) flanking TI.
- The RNA of claim 1, having a structure selected from the group consisting of Formulae (I)- (IV) :(I) 5’- (3’ IF) - (L) n-Z1- (L) n-TI- (5’ IF) -3’;(II) 5’- (3’ IF) - (L) n-TI- (L) n-Z1- (5’ IF) -3’;(III) 5’- (3’ IF) -TIB- (L) n-Z1- (L) n-TIA- (5’ IF) -3’;(IV) 5’- (3’ IF) -Z1B- (L) n-TI- (L) n-Z1A- (5’ IF) -3’; andwherein 3’ IF is the 3’ intron fragment; 5’ IF is the 5’ intron fragment; TI is a translation initiation sequence, which can be segmented into a 5’ fragment (TIA) and a 3’ fragment TI (TIB) ; Z1 is a protein-coding sequence, which can be segmented into a 5’ fragment (Z1A) and a 3’ fragment (Z1B) ; and each L is independently a linker sequence, and n=0, 1 or 2.
- The RNA of any one of claims 2 to 14, or of claim 17 having a structure selected from the group consisting of Formulae (I) - (III) , further comprising (1) an exon fragment 2 (E2) between the 3’ intron fragment and the target sequence, (2) an exon fragment 1 (E1) between the target sequence and the 5’ intron fragment; or (3) both (1) and (2) .
- The RNA of claim 18, wherein E2 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 53-63.
- The RNA of claim 18 or 19, wherein E1 has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 64-74.
- The RNA of any one of claims 18 to 20, wherein E1 and E2 together are 20 or less nucleotides in length.
- The RNA of any one of claims 1 to 17, wherein the circRNA consists of the target sequence.
- The RNA of any one of claims 1 to 22, wherein the 3’ intron fragment comprises D5-like sequence, and the 5’ intron fragment comprises a D1-like sequence.
- The RNA of claim 23, wherein the D1-like sequence comprises EBS1 sequence and EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- The RNA of claim 23, wherein the D1-like sequence comprises an EBS1 sequence and a δ nucleotide wherein the EBS1 sequence, and the δ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking a target sequence.
- The RNA of claim 23, wherein the D1-like sequence comprises EBS1’ sequence and EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- The RNA of claim 23, wherein the D1-like sequence comprises an EBS1’ sequence and a δ” nucleotide wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in a target sequence.
- The RNA of claim 26 or 27, wherein the complementarily paired regions are located at one or both ends of the target sequence.
- The RNA of any one of claims 23 to 28, wherein the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, 205, and 265.
- The RNA of any one of claims 23 to 29, wherein the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, 209, and 269.
- The RNA of any one of claims 23 to 30, wherein the 3’ intron fragment further comprises a D6-like sequence at the 3’ end of the D5-like sequence, wherein the D6-like sequence comprises a bulged adenosine or an atypical bulged adenosine.
- The RNA of claim 31, wherein the D6-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, 210, and 270.
- The RNA of any one of claims 23 to 32, wherein the 5’ intron fragment further comprises a D2-like sequence or a D3-like sequence at the 3’ end of the D1-like sequence.
- The RNA of any one of claims 33, wherein the 5’ intron fragment further comprises, from 5’ to 3’, a D2-like sequence and a D3-like sequence at the 3’ end of the D1-like sequence.
- The RNA of claims 33 or 34, wherein the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, 206, and 266, and the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 152, 163, 177, 189, 196, 201, 207, and 267.
- The RNA of any one of claims 23 to 35, further comprising a 3’ D4 stem-like sequence at the 5’ end of the 3’ intron fragment and a 5’ D4 stem-like sequence at the 3’ end of the 5’ intron fragment, wherein the pair of D4 stem-like sequences each has a region that is 10-200 or 30-60 nucleotides in length and are at least 60%complementarily paired (a complementary region) .
- The RNA of claim 36, wherein the 3’ and 5’ D4 stem-like sequences have two or more complementary regions.
- The RNA of one of claims 1 to 8 comprising a structure selecting from the group consisting of Formulae (1) - (12) :(1) 5’-D5L-TS-D1L-3’;(2) 5’-D5L-TS-D1L-D2/D3L-3’;(3) 5’-D5L-TS-D1L-D2L-D3L-3’;(4) 5’- (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’;(5) 5’- (3’ D4L) -D5L-TS-D1L-D2/D3L- (5’ D4L) -3’;(6) 5’- (3’ D4L) -D5L-TS-D1L-D2L-D3L- (5’ D4L) -3’;(7) 5’-D5L-D6L-TS-D1L-3’;(8) 5’-D5L-D6L-TS-D1L-D2/D3L-3’;(9) 5’-D5L-D6L-TS-D1L-D2L-D3L-3’;(10) 5’- (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’;(11) 5’- (3’ D4L) -D5L-D6L-TS-D1L-D2/D3L- (5’ D4L) -3’; and(12) 5’- (3’ D4L) -D5L-D6L-TS-D1L-D2L-D3L- (5’ D4L) -3’;wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D2L is D2-like sequence; D3L is D3-like sequence; D2/D3L is D2/D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
- The RNA of claim 38 comprising a structure selecting from the group consisting of Formulae (1) - (8) :(1) 5’-D5L-TS-D1L-3’;(2) 5’-D5L-TS-D1L-D3L-3’;(3) 5’- (3’ D4L) -D5L-TS-D1L- (5’ D4L) -3’;(4) 5’- (3’ D4L) -D5L-TS-D1L-D3L- (5’ D4L) -3’;(5) 5’-D5L-D6L-TS-D1L-3’;(6) 5’-D5L-D6L-TS-D1L-D3L-3’;(7) 5’- (3’ D4L) -D5L-D6L-TS-D1L- (5’ D4L) -3’; and(8) 5’- (3’ D4L) -D5L-D6L-TS-D1L-D3L- (5’ D4L) -3’;wherein TS is the target sequence; (3’ D4L) is 3’ D4 stem-like sequence; (5’ D4L) is 5’ D4 stem-like sequence; D1L is D1-like sequence; D3L is D3-like sequence; D5L is D5-like sequence; and D6L is D6-like sequence.
- The RNA of claim 38 or 39, wherein:(1) the D1-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 147, 159, 170, 184, 194, 199, 205, and 265;(2) the D5-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 154, 165, 168, 179, 191, 197, 203, 209, and 269;(3) the D6-like sequence a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 155, 166, 180, 192, 198, 204, 210, and 270; or(4) (a) the D2-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 151, 162, 176, 188, 195, 200, 206, and 266; or (b) the D3-like sequence has a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 98%, or 100%identical to a nucleotide sequence selected from the group consisting NOs: 152, 163, 177, 189, 196, 201, 207 and 267; or both (a) and (b) ; or any combination of (1) - (4) .
- The RNA of any one of claims 1 to 40, further comprising a 5’ homology arm operatively linked to the 5’-end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’-end of the 5’ intron fragment.
- The RNA of claim 41, wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- The RNA of claim 41 or 42, wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
- The RNA of claim 43, wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
- The RNA of any one of claims 1 to 44, wherein the RNA has group IIB intron activity.
- The RNA of any one of claims 1 to 22, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at an unpaired region into two fragments.
- The RNA of claim 46, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a loop region of a stem-loop structure of D1, D2, D3, D4, D5, or D6.
- The RNA of claim 46, wherein the 5’ intron fragment and the 3’ intron fragment are obtained by segmenting a group II intron at a linear region between D1 and D2, between D2 and D3, between D3 and D4, between D4 and D5, or between D5 and D6.
- The RNA of any one of claims 46 to 48, wherein the group II intron comprises a modification of one or more nucleotides relative to its wild-type form, and the modification is selected from one or more of a deletion, a substitution, and an addition.
- The RNA of claim 49, wherein the modification comprises a deletion of part or all of D4, such as a deletion of an intron-encoded protein (IEP) sequence in D4, preferably a deletion of all of D4.
- The RNA of claim 49, wherein the modification comprises a deletion of an open reading frame (ORF) .
- The RNA of any one of claims 49 to 51, wherein the D1 of the group II intron comprises an EBS1 sequence and an EBS3 sequence that are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- The RNA of any one of claims 49 to 51, wherein the D1 of the group II intron comprises an EBS1 sequence and a δ nucleotide, wherein the EBS1 sequence, and the δ nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length flanking the target sequence.
- The RNA of any one of claims 49 to 51, wherein the D1 of the group II intron comprises an EBS1’ sequence and an EBS3’ sequence that are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- The RNA of any one of claims 49 to 51, wherein the D1 of the group II intron comprises an EBS1’ sequence and a δ” nucleotide, wherein the EBS1’ sequence, and the δ” nucleotide and the 4-10 nucleotides of its immediate upstream, are each at least 60%complementarily paired with a region of a corresponding length in the target sequence.
- The RNA of claim 54 or 55, wherein the complementarily paired regions are located at one or both ends of the target sequence.
- The RNA of any one of claims 46 to 56, further comprising a 5’ homology arm operatively linked to the 5’-end of the 3’ intron fragment, and a 3’ homology arm operatively linked to the 3’-end of the 5’ intron fragment.
- The RNA of claim 57, wherein the 5’ homology arm, the 3’ homology arm, or both are 15 to 60 nucleotides in length.
- The RNA of claim 57 or 58, wherein the 5’ and the 3’ homology arms have up to 10%base mismatches.
- The RNA of claim 57, wherein (1) the 5’ homology arm has the nucleotide sequence of SEQ ID NO: 105; or (2) the 3’ homology arm has the nucleotide sequence of SEQ ID NO: 106; or both (1) and (2) .
- The RNA of any one of claims 46 to 60, wherein the group II intron is a group II intron derived from a microorganism.
- The RNA of any one of claims 46 to 61, wherein the group II intron is a group IIB intron.
- The RNA of claim 62, wherein the group II intron is Cte 1.
- The RNA of claim 62, wherein the group II intron is CL.
- The RNA of any one of claims 46 to 60, wherein the group II intron has a nucleotide sequence selected from the group consisting of SEQ ID NOs: 33-41, 135-145, and 264.
- The RNA of any one of claims 46 to 60, wherein the 3’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NO: 42-52 and 228.
- The RNA of any one of claims 46 to 60, wherein the 5’ intron fragment has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 75-88 and 229.
- The RNA of claim 1, wherein the RNA has a nucleotide sequence that is at least 95%, at least 98%, at least 99%, or 100%identical to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 230-264.
- The RNA of any one of claims 1 to 67, comprising a modified RNA nucleotide and/or modified nucleoside.
- The RNA of claim 68, comprising at least 10%modified RNA nucleotides and/or modified nucleosides.
- The RNA of claim 68, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is m5C (5-methylcytidine) , m5U (5-methyluridine) , m6A (N6-methyladenosine) , Y (pseudouridine) , or m1A (1-methyladenosine) .
- The RNA of any one of claims 68 to 70, wherein at least one of the modified RNA nucleotide and/or modified nucleoside is introduced at in vitro transcription (IVT) .
- A circRNA produced by the self-splicing of the RNA of any one of claims 1 to 72.
- A vector encoding the RNA of any one of claims 1 to 72.
- A cell comprising the RNA of any one of claims 1 to 72, the circRNA of claim 73, or the vector of claim 74.
- A method of making a circRNA comprising subjecting the RNA of any one of claims 1 to 72 under conditions sufficient for it to self-splice.
- A method of expressing a protein in a cell comprising transfecting the cell with the circRNA of claim 73.
- The method of claim 77 wherein the cell is a hepatocyte, epithelial cell, hematopoietic cell, epithelial cell, endothelial cell, lung cell, bone cell, stem cell, mesenchymal cell, neural cell (e. g., meninge, astrocyte, motor neuron, cell of the dorsal root ganglia and anterior horn motor neuron) , photoreceptor cell (e. g., rod and cone) , retinal pigmented epithelial cell, secretory cell, cardiac cell, adipocyte, vascular smooth muscle cell, cardiomyocyte, skeletal muscle cell, beta cell, pituitary cell, synovial lining cell, ovarian cell, testicular cell, fibroblast, B cell, T cell, dendritic cell, macrophage, reticulocyte, leukocyte, granulocyte, tumor cell, NK cell, liver starlet cell, HEK293, HEK293T, HeLa, MCF7, PC3, A549, NCI-H727, HCT-116, MCF10A, HPReC, FHC, immortalized cell lines, primary cell, yeast cell (e. g., Saccharomyces cerevisiae and Pichia pastoris) , bacteria cell (e. g., Escherichia coli) , insect cell (e. g., Spodoptera frugiperda sf9, Mimic Sf9 and sf21) , or Drosophila S2.
- A method of expressing a protein in vivo comprising administering to a subject the circRNA of claim 73 or the vector of claim 74.
- A method of expressing an RNA in vivo comprising administering to a subject the vector of claim 74.
- The method of claim 79 or 80 wherein the subject is a human.
- A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence of SEQ ID NO. 263.
- A non-naturally occurring RNA, wherein the RNA has a nucleotide sequence comprising the following elements: 5’ HBB UTR, CDS, 3’ HBB UTR and polyA, wherein the CDS encodes a protein that is not HBB.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNPCT/CN2023/123553 | 2023-10-09 | ||
| CN2023123553 | 2023-10-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025077734A1 true WO2025077734A1 (en) | 2025-04-17 |
Family
ID=95396557
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/123677 Pending WO2025077734A1 (en) | 2023-10-09 | 2024-10-09 | Constructs and methods for preparing circular rnas and uses thereof |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025077734A1 (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5498531A (en) * | 1993-09-10 | 1996-03-12 | President And Fellows Of Harvard College | Intron-mediated recombinant techniques and reagents |
| WO2020186991A1 (en) * | 2019-03-20 | 2020-09-24 | 中国科学院上海营养与健康研究所 | Protein translation using circular rna and application thereof |
| CN114438127A (en) * | 2022-03-02 | 2022-05-06 | 苏州科锐迈德生物医药科技有限公司 | A kind of recombinant nucleic acid molecule and its application in the preparation of circular RNA |
| CN114574483A (en) * | 2022-03-02 | 2022-06-03 | 苏州科锐迈德生物医药科技有限公司 | Recombinant nucleic acid molecule based on point mutation of translation initiation element and application thereof in preparation of circular RNA |
| CN115404240A (en) * | 2021-05-28 | 2022-11-29 | 上海环码生物医药有限公司 | Constructs, methods for making circular RNA and uses thereof |
| WO2023046153A1 (en) * | 2021-09-26 | 2023-03-30 | Center For Excellence In Molecular Cell Science, Chinese Academy Of Sciences | Circular rna and preparation method thereof |
| CN116323945A (en) * | 2020-06-25 | 2023-06-23 | 利兰·斯坦福青年大学托管委员会 | Genetic elements driving translation of circular RNAs and methods of use |
-
2024
- 2024-10-09 WO PCT/CN2024/123677 patent/WO2025077734A1/en active Pending
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5498531A (en) * | 1993-09-10 | 1996-03-12 | President And Fellows Of Harvard College | Intron-mediated recombinant techniques and reagents |
| WO2020186991A1 (en) * | 2019-03-20 | 2020-09-24 | 中国科学院上海营养与健康研究所 | Protein translation using circular rna and application thereof |
| CN116323945A (en) * | 2020-06-25 | 2023-06-23 | 利兰·斯坦福青年大学托管委员会 | Genetic elements driving translation of circular RNAs and methods of use |
| CN115404240A (en) * | 2021-05-28 | 2022-11-29 | 上海环码生物医药有限公司 | Constructs, methods for making circular RNA and uses thereof |
| WO2022247943A1 (en) * | 2021-05-28 | 2022-12-01 | Shanghai Circode Biomed Co., Ltd. | Constructs and methods for preparing circular rnas and use thereof |
| WO2023046153A1 (en) * | 2021-09-26 | 2023-03-30 | Center For Excellence In Molecular Cell Science, Chinese Academy Of Sciences | Circular rna and preparation method thereof |
| CN114438127A (en) * | 2022-03-02 | 2022-05-06 | 苏州科锐迈德生物医药科技有限公司 | A kind of recombinant nucleic acid molecule and its application in the preparation of circular RNA |
| CN114574483A (en) * | 2022-03-02 | 2022-06-03 | 苏州科锐迈德生物医药科技有限公司 | Recombinant nucleic acid molecule based on point mutation of translation initiation element and application thereof in preparation of circular RNA |
| US20230279389A1 (en) * | 2022-03-02 | 2023-09-07 | Purecodon (Hong Kong) Biopharma Ltd. | Recombinant nucleic acid molecule and application thereof in preparation of circular rna |
Non-Patent Citations (6)
| Title |
|---|
| "Ribozymes", 14 September 2021, WILEY, ISBN: 978-3-527-81452-7, article CHILLÓN ISABEL, MARCIA MARCO: "Introns", pages: 143 - 167, XP093303029, DOI: 10.1002/9783527814527.ch6 * |
| CHEN CHUYUN, WEI HUANHUAN, ZHANG KAI, LI ZEYANG, WEI TONG, TANG CHENXIANG, YANG YUN, WANG ZEFENG: "A flexible, efficient, and scalable platform to produce circular RNAs as new therapeutics", BIORXIV, 1 June 2022 (2022-06-01), XP055962269, Retrieved from the Internet <URL:https://www.biorxiv.org/content/10.1101/2022.05.31.494115v2.full.pdf> [retrieved on 20220919], DOI: 10.1101/2022.05.31.494115 * |
| MIKHEEVA, S. ET AL.: "Use of an engineered ribozyme to produce a circular human exon", NUCLEIC ACIDS RESEARCH, vol. 25, no. 24, 31 December 1997 (1997-12-31), XP002137476, DOI: 10.1093/nar/25.24.5085 * |
| PYLE ANNA MARIE: "Group II Intron Self-Splicing", ANNUAL REVIEW OF BIOPHYSICS, vol. 45, no. 1, 5 July 2016 (2016-07-05), pages 183 - 205, XP093303030, ISSN: 1936-122X, DOI: 10.1146/annurev-biophys-062215-011149 * |
| R. ALEXANDER WESSELHOEFT, PIOTR S. KOWALSKI, DANIEL G. ANDERSON: "Engineering circular RNA for potent and stable translation in eukaryotic cells", NATURE COMMUNICATIONS, vol. 9, no. 1, 1 December 2018 (2018-12-01), XP055622096, DOI: 10.1038/s41467-018-05096-6 * |
| SONJA PETKOVIC, SABINE MüLLER: "RNA circularization strategies in vivo and in vitro", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, GB, vol. 43, no. 4, 27 February 2015 (2015-02-27), GB , pages 2454 - 2465, XP055488942, ISSN: 0305-1048, DOI: 10.1093/nar/gkv045 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2022247943A1 (en) | Constructs and methods for preparing circular rnas and use thereof | |
| US11352641B2 (en) | Circular RNA for translation in eukaryotic cells | |
| ES2958832T3 (en) | Plasmid containing a sequence coding for an mRNA with a cleaved poly(A) tail | |
| US10927383B2 (en) | Cas9 mRNAs | |
| CN109154001B (en) | UTR sequence | |
| US9879254B2 (en) | Targeting RNAs to microvesicles | |
| EP2914721B1 (en) | A rna trans-splicing molecule (rtm) for use in the treatment of cancer | |
| EP4314265A2 (en) | Novel crispr enzymes, methods, systems and uses thereof | |
| WO2023231959A2 (en) | Synthetic circular rna compositions and methods of use thereof | |
| WO2025077734A1 (en) | Constructs and methods for preparing circular rnas and uses thereof | |
| TW202338091A (en) | Site-specific recombinases for efficient and specific genome editing | |
| WO2025218812A1 (en) | Novel internal ribosome entry sites and uses thereof | |
| CN120608081A (en) | Constructs, methods and uses of circular RNA | |
| CN120608082A (en) | Constructs, methods and uses for preparing circular RNA | |
| WO2025011529A2 (en) | Circular rna vaccines for seasonal flu and methods of uses | |
| US20250332287A1 (en) | Identification of tissue-specific extragenic safe harbors for gene therapy approaches |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24876547 Country of ref document: EP Kind code of ref document: A1 |