[go: up one dir, main page]

WO2025140537A1 - Circularisation d'arn - Google Patents

Circularisation d'arn Download PDF

Info

Publication number
WO2025140537A1
WO2025140537A1 PCT/CN2024/143138 CN2024143138W WO2025140537A1 WO 2025140537 A1 WO2025140537 A1 WO 2025140537A1 CN 2024143138 W CN2024143138 W CN 2024143138W WO 2025140537 A1 WO2025140537 A1 WO 2025140537A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
nucleotide
seq
igs
intron
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2024/143138
Other languages
English (en)
Inventor
Shaojun QI
Peng Gao
Bo YING
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Abogen Biosciences Co Ltd
Original Assignee
Suzhou Abogen Biosciences Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Abogen Biosciences Co Ltd filed Critical Suzhou Abogen Biosciences Co Ltd
Publication of WO2025140537A1 publication Critical patent/WO2025140537A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/64General methods for preparing the vector, for introducing it into the cell or for selecting the vector-containing host
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/67General methods for enhancing the expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/12Type of nucleic acid catalytic nucleic acids, e.g. ribozymes

Definitions

  • the disclosure relates to novel RNA constructs encoding foreign proteins or functional RNAs, with a circularization system based on group I introns, which are capable of self-circularizing with high efficiency without introducing extraneous fragments, as well as to methods of using the constructs to make circular RNAs.
  • mRNA messenger RNA
  • IVTT In vitro transcribed mRNAs
  • mRNA vaccines have been rapidly validated for their safety and efficacy in combating infectious diseases such as Covid-19.
  • nonvaccine therapies such as protein replacement is limited by several factors including mRNA stability, poor persistence of expression in vivo, immunogenicity, and limited range of expressing cell types.
  • Linear single-stranded mRNA requires adding a 5' cap and 3' polyA tail or even incorporating modified nucleotides like 1m ⁇ to guarantee stability and expression levels in vivo while reducing the risk of unwanted immunogenicity. Moreover, even with these modifications, mRNA is susceptible to exonuclease digestion, resulting in a short half-life both in vitro and in vivo.
  • Circular RNA is a type of single-stranded RNA which forms a 3’-5’ covalently closed loop. CircRNAs are created by a non-canonical splicing process termed "backsplicing" , whereby the spliceosome fuses a splice donor site in a downstream exon (5’ splice site) to a splice acceptor site in an upstream exon (3’ splice site) . Unlike linear mRNAs, circRNAs do not require a 5’-cap or 3’-poly (A) tail for their stability.
  • circRNAs The closed ring structure of circRNAs protects them from exonuclease-mediated degradation, rendering them resistant to several mechanisms of RNA turnover and having a 2.5-fold longer half-life compared to their linear mRNA counterparts. Moreover, circRNAs have beneficial features not shared by mRNAs, such as reduced immunogenicity and extended translation duration. For these reasons, circRNAs have been explored as therapeutic agents.
  • CircRNAs are generally noncoding, as they lack the 5’-cap structure, but several studies have provided evidence that some circRNAs can be translated into proteins.
  • Engineered circRNA with cap-independent translation elements such as internal ribosome entry sites (IRES) or N 6 -methyladenosine (m6A) modifications can also facilitate protein translation in vivo.
  • circRNAs can also be delivered via lipid nanoparticles (LNPs) to provide in vivo expression, which may be more sustained than linear mRNAs.
  • LNPs lipid nanoparticles
  • CircRNAs can be generated post-transcriptionally in living cells by plasmids carrying minigene sequences. Since spliceosome-mediated backsplicing is a major mechanism of circularization in vivo, most circRNA minigenes have at least exonic regions containing the sequence to be circularized, as well as 5' and 3' flanking intronic sequences containing splicing motifs. However, this vector transcription-dependent circularization can still produce variable amounts of unwanted heterologous by-products that cannot be easily identified or purified in vivo. In addition, this approach requires plasmid vectors to be efficiently delivered into the nucleus, making technical development difficult, while double-stranded DNAs also carry the risk of integrating into the genome.
  • Protein ligase and ribozyme assays are commonly used for in vitro preparation of circRNAs.
  • Enzyme ligation-mediated circularization usually requires a complementary splint (a DNA or RNA oligo) to bring both ends of the RNA molecule closer and then catalysis by several enzymes from bacteriophage T4, including T4 DNA ligase, T4 RNA ligase 1, and T4 RNA ligase 2.
  • T4 DNA ligase T4 RNA ligase 1
  • T4 RNA ligase 2 T4 DNA ligase 1
  • T4 RNA ligase 2 T4 DNA ligase
  • Ribozyme-mediated RNA circularizations can also be performed by the permuted intron and exon (PIE) method based on the group I intron or group II intron self-splicing system.
  • PIE permuted intron and exon
  • the group I introns are naturally occurring cis-splicing ribozymes that can splice an RNA transcript and remove themselves from the primary transcript by autocatalyzing two consecutive trans-esterification reactions and joining the two flanking exons (see FIG. 67) .
  • Native group I introns do not require assistance from the spliceosome or other proteins to self-splice but rely on magnesium and free guanosine nucleotides to initiate and complete the reaction. This process leads to ligation of the exons flanking the intron and circularization of the internal intron to generate an intronic circRNA.
  • Helices P1 to P9 (and the intervening junctions and loops) assemble to form the catalytic core of group I introns.
  • helix P1 comprises at least 4-6 base pairs from the 5' intron and 5' exon, ending with a conserved G-U wobble base pair (5'-GNNNNN -3' in intron or 5'-NNNNNU -3' in exon) , which contributes to 5' splice site recognition.
  • the P1 extension region (or “P1ex” ) is important for the 5' splicing reaction rate and splicing site recognition.
  • the sequence 'GNNNNN' is also known as the internal guide sequence (IGS) .
  • helix P10 is formed after the first step of splicing and involves base pairing between the 3' intron and 3' exon.
  • the 3' splice site is partially recognized through a conserved guanine at the 3' end of the intron, termed Omega G ( ⁇ G) .
  • ⁇ G Omega G
  • the 3' splice site accuracy can be improved by introducing or enhancing P9.0 or even P9.2 structures.
  • This method achieves RNA circularization by a regular group I intron self-splicing reaction that includes two transesterifications at defined splice sites. Attack of the 5' splice site by free GTP leads to the release of the 3' end sequence (5' half intron) of the PIE construct (first transesterification) . The free 3'-OH group of the newly generated 3' half exon attacks the 3' splice site in the second transesterification reaction. This leads to the release of circRNA and 3' half intron.
  • the PIE method can be used to circularize larger linear RNA precursors, it does not require additional protein ligase, and the reaction conditions and purification methods are easier to develop and optimize.
  • Circular RNAs encoding foreign proteins synthesized by the PIE method have been validated both in vitro and in vivo and retain the characteristics of low immunogenicity and longer translation duration, which broaden their applications. Based on these advantages, the PIE system is currently the most studied and widely used method for RNA circularization. Although the PIE system can achieve circularization of long fragments more efficiently than ligase-mediated methods, the splicing reaction introduces additional fragments (E1, E2, and spacer) from phage or Anabaena exons that may activate immune responses.
  • a PIE-group II intron system can achieve scarless circularization by optimizing exon binding sites (EBS) sequences to match the intron binding sites (IBS) .
  • EBS exon binding sites
  • IBS intron binding sites
  • the PIE system splits the ribozyme into two parts placed at the RNA construct's 3' and 5' terminals, which requires that the ribozyme fragments at both ends are correctly folded and spatially brought closer to form the complete ribozyme catalytic domain.
  • the structure of the internal sequences may interfere with the ribozyme structure at both ends, which requires additional spacer sequences to separate the internal sequences and the ribozyme fragments at both ends.
  • the disclosure provides novel RNA constructs (also referred to as “circular RNA precursors” ) encoding foreign proteins or functional RNAs, with a circularization system based on group I introns, different from the PIE constructs, e.g., in having an intact ribozyme core.
  • novel RNA constructs also referred to as “circular RNA precursors”
  • the RNA constructs are capable of self-circularizing with high efficiency without introducing extraneous fragments.
  • the novel RNA construct comprising, a first recognizer sequence (R1) comprising a first pairing sequence; a nucleotide sequence of interest (GOI) comprising a target site at its 3’ end; a ribozyme core sequence operably linked to an internal guide sequence (IGS) , wherein the ribozyme core sequence encodes a ribozyme core having the catalytic activity of a group I intron ribozyme; and a second recognizer sequence (R2) comprising a second pairing sequence substantially complementary to the first pairing sequence; wherein the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site form a non- Watson-Crick base pair to define a 5' splice site; R1 and R2 are positioned at opposite ends of the RNA construct, such that hybridization of the first and second pairing sequences results in the formation of a duplex-containing structure to define a 3’ splice site; the GOI is positioned 5’ to the
  • This design retains the complete core domain of the ribozyme, which is more conducive to the correct folding of the ribozyme than the traditional PIE system.
  • This method can also achieve circularization of the nucleotide sequence of interest without inclusion of exogenous sequence residues by mimicking the formation of a P1 duplex, selecting an arbitrary sequence (for example, 'nnnnnu' or 'nnnnnc' ) in a nucleotide sequence to be circularized as the target site sequence (simply guaranteeing that the target site sequence is unique in the RNA construct) and placing the sequence downstream the target site to the 3’ end of the nucleotide sequence to be circularized at the 5’ region of the GOI and the sequence from the 5’ end of the nucleotide sequence to be circularized to the target site at the 3’ region of the GOI, and then designing a corresponding IGS.
  • an arbitrary sequence for example, 'nnnnnu' or 'nnnnnc'
  • the RNA construct comprises, from 5’ end to 3’ end, R1 comprising a first pairing sequence and a 3’ end nucleotide 'N' ( ⁇ N) ; GOI comprising a target site at its 3’ end, IGS; Ribozyme core sequence; and R2 comprising a second pairing sequence; wherein ⁇ N is any naturally occurring or modified nucleotide; and the first pairing sequence and the second pairing sequence are substantially complementary to form a duplex-containing structure upstream of the ⁇ N to define the 3' splice site.
  • the disclosure further provides an RNA construct comprising, from 5’ end to 3’ end, a first recognizer sequence (R1) comprising a nucleotide sequence ' (N x ) s (N y ) t ( ⁇ N) ' at its 3’ end; a nucleotide sequence of interest (GOI) comprising a target site at its 3’ end; an internal guide sequence (IGS) ; a ribozyme core sequence encoding a ribozyme core which has the catalytic activity of a group I intron ribozyme; and a second recognizer sequence (R2) comprising a nucleotide sequence ' (n x ) w ' ; wherein the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site form a non- Watson-Crick base pair to define a 5' splice site; ⁇ N, 'N x ' , 'n x ' , and
  • ⁇ N is guanine ( ⁇ G) . In some embodiments, ⁇ N is cytosine ( ⁇ C) . In some other embodiments, ⁇ N is adenine ( ⁇ A) . In some other embodiments, ⁇ N is uracil ( ⁇ U) .
  • the ribozyme core sequence comprises a nucleotide sequence encoding the scaffold domain and catalytic domain of a group I intron; preferably, the ribozyme core sequence comprises or consists of the sequence from the IGS end to the sequence before the 5’ half of P9.0 duplex of a group I intron.
  • the ribozyme core sequence is derived from a group IC1 (e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumocystis sp. (e.g., Pneumocystis carinii) , IC2, IC3 (e.g., from Anabaena sp. PCC7120 or Azoarcus sp. BH72) or IA2 (e.g., from Bacteriophage Twort) intron.
  • IC1 e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumocystis sp. (e.g.
  • RNA construct comprising, from 5’ end to 3’ end, a first nucleotide sequence comprising a sequence from a nucleotide 'N q ' to the 3’ end of a group I intron, a nucleotide sequence of interest (GOI) comprising a target site at its 3’ end, an internal guide sequence (IGS) , and a second nucleotide sequence comprising a sequence from the IGS end to a nucleotide 'N p ' of a group I intron; wherein the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site form a non- Watson-Crick base pair to define a 5' splice site; 'N p ' and 'N q ' are independently selected from any nucleotide from the 3’ end nucleotide of the 5’ half to the 5’ end nucleotide of the 3’ half of P9.0 duplex of
  • the 3’ end guanine of the group I intron is substituted with cytosine, adenine or uracil.
  • the disclosure also provides DNA constructs encoding the novel RNA constructs and methods of making circular RNAs using the novel constructs.
  • FIG. 1 shows a schematic diagram of essential RNA sequence elements based on the cis-splicing circularization system.
  • FIG. 2 shows a schematic diagram of the circularization element with ribozyme P as an example according to some embodiments.
  • the 5’ and 3’ recognizer sequences (Recognizer 1 and Recognizer 2) can simulate the formation of a P9.0 duplex mimic structure with at least two base pairs. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 3 shows a schematic diagram of circularization elements with ribozyme T as an example according to some embodiments.
  • R1 can pair with R2 to form a P9.0 duplex mimic, a P9.2 duplex mimic as well as a double-stranded region through base pairing between Homology Arm 1 (5’ homology arm) and Homology Arm 2 (3’ homology arm) . Dashed boxes indicate unnecessary circularization elements.
  • FIG. 4A shows the GOI sequence structure when the target site 'NNNNNU' is located in the internal ribosome entry site (IRES) according to some embodiments. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 4B shows the GOI sequence structure when the target site 'NNNNNU' is located in the open reading frame (ORF) according to some embodiments. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 5 shows the sequence elements in the IGS region and the sequence elements in the GOI that form base pairs with the IGS according to some embodiments. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 6A shows a schematic diagram of the circularization elements designed with ribozyme T according to some embodiments, where the backsplicing site is located in the ORF-GFP. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 6B shows a schematic diagram of the circularization elements designed with ribozyme T according to some embodiments, where the backsplicing site is inside the IRES.
  • FIG. 7B shows the effect of Mg 2+ concentrations on circularization rate (%CircRNA in total-FA data) in IVT system.
  • the dotted line indicates the 40%circularization rate.
  • FIG. 7C shows the effect of Mg 2+ concentrations on yield (total RNA) in IVT system.
  • the dotted line indicates 200 ⁇ g yield.
  • FIG. 8B shows the purity of the product with or without RNase R digestion detected by FA.
  • FIG. 9 shows the cell expression detection (FITC-GFP) of products prepared under different Mg 2+ concentrations before and after treatment with RNase R.
  • FIG. 10 shows a schematic diagram of the circularization elements designed with ribozyme P as an example according to some embodiments, where the backsplicing site is located inside the IRES. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 11 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the product types of the corresponding bands are indicated on the right, respectively.
  • RNase R digests almost all the linear RNAs but not circular RNAs, allowing circular RNAs to be enriched.
  • FIG. 12 shows the purity of the product with or without RNase R digestion detected by FA.
  • FIG. 13 shows a schematic diagram of the circularization elements designed based on ribozyme T with the IGS and target site split to the precursor’s 5’ and 3’ regions, respectively, and the backsplicing site is located inside the IRES. Dashed boxes indicate unnecessary circularization elements.
  • FIG. 14 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digests almost all the linear RNAs but not circular RNAs, allowing circular RNAs to be enriched.
  • FIG. 15 shows the effect of Mg 2+ concentrations on circularization rate (%CircRNA in total-FA data) in IVT system; the dotted line indicates the 40%circularization rate.
  • FIG. 16 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • FIG. 17 shows the purity of the product with or without RNase R digestion detected by FA.
  • FIG. 19 shows a schematic diagram of the circularization element with ribozyme P as an example according to some embodiments.
  • the 5’ and 3’ recognizer sequences can simulate the formation of a P9.0 duplex mimic structure with at least two base pairs, including wobble base-pair G-U. Dashed boxes indicate nonessential circularization elements.
  • FIG. 20 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • FIG. 21 shows the purity of the product with or without RNase R digestion detected by FA.
  • FIG. 22 shows the cell expression detection (FITC-GFP) of products prepared under different Mg 2+ concentrations after treatment with RNase R.
  • FIG. 24 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • FIG. 25 shows the purity of the product with or without RNase R digestion detected by FA.
  • FIG. 26 shows the cell expression detection (FITC-GFP) of products prepared under different Mg 2+ concentrations before and after treatment with RNase R.
  • FIG. 27 shows a schematic diagram of the circularization elements with ribozyme T as an example according to some embodiments.
  • the P1 duplex formed between the target site and IGS includes a C-A wobble base pair. Dashed boxes indicate nonessential circularization elements.
  • FIG. 28 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • FIG. 30 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the recognizers can also mediate circularization without P9.2 and homology arm elements.
  • FIG. 31 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • FIG. 32 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the 5’ and 3’ recognizer sequences can also mediate circularization without Spacers. Dashed boxes indicate nonessential circularization elements.
  • FIG. 33 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • the product purity of FA analysis is presented in percentage form.
  • FIG. 34 shows the cell expression detection (FITC-GFP) of products prepared under different Mg 2+ concentrations before and after treatment with RNase R.
  • FIG. 35 shows a schematic diagram of the circularization element with ribozyme P as an example according to some embodiments.
  • the recognizers can also mediate circularization without homology arm elements and spacers. Dashed boxes indicate nonessential circularization elements.
  • FIG. 36 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RN to be enriched.
  • FIG. 37 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the recognizers can also mediate circularization without homology arm elements and linkers. Dashed boxes indicate nonessential circularization elements.
  • FIG. 38 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • the product purity of FA analysis is presented in percentage form.
  • FIG. 39 shows the fluorescence image of GFP expression in Huh7 cells transfected with circularized samples (before and after RNase R treatment) .
  • FIG. 40 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the recognizers cannot effectively mediate circularization when no paired structure is present. Dashed boxes indicate nonessential circularization elements.
  • FIG. 41 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • FIG. 42 shows the fluorescence image of GFP expression in Huh7 cells transfected with circularized samples (before and after RNase R treatment) .
  • FIG. 43 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the reintroduction of paired structures in R1 and R2 can restore circularization. Dashed boxes indicate nonessential circularization elements.
  • FIG. 44 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • the product purity of FA analysis is presented in percentage form.
  • FIG. 45 shows the fluorescence image of GFP expression in Huh7 cells transfected with circularized samples (before and after RNase R treatment) .
  • FIG. 46 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the 5' and 3' nucleotide sequences that constitute the native P9.0 duplex are swapped. Dashed boxes indicate nonessential circularization elements.
  • FIG. 47 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • the product purity (CircRNA plus Nicked RNA, representing splicing efficiency) of FA analysis is presented in percentage form.
  • FIG. 48 shows the fluorescence image of GFP expression in Huh7 cells transfected with circularized samples (before and after RNase R treatment) .
  • FIG. 49 shows a schematic diagram of the circularization element with ribozyme A as an example according to some embodiments. Dashed boxes indicate nonessential circularization elements.
  • FIG. 50 shows the migration mode of the purified products in the 2%E-Gel EX system for about 18 minutes.
  • the lines indicate the product types of the corresponding bands, respectively.
  • RNase R digestion allows circular RNAs to be enriched.
  • FIG. 51 shows the expression of firefly luciferase in A549 cells transfected with the circularized sample. Data are presented as relative luminescence units (RLU) .
  • RLU relative luminescence units
  • FIG. 52 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments. Dashed boxes indicate nonessential circularization elements.
  • FIG. 61 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the design of ⁇ A is also compatible with the cis-circularization system. Dashed boxes indicate nonessential circularization elements.
  • FIG. 64 shows a schematic diagram of the circularization element with ribozyme T as an example according to some embodiments.
  • the design of ⁇ U is also compatible with the cis-circularization system. Dashed boxes indicate nonessential circularization elements.
  • the expression “comprising” , “including” , “containing” or “having” are open-ended, and do not exclude additional unrecited elements, steps, or ingredients.
  • the expression “consisting essentially of” means that the scope is limited to the designated elements, steps or ingredients, plus elements, steps or ingredients that are optionally present that do not substantially affect the essential and novel characteristics of the claimed subject matter. It should be understood that the expression “comprising” encompasses the expressions “consisting essentially of” and “consisting of” .
  • nucleic acid sequence refers to a polymer of RNA or DNA that is single-or double-stranded, optionally containing synthetic, non-natural or modified nucleotides.
  • nucleotide in a nucleotide sequence is referred to by the single letter designation of its nucleobase as follows: "A (a) " for adenine or deoxyadenine (for RNA or DNA, respectively) , “C (c) “ for cytosine or deoxycytosine, “G (g) “ for guanine or deoxyguanine, “U (u) “ for uracil, “T (t) “ for deoxythymine, “R” for purines (A or G) , “Y” for pyrimidines (C or T) , "I” for hypoxanthine, and "N” or “n” for any nucleotide.
  • operably linked when referring to a first nucleotide sequence that is operably linked with a second nucleotide sequence, means a situation when the first nucleotide sequence is placed in a functional relationship with the second nucleotide sequence.
  • cis-splicing refers to splicing from the same nucleic acid strand.
  • back-splicing site or “backsplicing site” when used with reference to a circular RNA, refers to a dinucleotide served as the point of reconnection during the back-splicing process, resulting in the two ends of a linear nucleotide sequence joining to form the circular RNA.
  • splice site refers to a dinucleotide between which a phosphodiester bond is cleaved during RNA circularization.
  • the terms “native” and “naturally-occurring” mean the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms.
  • a naturally occurring group I intron or native nucleotide sequence of a group I intron may be present in and isolated from a natural source, and is not intentionally modified by human manipulation.
  • nucleotide sequence As used herein, the first nucleotide starting from the 5’ end of a nucleotide sequence is designated as the 5’ end nucleotide and is numbered as nucleotide 1 of the nucleotide sequence. Similarly, the last nucleotide starting from the 5’ end of a nucleotide sequence is designated as the 3’ end nucleotide of the nucleotide sequence.
  • upstream is toward the 5’ direction of a nucleotide sequence and “downstream” is toward the 3’ direction of a nucleotide sequence.
  • the expression “from 5’ end to 3’ end” means that the listed elements of a nucleotide sequence are present in a 5’ to 3’ direction and does not limit the length of the nucleotide sequence and elements therein. Thus, such an expression does not exclude any other elements located upstream, downstream and/or inbetween of the listed elements.
  • a first nucleotide sequence is “at the 5’ end” or “at the 3’ end” of a second nucleotide sequence refers to the terminal position of the first nucleotide sequence (or the nucleotide) within the second nucleotide sequence.
  • first nucleotide sequence is “in the 5’ region” or “in the 3’ region” of a second nucleotide sequence or a similar expression means the first nucleotide sequence (or the nucleotide) is located at a position adjacent to the 5’ or 3’ end nucleotide of the second nucleotide sequence but not necessarily at the 5’ or 3’ end of the second nucleotide sequence.
  • RNA structure prediction tools such as RNAfold (http: //rna. tbi. univie. ac. at/cgi-bin/RNAWebSuite/RNAfold. cgi) or RNAstructure (https: //rna. urmc. rochester. edu/RNAstructureWeb/index. html) .
  • the term “pair with” refers to two nucleic acid strands or two regions on the same nucleic acid strand form a duplex-containing structure through Watson-Crick base pairing and/or non-Watson-Crick base pairing.
  • the expression “form” , “can form” , “may form” or the like is open-ended and inclusive, and do not exclude additional unrecited structures.
  • the expression “the 5’ and 3’ flanking sequences can pair with each to form a double-stranded region” means that a double-stranded region is formed through base pairing between at least a portion of the nucleotides in the 5’ and 3’ flanking sequences, but do not exclude any other structure may be formed by the 5’ flanking sequence and 3’ flanking sequence alone or in combination.
  • the term “complementary” refers to Watson-Crick base pairing and/or non-Watson-Crick base pairing.
  • the term “reverse complementary” refers to base pairing is formed between a first nucleotide sequence in the 5’ to 3’ direction and a second nucleotide sequence in the 3’ to 5’ direction. Base pairings between two reverse complementary nucleotide sequences include Watson-Crick base pairing and/or non-Watson-Crick base pairing. Preferably, all base pairings between two reverse complementary nucleotide sequences are Watson-Crick base pairings.
  • a “reverse complement” of a given nucleotide sequence can be obtained by reversing the order of all the nucleotides in the nucleotide sequence and then replacing all the nucleotides with their respective Watson-Crick complementary nucleotides.
  • the degree of complementarity between two nucleotide sequences can be indicated by the percentage of nucleotides in a first nucleotide sequence which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleotide sequence (e.g., about 50%, about 60%, about 70%, about 80%, about 90%, and 100%complementary) .
  • Two nucleotide sequences are “reverse complementary” or “perfectly complementary” if all the contiguous nucleotides of a first nucleotide sequence form hydrogen bonds with the same number of contiguous nucleotides in a second nucleotide sequence.
  • the term “at least partially (reverse) complementary” or “substantially complementary” means that at least about 50% (e.g., at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, about 100%) nucleotides of a nucleotide sequence (e.g., a 5’ homology arm sequence) can form base pairs with another nucleotide sequence (e.g., a 3’ homology arm sequence) .
  • Two substantially complementary nucleotide sequences may share a sufficient level of sequence identity to one another’s reverse complement to allow hybridization occurs.
  • Two nucleotide sequences are “substantially complementary” or “at least partially complementary” if the two nucleotide sequences are at least 50% (e.g., at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%) complementary over a region of at least 8 nucleotides (e.g., at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or more nucleotides) , or if the two nucleotide sequences hybridize under at least moderate, or, in some embodiments high, stringency conditions.
  • at least 8 nucleotides e.g.,
  • Exemplary moderate stringency conditions include overnight incubation at 37°C in a solution comprising 20%formamide, 5 ⁇ SSC (150 mM NaCl, 15 mM trisodium citrate) , 50 mM sodium phosphate (pH 7.6) , 5 ⁇ Denhardt's solution, 10%dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1 ⁇ SSC at about 37-50°C, or substantially similar conditions, e.g., the moderately stringent conditions described in Sambrook, J., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press; 4th edition (Jun. 15, 2012) .
  • High stringency conditions are conditions that use, for example (1) low ionic strength and high temperature for washing, such as 0.015 M sodium chloride/0.0015 M sodium citrate/0.1%sodium dodecyl sulfate (SDS) at 50°C, (2) employ a denaturing agent during hybridization, such as formamide, for example, 50% (v/v) formamide with 0.1%bovine serum albumin (BSA) /0.1%Ficoll/0.1%polyvinylpyrrolidone (PVP) /50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride and 75 mM sodium citrate at 42°C, or (3) employ 50%formamide, 5 ⁇ SSC (0.75 M NaCl, 0.075 M sodium citrate) , 50 mM sodium phosphate (pH 6.8) , 0.1%sodium pyrophosphate, 5 ⁇ Denhardt's solution, sonicated salmon sperm DNA (50 ⁇ g/ml) , 0.1%S
  • a “homology arm sequence” is any contiguous sequence that can form base pairs with preferably at least about 50% (e.g., at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, about 100%) of another sequence (another homology arm sequence) in the RNA construct.
  • a “spacer” refers to a nucleotide sequence separating two other elements (segments) along a polynucleotide sequence.
  • a spacer may be of any length.
  • a spacer may be of 1-100 nucleotides, preferably 2-50 nucleotides in length.
  • a spacer may comprise a defined or random nucleotide sequence.
  • Watson-Crick base pairing refers to a hydrogen-bond pairing occurs between adenine and thymine (A-T) (DNA) or uracil (A-U) (RNA) , or guanine and cytosine (G-C) .
  • Wobble base pairing refers to a type of non-Watson-Crick base pairing.
  • Wobble base pairing may be formed between hypoxanthine and uracil (I-U, I for inosine) , guanine and uracil (G-U) , adenine and cytosine (A-C) , hypoxanthine and adenine (I-A) , or hypoxanthine and cytosine (I-C) , but not limited to.
  • base pair refers to two nitrogenous bases that are connected by hydrogen bonds.
  • a base pair can be a Watson-Crick base pair or a non-Watson-Crick base pair.
  • non-Watson-Crick base pairs may include but not limited to wobble base pairs and Hoogsteen base pairs.
  • wobble base pairs are most frequent of wobble base pairs.
  • G-T (U) base pairing and A-C base pairing are most frequent of wobble base pairs.
  • Other non-Watson-Crick base pairs include but are not limited to C-U, A-G (or I) and A-A.
  • stem-loop also known as a “hairpin” , refers to a secondary structure that can occur in single-stranded nucleic acids.
  • the stem-loop may occur when two regions of the same strand pair with each other to form a double-stranded region that ends in an unpaired loop.
  • duplex As used herein, the terms “duplex” , “double-stranded region” and “helix” are used interchangeably to refer to a double-stranded structure comprising at least one base pair.
  • a duplex may comprise a Watson-Crick base pair, a non-Watson-Crick base pair or a combination thereof.
  • duplex mimic refers to a double-stranded structure that functionally mimics the native duplex structure of a group I intron ribozyme.
  • a duplex mimic may comprise at least one base pair.
  • a duplex mimic may comprise a Watson-Crick base pair, a non-Watson-Crick base pair or a combination thereof.
  • the sequences forming a duplex mimic preferably are but not limited to the corresponding native ribozyme sequences and can be truncated or designed as other alternative sequences.
  • free energy refers to the energy released by folding an unfolded nucleic acid molecule (e.g., RNA or DNA, etc. ) , or, conversely, the amount of energy that must be added in order to unfold a folded nucleic acid molecule (e.g., RNA or DNA, etc. ) .
  • the “minimum free energy (MFE) ” of a nucleic acid molecule e.g., DNA, RNA, etc.
  • MFE minimum free energy
  • melting temperature refers to the temperature at which about 50%of double-stranded nucleic acid structures (e.g., DNA/DNA, DNA/RNA, or RNA/RNA duplexes) denature and dissociate to single-stranded structures.
  • the melting temperature of a particular nucleic acid molecule can be determined using thermodynamic analyses and algorithms described herein and known in the art (see, e.g., Kibbe W.A., Nucleic Acids Res., 35 (Web Server issue) : W43-W46 (2007) . doi: 10.1093/nar/gkm234; and Dumousseau et al., BMC Bioinformatics, 13: 101 (2012) . doi. org/10.1186/1471-2105-13-101) .
  • sequence similarity is used to denote similarity between two sequences. Sequence similarity or identity may be determined using standard techniques known in the art, including, but not limited to, the local sequence identity algorithm of Smith &Waterman, Adv. Appl. Math. 2, 482 (1981) , by the sequence identity alignment algorithm of Needleman &Wunsch, J Mol. Biol. 48, 443 (1970) , by the search for similarity method of Pearson &Lipman, Proc. Natl. Acad. Sci.
  • the ribozyme core sequence comprises a nucleotide sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%or 100%sequence identity to the nucleotide sequence of a naturally occurring group I intron.
  • the percent sequence identity can be any percent of sequence identity that allows for hybridization to occur. In some embodiments, at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%or 100%nucleotides of the first pairing sequence form base pairs with the second pairing sequence. In some embodiments, the first pairing seuqnece is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%complementary to the second pairing sequence.
  • the relative location of the duplex formed to the ⁇ N in the RNA construct is substantially identical to that of the P9.0 duplex to the ⁇ G in the group I intron from which the ribozyme core sequence is derived.
  • 'N 1 ' is the 3’ end nucleotide of a first contiguous sequence of 2-6 nucleotides in the first pairing sequence
  • 'n 1 ' is the 5’ end nucleotide of a second contiguous sequence in the second pairing sequence, wherein the first contiguous sequence is reverse complementary to the second contiguous sequence
  • 'N 1 ' is the 3’ end nucleotide of a first contiguous sequence of 2-6 nucleotides in the first pairing sequence
  • 'n 1 ' is the 5’ end nucleotide of a second contiguous sequence in the second pairing sequence, wherein the first contiguous sequence is reverse complementary to the second contiguous sequence
  • i is an integer of 1-21. In some particular embodiments, i is an integer of 1-11. In some preferable embodiments, i is 1 or 2.
  • R1 comprises a nucleotide sequence ' (N x ) s (N y ) t ( ⁇ N) ' at its 3’ end
  • R2 comprises a nucleotide sequence ' (n x ) w '
  • ⁇ N, 'N x ' , 'n x ' and 'N y ' are each independently any naturally occurring or modified nucleotide
  • t is an integer of 0-20
  • s and w are each independently an integer of 1-200
  • ' (N x ) s ' and ' (n x ) w ' are substantially complementary to form a duplex-containing structure upstream of the ⁇ N to define the 3' splice site.
  • ⁇ N is guanine ( ⁇ G) . In some embodiments, ⁇ N is cytosine ( ⁇ C) . In some embodiments, ⁇ N is adenine ( ⁇ A) . In some other embodiments, ⁇ N is uracil ( ⁇ U) .
  • t is an integer of 0-10. In some preferable embodiments, t is 0. In some other embodiments, t is 1.
  • each of s and w is an integer of h which is selected from 2-6, ' (N x ) h ' and ' (n x ) h ' are reverse complementary, and t is 0-20.
  • the ribozyme core sequence is derived from a group IC1 intron, for example, an Tetrahymena sp. (e.g., T. thermophile) or Pneumocystis sp. group I intron, and t is an integer of 0-20.
  • the ribozyme core sequence comprises or consists of the nucleotide sequence of SEQ ID NO: 17 or SEQ ID NO: 19 or a nucleotide sequence having at least 95%sequence identity thereto, and t is 0.
  • R1 comprises a nucleotide sequence 'N 1 (N y ) t G' at its 3' end
  • R2 comprises a nucleotide 'n 1 ' , wherein 'G' is the ⁇ G, 'N 1 ' , 'n 1 ' and 'N y ' are each independently any naturally occurring or modified nucleotide, t is an integer of 0-20, and 'N 1 ' and 'n 1 ' form a base pair.
  • t is 0.
  • the ribozyme core sequence is derived from a group IC1 intron, for example, an Tetrahymena sp. (e.g., T. thermophile) or Pneumocystis sp. group I intron, and t is 0.
  • R1 comprises a nucleotide sequence 'N 1 (N y ) t C' at its 3' end
  • R2 comprises a nucleotide 'n 1 ' , wherein 'C' is the ⁇ C, 'N 1 ' , 'n 1 ' and 'N y ' are each independently any naturally occurring or modified nucleotide, t is an integer of 0-20, and 'N 1 ' and 'n 1 ' form a base pair.
  • t is 0.
  • the ribozyme core sequence is derived from a group IC1 intron, for example, an Tetrahymena sp. (e.g., T. thermophile) or Pneumocystis sp. group I intron, and t is 0.
  • R1 comprises a nucleotide sequence 'N 1 (N y ) t A' at its 3' end
  • R2 comprises a nucleotide 'n 1 ' , wherein 'G' is the ⁇ A, 'N 1 ' , 'n 1 ' and 'N y ' are each independently any naturally occurring or modified nucleotide, t is an integer of 0-20, and 'N 1 ' and 'n 1 ' form a base pair.
  • t is 0.
  • the ribozyme core sequence is derived from a group IC1 intron, for example, an Tetrahymena sp. (e.g., T. thermophile) or Pneumocystis sp. group I intron, and t is 0.
  • R1 comprises a nucleotide sequence 'N 1 (N y ) t U' at its 3' end
  • R2 comprises a nucleotide 'n 1 ' , wherein 'U' is the ⁇ U, 'N 1 ' , 'n 1 ' and 'N y ' are each independently any naturally occurring or modified nucleotide, t is an integer of 0-20, and 'N 1 ' and 'n 1 ' form a base pair.
  • t is 0.
  • the ribozyme core sequence is derived from a group IC1 intron, for example, an Tetrahymena sp. (e.g., T. thermophile) or Pneumocystis sp. group I intron, and t is 0.
  • t is an integer of 1-20, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20.
  • a group IC3 intron for example, an Azoarcus sp. group I intron (e.g., derived from Azoarcus sp. strain BH72)
  • t is 1.
  • the ribozyme core sequence is derived from a group IC1 intron, for example, an Tetrahymena sp. (e.g., T. thermophile) group I intron
  • t is an integer of 1-10, preferably t is 1.
  • R1 comprises a nucleotide sequence 'N 2 N 1 (N y ) t G' at its 3' end
  • R2 comprises a nucleotide sequence 'n 1 n 2 '
  • 'G' is the ⁇ G
  • 'N 1 ' , 'n 1 ' , 'N 2 ' , 'n 2 ' and 'N y ' are each independently any naturally occurring or modified nucleotide
  • t is an integer of 0-20
  • 'N 1 ' and 'n 1 ' form a first base pair
  • 'N 2 ' and 'n 2 ' form a second base pair.
  • R1 comprises a nucleotide sequence 'N 2 N 1 G' at its 3' end.
  • R1 comprises a nucleotide sequence 'N 2 N 1 G' at its 3' end.
  • IC3 intron e.g., an Azoarcus sp. or Annona cherimola group I intron
  • t is 1
  • R1 comprises a nucleotide sequence 'N 2 N 1 N y G' , wherein 'N y ' is any naturally occurring or modified nucleotide; for example, 'N y ' is 'G' , 'U' or 'A' .
  • R1 comprises a nucleotide sequence 'N 2 N 1 (N y ) t C' at its 3' end
  • R2 comprises a nucleotide sequence 'n 1 n 2 '
  • 'C' is the ⁇ C
  • 'N 1 ' , 'n 1 ' , 'N 2 ' , 'n 2 ' and 'N y ' are each independently any naturally occurring or modified nucleotide
  • t is an integer of 0-20
  • 'N 1 ' and 'n 1 ' form a first base pair
  • 'N 2 ' and 'n 2 ' form a second base pair.
  • R1 comprises a nucleotide sequence 'N 2 N 1 C' at its 3' end.
  • R1 comprises a nucleotide sequence 'N 2 N 1 C' at its 3' end.
  • IC3 intron e.g., an Azoarcus sp. or Annona cherimola group I intron
  • t is 1
  • R1 comprises a nucleotide sequence 'N 2 N 1 N y C' , wherein 'N y ' is any naturally occurring or modified nucleotide; for example, 'N y ' is 'G' , 'U' or 'A' .
  • R1 comprises a nucleotide sequence 'N 2 N 1 (N y ) t A' at its 3' end
  • R2 comprises a nucleotide sequence 'n 1 n 2 '
  • 'A' is the ⁇ A
  • 'N 1 ' , 'n 1 ' , 'N 2 ' , 'n 2 ' and 'N y ' are each independently any naturally occurring or modified nucleotide
  • t is an integer of 0-20
  • 'N 1 ' and 'n 1 ' form a first base pair
  • 'N 2 ' and 'n 2 ' form a second base pair.
  • R1 comprises a nucleotide sequence 'N 2 N 1 A' at its 3' end.
  • R1 comprises a nucleotide sequence 'N 2 N 1 A' at its 3' end.
  • IC3 intron e.g., an Azoarcus sp. or Annona cherimola group I intron
  • t is 1
  • R1 comprises a nucleotide sequence 'N 2 N 1 N y A' , wherein 'N y ' is any naturally occurring or modified nucleotide; for example, 'N y ' is 'G' , 'U' or 'A' .
  • R1 comprises a nucleotide sequence 'N 2 N 1 (N y ) t U' at its 3' end
  • R2 comprises a nucleotide sequence 'n 1 n 2 '
  • 'U' is the ⁇ U
  • 'N 1 ' , 'n 1 ' , 'N 2 ' , 'n 2 ' and 'N y ' are each independently any naturally occurring or modified nucleotide
  • t is an integer of 0-20
  • 'N 1 ' and 'n 1 ' form a first base pair
  • 'N 2 ' and 'n 2 ' form a second base pair.
  • R1 comprises a nucleotide sequence 'N 2 N 1 U' at its 3' end.
  • R1 comprises a nucleotide sequence 'N 2 N 1 U' at its 3' end.
  • IC3 intron e.g., an Azoarcus sp. or Annona cherimola group I intron
  • t is 1
  • R1 comprises a nucleotide sequence 'N 2 N 1 N y U' , wherein 'N y ' is any naturally occurring or modified nucleotide; for example, 'N y ' is 'G' , 'U' or 'A' .
  • the first and second base pairs are each selected from A-U, G-C, G-A, A-A, U-U, A-C, G-U and a combination thereof.
  • the 5’ recognizer sequence (R1) may further comprise a 5’ flanking sequence located upstream of the first pairing sequence.
  • the 3’ recognizer sequence (R2) may further comprise a 3’ flanking sequence located downstream of the second pairing sequence.
  • the 5’ flanking sequence and 3’ flanking sequence may pair with each other to form at least one RNA secondary structure that promotes the 5’ and 3’ ends of the RNA construct to be close.
  • the at least one RNA secondary structure may comprise a double-stranded region formed by base pairing between the 5’ and 3’ flanking sequences, and optionally one or more structures selected from a bulge loop, an inteior loop and a hairpin loop.
  • RNA secondary structures include but are not limited to stem structures, stem-loop structures and stem-loop alternating structures.
  • the 5’ and 3’ flanking sequences each may independently comprise 1-500 nucleotides, for example, 10-500, 20-400, 30-300, 40-200, 50-100, 60-90 or 70-80 nucleotides.
  • the 5’ and 3’ flanking sequences each independently comprises 3-400, 4-200, 5-150, 10-100 or 20-50 nucleotides.
  • the double-stranded region may comprise one or more base pairs, e.g., about 2-500, about 5-100, about 2-50, about 10-50 or about 20-30 base pairs, consecutive or interrupted by one or more mismatches.
  • the double-stranded region comprises 2-50 base pairs, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49 or 50 base pairs.
  • Preferable examples of the 5’ and 3’ flanking sequences may be homology arm sequences.
  • a double-stranded region can be formed by two homology arm sequences that are substantially reverse complementary.
  • the 5’ flanking sequence comprises a 5’ homology arm sequence
  • the 3’ flanking sequence comprises a 3’ homology arm sequence
  • the 5’ and 3’ homology arm sequences are substantially complementary.
  • R1 further comprises a 5’ homology arm sequence located upstream of the first pairing sequence
  • R2 further comprises a 3’ homology arm sequence located downstream of the second pairing sequence, wherein the 5’ and 3’ homology arm sequences are substantially complementary.
  • the 5’ and 3’ homology arm sequences each may independently comprise 5-50 nucleotides, for example, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides.
  • the 5’ and 3’ homology arm sequences are reverse complementary.
  • the 5’ and 3’ homology arm sequences are partially reverse complementary, for example, at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%or 95%or 99%nucleotides of the 5’ and 3’ homology arm sequences form base pairs.
  • the 5’ and 3’ homology arm sequences share a higher percent of identity to one another’s reverse complement than they to a sequence located within the GOI and/or the ribozyme core sequence, such that formation of a double-stranded region between the 5’ and 3’ homology arm sequences is prioritized.
  • the 5’ and 3’ flanking sequences may form one or more structures mimicking the native structures of the group I intron ribozyme.
  • the 5’ and 3’ flanking sequences may form one or more structures mimicking the native P9 (P9a/9b) , P9.1, P9.1a or P9.2 duplex of the group I intron or a combination thereof.
  • the 5’ and 3’ flanking sequences in combination form a structure mimicking the P9.2 duplex of the group I intron.
  • RNA construct according to the present disclosure can be derived from a group I intron by inserting a nucleotide sequence of interest between a 3’ fragment (corresponding to R1) and a 5’ fragment (corresponding to Ribozyme core-R2) of a group I intron, wherein the 3’ fragment and 5’ fragment in combination retain the self-splicing ability of the group I intron.
  • a 3’ end portion e.g., a sequence from the 5’ half of P9.0 duplex to the 3’ end nucleotide
  • a 3’ end portion e.g., a sequence from the 5’ half of P9.0 duplex to the 3’ end nucleotide
  • the formation of a duplex-containing structure comprising any sequence between the 5' and 3' ends of the RNA construct is only required to facilitate circularization through the self-splicing activity of the ribozyme core.
  • an RNA construct comprising, from 5’ end to 3’ end, a first nucleotide sequence comprising a sequence from a nucleotide 'N q ' to the 3’ end of a group I intron, a nucleotide sequence of interest (GOI) comprising a target site at its 3’ end, an internal guide sequence (IGS) , and a second nucleotide sequence comprising a sequence from the IGS end to a nucleotide 'N p ' of a group I intron; wherein the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site form a non- Watson-Crick base pair to define a 5' splice site; 'N p ' and 'N q ' are independently selected from any nucleotide from the 3’ end nucleotide of the 5’ half to the 5’ end nucleotide
  • the 3’ end guanine of the group I intron is substituted with cytosine. In some embodiments, the 3’ end guanine of the group I intron is substituted with adenine. In some other embodiments, the 3’ end guanine of the group I intron is substituted with uracil.
  • the group I intron can be a group I intron as described above.
  • the group I intron is a group IC1 (e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumocystis sp. (e.g., Pneumocystis carinii) , IC2, IC3 (e.g., from Anabaena sp. PCC7120 or Azoarcus sp. BH72) or IA2 (e.g., from Bacteriophage Twort) intron.
  • IC1 e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumo
  • the group I intron is a Pneumocystis sp. group I intron comprising a nucleotide sequence selected from SEQ ID NOs: 32-36. In another embodiment, the group I intron is a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12. In another embodiment, the group I intron is an Anabaena sp. group I intron comprising the nucleotide sequence of SEQ ID NO: 49.
  • the first and second nucleotide sequences in combination retain the self-splicing ability of the group I intron, but not necessarily constitute the full-length of the group I intron.
  • the first and second nucleotide sequences in combination may lack one or more duplexes that is not a P9.0 duplex in the P9 domain of the group I intron.
  • the first and second nucleotide sequences in combination may lack one or more duplexes selected from a P9a duplex, a P9b duplex, a P9.1 duplex, a P9.1a duplex and a P9.2 duplex, when applicable.
  • the first and second nucleotide sequences in combination comprise at least one duplex selected from a P9a duplex, a P9b duplex, a P9.1 duplex, a P9.1a duplex and a P9.2 duplex, when applicable.
  • the group I intron is a Pneumocystis carinii group I intron comprising the nucleotide sequence of SEQ ID NO: 32, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 316 (U316) to nucleotide 342 (G342) of SEQ ID NO: 32.
  • the group I intron is a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 313 (A313) to nucleotide 411 (U411) of SEQ ID NO: 12.
  • the group I intron is an Anabaena sp.
  • group I intron comprising the nucleotide sequence of SEQ ID NO: 49, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 212 (C212) to nucleotide 243 (G243) of SEQ ID NO: 49.
  • 'N p ' may be located at any position upstream of 'N q ' in the group I intron. In some embodiments, 'N p ' is located immediately upstream of or adjacent to 'N q ' in the group I intron. In some embodiments, 'N p ' is located immediately upstream of 'N q ' in the group I intron. In some other embodiments, 'N p ' is located several nucleotides (for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides) upstream of 'N q ' in the group I intron.
  • nucleotides for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides
  • 'N p ' can be the 3’ end nucleotide of the 5’ half of P9.0 duplex of the group I intron
  • 'N q ' can be the 5’ end nucleotide of the 3’ half of P9.0 duplex of the group I intron.
  • 'N p ' and 'N q ' can be nucleotide 316 (U316) and nucleotide 342 (G342) of SEQ ID NO: 32, respectively.
  • 'N p ' and 'N q ' can be nucleotide 212 (C212) and nucleotide 243 (G243) of SEQ ID NO: 49, respectively.
  • 'N p ' and 'N q ' can be independently selected from any nucleotide from the 3’ end nucleotide of the 5’ half to the 5’ end nucleotide of the 3’ half of a duplex of the group I intron, wherein the duplex is not a P9.0 duplex.
  • the duplex can be a P9a, P9b, P9.1, P9.1a or P9.2 duplex.
  • the duplex is a P9.2 duplex.
  • 'N p ' and 'N q ' are located within the region connecting the 5’ half and 3’ half of a duplex, wherein the duplex is not a P9.0 duplex.
  • 'N p ' and 'N q ' can be located within the apical loop of a P9a/9b, P9.1, P9.1a or P9.2 duplex.
  • the group I intron is a Pneumocystis carinii group I intron comprising the nucleotide sequence of SEQ ID NO: 32, 'N p ' and 'N q ' are independently selected from nucleotide 325 (G325) to nucleotide 328 (A328) of SEQ ID NO: 32.
  • the group I intron is a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 383 (G383) to nucleotide 386 (A386) of SEQ ID NO: 12.
  • the group I intron is an Anabaena sp. group I intron comprising the nucleotide sequence of SEQ ID NO: 49, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 219 (A219) to nucleotide A (A222) of SEQ ID NO: 49; or 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 232 (G232) to nucleotide A (A235) of SEQ ID NO: 49.
  • 'N p ' is the 3’ end nucleotide of the 5’ half of a duplex and 'N q ' is the 5’ end nucleotide of the 3’ half of a duplex, wherein the duplex is not a P9.0 duplex.
  • 'N p ' and 'N q ' can be nucleotide 324 (C324) and nucleotide 329 (G329) of SEQ ID NO: 32, respectively.
  • 'N p ' and 'N q ' can be nucleotide 375 (U375) and nucleotide 394 (G394) of SEQ ID NO: 12, respectively; or 'N p ' and 'N q ' can be nucleotide 382 (C382) and nucleotide 387 (G387) of SEQ ID NO: 12, respectively.
  • 'N p ' and 'N q ' can be nucleotide 382 (C382) and nucleotide 387 (G387) of SEQ ID NO: 12, respectively.
  • an Anabaena sp for an Anabaena sp.
  • group I intron comprising the nucleotide sequence of SEQ ID NO: 49, 'N p ' and 'N q ' can be nucleotide 218 (C218) and nucleotide 223 (G223) of SEQ ID NO: 49, respectively; or 'N p ' and 'N q ' can be nucleotide 231 (C231) and nucleotide 236 (G236) of SEQ ID NO: 49, respectively.
  • the IGS end of a group I intron can be readily identified by those skilled in the art in view of the present disclosure and the prior art.
  • the second nucleotide sequence (corresponding to Ribozyme core-R2) may comprise a nucleotide sequence lacking the IGS of the group I intron.
  • the second nucleotide sequence may comprise a nucleotide sequence starting from the nucleotide immediately downstream of the 3’ end nucleotide of the IGS of a group I intron.
  • the group I intron is a Pneumocystis carinii group I intron comprising the nucleotide sequence of SEQ ID NO: 32.
  • the second nucleotide sequence comprises a nucleotide sequence starting from nucleotide 18 (G18) to nucleotide 316 (U316) of SEQ ID NO: 32
  • the first nucleotide sequence may comprise a nucleotide sequence starting from any nucleotide selected from nucleotide 317 (C317) to nucleotide 342 (G342) to the 3’ end of SEQ ID NO: 32.
  • the second nucleotide sequence comprises a nucleotide sequence starting from nucleotide 18 (G18) to any nucleotide selected from nucleotide 316 (U316) to nucleotide 341 (U341) of SEQ ID NO: 32
  • the first nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 342 (G342) to the 3’ end of SEQ ID NO: 32.
  • the 3’ end guanine of the group I intron is substituted with cytosine.
  • the 3’ end guanine of the group I intron is substituted with adenine.
  • the 3’ end guanine of the group I intron is substituted with uracil.
  • the group I intron is a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12.
  • the second nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 27 (A27) to nucleotide 313 (A313) of SEQ ID NO: 12, and the first nucleotide sequence may comprise a nucleotide sequence starting from any nucleotide selected from nucleotide 314 (C314) to nucleotide 411 (U411) to the 3’ end of SEQ ID NO: 12.
  • the second nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 27 (A27) to any nucleotide selected from nucleotide 313 (A313) to nucleotide 410 (C410) of SEQ ID NO: 12, and the first nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 411 (U411) to the 3’ end of SEQ ID NO: 12.
  • the 3’ end guanine of the group I intron is substituted with cytosine.
  • the 3’ end guanine of the group I intron is substituted with adenine.
  • the 3’ end guanine of the group I intron is substituted with uracil.
  • the group I intron is an Anabaena sp. group I intron comprising the nucleotide sequence of SEQ ID NO: 49.
  • the second nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 12 (C12) to nucleotide 212 (C212) of SEQ ID NO: 49
  • the first nucleotide sequence may comprise a nucleotide sequence starting from any nucleotide selected from nucleotide 213 (A213) to nucleotide 243 (G243) to the 3’ end of SEQ ID NO: 49.
  • the second nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 12 (C12) to any nucleotide selected from nucleotide 212 (C212) to nucleotide 242 (A242) of SEQ ID NO: 49
  • the first nucleotide sequence may comprise a nucleotide sequence starting from nucleotide 243 (G243) to the 3’ end of SEQ ID NO: 49.
  • the 3’ end guanine of the group I intron is substituted with cytosine.
  • the 3’ end guanine of the group I intron is substituted with adenine.
  • the 3’ end guanine of the group I intron is substituted with uracil.
  • the RNA construct further comprises a 5’ homology arm sequence located upstream of the first nucleotide sequence and a 3’ homology arm sequence located downstream of the second nucleotide sequence, wherein the 5’ and 3’ homology arm sequences are as described above.
  • the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 13, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 14. In some embodiments, the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 15 and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 16. In some embodiments, the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 37 and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 38. In some embodiments, the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 52 and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 53.
  • an RNA construct having a pair of homology arm sequences located at opposite ends of the RNA construct may achieve a high circularization efficiency comparable to an RNA construct counterpart preserving the native 3’ end sequence of a group I intron. That is, according to some embodiments of the present application, a 3’ end portion of the group I intron (e.g., a sequence from the 5’ half of the P9.0 duplex to the ⁇ G) may be entirely replaced by a pair of homology arm sequences that are placed upstream of the GOI and downstream of the ribozyme core sequence, respectively, without affecting the circularization efficiency.
  • a 3’ end portion of the group I intron e.g., a sequence from the 5’ half of the P9.0 duplex to the ⁇ G
  • a pair of homology arm sequences that are placed upstream of the GOI and downstream of the ribozyme core sequence, respectively, without affecting the circularization efficiency.
  • homologous arm sequences to replace the natural partial sequences of a group I intron offers several advantages, including design simplicity and flexibility.
  • 3' end portion of the group I intron e.g., a sequence from the 5' half of the P9.0 douplex to the ⁇ G
  • changes in the internal GOI sequence do not affect the circularization efficiency or interfere with the structure of the intron fragments at both ends. From a purification standpoint, the increased length difference between the 5' and 3' fragments generated after the splicing reaction facilitates their separation and purification.
  • using homologous arm sequences for replacement of a 3’ end portion of a group I intron simplifies design, maintains structural integrity, and enhances purification efficiency.
  • the present inventor further unexpectedly discovered that the 3’ end guanine of the group I intron can be substituted with cytosine, adenine or uracil, without affecting the circularization efficiency.
  • the RNA construct may further comprises additional nucleotide sequences, for example, a nucleotide sequence useful for replication, transcription, translation and/or purification of the RNA construct, for example, inserted between two elements of the RNA construct as a spacer, or extending at the 5’ and/or 3’ ends of the RNA construct, as long as the self-splicing activity is maintained.
  • additional nucleotide sequences may be conventionally selected by those skilled in the art as needed.
  • a spacer may be inserted between the 5’ homology arm and the first pairing sequence and/or between the 3’ homology arm and the second pairing sequence.
  • a spacer may be inserted between the 5’ homology arm and the first nucleotide sequence and/or between the 3’ homology arm and the second nucleotide sequence.
  • the 3' end of the RNA construct can be extended with a sequence that will not pair to form a stable secondary structure such as a stem (referred to as “Tail element” in the present disclosure) .
  • Such sequences may include but are not limited to a polyadenine (polyA) and polyadenine/cytosine (polyAC) sequence of, for example, 10-200, 20-180, 30-150, 40-120, 50-100 nucleotides in length.
  • the RNA construct further comprises a polyA sequence at its 3' end.
  • the polyA sequence may comprise 10 to 150, preferably more than 20 and less than 100, and more preferably 40 to 70 consecutive adenines. This design can facilitate RNase R digestion of the precursor and can also increase the precursor’s length difference versus the circRNA in favor of detection and purification (e.g., FIGs. 2 and 3) .
  • the nucleotide sequence of interest can include but is not limited to the structure elements shown in FIG. 4.
  • the nucleotide sequence of interest comprises a target site at its 3' end.
  • the target site sequence is unique in the RNA construct.
  • Base pairing between the target site and the IGS results in splicing at the 3' end of the target site. After circularization, the 3' end and 5' end nucleotides of the nucleotide sequence of interest are connected to define the backsplicing site.
  • the present application provides an RNA construct that may achieve circularization of the nucleotide sequence of interest without inclusion of an exogenous exon fragment, for example, by mimicking the formation of a P1 duplex (P1 duplex mimic) .
  • advantageous effects of the present invention may at least include simplicity in design, a broad target sequence compatibility and/or a lower immunogenicity in a host while maintaining a high circularization efficiency.
  • the circular RNA does not comprise an exogenous exon fragment.
  • both the 3' and the 5' ends of the GOI do not comprise a natural exon fragment flanking the group I intron from which the ribozyme core sequence is derived.
  • the ribozyme core sequence is derived from a Tetrahymena sp. group I intron. In some embodiments, the ribozyme core sequence is derived from a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12. In some embodiments, the ribozyme core sequence comprises or consists of the nucleotide sequence of SEQ ID NO: 17 or a nucleotide sequence having at least 95%sequence identity thereto. In some embodiments, the ribozyme core sequence is derived from a Pneumocystis sp. group I intron.
  • the ribozyme core sequence is derived from a Pneumocystis sp. group I intron comprising a nucleotide sequence selected from SEQ ID NOs: 32-36.
  • the ribozyme core sequence comprises or consists of the nucleotide sequence of SEQ ID NO: 19 or a nucleotide sequence having at least 95%sequence identity thereto.
  • a natural exon fragment flanking the group I intron may be desirable for a high circularization efficiency.
  • optimizing the 3’ end and/or 5’ end sequence of the GOI may be desirable to avoid the introduction of an exogenous exon sequence. This may be achieved by designing the backsplicing site in a non-coding region or codon optimization of a region in the nucleotide sequence to be circularized that is substantially homologous to an exon-exon junction fragment.
  • a 5’ end portion of the GOI (that is, a sequence that is downstream and adjacent to the ⁇ N) may be designed to include a sequence that is substantially homologous, for example, at least 80%, 85%, 90%, 95%, 99%or 100%identical to a 5’ end portion of the 3’ exon (downstream exon) of the group I intron.
  • a 3’ end portion of the GOI may be designed to include a sequence that is substantially homologous, for example, at least 80%, 85%, 90%, 95%, 99%or 100%identical to a 3’ end portion of the 5’ exon (upstream exon) of the group I intron.
  • the structure formed by the 5' and 3' termini of the GOI resembles the exon sequence structure found on both sides of the natural group I intron, where the 5' and 3' termini of the GOI can form an internal duplex. This structure may be introduced independently or integrated with the homologous sequences in the GOI. See for example, Chu-Xiao Liu et al., 2022, Molecular Cell, 82 (2) : 420-434, for further description.
  • a ribozyme core sequence derived from, for example, a Tetrahymena sp. group I intron or a Pneumocystis sp. group I intron a high circularization efficiency may be achieved without the incorporation of a natural exon fragment. Accordingly, in some embodiments of the present application, a ribozyme core sequence derived from a Tetrahymena sp. group I intron or a Pneumocystis sp. group I intron as described herein may be preferable.
  • the backsplicing site can theoretically be set at any matching position of a nucleotide sequence to be circularized.
  • the backsplicing site can be designed inside the IRES (e.g., a sequence of 'nnnnnu' or 'nnnnnc' inside the IRES can be selected as the target site sequence) .
  • IRES fragments at both ends of GOI can be reconnected to form a complete IRES sequence, as shown in FIG. 4A.
  • the backsplicing site can be designed inside the ORF (e.g., a sequence of 'nnnnu' or 'nnnnnc' inside the ORF can be selected as the target site sequence) .
  • UTRs can be native 3' UTR sequences or modified noncoding sequences.
  • Spacers can be native 5' UTR sequences or other noncoding sequences, including but not limited to aptamers, polyACs, translation-enhancing sequences, purification-related sequences, etc.
  • the IGS region comprises an internal guide sequence (IGS) .
  • IGS internal guide sequence
  • the non-Watson-Crick base pair is a wobble base pair.
  • the wobble base pair formed between the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site is guanine-uracil (G-u) , wherein ′G' is the 5’ end nucleotide of the IGS and 'u' is the 3’ end nucleotide of the target site.
  • the wobble base pair is adenine-cytosine (A-c) , wherein 'A' is the 5’ end nucleotide of the IGS and 'c' is the 3’ end nucleotide of the target site.
  • the non-Watson-Crick base pair formed between the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site is guanine-adenine (G-a) , wherein 'G' is the 5’ end nucleotide of the IGS and 'a' is the 3’ end nucleotide of the target site.
  • the ribozyme core sequence is derived from a Pneumocystis carinii or Tetrahymena sp. group I intron, the wobble base pair is adenine-cytosine (A-c) .
  • the IGS and the target site form a P1 duplex mimic.
  • the P1 duplex mimic may comprise a Watson-Crick base pair, a wobble base pair or a combination thereof.
  • the P1 duplex mimic may comprise at least on base pair.
  • the P1 duplex mimic may comprise 1-20 base pairs, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 base pairs.
  • the P1 duplex mimic comprises a substantially identical number of base pairs to that of the P1 duplex of the group I intron from which the ribozyme core sequence is derived.
  • the IGS has the structure of 5'-X (N) m -3'
  • the target site has the structure of 5'- (n) m x -3'
  • 'X' and 'x' are the nucleotides that form the non-Watson-Crick base pair
  • each 'N' and 'n' is a nucleotide independently selected from adenine (A) , cytosine (C) , guanine (G) , uracil (U) , pseudouridine ( ⁇ ) , 1-methylpseudouridine (1m ⁇ ) , 5-methoxyuridine (5moU) , 2-thiouridine, 4-thiouridine, 5-methylcytidine, and N6-methyladenosine, e.g., wherein each 'N' and 'n' is a nucleotide independently selected from adenine (A) , cytosine (C) , guanine (G)
  • n is an integer of 3-6. In some embodiments, m is an integer of 4-5. In a particular embodiment, m is 5.
  • the base pairs formed between 5'- (N) m -3' and 5'- (n) m -3' comprise a Watson-Crick base pair, a wobble base pair or a combination thereof.
  • 5'- (N) m -3' and 5'- (n) m -3' are reverse complementary.
  • the IGS comprises a sequence 'GNNNNN' and the target site comprises a sequence 'nnnnu' . In some embodiments, the IGS comprises a sequence 'ANNNNN' and the target site comprises a sequence 'nnnnnc' . In some embodiments, 'NNNNN' and 'nnnnn' are reverse complementary.
  • the RNA construct may further comprise a linker sequence located between the target site and IGS.
  • the linker sequence can include but are not limited to the sequence elements as shown in FIG. 5.
  • the linker sequence may comprise 1-50 nucleotides, for example, 2-45, 3-40, 4-30, 5-25, 6-20, 7-15 or 8-10 nucleotides.
  • the linker sequence comprises an unpaired sequence.
  • the unpaired sequence may form a loop structure between the target site and the IGS.
  • the linker sequence comprises an unpaired sequence, wherein the target site, the linker sequence and the IGS form a stem-loop structure.
  • the stem portion of the stem-loop structure may comprise at least two base pairs, for example, 2-20 base pairs, for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 base pairs or more.
  • the loop portion of the stem-loop structure may comprise at least 3 nucleotides, for example, 3-50 nucleotides, for example, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25-50, 30-45 or 35-40 nucleotides.
  • the stem-loop structure may also have on either side of the stem one or more bulges (mismatches) .
  • the unpaired sequence may comprise at least 3 nucleotides, for example, 3-50 nucleotides, for example, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25-50, 30-45 or 35-40 nucleotides. Examples of an unpaired sequence may be a polyA or polyU sequence.
  • the IGS can extend 1 to 3 nucleotides at the 5' end and form a P1 extension (P1-ex) mimic with 1 to 3 nucleotides adjacent to the target site (e.g., 'nnnnnu' or 'nnnnnc' , respectively) at the 3' end of GOI.
  • the linker sequence comprises, from 5’ to 3’ end, a third pairing sequence, a loop sequence, and a fourth pairing sequence, wherein the third and fourth pairing sequences form a P1 extension mimic.
  • the P1 extension mimic may comprise a Watson-Crick base pair, a wobble base pair or a combination thereof.
  • the P1 extension mimic may comprise 1, 2, 3, 4, 5, 6, or more base pairs, preferably 1-3 base pairs.
  • the P1 extension mimic comprises 1-3 reverse complementary base pairs.
  • the third pairing sequence comprises a sequence of 1-3 contiguous nucleotides, which is reverse complementary to a sequence of the same number of contiguous nucleotides in the fourth pairing sequence to form a P1 extension mimic.
  • the RNA construct may further comprise a fifth pairing sequence which can pair with a sequence in the GOI which is adjacent to the ⁇ N (e.g., ⁇ G in some embodiments) to simulate the formation of a P10 duplex (also referred to as a “P10 duplex mimic” ) .
  • the linker sequence comprises a fifth pairing sequence which can pair with a sixth pairing sequence in the 5' region of the nucleotide sequence of interest to form a P10 duplex mimic.
  • the P10 duplex mimic may comprise a Watson-Crick base pair, a wobble base pair or a combination thereof.
  • the P10 duplex mimic may comprise at least two consecutive base pairs, for example, 3-10 base pairs, preferably 3, 4, 5, 6, 7, or 8 base pairs.
  • the P10 duplex mimic comprises 3-10 reverse complementary base pairs.
  • the fifth pairing sequence comprises a sequence of 3-10 contiguouse nucleotides, which is reverse complementary to a sequence of the same number of contiguous nucleotides in the sixth pairing sequence to form a P10 duplex mimic.
  • the sixth pairing sequence may be located adjacent to the 5' end of the nucleotide sequence of interest, for example, starting from the nucleotide immediately downstream of the ⁇ N (i.e., starting from the nucleotide 1 of the nucleotide sequence of interest (the nucleotide at the ⁇ N+1 position in the RNA construct) or starting from a few nucleotides downstream of the ⁇ N (for example, starting from the nucleotide 2 or 3 of the nucleotide sequence of interest (the nucleotide at the ⁇ N+2 or ⁇ N+3 position in the RNA construct) .
  • the sixth pairing sequence starts from a nucleotide at a ⁇ N+r position in the RNA construct, wherein r is an integer greater or equal to 1, for example r is an integer of 1-50, 10-40, 20-30, for example, r is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20.
  • the sixth pairing sequence starts from the nucleotide at the ⁇ N+1 position in the RNA construct.
  • ⁇ N is guanine.
  • ⁇ N is cytosine.
  • ⁇ N is adenine.
  • ⁇ N is uracil.
  • the RNA construct comprises sequences for a P1 extension mimic but not a P10 duplex mimic.
  • the linker sequence comprises, from 5’ to 3’ end, a third pairing sequence, a loop sequence, and a fourth pairing sequence, wherein the third and fourth pairing sequences form a P1 extension mimic, and part or the entire of a 3’ end portion of the linker sequence does not pair with a sequence in the 5’ region of the GOI.
  • the RNA construct comprises sequences for a P10 duplex mimic but not a P1 extension mimic.
  • the linker sequence comprises a loop sequence, and a 3’ end portion of the loop sequence constitute a fifth pairing sequence which can pair with a sixth pairing sequence in the 5’ region of the GOI to form a P10 duplex mimic.
  • the RNA construct comprises sequences for a P1 extension mimic and a P10 duplex mimic.
  • the fifth pairing sequence for the P10 duplex mimic and the fourth pairing sequence for the P1 extension mimic partially overlap.
  • the linker sequence comprises, from 5’ to 3’ end, a third pairing sequence, a loop sequence and a fourth pairing sequence, wherein the third and fourth pairing sequences form a P1 extension mimic, and a 3’ end portion of the loop sequence and a 5’ end portion or the entire of the fourth pairing sequence constitute a fifth pairing sequence which can pair with a sixth pairing sequence in the 5' region of the GOI to form a P10 duplex mimic.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h (N y ) t G –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’ wherein 'N x ' , 'n x ' and 'N y ' each is independently any naturally occurring or modified nucleotide, ' (N x ) h ' and ' (n x ) h ' are reverse complementary, h is an integer of 2-6, t is an integer of 0-20, the 5’ and 3’ homology arm sequences are substantially complementary, and the ribozyme core sequence, the linker sequence, and the target site and the IGS are as defined above.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h (N y ) t C –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’ wherein 'N x ' , 'n x ' and 'N y ' each is independently any naturally occurring or modified nucleotide, ' (N x ) h ' and ' (n x ) h ' are reverse complementary, h is an integer of 2-6, t is an integer of 0-20, the 5’ and 3’ homology arm sequences are substantially complementary, and the ribozyme core sequence, the linker sequence, and the target site and the IGS are as defined above.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h (N y ) t A –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’ wherein 'N x ' , 'n x ' and 'N y ' each is independently any naturally occurring or modified nucleotide, ' (N x ) h ' and ' (n x ) h ' are reverse complementary, h is an integer of 2-6, t is an integer of 0-20, the 5’ and 3’ homology arm sequences are substantially complementary, and the ribozyme core sequence, the linker sequence, and the target site and the IGS are as defined above.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h (N y ) t U –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’ wherein 'N x ' , 'n x ' and 'N y ' each is independently any naturally occurring or modified nucleotide, ' (N x ) h ' and ' (n x ) h ' are reverse complementary, h is an integer of 2-6, t is an integer of 0-20, the 5’ and 3’ homology arm sequences are substantially complementary, and the ribozyme core sequence, the linker sequence, and the target site and the IGS are as defined above.
  • t is an integer of 0-10.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h G –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h C –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h A –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’.
  • the RNA construct has the structure of the following: 5’-5’ homology arm sequence – (N x ) h U –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’.
  • the IGS comprises a sequence 'GNNNNN' and the target site comprises a sequence 'nnnnu' . In some embodiments, the IGS comprises a sequence 'ANNNNN' and the target site comprises a sequence 'nnnnnc' . In some embodiments, 'NNNNN' and 'nnnnn' are reverse complementary.
  • the nucleotide sequence to be circularized can be split into a 5’ fragment ended with the selected target site and a 3’ fragment comprising the remaining sequence.
  • the nucleotide sequence of interest may be formed by placing the 3’ fragment at the 5’ region and the 5’ fragment at the 3’ region of the GOI.
  • the circular RNA is formed by connecting the nucleotide immediately downstream of the ⁇ N (i.e., the nucleotide at the ⁇ N+1 position in the RNA construct) and the 3’ end nucleotide of the target site through the self-splicing of the RNA construct. Accordingly, in some embodiments, the circular RNA may substantially consist of the nucleotide sequence of interest.
  • the circular RNA is formed by connecting the nucleotide at ⁇ N+1 position and the 3’ end nucleotide of the target site in the RNA construct.
  • ⁇ N is guanine.
  • ⁇ N is cytosine.
  • ⁇ N is adenine.
  • ⁇ N is uracil.
  • the circular RNA comprises a noncoding sequence having a biological activity.
  • a noncoding sequence having a biological activity include, but are not limited to, a micro RNA and a long non-coding (lnc) RNA.
  • the circular RNA comprises a protein-coding sequence.
  • the protein-coding sequence may encode any protein, for example, a protein for therapeutic or diagnostic use.
  • the protein-coding sequence encodes an antibody.
  • the circular RNA may further comprise sequences necessary for translation, e.g., an internal ribosomal entry site (IRES) sequence upstream of the protein-coding sequence.
  • IRES internal ribosomal entry site
  • the IRES sequence is intact within the nucleotide sequence of interest.
  • the IRES sequence is split to the 5’ and 3’ ends of the nucleotide sequence of interest and connected after circularization (e.g., FIGs. 4A) .
  • the circular RNA comprises an IRES sequence operably linked to a protein-coding sequence.
  • the phrase "operably linked" means that the IRES sequence is positioned upstream of the protein-coding sequence such that the protein-coding sequence can be translated into a protein in vivo (inside eukaryotic cells, e.g., human cells) and/or in vitro.
  • the IRES sequence may be any IRES sequence known in the art.
  • the IRES sequence may be naturally occurring or recombinant, e.g., obtained by truncating or mutating a naturally occurring IRES sequence.
  • the IRES sequence is selected from an IRES sequence of Taura syndrome virus, Triatoma virus, Theiler's encephalomyelitis virus, simian Virus 40, Solenopsis invicta virus 1, Rhopalosiphum padi virus, Reticuloendotheliosis virus, fuman poliovirus 1, Plautia stall intestine virus, Kashmir bee virus, Human rhinovirus 2, human rhinovirus B, Homalodisca coagulata virus-1, Human Immunodeficiency Virus type 1, Homalodisca coagulata virus-1, Himetobi P virus, Hepatitis C virus, Hepatitis A virus, Hepatitis GB virus, foot and mouth disease virus, Human enterovirus 71, Human enterovirus B, Equine rhinitis virus, Ectropis obliqua picoma-like virus, Encephalomyocarditis virus (EMCV) , Drosophila C Virus, Crucifer tobamo virus, Cricket paralysis
  • the nucleotide sequence of interest may comprise at least two protein-coding regions such that at least two different proteins can be expressed from the circular RNA.
  • a 2A or 2A-like sequence may be included between two protein-coding sequences to mediate co-translation of two proteins (also referred to as “Stop-Carry On” or “StopGo” translation) .
  • two or more different IRES sequences may be used to drive the expression of two or more different protein-coding regions.
  • the RNA construct may comprise unmodified or modified nucleotides.
  • the RNA construct does not comprise uridine, but comprises nucleosides selected from pseudouridine ( ⁇ ) , 1-methylpseudouridine (1m ⁇ ) , 5-methoxyuridine (5moU) , 2-thiouridine, or 4-thiouridine in place of uridine.
  • the RNA construct comprises 10%-100%, for example, 10%-90%, 20-80%, 30%-70%, 40-60%, or 50%-60%modified uridine in place of uridine, wherein the modified uridine is selected from pseudouridine ( ⁇ ) , 1-methylpseudouridine (1m ⁇ ) , 5-methoxyuridine (5moU) , 2-thiouridine, or 4-thiouridine.
  • the circular RNA may be of any length.
  • the circular RNA may comprise about 200-10,000 nucleotides (e.g., about 300, about 400, about 500, about 600, about 700, about 800, about 900, about 1,000, about 2,000, about 3,000, about 4,000, about 5,000, about 6,000, about 7,000, about 8,000, or about 9,000 nucleotides, or a range defined by any two of the foregoing values) .
  • the circular RNA comprises about 500-6,000 nucleotides (e.g., about 550, about 650, about 750, about 850, about 950, about 1,100, about 1,200, about 1,300, about 1,400, about 1,500, about 1,600, about 1,700, about 1,800, about 1,900, about 2,100, about 2,200, about 2,300, about 2,400, about 2,500, about 2,600, about 2,700, about 2,800, about 2,900, about 3,100, about 3,300, about 3,500, about 3,700, about 3,800, about 3,900, about 4,100, about 4,300, about 4,500, about 4,700, about 4,900, about 5,100, about 5,300, about 5,500, about 5,700, or about 5,900 nucleotides, or a range defined by any two of the foregoing values) .
  • nucleotides e.g., about 550, about 650, about 750, about 850, about 950, about 1,100, about 1,200, about 1,300, about 1,400, about 1,500, about 1,
  • RNA construct comprising, a first recognizersequence (R1) comprising a first pairing sequence; a nucleotide sequence of interest (GOI) comprising a target site at its 3’ end; a ribozyme core sequence operably linked to an internal guide sequence (IGS) , wherein the ribozyme core sequence encodes a ribozyme core having the catalytic activity of a group I intron ribozyme; and a second recognizer sequence (R2) comprising a second pairing sequence substantially complementary to the first pairing sequence; wherein the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site form a non- Watson-Crick base pair to define a 5' splice site; R1 and R2 are positioned at opposite ends of the RNA construct, such that hybridization of the first and second pairing sequences results in formation of a duplex-containing structure to define a 3’ splic
  • the present disclosure provides: 1.1. Construct 1, comprising, from 5’ end to 3’ end, R1 comprising a first pairing sequence and a 3’ end nucleotide 'N' ( ⁇ N) ; GOI comprising a target site at its 3’ end, IGS; Ribozyme core sequence; and R2 comprising a second pairing sequence; wherein ⁇ N is any naturally occurring or modified nucleotide; and the first pairing sequence and the second pairing sequence are substantially complementary to form a duplex-containing structure upstream of the ⁇ N to define the 3' splice site.
  • Construct 1 comprising, from 5’ end to 3’ end, R1 comprising a first pairing sequence and a 3’ end nucleotide 'N' ( ⁇ N) ; GOI comprising a target site at its 3’ end, IGS; Ribozyme core sequence; and R2 comprising a second pairing sequence; wherein ⁇ N is any naturally occurring or modified nucleotide; and the first pairing sequence and
  • RNA construct of 1.1 wherein ⁇ N is guanine ( ⁇ G) , cytosine ( ⁇ C) , adenine ( ⁇ A) or uracil ( ⁇ U) .
  • ⁇ N is guanine ( ⁇ G) , cytosine ( ⁇ C) , adenine ( ⁇ A) or uracil ( ⁇ U) .
  • the ribozyme core sequence comprises a nucleotide sequence encoding the scaffold domain and catalytic domain of a group I intron; preferably, the ribozyme core sequence comprises or consists of the sequence from the IGS end to the sequence before the P9.0 duplex of a group I intron.
  • the ribozyme core sequence is derived from a group IC1 (e.g., from Tetrahymena sp.
  • T. thermophile e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa
  • Pneumocystis sp. e.g., Pneumocystis carinii
  • IC2 e.g., from Anabaena sp. PCC7120 or Azoarcus sp. BH72
  • IA2 e.g., from Bacteriophage Twort
  • the first and second pairing sequences form a Watson-Crick base pair, a non-Watson-Crick base pair or a combination thereof; preferably, the non-Watson-Crick base pair is a wobble base pair, for example, a G-U wobble base pair; more preferably, the first and second pairing sequences form a base pair selected from A-U, G-C, G-A, A-A, U-U, A-C, G-U and a combination thereof. 1.12.
  • R1 further comprises a 5’ homology arm sequence located upstream of the first pairing sequence and R2 further comprises 3’ homology arm sequence located downstream of the second pairing sequence, and the 5’ and 3’ homology arm sequences are substantially reverse complementary;
  • the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 13
  • the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 14
  • the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 15
  • the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 16
  • the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 37
  • the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 38
  • RNA construct of 1.13 wherein ⁇ N is guanine ( ⁇ G) , cytosine ( ⁇ C) , adenine ( ⁇ A) or uracil ( ⁇ U) . 1.15.
  • the RNA construct of any one of 1.13-1.16, wherien t is 0 or 1. 1.18.
  • RNA construct of 1.13 wherein R1 comprises a nucleotide sequence 'N 2 N 1 G' , 'N 2 N 1 C' , 'N 2 N 1 A' or 'N 2 N 1 U' at its 3' end; or t is 1, R1 comprises a nucleotide sequence 'N 2 N 1 N y G' , 'N 2 N 1 N y C' , 'N 2 N 1 N y A' or 'N 2 N 1 N y U' at its 3' end; and R2 comprises a nucleotide sequence 'n 1 n 2 ' ; wherein 'N 1 ' , 'n 1 ' , 'N 2 ' , 'n 2 ' and 'N y ' are each independently any naturally occurring or modified nucleotide, preferably, 'N y ' is 'G' , 'U' or 'A' ; wherein 'N 1 ' and 'n 1 '
  • the RNA construct of 1.20 wherein the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 13, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 14; or the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 15, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 16; or the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 37, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 38; the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 52, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 53.
  • RNA construct comprising, from 5’ end to 3’ end, a first nucleotide sequence comprising a sequence from a nucleotide 'N q ' to the 3’ end of a group I intron, a nucleotide sequence of interest (GOI) comprising a target site at its 3’ end, an internal guide sequence (IGS) , and a second nucleotide sequence comprising a sequence from the IGS end to a nucleotide 'N p ' of a group I intron; wherein the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site form a non- Watson-Crick base pair to define a 5' splice site; 'N p ' and 'N q ' are independently selected from any nucleotide from the 3’ end nucleotide of the 5’ half to the 5’ end nucleotide of
  • the present disclosure provides: 1.22. Construct 2, wherein the 3’ end guanine of the group I intron is substituted with cytosine, adenine or uracil. 1.23. Construct 2 or the RNA construct of 1.23, wherein the group I intron is a group IC1 (e.g., from Tetrahymena sp. or Pneumocystis sp. (e.g., Pneumocystis carinii) , IC2, IC3 (e.g., from Anabaena sp. PCC7120 or Azoarcus sp.
  • IC1 e.g., from Tetrahymena sp. or Pneumocystis sp. (e.g., Pneumocystis carinii)
  • IC2 e.g., from Anabaena sp. PCC7120 or Azoarcus sp.
  • the group I intron is a Pneumocystis sp. group I intron comprising a nucleotide sequence selected from SEQ ID NOs: 32-36, or a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12. 1.24.
  • the group I intron is a Pneumocystis carinii group I intron comprising the nucleotide sequence of SEQ ID NO: 32, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 316 to nucleotide 342 of SEQ ID NO: 32; or the group I intron is a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 313 to nucleotide 411 of SEQ ID NO: 12; or the group I intron is an Anabaena sp.
  • group I intron comprising the nucleotide sequence of SEQ ID NO: 49, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 212 to nucleotide 243 of SEQ ID NO: 49. 1.25.
  • RNA construct 2 or the RNA construct of 1.22 or 1.23, wherein 'N p ' and 'N q ' are independently selected from any nucleotide from the 3’ end nucleotide of the 5’ half to the 5’ end nucleotide of the 3’ half of a duplex of the group I intron, wherein the duplex is not a P9.0 duplex; for example, the duplex is a P9a/9b, P9.1, P9.1a or P9.2 duplex, preferably a P9.2 duplex. 1.26.
  • RNA construct 2 or the RNA construct of 1.22 or 1.23, wherein 'N p ' and 'N q ' are located within the region connecting the 5’ half and 3’ half of the duplex; or 'N p ' is the 3’ end nucleotide of the 5’ half of the duplex and 'N q ' is the 5’ end nucleotide of the 3’ half of the duplex. 1.27.
  • RNA construct 2 or the RNA construct of any one of 1.22-1.26, wherein the RNA construct further comprises a 5’ homology arm sequence located upstream of the first nucleotide sequence and a 3’ homology arm sequence located downstream of the second nucleotide sequence, wherein the 5’ and 3’ homology arm sequences are substantially complementary; for example, the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 13, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 14; or the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 15, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 16; or the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 37, and the 3’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 38; or the 5’ homology arm sequence comprises the nucleotide sequence of SEQ ID NO: 52, and
  • RNA construct wherein the non-Waton-Crick base pair formed between the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site is (a) guanine-uracil (G-u) , wherein 'G' is the 5’ end nucleotide of the IGS and 'u' is the 3’ end nucleotide of the target site; or (b) adenine-cytosine (A-c) , wherein 'A' is the 5’ end nucleotide of the IGS and 'c' is the 3’ end nucleotide of the target site; or (c) guanine-adenine (G-a) , wherein 'G' is the 5’ end nucleotide of the IGS and 'a' is the 3’ end nucleotide of the target site.
  • G-u guanine-uracil
  • A-c adenine-cytosine
  • G-a
  • RNA construct wherein the IGS and the target site form a P1 duplex mimic. 1.30. Any foregoing RNA construct, wherein the IGS has the structure of 5'-X (N) m -3' the target site has the structure of 5'- (n) m x -3' 'X' and 'x' are the nucleotides that form the non-Watson-Crick base pair, each 'N' and 'n' is a nucleotide independently selected from adenine (A) , cytosine (C) , guanine (G) , uracil (U) , pseudouridine ( ⁇ ) , 1-methylpseudouridine (1m ⁇ ) , 5-methoxyuridine (5moU) , 2-thiouridine, 4-thiouridine, 5-methylcytidine, and N6-methyladenosine, e.g., wherein each 'N' and 'n' is a nucle
  • RNA construct wherein the IGS comprises a sequence 'GNNNNN' and the target site comprises a sequence 'nnnnu' ; or the IGS comprises a sequence 'ANNNNN' and the target site comprises a sequence 'nnnnnc' ; wherein 'NNNNN' and 'nnnnn' are reverse complementary.
  • the RNA construct further comprises a linker sequence located between the target site and IGS. 1.33. The RNA construct of 1.32, wherein the linker sequence comprises an unpaired sequence, wherein the target site, the linker sequence and the IGS form a stem-loop structure. 1.34.
  • RNA construct of 1.32 wherein the linker sequence comprises, from 5’ end to 3’ end, a third pairing sequence, a loop sequence and a fourth pairing sequence, wherein the third and fourth pairing sequences form a P1 extension mimic; preferably, the P1 extension mimic comprises 1-3 reverse complementary base pairs. 1.35.
  • RNA construct having the structure of the following: (a) 5’-5’ homology arm sequence – (N x ) h (N y ) t G –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’; (b) 5’-5’ homology arm sequence – (N x ) h (N y ) t C –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’; (c) 5’-5’ homology arm sequence – (N x ) h (N y ) t A –GOI –linker sequence –IGS –ribozyme core sequence – (n x ) h –3’ homology arm sequence -3’; or (d) 5’-5’ homology arm sequence – (N x )
  • RNA construct of 1.36 wherein t is an integer of 0-10; preferably, t is 0. 1.38.
  • RNA construct of 1.13 having the structure of the following: (a) 5’-SEQ ID NO: 21 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 20 -3’; (b) 5’-SEQ ID NO: 23 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 22 -3’; (c) 5’-SEQ ID NO: 25 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 24 -3’; or (d) 5’-SEQ ID NO: 27 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 26 -3’; (e) 5’-GUG –GOI –linker sequence –IGS –SEQ ID NO: 19 –AU -3’; (f) 5’-ACG –GOI –linker sequence –IGS –S
  • RNA construct wherein the circular RNA does not contain an exogenous exon sequence. 1.41. Any forgoing RNA, wherein the circular RNA substantially consists of or consists of the GOI. 1.42. Any foregoing RNA construct, wherein the RNA construct does not comprise uridine, but comprises nucleosides selected from pseudouridine ( ⁇ ) , 1-methylpseudouridine (1m ⁇ ) , 5-methoxyuridine (5moU) , 2-thiouridine, or 4-thiouridine in place of uridine. 1.43.
  • any foregoing RNA construct, wherein the circular RNA is formed by connecting the nucleotide at ⁇ N+1 position and the 3’ end nucleotide of the target site in the RNA construct; preferably, ⁇ N is guanine, cytosine, adenine or uracil. 1.44. Any foregoing RNA construct, wherein the circular RNA comprises a noncoding sequence having a biological activity, optionally wherein the noncoding sequence is micro RNA or long non-coding (lnc) RNA. 1.45. Any foregoing RNA construct, wherein the circular RNA comprises a protein- coding sequence, optionally wherein the protein-coding sequence is operably linked to an internal ribosome entry site (IRES) sequence. 1.46.
  • IRS internal ribosome entry site
  • RNA construct wherein the circular RNA comprises an IRES; e.g., wherein the IRES sequence is intact within the nucleotide sequence of interest or is split at either end of the nucleotide sequence of interest but joined after circularization. 1.47.
  • the circular RNA comprises an IRES
  • the IRES sequence is selected from an IRES sequence of Taura syndrome virus, Triatoma virus, Theiler's encephalomyelitis virus, simian Virus 40, Solenopsis invicta virus 1, Rhopalosiphum padi virus, Reticuloendotheliosis virus, fuman poliovirus 1, Plautia stall intestine virus, Kashmir bee virus, Human rhinovirus 2, Human rhinovirus B, Homalodisca coagulata virus-1, Human Immunodeficiency Virus type 1, Homalodisca coagulata virus-1, Himetobi P virus, Hepatitis C virus, Hepatitis A virus, Hepatitis GB virus, foot and mouth disease virus, Human enterovirus 71, Human enterovirus B, Equine rhinitis virus, Ectropis obliqua picoma-like virus, Encephalomyocarditis virus (EMCV) , Drosophil
  • RNA construct wherein the circular RNA comprises an IRES, wherein the IRES sequence is an IRES sequence of Human rhinovirus B.
  • the RNA construct of the present disclosure has a sequence selected from: (a) 5’- [R1: SEQ ID NO: 21] – [GOI, wherein the target site is located within the ORF and has a nucleotide sequence of 'nnnnnu' ] – [Linker sequence] – [IGS having a nucleotide sequence of 'GNNNNN' ] – [ribozyme core sequence: SEQ ID NO: 17] – [R2: SEQ ID NO: 20] –polyadenylation sequence -3’ [i.e. framework of SEQ ID NO: 1, related to FIG.
  • the promoter is a T7 promoter and the RNA polymerase is a T7 virus RNA polymerase; or the promoter is a T6 promoter, and the polymerase is a T6 virus RNA polymerase; or the promoter is an SP6 virus RNA polymerase promoter and the polymerase is SP6 virus RNA polymerase; or the promoter is T3 virus RNA polymerase promoter and the polymerase is T3 virus RNA polymerase; or the promoter is T4 virus RNA polymerase promoter and the polymerase is T4 virus RNA polymerase.
  • the RNA polymerase promoter is a T7 virus RNA polymerase promoter and the polymerase is a T7 virus RNA polymerase.
  • Other examples of promoters may include but are not limited to cytomegalovirus (CMV) immediate early promoter, eukaryotic translation elongation factor 1 ⁇ (EF-1 ⁇ ) promoter, simian virus 40 (SV40) , U6 promoter, H1 promoter, chicken ⁇ -actin (CBA) promoter and human phosphoglycerate kinase 1 (hPGK) promoter.
  • CMV cytomegalovirus
  • EF-1 ⁇ eukaryotic translation elongation factor 1 ⁇
  • SV40 simian virus 40
  • U6 promoter
  • H1 promoter H1 promoter
  • CBA chicken ⁇ -actin
  • hPGK human phosphoglycerate kinase 1
  • the template DNA may be linear or circular.
  • the template DNA is prepared by linearizing a DNA plasmid, e.g., by a restriction enzyme.
  • the template is circular (e.g., a DNA plasmid) .
  • the template DNA may comprise an RNA polymerase terminator sequence element downstream of the region that encodes the RNA construct, especially when the template DNA is circular.
  • the template DNA comprises a sequence encoding the RNA construct, which as described above, is a linear RNA molecule that can self-splice, thereby producing a circular RNA (circRNA) .
  • the RNA construct contains the circRNA sequence plus splicing sequences (e.g., ribozyme core sequence and 5' and 3' recognizer sequences) necessary to circularize the RNA. These splicing sequences are removed from the RNA construct during the circularization, leaving a circRNA comprising the nucleotide sequence of interest.
  • the nucleoside moieties in the RNA construct are naturally occurring nucleosides, e.g., adenosine, guanosine, cytidine and uridine.
  • the DNA template comprises a promoter recognized by an RNA polymerase operably linked to a sequence encoding an RNA construct as described above.
  • operably linked means that the elements are positioned on the DNA template such that the RNA construct can be synthesized by in vitro or in vivo transcription of the template DNA.
  • the RNA construct can then form the desired circRNA, e.g., using the methods disclosed herein.
  • the disclosure thus further provides a DNA construct, e.g., a plasmid, comprising a sequence encoding the RNA construct of the present disclosure, operably linked to a promoter.
  • a DNA construct e.g., a plasmid
  • the disclosure further provides methods for production of a circRNA by (i) in vitro transcription of a DNA construct, e.g., a plasmid, comprising a sequence encoding the RNA construct of the present disclosure, and (ii) circularization (i.e., self-splicing) of the RNA construct thus transcribed, in a buffered reaction solution comprising magnesium and ingredients required for in vitro transcription, e.g., an RNA polymerase, an RNase inhibitor, ATP, GTP, CTP, UTP, DTT, and a monovalent cation (Na + or K + ) .
  • a DNA construct e.g., a plasmid
  • circularization i.e., self-splicing
  • the concentration of Mg 2+ in the solution is from 30 mM to 100 mM, e.g., from 30 mM to 90 mM, from 30 mM to 80 mM, from 30 mM to 70 mM, from 30 mM to 60 mM, from 30 mM to 50 mM, from 30 mM to 40 mM, from 35 mM to 100 mM, from 35 mM to 90 mM, from 35 mM to 80 mM, from 35 mM to 70 mM, from 35 mM to 60 mM, from 35 mM to 50 mM, from 35 mM to 40 mM, from 38 to 66 mM, e.g., about 38 mM.
  • the concentration of Mg 2+ in the solution is from 38 mM to 66 mM.
  • the reaction solution comprises a pyrophosphatase at the concentration of from 1 U/ml to 5 U/ml, e.g., from 1 U/ml to 4 U/ml, from 1.5 U/ml to 3 U/ml, from 1.5 U/ml to 2.5 U/ml, about 1 U/ml, about 2 U/ml, or about 4 U/ml.
  • 1 U (unit) of pyrophosphatase is defined as the amount of enzyme that generates 1 ⁇ mol of phosphate per minute from inorganic pyrophosphate under standard reaction conditions (a10 minute reaction at 25°C in 20 mM Tris-HCl, pH 8.0, 2 mM MgCl 2 and 2 mM PPi) .
  • the reaction solution further comprises ingredients required for in vitro transcription.
  • the reaction solution comprises an RNA polymerase, an RNase inhibitor, ATP, GTP, CTP, UTP, DTT, and a monovalent cation (Na + or K + ) .
  • the reaction solution comprises about 5 U/ ⁇ l RNA polymerase, about 1 U/ ⁇ l RNAse inhibitor, about 10 mM ATP, about 10 mM GTP, about 10 mM CTP, about 10 mM UTP, about 10 mM DTT, and 5 mM monovalent cation (Na + or K + ) .
  • the reaction solution may comprise a buffer.
  • the pH of the reaction solution may be from 6 to 8, e.g., from 7 to 8, or about 7.5.
  • the RNA construct may be unmodified, partially modified or completely modified.
  • the RNA construct is unmodified, i.e., contains only naturally occurring nucleotides.
  • the RNA construct is partially modified or completely modified.
  • a part or all of at least one ribonucleoside triphosphate in the reaction solution may be replaced with a modified nucleoside triphosphate in order to synthesize partially modified or completely modified RNA construct.
  • modified nucleoside triphosphate include, but are not limited to, pseudouridine-5′-triphosphate, 1-methylpseudouridine-5′-triphosphate, 2-thiouridine-5′-triphosphate, 4-thiouridine-5′-triphosphate and 5-methylcytidine-5′-triphosphate.
  • RNA polymerase used for in vitro transcription may be chosen based on the RNA polymerase promoter in the DNA template.
  • the reaction solution may comprise a T7 RNA polymerase.
  • the reaction solution comprises an RNA polymerase selected from T7 virus RNA polymerase, T6 virus RNA polymerase, SP6 virus RNA polymerase, T3 virus RNA polymerase, or T4 virus RNA polymerase.
  • the RNA polymerase promoter in the DNA template is a T7 virus RNA polymerase and the reaction solution comprises a T7 virus RNA polymerase.
  • the in vitro transcription of the template DNA and the circularization (i.e., self-splicing) of the RNA construct are carried out at a temperature of from 37 °C to 55 °C, e.g., from 39 °C to 55 °C, from 41 °C to 55 °C, from 43 °C to 55 °C, from 37 °Cto 50 °C, from 39 °C to 50 °C, from 41 °C to 50 °C, from 43 °C to 50 °C, from 37 °C to 47 °C, from 39 °C to 47 °C, from 41 °C to 47 °C, from 43 °C to 47 °C, from 47 °C to 55 °C, from 50 °Cto 55 °C, from 39 °C to 43 °C, about 37 °C, about 39 °C, about 41 °C, about 43 °C, about 47 °C, about 53 °C, or about 55 °C
  • a genetically modified RNA polymerase exhibiting increased thermo stability may be preferred if the in vitro transcription of the template DNA and the circularization (i.e., self-splicing) of the RNA construct are carried out at a high temperature.
  • the in vitro transcription of the template DNA and the circularization (i.e., self-splicing) of the RNA construct are carried out at a temperature of from 47 °C to 55 °C, e.g., from 50 °C to 55 °C, about 47 °C, about 53 °C, or about 55 °C and the RNA polymerase is a thermostable polymerase (e.g., T7 Toyobo) .
  • a thermostable polymerase e.g., T7 Toyobo
  • the in vitro transcription of the template DNA and the circularization (i.e., self-splicing) of the RNA construct may be carried out for at least 1 hour, e.g., at least 1.5 hours, at least 2.5 hours, at least 3 hours, from 1 hour to 3 hours, from 1.5 hours to 3 hours, from 2 hours to 3 hours, or from 2.5 hours to 3 hours.
  • the reaction time no less than 1.5 hours is preferred to guarantee the sufficient circularization.
  • the in vitro transcription of the template DNA and the circularization (i.e., self-splicing) of the RNA construct are carried out for 2.5-3 hours.
  • the method further comprises a step of removing the DNA template after the self-splicing of the RNA construct.
  • the DNA template may be removed by adding a DNase I, e.g., for 30 min at 37 °C.
  • the method further comprises a step of purifying the circular RNA after the self-splicing of the RNA construct or after the step of removing the DNA template, if the method comprises a step of removing the DNA template.
  • the purification step is selected from a precipitation step, a tangential flow filtration step and a chromatographic step, and a combination thereof.
  • the precipitation step may be an alcoholic precipitation step or LiCl precipitation.
  • the tangential flow filtration step may be a diafiltration step using tangential flow filtration and/or a concentration step using tangential flow filtration.
  • the chromatographic step may be selected from HPLC, anion exchange chromatography, affinity chromatography, hydroxyapatite chromatography, magnetic bead chromatography and core bead chromatography.
  • the purification step comprises a precipitation step, e.g., LiCl precipitation.
  • the purification step comprises a chromatography, e.g., magnetic bead chromatography.
  • the disclosure thus provides, in an aspect, a method of preparing a circular RNA (Method 1) , comprising (i) providing a template DNA, wherein the template DNA comprises a sequence encoding the RNA construct of the present disclosure, operably linked to a promoter, in a reaction solution, thereby allowing synthesis of the RNA construct by in vitro transcription of the template DNA and allowing the RNA construct to self-splice, to produce a circular RNA, and (ii) recovering the circular RNA thus produced.
  • the invention includes: 1.1.
  • any of preceding methods wherein the promoter is a T7 virus RNA polymerase promoter, T6 virus RNA polymerase promoter, SP6 virus RNA polymerase promoter, T3 virus RNA polymerase promoter, or T4 virus RNA polymerase promoter.
  • the promoter is a T7 virus RNA polymerase promoter.
  • the reaction solution comprises Mg 2+ at the concentration greater than 26 mM, e.g., greater than 30 mM or greater than 35 mM. 1.8.
  • the concentration of Mg 2+ in the solution is from 30 mM to 100 mM, e.g., from 30 mM to 90 mM, from 30 mM to 80 mM, from 30 mM to 70 mM, from 30 mM to 60 mM, from 30 mM to 50 mM, from 30 mM to 40 mM, from 35 mM to 100 mM, from 35 mM to 90 mM, from 35 mM to 80 mM, from 35 mM to 70 mM, from 35 mM to 60 mM, from 35 mM to 50 mM, from 35 mM to 40 mM, from 38 to 66 mM, e.g., about 38 mM, optionally wherein the concentration of Mg 2+ in the solution is from 38 mM to 66 mM.
  • Embodiment 3 The RNA construct according to embodiment 1 or 2, wherein the ribozyme core is derived from a group IC1 (e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumocystis sp. (e.g., Pneumocystis carinii) , IC2, IC3 (e.g., from Anabaena sp. PCC7120 or Azoarcus sp. BH72) or IA2 (e.g., from Bacteriophage Twort) intron.
  • IC1 e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or P
  • Embodiment 4 The RNA construct according to embodiment 1 or 2, wherein the ribozyme core is derived from a Pneumocystis sp. group I intron; for example, a Pneumocystis sp. group I intron comprising a nucleotide sequence selected from SEQ ID NOs: 32-36; preferably, the ribozyme core is derived from a Pneumocystis carinii group I intron comprising the nucleotide sequence of SEQ ID NO: 32, for example, the ribozyme core comprises or consists of the nucleotide sequence of SEQ ID NO: 19 or a nucleotide sequence having at least 95%sequence identity thereto.
  • Embodiment 7 The RNA construct according to any one of embodiments 1-6, wherein the first and second pairing sequences each independently comprises 2-100 nucleotides; for example, the first pairing sequence comprises 2-20, 2-12, 4-10, 6, 7 or 8 nucleotides; and/or the second pairing sequence comprises 2-100, 5-80, 8-60, 10-50, 15, 20, 30, 40, 50, 60, 70, 80, 90 or 100, preferably 5-80 or 8-60 nucleotides.
  • Embodiment 14 The RNA construct according to Embodiment 13, wherein the group I intron is a group IC1 (e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumocystis carinii) , IC2, IC3 (e.g., from Anabaena sp. PCC7120 or Azoarcus sp.
  • IC1 e.g., from Tetrahymena sp. (e.g., T. thermophile, T. cosmopolitanis, T. hyperangularis, T. malaccensis or T. pigmentosa) or Pneumocystis carinii)
  • IC2 e.g., from Anabaena sp. PCC7120 or Azoarcus sp.
  • the group I intron is a group IC1 intron, for example, a Pneumocystis sp. or Tetrahymena sp. group I intron, more preferably, the group I intron comprises a nucleotide sequence selected from SEQ ID NOs: 32-36 and SEQ ID NO: 12.
  • Embodiment 15 The RNA construct according to embodiment 13, wherein the group I intron is a Pneumocystis carinii group I intron comprising the nucleotide sequence of SEQ ID NO: 32, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 316 to nucleotide 342 of SEQ ID NO: 32; or the group I intron is a Tetrahymena thermophila group I intron comprising the nucleotide sequence of SEQ ID NO: 12, 'N p ' and 'N q ' are independently selected from any nucleotide from nucleotide 313 to nucleotide 411 of SEQ ID NO: 12.
  • Embodiment 16 The RNA construct according to embodiment 13 or 14, wherein 'N p ' and 'N q ' are independently selected from any nucleotide from the 3’ end nucleotide of the 5’ half to the 5’ end nucleotide of the 3’ half of a duplex of the group I intron, wherein the duplex is not a P9.0 duplex; for example, the duplex is a P9a/9b, P9.1, P9.1a or P9.2 duplex, preferably a P9.2 duplex.
  • Embodiment 17 The RNA construct according to embodiment 16, wherein 'N p ' and 'N q ' are located within the region connecting the 5’ half and 3’ half of the duplex; or 'N p ' is the 3’ end nucleotide of the 5’ half of the duplex and 'N q ' is the 5’ end nucleotide of the 3’ half of the duplex.
  • Embodiment 18 The RNA construct according to any one of embodiments 13-17, wherein the RNA construct further comprises a 5’ homology arm sequence located upstream of the first nucleotide sequence and a 3’ homology arm sequence located downstream of the second nucleotide sequence; wherein the 5’ and 3’ homology arm sequences are at least partially reverse complementary.
  • Embodiment 19 The RNA construct according to any one of embodiments 1-18, wherein the non-Waton-Crick base pair formed between the 5' end nucleotide of the IGS and the 3' end nucleotide of the target site is (a) guanine-uracil (G-u) , wherein 'G' is the 5’ end nucleotide of the IGS and 'u' is the 3’ end nucleotide of the target site; or (b) adenine-cytosine (A-c) , wherein 'A' is the 5’ end nucleotide of the IGS and 'c' is the 3’ end nucleotide of the target site; or (c) guanine-adenine (G-a) , wherein 'G' is the 5’ end nucleotide of the IGS and 'a' is the 3’ end nucleotide of the target site.
  • G-u guanine-uracil
  • Embodiment 20 The RNA construct according to any one of embodiments 1-17, wherein the IGS and the target site form a P1 duplex mimic.
  • Embodiment 21 The RNA construct according to any one of embodiments 1-20, wherein the IGS has the structure of 5'-X (N) m -3' the target site has the structure of 5'- (n) m x -3' 'X' and 'x' are the nucleotides that form the non-Watson-Crick base pair, each 'N' and 'n' is a nucleotide independently selected from A, G, C and U, and m is an integer of 2-8, preferably 3-6, most preferably 4-5; preferably, 5' - (N) m -3' and 5' - (n) m -3' are reverse complementary.
  • Embodiment 22 The RNA construct according to any one of embodiments 1-21, wherein the IGS comprises a sequence 'GNNNNN' and the target site comprises a sequence 'nnnnu' ; or the IGS comprises a sequence 'ANNNNN' and the target site comprises a sequence 'nnnnnc' ; wherein 'NNNNN' and 'nnnnn' are reverse complementary.
  • Embodiment 23 The RNA construct according to any one of embodiments 1-22, wherein the RNA construct further comprises a linker sequence located between the target site and IGS.
  • Embodiment 24 The RNA construct according to embodiment 23, wherein the linker sequence comprises an unpaired sequence, wherein the target site, the linker sequence and the IGS form a stem-loop structure.
  • Embodiment 25 The RNA construct according to embodiment 23, wherein the linker sequence comprises, from 5’ end to 3’ end, a third pairing sequence, a loop sequence and a fourth pairing sequence, wherein the third and fourth pairing sequences form a P1 extension mimic; preferably, the P1 extension mimic comprises 1-3 reverse complementary base pairs.
  • Embodiment 27 The RNA construct according to embodiment 1, having the structure of: (a) 5’-SEQ ID NO: 21 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 20 -3’; (b) 5’-SEQ ID NO: 23 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 22 -3’; (c) 5’-SEQ ID NO: 25 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 24 -3’; (d) 5’-SEQ ID NO: 27 –GOI –linker sequence –IGS –SEQ ID NO: 17 –SEQ ID NO: 26 -3’; (e) 5’-GUG –GOI –linker sequence –IGS –SEQ ID NO: 19 –AU -3’; (f) 5’-ACG –GOI –linker sequence –IG
  • R2 (the 3’ recognizer sequence) is designed to comprise the sequence from the 5’ half of P9.0 to the 5’ half of P9.2 of SEQ ID NO: 12 (the sequence connecting the 5’ half of P9.0 and the 5’ half of P9.2 is also referred to as “Spacer 2” for convenience) and a 3’ homology arm sequence (Arm I)
  • R1 (the 5’ recognizer sequence) is designed to comprise a 5’ homology arm sequence (Arm I) and the sequence from the 3’ half of P9.2 to ⁇ G of SEQ ID NO: 12 (the sequence connecting the 3’ half of P9.2 and the 3’ half of P9.0 is also referred to as “Spacer 1” for convenience) .
  • the plasmid linearized by BsaI enzymatic digestion is used as a template for the IVT reaction.
  • a single reaction system (20 ⁇ L in total) is prepared as follows: 1 U/ ⁇ L RNase Inhibitor (Novoprotein E125) , 6.67 mM ATP, 20 mM GTP, 6.67 mM CTP, 6.67 mM UTP, 1X Transcription buffer (Novoprotein GMP-EB121 containing 6 mM MgCl 2 ) , 10 mM DTT (Sigma 43816) , 4 U/mL Pyrophosphatase Inorganic (Novoprotein GMP-M036) , 5 mM NaCl (Invitrogen AM9760G) , 18 mM MgCl 2 (Invitrogen M1028) , 5 U/ ⁇ L T7 RNA polymerase (Novoprotein GMP-E121) , 25 ng/ ⁇ L linearized plasmid.
  • RNA construct is purified by precipitation with 7.5 M LiCl or column purification using a Monarch RNA cleanup kit (NEB) . A fragment analyzer is applied to evaluate the products. 1.3) Generation and purification of circular RNA:
  • the plasmid linearized by BsaI enzymatic digestion was used as a template for the IVT reaction.
  • a single reaction system (20 ⁇ L in total) was prepared as follows: 1 U/ ⁇ L RNase Inhibitor (Novoprotein E125) , 10 mM ATP, 10 mM GTP, 10 mM CTP, 10 mM UTP, 1X Transcription buffer (Novoprotein GMP-EB121; containing 6 mM MgCl 2 ) , 10 mM DTT (Sigma 43816) , 4U/mL Inorganic Pyrophosphatases (Novoprotein GMP-M036) , 5 mM NaCl (Invitrogen AM9760G) , MgCl 2 (Invitrogen M1028) ranging from 30 mM to 50 mM, 5 U/ ⁇ L T7 RNA polymerase (KactusBio GMP-T7P-EE101-12) , 25 ng/ ⁇ L linearized plasmid.
  • RNAs were purified by 7.5 M LiCl precipitation or column purification using a Monarch RNA cleanup kit (NEB) .
  • NEB Monarch RNA cleanup kit
  • a fragment analyzer was applied to evaluate the products. Specifically, in the RNA mode, purified circular RNAs were further analyzed with capillary electrophoresis with Agilent 5200 or 5300 Bioanalyzer. Samples were diluted to an appropriate concentration and analyzed according to the manufacturer’s instructions (Agilent DNF-471 RNA Kit, 15 nt) . Agilent ProSize Data Analysis Software was utilized to analyze the results. The Smear analysis module was applied to identify the peak range corresponding to the circular RNA component. As FA cannot distinguish between circRNA and nicked RNA, both components were exhibited in a single peak before the precursor peak, as shown in FIG. 7A.
  • RNA 9000 Ladder (SCIEX, AM7150) was used as a reference for the size of sample bands, and the area percent (%) and Quality (bp) of each component were obtained through integrated quantification. 1.4) RNase R digestion to remove linear RNA:
  • RNA sequencing across the putative splice junction of the RNA products after RNase R treatment also confirmed the correct ligation between the 5’ and 3’ ends of the GOI (data not shown) .
  • gel-purified RNA was subjected to reverse transcription using a PrimeScript RT Reagent Kit with random primers (TAKARA, RR037B) , followed by PCR amplification with primers capable of amplifying transcripts across the splice junction.
  • the resulting PCR products were then subjected to Sanger sequencing in order to validate the backsplice junction of the circular RNA.
  • the circularization products including the RNase R-treated samples, were transfected into HEK293 cells with the precursor as a control. Specifically, 50000 cells were seeded per well of a 96-well plate, 100 ng RNA sample was transfected into cells per well using transfection reagent (TransIT, Mirus) , and reporter gene expression was detected by flow cytometry 48 h later. The results show that the circularization products and RNase R digested products could effectively express GFP but not for the RNA construct (FIG. 9) .
  • Ribozyme T can mediate the circularization where the backsplicing site is located in other regions, such as IRES element
  • the circRNA precursor (SEQ ID NO: 39) was generated and purified through the same processes described in Examples 1.1 and 1.2.
  • the backsplicing site was designed inside the ORF (FIG. 6A) of the nucleotide sequence to be circularized (SEQ ID NO: 50) .
  • the preparation of circular RNA was carried out following the procedures described in Example 1.3.
  • the circularization products were digested by RNase R as described in Example 1.4.
  • R1 and R2 were designed to form unpaired structures, like loops, to generate a circRNA precursor comprising the sequence of SEQ ID NO: 40 (FIG. 40) .
  • the circRNA precursor (SEQ ID NO: 40) was generated and purified through the same processes described in Examples 1.1 and 1.2.
  • the back-splicing site was designed inside the ORF (FIG. 6A) of the nucleotide sequence to be circularized (SEQ ID NO: 50) .
  • the preparation of circular RNA was carried out following the procedures described in Example 1.3.
  • the circularization products were digested by RNase R as described in Example 1.4.
  • Example 12 Based on the results obtained in Example 12, it was observed that absence of a paired structure between R1 and R2 led to a significant inhibition of the circularization reaction. To further validate the necessity of paired structure formation at both ends of the precursor, homology arms were reintroduced to generate a circRNA precursor comprising the sequence of SEQ ID NO: 41 (FIG. 43) .
  • Examples 12 and 13 have demonstrated that incorporating complementary pairing structures between R1 and R2 is crucial for effective circularization.
  • the base pairs within the P9.0 duplex at the 5' and 3' positions were swapped while still maintaining the complementary design.
  • the resultant circRNA precursor has the sequence of SEQ ID NO: 44 (FIG. 46) .
  • the circRNA precursor (based on Anabaena (sp. strain PCC 7120) -hereafter referred to as “ribozyme A” ) was generated and purified through the same processes described in Examples 1.1 and 1.2 (SEQ ID NO: 45) (FIG. 49) .
  • the nucleotide sequence to be circularized (SEQ ID NO: 51) comprises a 5’ UTR comprising an IRES from Enterovirus B, an ORF sequence encoding firefly luciferase (Fluc) and a 3’ UTR. The back-splicing site was designed in the 3’ UTR.
  • the target site having a sequence of 'CTT' ( 'nnu' ; corresponding to the upstream exon fragment of the native Anabaena group I intron) and a sequence of 'A AAA' (corresponding to the downstream exon fragment of the native Anabaena group I intron) were designed in the 3’ UTR of SEQ ID NO: 51.
  • the GOI was formed by placing the sequence 'AAAA' and its downstream sequence in SEQ ID NO: 51 at the 5’ end and the remaining sequence in SEQ ID NO: 51 at the 3’ end.
  • R1 and R2 were designed to include homology arm sequences, spacers and the sequecnes for P9.0 duplex (see FIG. 49) .
  • the preparation of circular RNA was carried out following the procedures described in Example 1.3.
  • the circularization products were digested by RNase R as described in Example 1.4.
  • the circRNA precursor (SEQ ID NO: 46) based on ribozyme T was generated and purified through the same processes described in Examples 1.1 and 1.2.
  • the back-splicing site was designed inside the ORF (FIG. 6A) of the nucleotide sequence to be circularized (SEQ ID NO: 50) .
  • the preparation of circular RNA was carried out following the procedures described in Example 1.3.
  • the circularization products were digested by RNase R as described in Example 1.4.
  • the circRNA precursor (SEQ ID NO: 47) based on ribozyme T was generated and purified through the same processes described in Examples 1.1 and 1.2.
  • the back-splicing site was designed inside the ORF (FIG. 6A) of the nucleotide sequence to be circularized (SEQ ID NO: 50) .
  • the preparation of circular RNA was carried out following the procedures described in Example 1.3.
  • the circularization products were digested by RNase R as described in Example 1.4.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Cell Biology (AREA)
  • Saccharide Compounds (AREA)

Abstract

L'invention concerne de nouvelles constructions d'ARN ribozyme codant pour des protéines étrangères ou des ARN fonctionnels, avec un système de circularisation fondé sur les introns du groupe I, qui sont aptes à s'auto-circulariser avec une grande efficacité sans introduire de fragments étrangers, ainsi que des procédés d'utilisation des constructions pour fabriquer des ARN circulaires.
PCT/CN2024/143138 2023-12-29 2024-12-27 Circularisation d'arn Pending WO2025140537A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2023143065 2023-12-29
CNPCT/CN2023/143065 2023-12-29

Publications (1)

Publication Number Publication Date
WO2025140537A1 true WO2025140537A1 (fr) 2025-07-03

Family

ID=96216849

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2024/143138 Pending WO2025140537A1 (fr) 2023-12-29 2024-12-27 Circularisation d'arn

Country Status (1)

Country Link
WO (1) WO2025140537A1 (fr)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100305197A1 (en) * 2009-02-05 2010-12-02 Massachusetts Institute Of Technology Conditionally Active Ribozymes And Uses Thereof
WO2017009376A1 (fr) * 2015-07-13 2017-01-19 Curevac Ag Procédé de production d'arn à partir d'adn circulaire et adn matriciel correspondant
US20200080106A1 (en) * 2018-06-06 2020-03-12 Massachusetts Institute Of Technology Circular rna for translation in eukaryotic cells
WO2021158964A1 (fr) * 2020-02-07 2021-08-12 University Of Rochester Assemblage et expression d'arn à médiation par ribozyme
WO2022191642A1 (fr) * 2021-03-10 2022-09-15 알지노믹스 주식회사 Structure d'arn auto-circularisée
CN115786338A (zh) * 2022-10-10 2023-03-14 瑞思高科生物科技(北京)有限公司 一种i型内含子核酶、其制备方法和在调控rna成环中的应用
CN116286916A (zh) * 2023-02-24 2023-06-23 北京大学人民医院 一种改良的i型内含子核酶序列用于构建环状rna的方法及应用
CN116376978A (zh) * 2023-03-27 2023-07-04 清华大学 一种基于核酶催化方式的mRNA环化和翻译方法
WO2024140987A1 (fr) * 2022-12-29 2024-07-04 Suzhou Abogen Biosciences Co., Ltd. Circularisation d'arn

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100305197A1 (en) * 2009-02-05 2010-12-02 Massachusetts Institute Of Technology Conditionally Active Ribozymes And Uses Thereof
WO2017009376A1 (fr) * 2015-07-13 2017-01-19 Curevac Ag Procédé de production d'arn à partir d'adn circulaire et adn matriciel correspondant
US20200080106A1 (en) * 2018-06-06 2020-03-12 Massachusetts Institute Of Technology Circular rna for translation in eukaryotic cells
WO2021158964A1 (fr) * 2020-02-07 2021-08-12 University Of Rochester Assemblage et expression d'arn à médiation par ribozyme
WO2022191642A1 (fr) * 2021-03-10 2022-09-15 알지노믹스 주식회사 Structure d'arn auto-circularisée
CN115786338A (zh) * 2022-10-10 2023-03-14 瑞思高科生物科技(北京)有限公司 一种i型内含子核酶、其制备方法和在调控rna成环中的应用
WO2024140987A1 (fr) * 2022-12-29 2024-07-04 Suzhou Abogen Biosciences Co., Ltd. Circularisation d'arn
CN116286916A (zh) * 2023-02-24 2023-06-23 北京大学人民医院 一种改良的i型内含子核酶序列用于构建环状rna的方法及应用
CN116376978A (zh) * 2023-03-27 2023-07-04 清华大学 一种基于核酶催化方式的mRNA环化和翻译方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHI SYLVIA IGHEM, DAHL MIKAEL, EMBLEM Ã SE, JOHANSEN STEINAR D.: "Giant group I intron in a mitochondrial genome is removed by RNA back-splicing", BMC MOLECULAR BIOLOGY, BIOMED CENTRAL LTD., GB, vol. 20, no. 1, 1 December 2019 (2019-12-01), GB , XP093185985, ISSN: 1471-2199, DOI: 10.1186/s12867-019-0134-y *

Similar Documents

Publication Publication Date Title
JP6752957B2 (ja) オリゴヌクレオチドの産生のための新規プロセス
WO2022271965A2 (fr) Compositions et procédés pour une traduction de protéines améliorée à partir d'arn circulaires recombinants
EP3414333B1 (fr) Système de transposon réplicative
JP7058839B2 (ja) ローリングサークル増幅産物を使用した無細胞タンパク質発現
WO2024140987A1 (fr) Circularisation d'arn
US12049623B2 (en) Compositions and methods for identifying polynucleotides of interest
US11834670B2 (en) Site-specific DNA modification using a donor DNA repair template having tandem repeat sequences
JP7258361B2 (ja) 二本鎖コンカテマーdnaを使用したセルフリータンパク質発現
JP2022547949A (ja) シーケンシングのためのrna試料を調製する方法およびそのキット
WO2023283622A1 (fr) Édition d'arn programmable à base de crispr
WO2025140537A1 (fr) Circularisation d'arn
WO2019035485A1 (fr) Aptamère d'acide nucléique pour inhiber l'activité de l'enzyme d'édition du génome
US20250075201A1 (en) Screening codon-optimized nucleotide sequences
WO2024211883A1 (fr) Édition génomique en un clic
WO2019087113A1 (fr) Arn synthétiques et procédés d'utilisation
WO2023148646A1 (fr) Sélection d'image miroir d'aptamères d'acide l-nucléique
Welden et al. Use of Alu element containing minigenes to analyze circular RNAs
Cattle et al. An enhanced Eco1 retron editor enables precision genome engineering in human cells without double-strand breaks
WO2024138131A1 (fr) Expansion d'applications de l'alphabet zgtc dans l'expression de protéines et l'édition de gènes
Liu et al. Multifaceted mRNA analysis using programmed RNA cleavage by Mucilaginibacter paludis Argonaute
WO2024236366A2 (fr) Conception et procédé de génération d'arn circulaire sans cicatrice
WO2024260432A1 (fr) Composant de cyclisation d'arn linéaire et son utilisation
JP2025527489A (ja) 合成環状核酸を産生する無細胞法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24911457

Country of ref document: EP

Kind code of ref document: A1