WO2025046039A1 - Methods for making circular rnas - Google Patents
Methods for making circular rnas Download PDFInfo
- Publication number
- WO2025046039A1 WO2025046039A1 PCT/EP2024/074233 EP2024074233W WO2025046039A1 WO 2025046039 A1 WO2025046039 A1 WO 2025046039A1 EP 2024074233 W EP2024074233 W EP 2024074233W WO 2025046039 A1 WO2025046039 A1 WO 2025046039A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ribozyme
- sequence
- interest
- gene
- modified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/111—General methods applicable to biologically active non-coding nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/64—General methods for preparing the vector, for introducing it into the cell or for selecting the vector-containing host
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/12—Type of nucleic acid catalytic nucleic acids, e.g. ribozymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/50—Physical structure
- C12N2310/53—Physical structure partially self-complementary or closed
- C12N2310/532—Closed or circular
Definitions
- the invention relates to methods for making circular RNAs.
- the invention also relates to modified ribozymes for making circular RNAs, nucleic acid molecules for making circular RNAs, and kits for making circular RNAs.
- Ribosomes are molecular machines that translate the genetic code carried by messenger RNAs (mRNA) into functional proteins. When exogenous mRNA is introduced into cells, the gene of interest (GOI) encoded by that mRNA can be expressed, forming the basis of mRNA therapeutics.
- mRNA therapeutics did not receive significant investment due to challenges such as mRNA instability, high innate immunogenicity, and inefficient in vivo delivery (Pardi, N. et al. (2016) Nat Rev Drug Discov 17 , 261-279).
- Recent advances in mRNA delivery and nucleotide modification techniques have led to the successful application of mRNA therapeutics, particularly in the field of vaccines.
- other types of mRNA therapeutics such as protein replacement therapy, still face challenges due to mRNA instability.
- Circular RNAs have emerged as a promising solution to address the instability problem of mRNAs in therapeutic applications. Due to their covalently closed structure, circRNAs are inherently resistant to exonucleases, making them more stable compared to linear mRNAs (Jeck, W. R. & Sharpless, N. E. (2014) Nat Biotechnol 32, 453-461). CircRNAs are widely present in human cells and can serve as microRNA and protein sponges, as well as templates for peptide or protein production (Liu, C. X. & Chen, L. L. (2022) Cell 185, 2016-2034).
- circRNAs are primarily generated through back splicing within cells, they can also be synthesized in vitro using chemical or enzymatic ligation methods or through group I intron-based circularization methods (Petkovic, S. & Muller, S. (2015) Nucleic Acids Res 43, 2454-2465).
- PIE permuted intron-exon method
- TAC trans ribozyme-based circularization
- Figure 1A-C Puttaraju, M. & Been, M. D. (1992) Nucleic Acids Res 20, 5357-5364; Wesselhoeft, R. A., Kowalski, P. S.
- the invention provides a method for producing a circular gene of interest, the method comprising: a) providing a nucleic acid molecule, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction, a first bridge sequence, and a gene of interest, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) providing a modified ribozyme comprising a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and c) combining the nucleic acid molecule and the modified ribozyme under conditions suitable for circularization to occur.
- the invention provides a circular RNA obtained by the methods described herein.
- the invention provides a modified ribozyme for use in a method of circularizing an RNA molecule, the modified ribozyme comprising, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
- EGS extended guide sequence
- the invention provides a nucleic acid molecule containing a gene of interest for use in a method of circularizing the gene of interest, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme sequence, b) the gene of interest, c) a loop sequence, and d) an extended guide sequence.
- the invention provides a kit for circularizing a gene of interest, the kit comprising a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a 3’ portion of a ribozyme, and b) the gene of interest.
- the modified ribozyme may be a modified ribozyme of the third aspect.
- the nucleic acid molecule may be a nucleic acid molecule of the fourth aspect.
- the invention provides a DNA molecule encoding a modified ribozyme as defined herein.
- the invention provides a DNA molecule encoding a nucleic acid molecule containing a gene of interest as defined herein.
- the invention provides a kit for circularizing a gene of interest, the kit comprising a DNA molecule of the sixth aspect and a DNA molecule of the seventh aspect.
- Figure 1 Comparison of prior art approaches and trans excision ribozyme-based circularisation (TERIC).
- A Schematic depicting splicing by the group I intron from cyanobacterium Anabaena (Ana), resulting in a linear molecule comprising joined exons.
- B Schematic depicting circularization of a gene of interest by the permuted intron-exon (PIE) method.
- C Schematic depicting circularization of a gene of interest using the TRIC approach.
- D Schematic depicting splicing by a trans excision ribozyme (TER).
- E Schematic depicting circularization of a gene of interest using the TERIC approach.
- Figure 2 (A) Secondary structure of the Ana group I intron. Lower case characters are exon sequences. TER V1 and TER V2 labels indicate points from which a 3’ portion of the intron is removed. (B) Details for designing of trans excision ribozymes (TERs) and corresponding precursor of GOIs (pGs) using the cyanobacterium Anabaena tRNALeu group I intron as an example.
- TERs trans excision ribozymes
- pGs corresponding precursor of GOIs
- FIG. 3 (A) Splicing reactions of TERIC V1 and V2 were loaded to a 0.8% native agarose gel. The TRIC V2-CVB3-EGFP was loaded as a positive control. (B) pG, TER and circularized sample of TERIC V2 were loaded to a 6M urea-1 .5% agarose gel. Circular CVB3-EGFP was present in the circularized TERIC sample. (C) RT-PCR of the pG and circularized sample of TERIC V2. Primers used here are indicated in Figure 2B. (D-E) Sequence of TERs and pGs of TERIC V1 (SEQ ID NOs: 5 and 6) and V2 (SEQ ID NOs: 7 and 8).
- FIG. 4 Optimization of (A) TER to pG ratio, (B) concentration of pGs, and (C) duration of circularization. 0.8% native agarose gel was used. In the case of (B) and (C), TERs have run out of the gel. TERIC V2 shows generally better circularization efficiency than the TERIC V1 , although both achieve circularization. The most efficient circularization protocol for TERIC V2 was identified as 200- 400 nM pG and 20-40 min of circularization in the presence of 2mM GTP and 4 times amount of TERs to pGs.
- FIG. 5 (A) Circularization efficiencies between pGs with the eACA elements or the native tRNA sequences are compared. The results show the eACA is important for TERIC to work. (B) Sequence of the TERIC-pG-tRNA (SEQ ID NO: 9).
- FIG. 6 (A) Modified and unmodified precursors of PIE, TRIC, and TERIC were subject to circularization and loaded onto a 0.8% native agarose. No modified circular RNA can be seen for any of these three constructs. (B) IGSs of TRIC and TERIC are mutated from UUGAG to CCGCC and the circZnf609 was selected as the target sequence. Mutation of IGS to CCGCC for the TERIC restored ribozyme activity as indicated by the presence of circular modified RNAs. (C) Sequences of the TER (1 ,226)-IGS CCGCC (SEQ ID NO: 10) and pG_circZnf609 (SEQ ID NO: 11).
- Figure 7 Schematic overview of the TERIC approach.
- Figure 8 The extended anticodon arm (eACA). As long as a stem loop structure can be found where the loop contains a uracil in 3 rd position and the stem is >5bp, a circularization site can be assembled. Thus, circularization sites can be either placed in UTR or CDS.
- eACA extended anticodon arm
- Figure 9 Circularization of pGs using either 3’-truncated TER or 5’ and-3’-truncated TER.
- the 3’ bridge sequence in pGs is either identical to the truncated portion of the ribozyme or contains mutations. Circularization was performed at a 4:1 TER/pG ratio at 55 °C for 20 minutes and then analyzed on a 0.8% native agarose gel.
- the invention provides a new method for circularizing genes of interest to produce circular RNAs.
- the invention is based at least in part on the discovery that portions of a ribozyme can be relocated to a gene of interest to allow the ribozyme to interact with the gene of interest, and facilitate splicing in trans. This results in the circularization of the gene of interest, while the ribozyme itself can be reused.
- the invention enables the circularisation of genes of interest which contain modified nucleotides.
- a truncated ribozyme may be designed to lack a 3’ portion of its sequence.
- a bridge sequence can then be designed which corresponds to this missing portion of the ribozyme.
- the bridge sequence allows the truncated ribozyme to interact with the gene of interest and restore its tertiary structures, permitting splicing and circularisation of the gene of interest.
- the methods comprise providing a nucleic acid molecule comprising a bridge sequence and the gene of interest, and a modified ribozyme.
- the bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme.
- the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
- the methods described herein further comprise combining the nucleic acid molecule and modified ribozyme under conditions suitable for circularization to occur, in order to produce a circular gene of interest.
- Ribozymes such as group I introns, are capable of folding to form tertiary structures consisting of paired segments, termed P1-P10 (as shown in Figure 2A).
- the methods of the invention exploit these self-interactions by relocating parts of the ribozyme sequence to a gene of interest.
- the truncated ribozyme which might lack a 3’ portion of its sequence, or a 3’ portion and a 5’ portion, is then able to interact with the relocated parts of the ribozyme sequence on the gene of interest (the bridge sequence(s)) to restore its tertiary structure.
- modified ribozymes which generally comprise a truncated ribozyme.
- truncated ribozyme it is generally meant a ribozyme in which a 3’ portion of the corresponding wild-type ribozyme sequence has been removed.
- a truncated ribozyme may also refer to a ribozyme in which a 3’ portion and a 5’ portion of the corresponding wild-type ribozyme sequence have been removed.
- the ribozyme can be any ribozyme capable of acting as a trans excision ribozyme.
- the ribozyme is derived from or is a group I intron.
- suitable ribozymes are the Tetrahymena ribosomal intron, T4 phage thymidylate synthase intron, Anabaena (Ana) pre-tRNA intron, Azoarcus sp. BH72 lie tRNA intron, and Staphylococcus phage Twort ribonucleotide reductase intron. Sequences of these ribozymes are shown below, with the internal guide sequence (IGS) shown with underlined, shaded letters.
- IGS internal guide sequence
- Staphylococcus phage Twort ribonucleotide reductase intron Staphylococcus phage Twort ribonucleotide reductase intron:
- Truncated ribozymes generally do not comprise a 3’ portion of a corresponding wild-type ribozyme.
- a 3’ portion corresponding to nucleotides 227-249 may be removed.
- the 3’ portion can also be shorter than this, for example corresponding to nucleotides 242-249.
- a 3’ portion it is generally meant a 3’ end portion, such that the remaining ribozyme is essentially truncated at the 3’ end.
- the 3’ portion which is removed is generally between 1 and 1500 nucleotides in length.
- the truncated ribozyme may not comprise a 3’ portion of a corresponding wild-type ribozyme which is between 1 and 1250, 1 and 1000, 1 and 750, 1 and 500, 1 and 400, 1 and 300, 1 and 250, 1 and 200, 1 and 150, 1 and 100, 1 and 50, 5 and 1500, 5 and 1250, 5 and 1000, 5 and 750, 5 and 500, 5 and 400, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50, 10 and 1500, 10 and 1250, 10 and 1000, 10 and 750, 10 and 500, 10 and 400, 10 and 300, 10 and 250, 10 and 200, 10 and 150, 10 and 100, 10 and 50, or 10 and 30 nucleotides
- the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme which is between 1 and 30 nucleotides in length.
- the 3’ portion which is missing or removed may be 23 nucleotides in length.
- the 3’ portion which is removed may be 8 nucleotides in length.
- the truncated ribozyme may not comprise a 3’ portion which is between 1 and 249 nucleotides in length.
- the 3’ portion may variously be referred to as a portion which is “removed” from the ribozyme.
- the method does not necessarily include a step of removing this portion (or a 5’ portion) from the ribozyme, since the ribozyme can be reused and therefore provided in an already truncated form.
- the 3’ portion which is missing from the truncated ribozyme may generally be one that is involved in the formation of a paired segment in a wild-type ribozyme.
- the 3’ portion may be one that forms, or is involved in the formation of, a P9 region in a wild-type ribozyme.
- the ribozyme is an Ana group I intron ribozyme
- the ribozyme may comprise nucleotides 1-226 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof.
- the ribozyme may comprise the nucleotide sequence of SEQ ID NO: 12. or a variant thereof.
- the ribozyme may comprise nucleotides 1-241 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof.
- the ribozyme may comprise the nucleotide sequence of SEQ ID NO: 13 or a variant thereof.
- the truncated ribozymes described herein may also not comprise a 5’ portion of a corresponding wildtype ribozyme.
- a 5’ portion it is generally meant a 5’ end portion, such that the remaining ribozyme is essentially truncated at the 5’ end.
- the 5’ portion which is removed is generally between 1 and 1500 nucleotides in length.
- the truncated ribozyme may not comprise a 5’ portion of a corresponding wild-type ribozyme which is between 1 and 1250, 1 and 1000, 1 and 750, 1 and 500, 1 and 400, 1 and 300, 1 and 250, 1 and 200, 1 and 150, 1 and 100, 1 and 50, 5 and 1500, 5 and 1250, 5 and 1000, 5 and 750, 5 and 500, 5 and 400, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50, 10 and 1500, 10 and 1250, 10 and 1000, 10 and 750, 10 and 500, 10 and 400, 10 and 300, 10 and 250, 10 and 200, 10 and 150, 10 and 100, 10 and 50, or 10 and 30 nucleotides in length.
- the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme which is between 1 and 30 nucleotides in length.
- the 5’ portion may variously be referred to as a portion which is “removed” from the ribozyme.
- the method does not necessarily include a step of removing this portion from the ribozyme, since the ribozyme can be reused and therefore provided in an already truncated form.
- the 5’ portion that is missing from the truncated ribozyme may generally be one that forms an internal guide sequence (IGS).
- IGS internal guide sequence
- the IGS forms P1 and P10 regions by complementary base pairing. This is shown in Figure 2A.
- the ribozyme used in the approaches described herein should generally match the ribozyme from which the bridge sequence on the gene of interest is derived.
- the bridge sequence added to the gene of interest should also be derived from the Ana group I intron, although this is not absolutely necessary as long as the bridge sequence enables interaction with the truncated ribozyme.
- the sequence corresponding to a 3’ portion of a ribozyme of the first bridge sequence may be different to the sequence of the 3’ portion of a corresponding wild-type ribozyme that is missing or absent from the truncated ribozyme.
- the sequence of the first bridge sequence may differ from the missing 3’ portion of the corresponding wild-type ribozyme by insertion, addition, substitution or deletion of 1 nucleotide, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more than 10 nucleotides.
- the sequence of the first bridge sequence may comprise a nucleotide sequence having at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% sequence identity to the missing 3’ portion.
- Sequence comparison may be made over the full-length of the relevant sequence described herein using a standard algorithm, such as GAP, BLAST (which uses the method of Altschul et al. (1990) J. Mol. Biol. 215: 405-410), FASTA (which uses the method of Pearson and Lipman (1988) PNAS USA 85: 2444-2448), or the Smith-Waterman algorithm (Smith and Waterman (1981) J. Mol Biol. 147'. 195- 197), or the TBLASTN program, of Altschul et al. (1990) supra, generally employing default parameters.
- GAP GAP
- BLAST which uses the method of Altschul et al. (1990) J. Mol. Biol. 215: 405-410
- FASTA which uses the method of Pearson and Lipman (1988) PNAS USA 85: 2444-2448
- Smith-Waterman algorithm Smith and Waterman (1981) J. Mol Biol. 147'. 195- 197
- TBLASTN program of Altschul e
- the length of the bridge sequence corresponds to the length of the 3’ portion (or 5’ portion) which is removed from the ribozyme, although this is not a requirement. Differences in length between the missing 3’ portion (and 5’ portion) and the bridge sequence are tolerated well.
- the missing 3’ portion (and 5’ portion) may be 1 , 2, 3, 4, 5, or 10 or more nucleotides longer or shorter than the bridge sequence.
- a bridge sequence may comprise a sequence corresponding to a 3’ portion of a ribozyme, such as a 3’ end portion of a ribozyme. This may be referred to as a 3’ bridge sequence, or a first bridge sequence.
- a bridge sequence may also comprise a sequence corresponding to a 5’ portion of a ribozyme, such as a 5’ end portion of a ribozyme. This may be referred to as a 5’ bridge sequence, or a second bridge sequence.
- the first bridge sequence may be between 5 and 1500, 5 and 1000, 5 and 750, 5 and 500, 5 and 450, 5 and 400, 5 and 350, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50 nucleotides in length. In many cases, the first bridge sequence may be between 5 and 30 nucleotides in length. For example, when an Ana group I intron is used, the first bridge sequence may be between 2 and 248 nucleotides in length. The first bridge sequence may be between 5 and 30 nucleotides in length, for example when an Ana group I intron is used. The first bridge sequence may be 8 nucleotides in length. The first bridge sequence may be 23 nucleotides in length.
- the function of the bridge sequence is to facilitate interaction with a modified ribozyme as described herein.
- the first bridge sequence generally comprises a portion which is capable of complementary base pairing with a 3’ portion of a ribozyme, such as a portion between 1 and 10 nucleotides in length.
- the first bridge sequence may be capable of forming a paired segment with a modified ribozyme as described herein, or with a 3’ portion of a ribozyme.
- the first bridge sequence may be capable of forming a P9 region with a 3’ portion of a ribozyme.
- the first bridge sequence may comprise a portion which is capable of complementary base pairing with nucleotides 207-212 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof.
- the precise nucleotide sequence of the first bridge sequence is not critical to the invention, as long as the nucleotide sequence of the first bridge sequence enables interaction with the modified ribozyme.
- sequence corresponding to the 3’ portion of a ribozyme of the first bridge sequence may be identical or different to the sequence of the 3’ portion of a corresponding wild-type ribozyme that is missing or absent from the truncated ribozyme.
- the first bridge sequence may comprise a sequence corresponding to nucleotide residues 242-249 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof.
- the first bridge sequence may comprise the nucleotide sequence set forth in SEQ ID NO: 14 or a variant thereof.
- the first bridge sequence may comprise a sequence corresponding to nucleotide residues 227-249 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof.
- the first bridge sequence may comprise the nucleotide sequence set forth in SEQ ID NO: 15 or a variant thereof.
- the bridge sequence is joined (either directly or indirectly) to the 3’ end of a gene of interest.
- the length of the second bridge sequence may vary depending on the ribozyme from which it is derived.
- the second bridge sequence is generally at least two nucleotides in length.
- the second bridge sequence is generally less than 1500 nucleotides in length.
- the second bridge sequence may be between 2 and 1000, 2 and 750, 2 and 500, 2 and 450, 2 and 400, 2 and 350, 2 and 300, 2 and 250, 2 and 200, 2 and 150, 2 and 100, 2 and 50, or 2 and 30 nucleotides in length.
- the second bridge sequence may be between 5 and 1500, 5 and 1000, 5 and 750, 5 and 500, 5 and 450, 5 and 400, 5 and 350, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50 nucleotides in length. In many cases, the second bridge sequence may be between 5 and 30 nucleotides in length.
- the second bridge sequence may comprise a portion that is capable of complementary base pairing with a 5’ portion of a modified ribozyme.
- bridge sequences are described as “corresponding to” equivalent portions of ribozyme sequences, they do not need to be identical to those sequences. Any bridge sequence may be used that allows interaction with the modified ribozyme such that circularisation can occur.
- a bridge sequence that “corresponds to” a portion of a ribozyme sequence may have at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with a portion of a ribozyme sequence as defined herein.
- the wild-type Ana ribozyme is modified to remove a 3’ portion that is 23 nucleotides in length, and corresponds to residues 227-249 of SEQ ID NO: 1 or SEQ ID NO: 49 (wild-type Ana) or a variant thereof.
- This modified ribozyme then comprises residues 1-226 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof.
- a corresponding first bridge sequence which comprises a sequence corresponding to residues 227-249 of SEQ ID NO: 1 , is then added to a 5’ end of a gene of interest.
- the first bridge sequence can interact with (i.e. complementary base pair with) the remaining portion of the ribozyme, for example forming the P9 region shown in Figure 2A.
- the IGS of the modified ribozyme can form P1 and P10 regions with the gene of interest.
- the gene of interest refers to the sequence which is to be circularized.
- the GOI can comprise a coding sequence, coding for a peptide or protein, or can be a noncoding sequence.
- the GOI can also comprise a combination of coding and noncoding sequence.
- gene of interest encompasses sequences which include additional sequence elements, such as internal ribosome entry site (IRES) sequences, multiple siRNA target sites (msiTS), spacer sequences such as polyAC sequences, start codons, stop codons, and any other sequence elements known to be useful in the art for producing circular RNA.
- IRS internal ribosome entry site
- miTS multiple siRNA target sites
- spacer sequences such as polyAC sequences, start codons, stop codons, and any other sequence elements known to be useful in the art for producing circular RNA.
- Suitable IRESs for use in the invention include CVB3, CroV, CSFV, the DNA sequences of which are set out below.
- the TERIC method is suitable for genes of interest of any length. TERIC is particularly suitable for long genes of interest.
- “long” is generally considered to mean a sequence of at least 500 nucleotides.
- the gene of interest may be at least 100, at least 250, at least 500, at least 1000, at least 2000, at least 3000, at least 4000, at least 5000, at least 6000, at least 7000, or at least 8000 nucleotides in length.
- the gene of interest may also comprise an extended anticodon arm (eACA) sequence.
- the eACA may be part of the gene of interest sequence (for example, it may be naturally occurring within the gene of interest), but for clarity is referred to separately herein.
- An extended anticodon arm (eACA) sequence is one which is capable of forming a stem-loop structure (also known as a hairpin or hairpin loop). Stem-loop structures form when two regions of single-stranded RNA which are generally complementary to each other (when read in opposite directions) base-pair with each other. The base-pairing results in a double helix structure ending in an unpaired loop.
- the natural propensity of eACA sequences to form stem-loop structures can be utilised to enable and enhance circularisation of a gene of interest.
- a nucleic acid molecule to be circularized Prior to circularization, a nucleic acid molecule to be circularized comprises an eACA sequence in two separate portions. A first portion of the eACA sequence is positioned at or near the 5’ end of the gene of interest, and a second portion of the eACA sequence is positioned at or near the 3’ end of the gene of interest, as shown in Figure 2B and Figure 7.
- splicing by the ribozyme causes the first and second portions of the eACA sequence to be covalently joined, in order to create a circular version of the gene of interest.
- the first and second portions are joined to form the eACA sequence, which is generally capable of forming a stem-loop structure as shown in Figure 2B and Figure 7.
- the first portion of the eACA sequence may comprise a first eACA stem portion and a first eACA loop portion.
- the second portion of the eACA sequence may comprise a second eACA stem portion and a second eACA loop portion.
- stem portion it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the stem of a stem-loop structure.
- loop portion it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the loop of a stem-loop structure.
- a stem-loop forming structure can be identified in a gene of interest, and that gene of interest can be rearranged to provide a first part of the stem-loop forming structure at one end, and the second part at the other end. This gene of interest can subsequently be used for circularization.
- the stem and loop portions of the eACA sequence are capable of forming a stem-loop structure in the nucleic acid molecules described herein.
- the eACA can be of any nucleotide sequence.
- the last nucleotide in the second eACA loop portion is one which can form a wobble base pair with a corresponding nucleotide in the internal guide sequence described herein.
- the last nucleotide in the second eACA loop portion is a uracil, and forms a wobble base pair with a corresponding guanine in the internal guide sequence.
- the last nucleotide in the second eACA loop portion may also be cytosine and form a wobble base pair with adenosine.
- the last nucleotide in the second eACA loop portion is one which can form a canonical base pair with a corresponding nucleotide in the internal guide sequence or one which does not form a wobble or canonical base pair with a corresponding nucleotide in the internal guide sequence.
- the first and second eACA stem portions may be complementary to each other, though this is not strictly necessary, and some non-complementarity may be tolerated.
- the first and second eACA stem portions are generally each at least 5 nucleotides in length but can be a short as 1 nucleotide in length each.
- the stem portion lengths can be adapted depending on the gene of interest to be circularized. For example, longer stem lengths (such as lengths greater than 15 nt) may be advantageous if circularizing long (>500 nt) genes of interest.
- first and second stem portions may each be at least 1 , at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, at least 25 or at least 30 nucleotides in length.
- first and second stem portions may each be at least 15 or at least 25 nucleotides in length.
- the first and second stem portions need not be the same length, for example one stem portion may be one or two nucleotides shorter than the other, provided a stem-loop structure can still be formed.
- the anticodon arm loop of the Ana tRNA group I intron is naturally 7 nucleotides in length. Consequently, in the circular RNAs described herein, the loop of the stem-loop structure may be 7 nucleotides in length, particularly when the ribozyme used is, or is derived from, the Ana group I intron. If other group I introns are used, the loop of the stem-loop structure may have a different nucleotide length.
- the loop of the stem-loop structure is generally between 3 and 10 nucleotides in length. In many cases, the loop of the stem-loop structure is at least 5 nucleotides in length, particularly if the ribozyme is or is derived from the Ana group I intron.
- the first eACA loop portion comprises 4 nucleotides
- the second eACA loop portion comprises 3 nucleotides.
- the first and second eACA stem portions may each be at least 15 nucleotides in length
- the first eACA loop portion may be 4 nucleotides in length
- the second eACA loop portion may be 3 nucleotides in length.
- the first portion of the eACA sequence may comprise as few as 5 nucleotides (for example, 1 stem nucleotide and 4 loop nucleotides).
- the second portion of the eACA sequence may comprise as few as 4 nucleotides (for example, 1 stem nucleotides and 3 loop nucleotides).
- the first portion of the eACA sequence may comprise, for example, 19 nucleotides (e.g. 15 stem nucleotides and 4 loop nucleotides), or 29 nucleotides (e.g. 25 stem nucleotides and 4 loop nucleotides).
- the second portion of the eACA sequence may comprise, for example, 18 nucleotides (e.g. 15 stem nucleotides and 3 loop nucleotides), or 28 nucleotides (e.g. 25 stem nucleotides and 3 loop nucleotides).
- An exemplary first portion of the eACA sequence comprising a 1 nucleotide stem and a 4 nucleotide loop may comprise the nucleotide sequence 5 -NNNNN-3’, wherein N is any nucleotide
- an exemplary second portion of the eACA sequence comprising a 1 nucleotide stem and a 3 nucleotide loop may comprise the nucleotide sequence 5’-NNNU-3’, wherein N is any nucleotide.
- a possible ACA sequence is GATCACCACTTTAAGGTGATC (SEQ ID NO: 19).
- the NLuc GOI is rearranged such that the ACA sequence is provided in two portions (a 5’ first portion and a 3’ second portion).
- the first portion of the eACA sequence comprises the sequence TTAAGGTGATC (SEQ ID NO: 20).
- the second portion of the eACA sequence comprises the sequence GATCACCACT (SEQ ID NO: 21).
- the first and second eACA loop portions base pair with the internal guide sequence (IGS) to form the P1 and P10 regions, which are critical for ribozyme activity.
- the first eACA loop portion positioned towards the 5’ end of the gene of interest, base pairs with the IGS to form the P10 region. It is not necessary for all the nucleotides in the first eACA loop portion to form the P10 region, and in some cases only two nucleotides of the first eACA loop portion form the P10 region.
- the second eACA loop portion positioned towards the 3’ end of the gene of interest, base pairs with the IGS to form the P1 region.
- the last nucleotide of the second eACA loop portion (i.e. the nucleotide at the 3’ end of the second eACA loop portion) may form a wobble base pair with a corresponding nucleotide in the IGS.
- the wobble base pair is a GU wobble base pair, with G in the IGS and U in the second eACA loop portion.
- the wobble base pair provides the circularization site, such that once circularized, the nucleotide at the 3’ end of the second eACA loop portion forms the third nucleotide in the loop of the eACA stem-loop structure. This is depicted in Figure 8.
- the last nucleotide of the second eACA loop portion may form a canonical base pair with a corresponding nucleotide in the internal guide sequence. In other embodiments, the last nucleotide of the second eACA loop portion may not form a wobble or canonical base pair with a corresponding nucleotide in the internal guide sequence.
- the P1 region may also be formed by base pairing of the IGS with a region adjacent to the second eACA loop portion in the 3’ direction, known as the “P1 extension”. If present, the P1 extension typically comprises between 2 and 4 nucleotides, which base pair with the IGS. The P1 region may therefore be formed by the P1 extension and second eACA loop portion base pairing with the IGS, as shown in Figure 2B and Figure 7. Consequently, in some embodiments, the second portion of the eACA sequence and the P1 extension together are capable of forming a P1 region. If the P1 extension is not present, the P1 region is formed by only the second eACA loop portion base pairing with the IGS. P1 extensions have been described in Olson & Muller (2012) RNA 18:581-589. The contents of which are incorporated herein by reference. Generally, if an extended guide sequence (EGS) is used, the P1 extension region will be present.
- EGS extended guide sequence
- a particular advantage of the TERIC method is that it can utilise eACA sequences which are already present in the gene of interest. For example, if a stem-loop forming eACA sequence can be found in a gene of interest, this gene can be circularized efficiently without introducing any additional sequences. This in turn means that the resulting circular RNA is far less likely to be immunogenic.
- An eACA sequence i.e. a stem-loop forming structure
- the gene of interest is rearranged such that the eACA sequence is split into two portions, one at each end of the gene of interest.
- This rearranged gene is then cloned into a TERIC construct for circularisation.
- An example of this is the protein coding circular RNA T2A Nano Luciferase.
- This circular RNA already comprises an eACA sequence in its natural sequence. This means a circularisation site can be introduced using the naturally occurring eACA sequence, without the need to perform mutations or introduce additional sequence.
- Codon redundancy means that mutations can be made to the nucleotide sequence of the GOI without affecting the resulting peptide sequence. Consequently, an eACA sequence can be provided in the GOI without requiring the introduction of additional sequences. Instead, only selective mutation of the existing sequence is needed, following the rules of codon redundancy.
- the circularization site can be created by introducing additional nucleotides. For example, as shown in Figure 8, 5 nucleotides (light grey nt) could be introduced to create a stem portion of the eACA sequence, using the existing sequence (black nt) of the GOI to provide the remainder of the eACA.
- the GOI may comprise, in the 5’ to 3’ direction: a stop codon, a polyAC sequence, multiple siRNA target sites (msiTS), an IRES, a start codon, and the coding sequence including the eACA.
- the GOI may comprise, in the 5’ to 3’ direction: multiple siRNA target sites (msiTS), an IRES, a start codon, a coding sequence, a stop codon, a polyAC sequence, and the eACA.
- the first and/or second portions of the eACA sequence may naturally occur in the gene of interest. In other words, they may be part of the gene of interest and so are present without having to mutate the existing sequence or introduce additional sequence.
- all or part of the eACA sequence may be derived from human ribosomal RNA (rRNA).
- rRNA ribosomal RNA
- the use of human rRNA has the potential to provide circular RNAs which are less immunogenic.
- first and second portions of the eACA sequence in the gene of interest has little impact on circularization.
- One portion of the eACA sequence may be identified in or placed in a coding sequence, whilst the other may be identified in or placed in an untranslated region. It is not necessary for both portions to be in the coding sequence, for example, or for both portions to be in the untranslated region.
- nucleic acid molecules containing a gene of interest wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, and b) the gene of interest.
- the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, and d) a second portion of the eACA sequence.
- first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme
- eACA extended anticodon arm
- nucleic acid molecules containing a gene of interest comprising, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) the gene of interest, and c) a second bridge sequence, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme.
- the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second portion of the eACA sequence, and e) a second bridge sequence, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme.
- a first bridge sequence wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme
- eACA extended anticodon arm
- nucleic acid molecules Further components that may also be included in the nucleic acid molecules, such as extended guide sequences and loop sequences, are described below.
- a nucleic acid molecule, bridge sequence, gene of interest, ribozyme, ribozyme portion or truncated ribozyme, extended guide sequence, loop sequence, homology arm, extended anticodon arm (eACA) sequence or other sequence described herein may comprise or consist of the amino acid sequence of a reference sequence set out herein or may be a variant of a reference sequence set out herein.
- a variant sequence may have at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to the reference sequence.
- a variant sequence may differ from the reference sequence by insertion, addition, substitution or deletion of 1 nucleotide, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more than 10 nucleotides.
- the gene of interest may comprise at least one modified nucleotide.
- the gene of interest may comprise any amount of modified nucleotides considered useful for the desired application of the circular RNA (e.g. therapeutic or research applications). Accordingly, the % of nucleotides in the gene of interest which are modified may vary from 0% to 100%. In some cases, the gene of interest may be fully modified, in that 100% of the nucleotides are modified.
- the gene of interest may comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% modified nucleotides.
- the modification may be a base modification.
- the modification may be a ribose modification.
- the modification may be selected from the group consisting of: m 5 C (5-methylcytidine); m 5 U (5-methyluridine); m 6 A (N 6 -methyladenosine); s 2 U (2-thiouridine); T (pseudouridine); N 1 T (N 1 -methylpseudouridine); Um (2'-O-methyluridine); m 1 A (1 -methyladenosine); m 2 A (2-methyladenosine); Am (2'-O-methyladenosine); ms 2 m 6 A (2-methylthio-N 6 -methyladenosine); i 6 A (N 6 -isopentenyladenosine); ms 2 i6A (2-methylthio-N 6 isopentenyladenosine); io 6 A (N 6 -(cis- hydroxyisopenteny
- the gene of interest may comprise at least one modified nucleotide, wherein the modification is a m 6 A (N 6 -methyladenosine) modification.
- the gene of interest may comprise at least 95% modified nucleotides, wherein the modification is a m 6 A (N 6 -methyladenosine) modification.
- the gene of interest may comprise 100% modified nucleotides, wherein the modification is a m 6 A (N 6 - methyladenosine) modification.
- the gene of interest may comprise at least one modified nucleotide, wherein the modification is a N 1 T (N 1 -methylpseudouridine) modification.
- the gene of interest may comprise at least 95% modified nucleotides, wherein the modification is a N 1 1 (N 1 -methylpseudouridine) modification.
- the gene of interest may comprise 100% modified nucleotides, wherein the modification is a N 1 1 (N 1 - methylpseudouridine) modification.
- the gene of interest may comprise at least one modified nucleotide, wherein the modification is a m 5 C (5-methylcytidine) modification.
- the gene of interest may comprise at least 95% modified nucleotides, wherein the modification is a m 5 C (5-methylcytidine) modification.
- the gene of interest may comprise 100% modified nucleotides, wherein the modification is a m 5 C (5-methylcytidine) modification.
- both the internal guide sequence of the ribozyme and the sequence of the gene of interest can be adapted to minimise the disruptive effect of modifications on interaction between ribozyme and GOL
- the IGS can be adapted to be CG rich, for example comprising 60% or more, 80% or more, or 100% CG content.
- CG rich for example comprising 60% or more, 80% or more, or 100% CG content.
- corresponding adaptations should be made to the sequence of the gene of interest, in order to maintain complementary base pairing. The precise nature of the sequence alterations or adaptations will depend on the nucleotide modifications that are desired in the resulting circular RNA.
- the nucleic acid molecules and modified ribozymes described herein may further each comprise an extended guide sequence (EGS).
- a modified ribozyme may comprise a first EGS
- a nucleic acid molecule containing the gene of interest may comprise a second EGS.
- the first and the second EGS may be capable of complementary base pairing to each other.
- the function of the EGS is generally to increase the length of the complementary base-pairing region between the ribozyme and the nucleic acid molecule containing the gene of interest.
- the modified ribozymes described herein may comprise a first EGS positioned at a 5’ end of the truncated ribozyme.
- the nucleic acid molecules containing the gene of interest described herein may comprise a second EGS positioned at a 3’ end of the nucleic acid molecule.
- the first and second EGS may be partly or fully complementary to each other. Generally, mismatches are tolerated well and do not materially affect circularization. Accordingly, the first EGS may be at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% complementary to the second EGS. Generally, the first EGS may be substantially complementary to the second EGS, such as at least 70% complementary. If present, the first and second EGS may each be between 1 and 500 nucleotides in length. For example, the first and second EGS may each be between 10 and 50 nucleotides in length. The first and second EGS may each be 20, 30, or 40 nucleotides in length.
- An exemplary first EGS sequence is GGUCAAUCGGUUGGCUUCCG (SEQ ID NO: 22).
- An exemplary second EGS sequence is CGGAAGCCAACCGAUUGACC (SEQ ID NO: 23).
- the ribozymes and nucleic acid molecules described herein may further comprise loop sequences, such as a first loop sequence and a second loop sequence.
- the first and second loops may be configured to act as spacers.
- the first and second loops may act as spacers between, in the ribozyme, the internal guide sequence (IGS) and the first extended guide sequence (EGS), and in the nucleic acid molecule containing the gene of interest, the gene of interest and the second EGS.
- the loop sequences are preferably not complementary to each other, such that there is little or no base pair interaction between the first and second loop sequences. Because of the low or non-complementarity between the two loop sequences, the base-paired P1 region remains at a fixed length.
- the loop sequences are substantially non-complementary.
- the loop sequences may have less than 30%, less than 20%, less than 10%, or less than 5% complementarity.
- the first and second loop sequences may each be between 1 and 10 nucleotides in length. It is not necessary for the first and second loop sequences to have the same number of nucleotides, and in fact the TERIC method works well when the first and second loop sequences are different lengths.
- the first loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length.
- the second loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length.
- a preferred combination is a 6 nucleotide first loop sequence and a 5 nucleotide second loop sequence.
- Another preferred combination is a 3 nucleotide first loop sequence and a 2 nucleotide second loop sequence.
- the first loop sequence is positioned 3’ to the first EGS and 5’ to the truncated ribozyme, in other words, in between the first EGS and the truncated ribozyme.
- the second loop sequence is positioned 3’ to the gene of interest, and 5’ to the second EGS, in other words, between the gene of interest and the second EGS in the nucleic acid molecule containing the gene of interest. If the P1 extension region is present, the second loop sequence is 3’ to the P1 extension.
- Circularization can be achieved using only a first loop sequence, positioned in between the first EGS and the truncated ribozyme, without a second loop sequence.
- first loop sequence positioned between the first EGS and the truncated ribozyme
- second loop sequence positioned between the second portion of the gene of interest and the second EGS.
- An exemplary sequence for the first loop sequence is AAATAA.
- An exemplary sequence for the second loop sequence is ACACC.
- modified ribozymes comprising, in the 5’ to 3’ direction: a) an extended guide sequence (EGS) b) a loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
- EGS extended guide sequence
- modified ribozymes comprising, in the 5’ to 3’ direction: a) an extended guide sequence (EGS) b) a loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme and does not comprise a 5’ portion of a corresponding wildtype ribozyme.
- EGS extended guide sequence
- nucleic acid molecules containing a gene of interest comprising, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, b) the gene of interest, c) a loop sequence, and d) an extended guide sequence.
- nucleic acid molecules containing a gene of interest comprising, in the 5’ to 3’ direction: a) a first bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, b) the gene of interest, c) a second bridge sequence comprising a sequence corresponding to a 5’ portion of a ribozyme, d) a loop sequence, and e) an extended guide sequence.
- nucleic acid molecules containing a gene of interest comprising, in the 5’ to 3’ direction: a) a bridge sequence comprising a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second portion of the eACA sequence; e) a loop sequence, and f) an extended guide sequence.
- a bridge sequence comprising a 3’ portion of a ribozyme
- eACA extended anticodon arm
- nucleic acid molecules containing a gene of interest comprising, in the 5’ to 3’ direction: a) a first bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second bridge sequence comprising a sequence corresponding to a 5’ portion of a ribozyme, e) a second portion of the eACA sequence; f) a loop sequence, and g) an extended guide sequence.
- a first bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme
- eACA extended anticodon arm
- the ribozyme and the nucleic acid molecule comprising the GOI may further be provided with homology arms.
- the purpose of the homology arms is to enable the 5’ end of the GOI to interact with the 3’ end of the ribozyme, as shown in Figure 2B.
- Figure 2B labels the homology arm on the GOI as “HR-G” and the homology arm on the ribozyme as “HR-R”.
- the homology arms perform a similar function to the EGS, by extending the region of complementary base pairing between the GOI and the ribozyme.
- the first and second homology arms may be partly or fully complementary to each other.
- the first homology arm may be at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% complementary to the second homology arm.
- the homology arms are substantially complementary to each other, such as at least 70% complementary to each other.
- the homology arms can be of any length which facilitates circularisation, and do not necessarily have to be of the same length.
- the homology arm on the GOI can be longer or shorter than the homology arm on the ribozyme.
- the homology arms may each be at least 1 , 2, 3, 4, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 300 or 500 nucleotides in length.
- the homology arms may each be between 1 and 50, 1 and 40, 1 and 30, 1 and 20, 5 and 50, 5 and 40, 5 and 30, 5 and 20, 10 and 50, 10 and 40, 10 and 30, or 10 and 20 nucleotides in length. In some cases, the homology arms are each 20 nucleotides in length.
- HR-R ribozyme: CAGGACAACAGCATCACTAG (SEQ ID NO: 47)
- the homology arm is generally placed at the 3’ end. In the case of the nucleic acid molecule containing the gene of interest, the homology arm is generally placed at the 5’ end.
- a modified ribozyme as described herein may comprise, in a 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and d) a homology arm sequence.
- EGS extended guide sequence
- a nucleic acid molecule comprising a gene of interest as described herein may comprise, in a 5’ to 3’ direction: a) a homology arm sequence, b) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme sequence, c) the gene of interest, d) a loop sequence, and e) an extended guide sequence.
- the methods generally comprise providing a bridge sequence (as described herein) at a 5’ end of the gene to be circularized, and optionally also providing a second bridge sequence at a 3’ end of the gene to be circularised.
- the bridge sequence(s) can be added to the gene of interest by methods known in the art.
- a DNA template can be synthesised which comprises a bridge sequence 5’ to the gene of interest, and/or a bridge sequence 3’ to the gene of interest.
- the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme
- the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme.
- the methods further comprise providing a modified ribozyme, as described elsewhere herein.
- the modified ribozyme is one in which a 3’ portion of the ribozyme has been removed.
- the modified ribozyme may comprise a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
- the modified ribozyme may correspond to nucleotides 1-226, or 1-242 of SEQ ID NO: 1 , with nucleotides 227-249 or 243-249 removed, respectively.
- the modified ribozyme may also comprise a truncated ribozyme wherein the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme.
- the truncated ribozyme may be truncated at both the 5’ and 3’ ends compared to a wild-type ribozyme.
- the methods comprise combining the gene of interest and the modified ribozyme under conditions suitable for circularization to occur.
- Circularisation protocols are generally known in the art.
- such a step may be carried out for between 10 and 60 minutes, in some cases between 20 and 40 minutes.
- Circularization may be achieved by heating the gene of interest and modified ribozyme together, for example to between 50°C and 60°C.
- Circularization may be achieved by heating the mixture of the gene of interest and the modified ribozyme to about 55°C for about 20 minutes.
- the gene of interest and the modified ribozyme may be combined and/or heated together in any suitable circularization buffer known in the art.
- Methods for producing a circular gene of interest may comprise: a) providing a nucleic acid molecule, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction, a first bridge sequence, and a gene of interest, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) providing a modified ribozyme comprising a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and c) combining the nucleic acid molecule and the modified ribozyme under conditions suitable for circularization to occur.
- the ratio of modified ribozyme to gene of interest is generally 1 :1 or greater.
- the ratio of modified ribozyme to gene of interest may be 2:1 , 3:1 , 4:1 , 5:1 , 6:1 , or greater.
- the ratio of modified ribozyme to gene of interest may be 4:1 .
- the concentration of gene of interest is generally between about 100 nM and about 500 nM. In some embodiments, the concentration of gene of interest may be between 200 nM and 400 nM.
- steps of the methods described herein may be preceded by steps providing DNA templates encoding any of the modified ribozymes or genes of interest. Such methods may also include a step of in vitro transcription of the DNA templates to provide RNA precursors. The methods may also include a step of refolding the modified ribozymes and genes of interest following in vitro transcription and prior to the addition of a circularization buffer.
- the methods described herein may further comprise a step of recovering the modified ribozyme.
- the recovered modified ribozyme may then be used in a future reaction.
- Suitable methods for recovering the modified ribozyme include separation and purification by gel filtration or gel extraction.
- the modified ribozyme can be recovered and/or purified at the same time as recovering the desired circular RNA.
- the TER can be immobilised on a solid surface, for example covalently linked to a bead or plate, or other suitable surface known in the art.
- the gene of interest can be added, circularised and washed out for recovery of the circular RNA, while the TER remains bound to the solid surface and can be reused for a second and further rounds of circularisation with new genes of interest.
- circular RNAs obtainable by the methods described above.
- the invention also provides circular RNAs comprising a sequence encoding a gene of interest, wherein the circular RNA comprises at least one modified nucleotide residue as described herein, and wherein the circular RNA does not comprise exogenous exon sequences, and/or does not comprise any ribozyme-derived sequence.
- the invention also provides circular RNAs comprising a sequence encoding a gene of interest, wherein the circular RNA comprises an eACA sequence, and wherein the circular RNA comprises at least one modified nucleotide residue. In some cases, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or 100% of the nucleotide residues in the circular RNA may be modified.
- kits for circularizing a gene of interest comprise a modified ribozyme and a nucleic acid molecule containing the gene of interest, as described herein.
- Kits for circularizing a gene of interest may comprise a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, and b) the gene of interest.
- Kits for circularizing a gene of interest may comprise a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, c) an internal guide sequence (IGS), and d) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of the ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second portion of the eACA sequence; e) a second loop sequence, and f) a second
- the disclosure also encompasses DNA precursor molecules encoding any of the modified ribozymes, genes of interest, and nucleic acid molecules containing genes of interest described herein.
- any of the ribozymes or nucleic acid molecules described herein can be provided as DNA templates, which subsequently undergo in vitro transcription (IVT) in order to provide ribozymes and RNA molecules that can be circularized.
- kits comprising a first DNA molecule encoding a modified ribozyme and a second DNA molecule encoding a nucleic acid molecule.
- nucleic acid molecule further comprises a second bridge sequence located 3’ of the gene of interest, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme, and wherein the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme.
- the 5’ portion of the corresponding wild-type ribozyme comprises a sequence corresponding to residues 1-10 of SEQ ID NO: 1 or 49.
- the gene of interest comprises at least one modified nucleotide.
- the method of clause 23 or 24, wherein the at least one modified nucleotide is selected from the group consisting of N6-methyladenosine, N1-methyl-pseudouridine, or combinations thereof.
- the modified ribozyme further comprises a first extended guide sequence (EGS) at a 5’ end
- the nucleic acid molecule further comprises a second EGS at a 3’ end.
- EGS extended guide sequence
- the method of clause 26, wherein the first and second EGS are substantially complementary to each other.
- step (c) is performed for between 20 and 40 minutes.
- step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme.
- step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme to between 50°C and 60°C.
- step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme to about 55°C for about 20 minutes.
- a modified ribozyme for use in a method of circularizing an RNA molecule comprising, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
- EGS extended guide sequence
- nucleic acid molecule of clause 43 further comprising a second bridge sequence at a 3’ end, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme sequence.
- kits for circularizing a gene of interest comprising a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a 3’ portion of a ribozyme, and b) the gene of interest.
- a kit comprising a first DNA molecule encoding the modified ribozyme of the kit of clause 41 or 42, and a second DNA molecule encoding the nucleic acid molecule containing the gene of interest of the kit of clause 43 or 44.
- a circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA comprises at least one modified nucleotide residue, and wherein the circular RNA does not comprise exogenous exon sequences.
- a circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA comprises an eACA sequence, and wherein the circular RNA comprises at least one modified nucleotide residue.
- the plasmids TRIC V1-CVB3-EGFP (SEQ ID NO: 24), TRIC V2-CVB3-EGFP (SEQ ID NO: 25), TRIO V2-circZnf609 (SEQ ID NO: 26), PIE-CVB3-EGFP (SEQ ID NO: 27), and PIE-circZnf609 (SEQ ID NO: 28) were generated as described in GB2308675.4.
- the inventors amplified them in TOP10 competent cells and purified them using the QIAGEN Maxi Plus plasmid purification kit. The purified plasmids were then linearized using EcoR V and cleaned through phenol:chloroform:isoamyl alcohol extraction.
- IVT In vitro transcriptions
- 1X IVT buffer included 80 mM HEPES-K (pH 7.5), 2 mM spermidine, 40 mM DTT, and 24 mM MgCh.
- concentration of MgCh in 1X IVT buffer was adjusted to 14 mM.
- the IVT reactions were incubated at 37 °C for 3-5 hours, followed by digestion with RNase-free DNase I for 20 minutes. To remove any precipitation, 100 mM EDTA was added to achieve a final concentration of 25 mM. Subsequently, an equal volume of 7.5 M lithium chloride was added to precipitate the RNAs. This precipitation step was performed for a duration of 30 minutes to overnight at -20 °C. The resulting precipitates were centrifuged at 13,000 rpm/min for at least 20 minutes, and the RNA pellets were washed with 75% alcohol, air-dried, and dissolved in DEPC-treated H2O.
- Circular RNA synthesis
- RNAs underwent a refolding process. They were initially denatured at 95 °C for 2 minutes and then annealed on ice for 3 minutes. The circularization step was carried out in a 10 pl reaction volume and was terminated by adding 2 pl of 100 mM EDTA.
- RNAs at a final concentration of 200 nM were combined with 10X circularization buffer (composed of 500 mM Tris-HCI, pH 7.4, 100 mM MgCh, 10 mM DTT, and 20 mM GTP). The mixture was heated at 55 °C for 8 minutes to allow circularization.
- 10X circularization buffer composed of 500 mM Tris-HCI, pH 7.4, 100 mM MgCh, 10 mM DTT, and 20 mM GTP.
- protocol A TERs and pGs were mixed and refolded as described above in DEPC-treated H2O. Subsequently, they were supplied with the circularisation buffer for 20min of circularization at 55 °C.
- protocol B the refolding of TSRs and pGs took place in the circularisation buffer, followed by mixing and 20min of circularization at 55 °C.
- the splicing conditions such as the concentration of GTP, the ratio of TER to pG, the concentration of pGs, and the reaction duration, were optimized accordingly.
- Reverse transcriptase and DNA polymerase used here are the SuperScrip IV Reverse Transcriptase (Thermo Fisher) and the Q5 High-Fidelity DNA Polymerase (NEB). Manufacturer’s manuals were followed for reverse transcription and PCR.
- the pG and circularized pG of TERIC V2 were used as templates for reserve transcription using the RT-PCR_Reverse (GTGAACCGCATCGAGCTG (SEQ ID NO: 43)) as the reverse primer.
- RT-PCR_Forward TTTGCTGTATTCAACTTAACAATGAATTGTAATG (SEQ ID NO: 44)
- RT-PCR_Reverse RT-PCR product was gel extracted and submitted for Sanger sequencing.
- RNA sample ( ⁇ 1 OOng) was mixed with equal volume of urea loading buffer (NEB) and denatured at 95 °C for 2 min. Gels were stained in 10ml 1X TBE with SYBR Safe for 10 min before imaging.
- NEB urea loading buffer
- Group I introns utilize internal guide sequences (IGS) to form P1 and P10 structures with flanking exons, bringing the exons into proximity to facilitate splicing ( Figure 2A).
- IGS internal guide sequences
- TER trans excision ribozyme
- TER trans excision ribozyme
- the inventors utilised a group I intron which included an IGS sequence (the cyanobacterium Anabaena tRNA Leu intron, referred to hereafter as “Ana”).
- IGS sequence the cyanobacterium Anabaena tRNA Leu intron, referred to hereafter as “Ana”.
- the inventors relocated a 3’ portion of the intron to the 5’ end of the gene of interest ( Figure 2B).
- This 3’ portion also called a 3’ bridge sequence
- this 3’ portion would form the P9.0 region with the trans excision ribozyme, thus improving the interaction between ribozyme and gene of interest to facilitate splicing.
- EGS extended guide sequences
- eACA extended anticodon arm
- TER V1 (comprising nucleotides 1 to 241 of Ana, SEQ ID NO: 13) and TER V2 (comprising nucleotides 1 to 226 of Ana, SEQ ID NO: 12)
- Figure 2B Precursors of the corresponding genes of interest (pGs) were also generated, using a 3’ bridge sequence of nucleotides 242-249 of Ana (SEQ ID NO: 14) for TER V1 , and nucleotides 227-249 of Ana (SEQ ID NO: 15) for TER V2 ( Figure 2B). These 3’ bridge sequences were joined to the 5’ end of a gene of interest encoding EGFP and including other sequence elements such as CVB3 (an IRES), stop and start codons, and polyAC.
- CVB3 an IRES
- stop and start codons and polyAC.
- protocol B TSRs and pGs were individually refolded in splicing buffer, followed by mixing and heating at 55°C for 20 minutes. The final concentration of pGs was 200 nM. Splicing reactions were halted using 2 pL of 100 mM EDTA and loaded onto a 0.8% native agarose gel.
- TERIC V2 exhibited the highest splicing efficiency at 2 mM GTP when protocol A was employed, although TER V1 also achieved efficient splicing with protocol A. Splicing was also achieved with protocol B, but was less efficient compared to protocol A.
- the bands marked by empty circles represent circCVB3-EGFP.
- the inventors optimized the ratio between TERs and pGs. Initially, the inventors maintained the concentration of pGs at 200 nM and gradually increased the concentration of TERs from 200 nM to 1600 nM.
- Figure 4A illustrates that within the TER/pG ratio range of 1-4, an increase in TER concentration resulted in improved circularization efficiency. However, when the TER/pG ratio exceeded 4, the circularization efficiency did not exhibit significant changes.
- the inventors fixed the ratio between TERs and pGs at 4 and investigated the effect of pG concentration and circularization time. As depicted in Figures 4B and 4C, pGs concentrations ranging from 200 nM to 400 nM, along with circularization times of 20-40 minutes, yielded the highest circularization efficiency while maintaining a relatively low level of nicking.
- unmodified PIE and TRIC precursors efficiently convert to circular RNAs, while no circular RNA is observed for the modified PIE or TRIC variants.
- the inventors synthesized modified precursors while keeping the ribozyme unmodified. Consistent with the previous observations, unmodified TERIC V2 successfully circularized the unmodified pG. However, surprisingly, no circular RNAs were observed for the modified pGs. The inventors identified two potential reasons for the failure to circularize modified pGs.
- the 3’ bridge sequence located at the 5’ end of the pGs would be modified, and this 3’ bridge sequence spans nucleotides 227-249 of the Ana intron. It is possible that m6A or N1 1 modifications within this short 3’ bridge sequence abolishes TERIC activity.
- the IGS in TER V2, UUGAG is an AU-rich sequence, and m6A or N1 T modifications in the corresponding eACA could weaken the P1 and P10 structures, thus disrupting TER activity.
- the purpose of relocating the 3’ portion of the ribozyme to the 5’ end of GOIs is to reconstitute the ribozyme structure between the ribozyme and the GOI. It is the structure, rather than the sequence or length, that is crucial for reconstitution to be functional. To confirm this, we introduced mutations in the 3’ portion of the ribozyme that was moved to the 5’ end of the GOI (SEQ ID NOs: 53 and 54).
- TERIC can also be used for circularization of modified RNAs where previous approaches such as PIE and TRIC are unsuitable.
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Cell Biology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
This invention relates to methods for producing a circular gene of interest. A nucleic acid molecule is provided that comprises, in the 5' to 3' direction, a first bridge sequence, and a gene of interest. The first bridge sequence comprises a sequence corresponding to a 3' portion of a ribozyme. Also provided is a modified ribozyme comprising a truncated ribozyme that does not comprise a 3' portion of a corresponding wild-type ribozyme. The nucleic acid molecule and the modified ribozyme are combined under conditions suitable for circularization to occur. Modified ribozymes, nucleic acid molecules and kits are also provided.
Description
METHODS FOR MAKING CIRCULAR RNAS
This application claims priority from GB2313326.7 filed 01 Sept 2023, the contents and elements of which are herein incorporated by reference for all purposes.
FIELD OF THE INVENTION
The invention relates to methods for making circular RNAs. The invention also relates to modified ribozymes for making circular RNAs, nucleic acid molecules for making circular RNAs, and kits for making circular RNAs.
BACKGROUND
Ribosomes are molecular machines that translate the genetic code carried by messenger RNAs (mRNA) into functional proteins. When exogenous mRNA is introduced into cells, the gene of interest (GOI) encoded by that mRNA can be expressed, forming the basis of mRNA therapeutics. However, despite promising results in animals since the 1990s, mRNA therapeutics did not receive significant investment due to challenges such as mRNA instability, high innate immunogenicity, and inefficient in vivo delivery (Pardi, N. et al. (2018) Nat Rev Drug Discov 17 , 261-279). Recent advances in mRNA delivery and nucleotide modification techniques have led to the successful application of mRNA therapeutics, particularly in the field of vaccines. However, other types of mRNA therapeutics, such as protein replacement therapy, still face challenges due to mRNA instability.
Circular RNAs (circRNA) have emerged as a promising solution to address the instability problem of mRNAs in therapeutic applications. Due to their covalently closed structure, circRNAs are inherently resistant to exonucleases, making them more stable compared to linear mRNAs (Jeck, W. R. & Sharpless, N. E. (2014) Nat Biotechnol 32, 453-461). CircRNAs are widely present in human cells and can serve as microRNA and protein sponges, as well as templates for peptide or protein production (Liu, C. X. & Chen, L. L. (2022) Cell 185, 2016-2034). While circRNAs are primarily generated through back splicing within cells, they can also be synthesized in vitro using chemical or enzymatic ligation methods or through group I intron-based circularization methods (Petkovic, S. & Muller, S. (2015) Nucleic Acids Res 43, 2454-2465). Among these methods, the permuted intron-exon method (PIE) and the trans ribozyme-based circularization (TRIC) are highly efficient in circularizing long RNAs (Figure 1A-C) (Puttaraju, M. & Been, M. D. (1992) Nucleic Acids Res 20, 5357-5364; Wesselhoeft, R. A., Kowalski, P. S. & Anderson, D. G. (2018) Nat Commun 9, 2629). However, a limitation of PIE is the inability to circularize modified circRNAs (Wesselhoeft, R. A. et al. (2019) Mol Cell 74, 508-520 e504). This limitation arises due to the potential disruption of ribozyme structure caused by base modifications in the PIE construct. Although circRNAs have been shown to be less immunogenic than unmodified linear mRNAs, base modifications could still be beneficial since nicking of circRNAs is inevitable.
Thus, there remains a need in the art for improved methods for the efficient circularization of RNA, in particular long and/or modified RNA.
SUMMARY
In a first aspect, the invention provides a method for producing a circular gene of interest, the method comprising: a) providing a nucleic acid molecule, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction, a first bridge sequence, and a gene of interest, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) providing a modified ribozyme comprising a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and c) combining the nucleic acid molecule and the modified ribozyme under conditions suitable for circularization to occur.
In a second aspect, the invention provides a circular RNA obtained by the methods described herein.
In a third aspect, the invention provides a modified ribozyme for use in a method of circularizing an RNA molecule, the modified ribozyme comprising, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
In a fourth aspect, the invention provides a nucleic acid molecule containing a gene of interest for use in a method of circularizing the gene of interest, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme sequence, b) the gene of interest, c) a loop sequence, and d) an extended guide sequence.
In a fifth aspect, the invention provides a kit for circularizing a gene of interest, the kit comprising a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction:
a) a bridge sequence comprising a 3’ portion of a ribozyme, and b) the gene of interest.
The modified ribozyme may be a modified ribozyme of the third aspect. The nucleic acid molecule may be a nucleic acid molecule of the fourth aspect.
In a sixth aspect, the invention provides a DNA molecule encoding a modified ribozyme as defined herein.
In a seventh aspect, the invention provides a DNA molecule encoding a nucleic acid molecule containing a gene of interest as defined herein.
In an eighth aspect, the invention provides a kit for circularizing a gene of interest, the kit comprising a DNA molecule of the sixth aspect and a DNA molecule of the seventh aspect.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 : Comparison of prior art approaches and trans excision ribozyme-based circularisation (TERIC). (A) Schematic depicting splicing by the group I intron from cyanobacterium Anabaena (Ana), resulting in a linear molecule comprising joined exons. (B) Schematic depicting circularization of a gene of interest by the permuted intron-exon (PIE) method. (C) Schematic depicting circularization of a gene of interest using the TRIC approach. (D) Schematic depicting splicing by a trans excision ribozyme (TER). (E) Schematic depicting circularization of a gene of interest using the TERIC approach.
Figure 2: (A) Secondary structure of the Ana group I intron. Lower case characters are exon sequences. TER V1 and TER V2 labels indicate points from which a 3’ portion of the intron is removed. (B) Details for designing of trans excision ribozymes (TERs) and corresponding precursor of GOIs (pGs) using the cyanobacterium Anabaena tRNALeu group I intron as an example.
Figure 3: (A) Splicing reactions of TERIC V1 and V2 were loaded to a 0.8% native agarose gel. The TRIC V2-CVB3-EGFP was loaded as a positive control. (B) pG, TER and circularized sample of TERIC V2 were loaded to a 6M urea-1 .5% agarose gel. Circular CVB3-EGFP was present in the circularized TERIC sample. (C) RT-PCR of the pG and circularized sample of TERIC V2. Primers used here are indicated in Figure 2B. (D-E) Sequence of TERs and pGs of TERIC V1 (SEQ ID NOs: 5 and 6) and V2 (SEQ ID NOs: 7 and 8).
Figure 4: Optimization of (A) TER to pG ratio, (B) concentration of pGs, and (C) duration of circularization. 0.8% native agarose gel was used. In the case of (B) and (C), TERs have run out of the gel. TERIC V2 shows generally better circularization efficiency than the TERIC V1 , although both
achieve circularization. The most efficient circularization protocol for TERIC V2 was identified as 200- 400 nM pG and 20-40 min of circularization in the presence of 2mM GTP and 4 times amount of TERs to pGs.
Figure 5: (A) Circularization efficiencies between pGs with the eACA elements or the native tRNA sequences are compared. The results show the eACA is important for TERIC to work. (B) Sequence of the TERIC-pG-tRNA (SEQ ID NO: 9).
Figure 6: (A) Modified and unmodified precursors of PIE, TRIC, and TERIC were subject to circularization and loaded onto a 0.8% native agarose. No modified circular RNA can be seen for any of these three constructs. (B) IGSs of TRIC and TERIC are mutated from UUGAG to CCGCC and the circZnf609 was selected as the target sequence. Mutation of IGS to CCGCC for the TERIC restored ribozyme activity as indicated by the presence of circular modified RNAs. (C) Sequences of the TER (1 ,226)-IGS CCGCC (SEQ ID NO: 10) and pG_circZnf609 (SEQ ID NO: 11).
Figure 7: Schematic overview of the TERIC approach.
Figure 8: The extended anticodon arm (eACA). As long as a stem loop structure can be found where the loop contains a uracil in 3rd position and the stem is >5bp, a circularization site can be assembled. Thus, circularization sites can be either placed in UTR or CDS.
Figure 9: Circularization of pGs using either 3’-truncated TER or 5’ and-3’-truncated TER. The 3’ bridge sequence in pGs is either identical to the truncated portion of the ribozyme or contains mutations. Circularization was performed at a 4:1 TER/pG ratio at 55 °C for 20 minutes and then analyzed on a 0.8% native agarose gel.
DETAILED DESCRIPTION
The invention provides a new method for circularizing genes of interest to produce circular RNAs. The invention is based at least in part on the discovery that portions of a ribozyme can be relocated to a gene of interest to allow the ribozyme to interact with the gene of interest, and facilitate splicing in trans. This results in the circularization of the gene of interest, while the ribozyme itself can be reused. The invention enables the circularisation of genes of interest which contain modified nucleotides.
As will be described in more detail below, the invention relies on the use of a “bridge sequence” and corresponding truncated ribozyme. For example, a truncated ribozyme may be designed to lack a 3’ portion of its sequence. A bridge sequence can then be designed which corresponds to this missing portion of the ribozyme. When the bridge sequence is added to the gene of interest, the bridge sequence allows the truncated ribozyme to interact with the gene of interest and restore its tertiary structures, permitting splicing and circularisation of the gene of interest.
The inventors term this approach trans excision ribozyme-based circularization, or TERIC.
With this new system, circularization is achieved by providing the ribozyme and gene of interest as separate constructs. This has the advantage that genes of interest comprising modified nucleotides can be circularized, something which is difficult to achieve with existing methods in the art, such as the permuted intron-exon (PIE) method. A further advantage is that the ribozyme can be reused, which again is not possible with methods such as PIE.
Abbreviations
ACA - anticodon arm
CDS - coding sequence
EGS - extended guide sequence elGS - extended internal guide sequence eACA - extended anticodon arm GOI - gene of interest IGS - internal guide sequence IVT - in vitro transcription pG - precursor of gene of interest PIE - permuted intron-exon RNA - ribonucleic acid
TERIC - trans excision ribozyme-based circularization
TRIC - trans ribozyme based circularization
TER - trans excision ribozyme tRNA - transfer RNA UTR - untranslated region
Below are provided certain definitions of terms, technical means, and embodiments used herein.
Generally, described herein are methods for producing circular genes of interest. The methods comprise providing a nucleic acid molecule comprising a bridge sequence and the gene of interest, and a modified ribozyme. The bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme. The modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme. The methods described herein further comprise combining the nucleic acid molecule and modified ribozyme under conditions suitable for circularization to occur, in order to produce a circular gene of interest.
Ribozymes, such as group I introns, are capable of folding to form tertiary structures consisting of paired segments, termed P1-P10 (as shown in Figure 2A). The methods of the invention exploit these
self-interactions by relocating parts of the ribozyme sequence to a gene of interest. The truncated ribozyme, which might lack a 3’ portion of its sequence, or a 3’ portion and a 5’ portion, is then able to interact with the relocated parts of the ribozyme sequence on the gene of interest (the bridge sequence(s)) to restore its tertiary structure.
Accordingly, disclosed herein are modified ribozymes, which generally comprise a truncated ribozyme.
By truncated ribozyme, it is generally meant a ribozyme in which a 3’ portion of the corresponding wild-type ribozyme sequence has been removed. A truncated ribozyme may also refer to a ribozyme in which a 3’ portion and a 5’ portion of the corresponding wild-type ribozyme sequence have been removed.
The ribozyme can be any ribozyme capable of acting as a trans excision ribozyme. In some cases, the ribozyme is derived from or is a group I intron. Examples of suitable ribozymes are the Tetrahymena ribosomal intron, T4 phage thymidylate synthase intron, Anabaena (Ana) pre-tRNA intron, Azoarcus sp. BH72 lie tRNA intron, and Staphylococcus phage Twort ribonucleotide reductase intron. Sequences of these ribozymes are shown below, with the internal guide sequence (IGS) shown with underlined, shaded letters.
Ana group I intron:
AATAATTGAGCCTTAGAGAAGAAATTCTTTAAGTGGATGCTCTCAAACTCAGGGAAACCTAAATCT AGCTATAGACAAGGCAATCCTGAGCCAAGCCGAAGTAGTAATTAGTAAGTTAACAACAGATAACTT ACAGCTAATCGGAAGGTGCAGAGACTCGACGGGAGCTACCCTAACGTCAAGACGAGGGTAAAGA GAGAGTCCAATTCTCAAAGCCAATAGGCAGTAGCGAAAGCTGCGGGAGAATG (SEQ ID NO: 49)
T4 phage group I intron:
AATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACGGGGAACCTCTCTAGTAG ACAATCCCGTGCTAAATTGTAGGACTTGCCCTTTAATAAATACTTCTATATTTAAAGAGGTATTTAT GAAAAGCGGAATTTATCAGATTAAAAATACTTTAAACAATAAAGTATATGTAGGAAGTGCTAAAGAT TTTGAAAAGAGATGGAAGAGGCATTTTAAAGATTTAGAAAAAGGATGCCATTCTTCTATAAAACTT CAGAGGTCTTTTAACAAACATGGTAATGTGTTTGAATGTTCTATTTTGGAAGAAATTCCATATGAGA AAG ATTTG ATTATTG AACG AG AAAATTTTTG G ATTAAAG AG CTTAATTCTAAAATTAATG G ATACAA TATTGCTGATGCAACGTTTGGTGATACATGTTCTACGCATCCATTAAAAGAAGAAATTATTAAGAAA CGTTCTGAAACTGTTAAAGCTAAGATGCTTAAACTTGGACCTGATGGTCGGAAAGCTCTTTACAGT AAACCCGGAAGTAAAAACGGGCGTTGGAATCCAGAAACCCATAAGTTTTGTAAGTGCGGTGTTCG CATACAAACTTCTGCTTATACTTGTAGTAAATGCAGAAATCGTTCAGGTGAAAATAATTCATTCTTT AATCATAAGCATTCAGACATAACTAAATCTAAAATATCAGAAAAGATGAAAGGTAAAAAGCCTAGT AATATTAAAAAGATTTCATGTGATGGGGTTATTTTTGATTGTGCAGCAGATGCAGCTAGACATTTTA AAATTTCGTCTGGATTAGTTACTTATCGTGTAAAATCTGATAAATGGAATTGGTTCTACATAAATGC CTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTCGAAACGATAGACAACTTGCTTTAACA AGTTGGAGATATAGTCTGCTCTGCATGGTGACATGCAGCTGGATATAATTCCGGGGTAAGATTAA CGACCTTATCTGAACATAATG (SEQ ID NO: 50)
Tetrahymena ribosomal intron:
AAATAG CAATATTTACCTTTGGAGGGAAAAGTTATCAGGCATG CACCTG GTAG CTAGTCTTTAAAC CAATAGATTGCATCGGTTTAAAAGGCAAGACCGTCAAATTGCGGGAAAGGGGTCAACAGCCGTTC AGTACCAAGTCTCAGGGGAAACTTTGAGATGGCCTTGCAAAGGGTATGGTAATAAGCTGACGGA CATGGTCCTAACCACGCAGCCAAGTCCTAAGTCAACAGATCTTCTGTTGATATGGATGCAGTTCA CAGACTAAATGTCGGTCGGGGAAGATGTATTCTTCTCATAAGATATAGTCGGACCTCTCCTTAATG
GGAGCTAGCGGATGAAGTGATGCAACACTGGAGCCGCTGGGAACTAATTTGTATGCGAAAGTAT
ATTGATTAGTTTTGGAGTACTCG (SEQ ID NO: 3)
Azoarcus sp. BH72 He tRNA intron:
ATTTCGATGTGCCTTGCGCCGGGAAACCACGCAAGGGATGGTGTCAAATTCGGCGAAACCTAAG CGCCCGCCCGGGCGTATGGCAACGCCGAGCCAAGCTTCGGCGCCTGCGCCGATGAAGGTGTAG AGACTAGACGGCACCCACCTAAGGCAAACGCTATGGTGAAGGCATAGTCCAGGGAGTGGCGAAA GTCACACAAACCGG (SEQ ID NO: 4)
Staphylococcus phage Twort ribonucleotide reductase intron:
AAACAATTCTGCCCCCTATATTAGTAATAGTGTAGGTTAAAAAACTTCCTTAATTCATGGGAAATCT CCCTGACTTTTATTATAAATTTTGGTATAGTAAAATGAGAAGGAGATAATCATGAGCAAGATAACCA ATAGCAAAGAAACCCAAAAAGTACCTAACGCTACTAAAAGTATTTACCACCATATAAAAAGTAAAA GAAGGATGGAAGTCATTAAATCACTTAATGAATTGGTAATTATCTTGTGCAACGACTAGAGAAAAG ATAGTTTATTGTTACAGGCAGTAAATGAAGACTGAGTATCGTACACACAAGTGAGTGGAAACAGG AAGTATCCTAGAGTAACGACTAGGATAATGATATAGTCTGAACATTGTAGGTGACTACAAGAAGGT AAGGAGTAACGAACCTTATCGTAACATAATTG (SEQ ID NO: 51)
To enable the ribozyme to act in trans (that is, on a separate RNA molecule), a 3’ portion of the ribozyme is generally removed, and a corresponding sequence introduced to the 5’ end of a gene of interest. This provides a truncated ribozyme. Truncated ribozymes generally do not comprise a 3’ portion of a corresponding wild-type ribozyme. For example, in the case of the Ana group I intron above, a 3’ portion corresponding to nucleotides 227-249 may be removed. The 3’ portion can also be shorter than this, for example corresponding to nucleotides 242-249.
By “a 3’ portion” it is generally meant a 3’ end portion, such that the remaining ribozyme is essentially truncated at the 3’ end. The 3’ portion which is removed is generally between 1 and 1500 nucleotides in length. The truncated ribozyme may not comprise a 3’ portion of a corresponding wild-type ribozyme which is between 1 and 1250, 1 and 1000, 1 and 750, 1 and 500, 1 and 400, 1 and 300, 1 and 250, 1 and 200, 1 and 150, 1 and 100, 1 and 50, 5 and 1500, 5 and 1250, 5 and 1000, 5 and 750, 5 and 500, 5 and 400, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50, 10 and 1500, 10 and 1250, 10 and 1000, 10 and 750, 10 and 500, 10 and 400, 10 and 300, 10 and 250, 10 and 200, 10 and 150, 10 and 100, 10 and 50, or 10 and 30 nucleotides in length. In some cases, the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme which is between 1 and 30 nucleotides in length. For example, the 3’ portion which is missing or removed may be 23 nucleotides in length. The 3’ portion which is removed may be 8 nucleotides in length. In the case of an Ana group I intron, the truncated ribozyme may not comprise a 3’ portion which is between 1 and 249 nucleotides in length. The 3’ portion may variously be referred to as a portion which is “removed” from the ribozyme. However, it will be understood that the method does not necessarily include a step of removing this portion (or a 5’ portion) from the ribozyme, since the ribozyme can be reused and therefore provided in an already truncated form.
The 3’ portion which is missing from the truncated ribozyme may generally be one that is involved in the formation of a paired segment in a wild-type ribozyme. For example, the 3’ portion may be one that forms, or is involved in the formation of, a P9 region in a wild-type ribozyme.
When the ribozyme is an Ana group I intron ribozyme, the ribozyme may comprise nucleotides 1-226 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof. The ribozyme may comprise the nucleotide sequence of SEQ ID NO: 12. or a variant thereof. The ribozyme may comprise nucleotides 1-241 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof. The ribozyme may comprise the nucleotide sequence of SEQ ID NO: 13 or a variant thereof.
The truncated ribozymes described herein may also not comprise a 5’ portion of a corresponding wildtype ribozyme. By “a 5’ portion” it is generally meant a 5’ end portion, such that the remaining ribozyme is essentially truncated at the 5’ end. The 5’ portion which is removed is generally between 1 and 1500 nucleotides in length. The truncated ribozyme may not comprise a 5’ portion of a corresponding wild-type ribozyme which is between 1 and 1250, 1 and 1000, 1 and 750, 1 and 500, 1 and 400, 1 and 300, 1 and 250, 1 and 200, 1 and 150, 1 and 100, 1 and 50, 5 and 1500, 5 and 1250, 5 and 1000, 5 and 750, 5 and 500, 5 and 400, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50, 10 and 1500, 10 and 1250, 10 and 1000, 10 and 750, 10 and 500, 10 and 400, 10 and 300, 10 and 250, 10 and 200, 10 and 150, 10 and 100, 10 and 50, or 10 and 30 nucleotides in length. In some cases, the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme which is between 1 and 30 nucleotides in length. As for the 3’ portion, the 5’ portion may variously be referred to as a portion which is “removed” from the ribozyme. However, it will be understood that the method does not necessarily include a step of removing this portion from the ribozyme, since the ribozyme can be reused and therefore provided in an already truncated form.
The 5’ portion that is missing from the truncated ribozyme may generally be one that forms an internal guide sequence (IGS). In a wild-type ribozyme, the IGS forms P1 and P10 regions by complementary base pairing. This is shown in Figure 2A. By relocating such a 5’ portion to the gene of interest (in the form of a bridge sequence), the truncated ribozyme is able to interact with the gene of interest.
It will be appreciated that the ribozyme used in the approaches described herein should generally match the ribozyme from which the bridge sequence on the gene of interest is derived. For example, where an Ana group I intron is used, the bridge sequence added to the gene of interest should also be derived from the Ana group I intron, although this is not absolutely necessary as long as the bridge sequence enables interaction with the truncated ribozyme. In a similar manner, it is not essential for the nucleotide sequence of the bridge sequence to match the nucleotide sequence of the portion of the ribozyme which is missing from the truncated ribozyme. Again, as long as the bridge sequence enables interaction with the truncated ribozyme, sequence differences between the two are tolerated well. For example, the sequence corresponding to a 3’ portion of a ribozyme of the first bridge sequence may be different to the sequence of the 3’ portion of a corresponding wild-type ribozyme that is missing or absent from the truncated ribozyme. The sequence of the first bridge sequence may differ from the missing 3’ portion of the corresponding wild-type ribozyme by insertion, addition, substitution or deletion of 1 nucleotide, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more than 10 nucleotides. For
example, the sequence of the first bridge sequence may comprise a nucleotide sequence having at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% sequence identity to the missing 3’ portion.
Sequence comparison may be made over the full-length of the relevant sequence described herein using a standard algorithm, such as GAP, BLAST (which uses the method of Altschul et al. (1990) J. Mol. Biol. 215: 405-410), FASTA (which uses the method of Pearson and Lipman (1988) PNAS USA 85: 2444-2448), or the Smith-Waterman algorithm (Smith and Waterman (1981) J. Mol Biol. 147'. 195- 197), or the TBLASTN program, of Altschul et al. (1990) supra, generally employing default parameters.
In some cases, the length of the bridge sequence corresponds to the length of the 3’ portion (or 5’ portion) which is removed from the ribozyme, although this is not a requirement. Differences in length between the missing 3’ portion (and 5’ portion) and the bridge sequence are tolerated well. For example, the missing 3’ portion (and 5’ portion) may be 1 , 2, 3, 4, 5, or 10 or more nucleotides longer or shorter than the bridge sequence.
As described herein, a bridge sequence is generally a sequence which is added to a gene of interest, in order to facilitate circularisation.
A bridge sequence may comprise a sequence corresponding to a 3’ portion of a ribozyme, such as a 3’ end portion of a ribozyme. This may be referred to as a 3’ bridge sequence, or a first bridge sequence. A bridge sequence may also comprise a sequence corresponding to a 5’ portion of a ribozyme, such as a 5’ end portion of a ribozyme. This may be referred to as a 5’ bridge sequence, or a second bridge sequence.
In the case of a first bridge sequence (corresponding to a 3’ portion of a ribozyme), the bridge sequence is joined (either directly or indirectly) to the 5’ end of a gene of interest.
The length of the first bridge sequence may vary depending on the ribozyme from which it is derived. The first bridge sequence is generally at least two nucleotides in length. The first bridge sequence is generally shorter than the full length of any given intron. The first bridge sequence may be between 2 and 1500 nucleotides in length. The first bridge sequence may be between 2 and 1000, 2 and 750, 2 and 500, 2 and 450, 2 and 400, 2 and 350, 2 and 300, 2 and 250, 2 and 200, 2 and 150, 2 and 100, 2 and 50, or 2 and 30 nucleotides in length. The first bridge sequence may be between 5 and 1500, 5 and 1000, 5 and 750, 5 and 500, 5 and 450, 5 and 400, 5 and 350, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50 nucleotides in length. In many cases, the first bridge sequence may be between 5 and 30 nucleotides in length.
For example, when an Ana group I intron is used, the first bridge sequence may be between 2 and 248 nucleotides in length. The first bridge sequence may be between 5 and 30 nucleotides in length, for example when an Ana group I intron is used. The first bridge sequence may be 8 nucleotides in length. The first bridge sequence may be 23 nucleotides in length.
The function of the bridge sequence is to facilitate interaction with a modified ribozyme as described herein. The first bridge sequence generally comprises a portion which is capable of complementary base pairing with a 3’ portion of a ribozyme, such as a portion between 1 and 10 nucleotides in length. The first bridge sequence may be capable of forming a paired segment with a modified ribozyme as described herein, or with a 3’ portion of a ribozyme. The first bridge sequence may be capable of forming a P9 region with a 3’ portion of a ribozyme. The first bridge sequence may comprise a portion which is capable of complementary base pairing with nucleotides 207-212 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof. As will be appreciated, the precise nucleotide sequence of the first bridge sequence is not critical to the invention, as long as the nucleotide sequence of the first bridge sequence enables interaction with the modified ribozyme.
As discussed above, the sequence corresponding to the 3’ portion of a ribozyme of the first bridge sequence may be identical or different to the sequence of the 3’ portion of a corresponding wild-type ribozyme that is missing or absent from the truncated ribozyme.
The first bridge sequence may comprise a sequence corresponding to nucleotide residues 242-249 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof. The first bridge sequence may comprise the nucleotide sequence set forth in SEQ ID NO: 14 or a variant thereof. The first bridge sequence may comprise a sequence corresponding to nucleotide residues 227-249 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof. The first bridge sequence may comprise the nucleotide sequence set forth in SEQ ID NO: 15 or a variant thereof.
In the case of a second bridge sequence (corresponding to a 5’ portion of a ribozyme), the bridge sequence is joined (either directly or indirectly) to the 3’ end of a gene of interest.
The length of the second bridge sequence may vary depending on the ribozyme from which it is derived. The second bridge sequence is generally at least two nucleotides in length. The second bridge sequence is generally less than 1500 nucleotides in length. The second bridge sequence may be between 2 and 1000, 2 and 750, 2 and 500, 2 and 450, 2 and 400, 2 and 350, 2 and 300, 2 and 250, 2 and 200, 2 and 150, 2 and 100, 2 and 50, or 2 and 30 nucleotides in length. The second bridge sequence may be between 5 and 1500, 5 and 1000, 5 and 750, 5 and 500, 5 and 450, 5 and 400, 5 and 350, 5 and 300, 5 and 250, 5 and 200, 5 and 150, 5 and 100, 5 and 50 nucleotides in length. In many cases, the second bridge sequence may be between 5 and 30 nucleotides in length.
The second bridge sequence may comprise a portion that is capable of complementary base pairing with a 5’ portion of a modified ribozyme.
Although generally herein, bridge sequences are described as “corresponding to” equivalent portions of ribozyme sequences, they do not need to be identical to those sequences. Any bridge sequence may be used that allows interaction with the modified ribozyme such that circularisation can occur. For example, a bridge sequence that “corresponds to” a portion of a ribozyme sequence may have at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 99%, or 100% sequence identity with a portion of a ribozyme sequence as defined herein.
The following sets out an example when an Ana group I intron is used as the modified ribozyme. The wild-type Ana ribozyme is modified to remove a 3’ portion that is 23 nucleotides in length, and corresponds to residues 227-249 of SEQ ID NO: 1 or SEQ ID NO: 49 (wild-type Ana) or a variant thereof. This modified ribozyme then comprises residues 1-226 of SEQ ID NO: 1 or SEQ ID NO: 49 or a variant thereof. A corresponding first bridge sequence, which comprises a sequence corresponding to residues 227-249 of SEQ ID NO: 1 , is then added to a 5’ end of a gene of interest. When the modified ribozyme and the gene of interest are combined under circularizing conditions, the first bridge sequence can interact with (i.e. complementary base pair with) the remaining portion of the ribozyme, for example forming the P9 region shown in Figure 2A. At the same time, the IGS of the modified ribozyme can form P1 and P10 regions with the gene of interest. These interactions enable the splicing and circularization of the gene of interest by the modified ribozyme.
Genes of interest
Provided herein are nucleic acid molecules containing a gene of interest. The gene of interest (GOI) refers to the sequence which is to be circularized. The GOI can comprise a coding sequence, coding for a peptide or protein, or can be a noncoding sequence. The GOI can also comprise a combination of coding and noncoding sequence.
It will also be understood that, as used herein, the term “gene of interest” encompasses sequences which include additional sequence elements, such as internal ribosome entry site (IRES) sequences, multiple siRNA target sites (msiTS), spacer sequences such as polyAC sequences, start codons, stop codons, and any other sequence elements known to be useful in the art for producing circular RNA.
Suitable IRESs for use in the invention include CVB3, CroV, CSFV, the DNA sequences of which are set out below.
CVB3-IRES
TTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCATTGGGCGCTAGCACTCTGGTATCACG GTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAACACACACCGATC
AACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATC AATAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCT AGTAACACCGTGGAAGTTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGA GTCACCGCATTCCCCACGGGCGACCGTGGCGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAA CCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTGAGCTAGTTGGTAGTCCTCCG GCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGTGTGTCG TAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTG GCTGCTTATGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTA ATAGAGCTATTATATATCCCTTTGTTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACA ATTCATTGTTAAGTTGAATACAGCAAA (SEQ ID NO: 16)
CroV-IRES:
GTATAAGAGACAGGTGTTTGCCTTGTCTTCGGACTGGCATCTTGGGACCAACCCCCCTTTTCCCC AGCCATGGGTTAAATGGCAATAAAGGACGTAACAACTTTGTAACCATTAAGCTTTGTAATTTTGTA ACCACTAAGCTTTGTGCACATAATGTAACCATCAAGCTTGTTAGTCCCAGCAGGAGGTTTGCATG CTTGTAGCCGAAATGGGGCTCGACCCCCCATAGTAGGATACTTGATTTTGCATTCCATTGTGGAC CTGCAAACTCTACACATAGAGGCTTTGTCTTGCATCTAAACACCTGAGTACAGTGTGTACCTAGAC CCTATAGTACGGGAGGACCGTTTGTTTCCTCAATAACCCTACATAATAGGCTAGGTGGGCATGCC CAATTTGCAAGATCCCAGACTGGGGGTCGGTCTGGGCAGGGTTAGATCCCTGTTAGCTACTGCC TGATAGGGTGGTGCTCAACCATGTGTAGTTTAAATTGAGCTGTTCATATACC (SEQ ID NO: 17)
CSFV-IRES:
GTATACGAGGTTAGTTCATTCTCGTATACACGATTGGACAAATCAAAATTATAATTTGGTTCAGGG CCTCCCTCCAGCGACGGCCGAACTGGGCTAGCCATGCCCATAGTAGGACTAGCAAACGGAGGG ACTAGCCGTAGTGGCGAGCTCCCTGGGTGGTCTAAGTCCTGAGTACAGGACAGTCGTCAGTAGT TCGACGTGAGCAGAAGCCCACCTCGAGATGCTACGTGGACGAGGGCATGCCCAAGACACACCTT AACCCTAGCGGGGGTCGCTAGGGTGAAATCACACCACGTGATGGGAGTACGACCTGATAGGGC GCTGCAGAGGCCCACTATTAGGCTAGTATAAAAATCTCTGCTGTACATGGCAC (SEQ ID NO: 18)
The TERIC method is suitable for genes of interest of any length. TERIC is particularly suitable for long genes of interest. In this context, “long” is generally considered to mean a sequence of at least 500 nucleotides. Thus, the gene of interest may be at least 100, at least 250, at least 500, at least 1000, at least 2000, at least 3000, at least 4000, at least 5000, at least 6000, at least 7000, or at least 8000 nucleotides in length.
Extended anticodon arm sequence
The gene of interest may also comprise an extended anticodon arm (eACA) sequence. The eACA may be part of the gene of interest sequence (for example, it may be naturally occurring within the gene of interest), but for clarity is referred to separately herein.
An extended anticodon arm (eACA) sequence is one which is capable of forming a stem-loop structure (also known as a hairpin or hairpin loop). Stem-loop structures form when two regions of single-stranded RNA which are generally complementary to each other (when read in opposite directions) base-pair with each other. The base-pairing results in a double helix structure ending in an unpaired loop. The natural propensity of eACA sequences to form stem-loop structures can be utilised to enable and enhance circularisation of a gene of interest.
Prior to circularization, a nucleic acid molecule to be circularized comprises an eACA sequence in two separate portions. A first portion of the eACA sequence is positioned at or near the 5’ end of the gene of interest, and a second portion of the eACA sequence is positioned at or near the 3’ end of the gene of interest, as shown in Figure 2B and Figure 7. During circularization, splicing by the ribozyme causes the first and second portions of the eACA sequence to be covalently joined, in order to create a circular version of the gene of interest. In the resulting circular molecule, the first and second portions are joined to form the eACA sequence, which is generally capable of forming a stem-loop structure as shown in Figure 2B and Figure 7.
The first portion of the eACA sequence may comprise a first eACA stem portion and a first eACA loop portion. The second portion of the eACA sequence may comprise a second eACA stem portion and a second eACA loop portion. By “stem portion” it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the stem of a stem-loop structure. By “loop portion” it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the loop of a stem-loop structure. A stem-loop forming structure can be identified in a gene of interest, and that gene of interest can be rearranged to provide a first part of the stem-loop forming structure at one end, and the second part at the other end. This gene of interest can subsequently be used for circularization. Thus, the stem and loop portions of the eACA sequence are capable of forming a stem-loop structure in the nucleic acid molecules described herein.
The specific nucleotide sequence of the eACA sequence is not important for TERIC and does not determine whether circularization will occur. Rather, it is the structure as opposed to the sequence of the eACA that is important. Consequently, the eACA can be of any nucleotide sequence. In some preferred embodiments, the last nucleotide in the second eACA loop portion is one which can form a wobble base pair with a corresponding nucleotide in the internal guide sequence described herein. Generally, the last nucleotide in the second eACA loop portion is a uracil, and forms a wobble base pair with a corresponding guanine in the internal guide sequence. However, the last nucleotide in the second eACA loop portion may also be cytosine and form a wobble base pair with adenosine. In other embodiments, the last nucleotide in the second eACA loop portion is one which can form a canonical base pair with a corresponding nucleotide in the internal guide sequence or one which does not form a wobble or canonical base pair with a corresponding nucleotide in the internal guide sequence.
The first and second eACA stem portions may be complementary to each other, though this is not strictly necessary, and some non-complementarity may be tolerated.
The first and second eACA stem portions are generally each at least 5 nucleotides in length but can be a short as 1 nucleotide in length each. The stem portion lengths can be adapted depending on the gene of interest to be circularized. For example, longer stem lengths (such as lengths greater than 15 nt) may be advantageous if circularizing long (>500 nt) genes of interest.
Thus, the first and second stem portions may each be at least 1 , at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, at least 25 or at least 30 nucleotides in length. In particular, the first and second stem portions may each be at least 15 or at least 25 nucleotides in length.
The first and second stem portions need not be the same length, for example one stem portion may be one or two nucleotides shorter than the other, provided a stem-loop structure can still be formed.
The anticodon arm loop of the Ana tRNA group I intron is naturally 7 nucleotides in length. Consequently, in the circular RNAs described herein, the loop of the stem-loop structure may be 7 nucleotides in length, particularly when the ribozyme used is, or is derived from, the Ana group I intron. If other group I introns are used, the loop of the stem-loop structure may have a different nucleotide length. The loop of the stem-loop structure is generally between 3 and 10 nucleotides in length. In many cases, the loop of the stem-loop structure is at least 5 nucleotides in length, particularly if the ribozyme is or is derived from the Ana group I intron.
Generally, if the loop of the stem-loop structure comprises 7 nucleotides, then the first eACA loop portion comprises 4 nucleotides, and the second eACA loop portion comprises 3 nucleotides.
Accordingly, the first and second eACA stem portions may each be at least 15 nucleotides in length, the first eACA loop portion may be 4 nucleotides in length, and the second eACA loop portion may be 3 nucleotides in length. In the resulting circular RNA molecule, this produces an eACA sequence which is capable of forming a stem-loop structure, wherein the stem comprises at least 15 base pairs, and the loop is 7 nucleotides in length.
In total, the first portion of the eACA sequence may comprise as few as 5 nucleotides (for example, 1 stem nucleotide and 4 loop nucleotides). The second portion of the eACA sequence may comprise as few as 4 nucleotides (for example, 1 stem nucleotides and 3 loop nucleotides). Alternatively, the first portion of the eACA sequence may comprise, for example, 19 nucleotides (e.g. 15 stem nucleotides and 4 loop nucleotides), or 29 nucleotides (e.g. 25 stem nucleotides and 4 loop nucleotides). The second portion of the eACA sequence may comprise, for example, 18 nucleotides (e.g. 15 stem
nucleotides and 3 loop nucleotides), or 28 nucleotides (e.g. 25 stem nucleotides and 3 loop nucleotides).
An exemplary first portion of the eACA sequence comprising a 1 nucleotide stem and a 4 nucleotide loop may comprise the nucleotide sequence 5 -NNNNN-3’, wherein N is any nucleotide, and an exemplary second portion of the eACA sequence comprising a 1 nucleotide stem and a 3 nucleotide loop may comprise the nucleotide sequence 5’-NNNU-3’, wherein N is any nucleotide. As an example, in the case of Nano Luciferase (NLuc), a possible ACA sequence is GATCACCACTTTAAGGTGATC (SEQ ID NO: 19). For circularization, the NLuc GOI is rearranged such that the ACA sequence is provided in two portions (a 5’ first portion and a 3’ second portion). The first portion of the eACA sequence comprises the sequence TTAAGGTGATC (SEQ ID NO: 20). The second portion of the eACA sequence comprises the sequence GATCACCACT (SEQ ID NO: 21). It will be realised that the provision of a thymine in a DNA precursor molecule will result in a uracil In the translated RNA molecule for circularization.
The first and second eACA loop portions base pair with the internal guide sequence (IGS) to form the P1 and P10 regions, which are critical for ribozyme activity. The first eACA loop portion, positioned towards the 5’ end of the gene of interest, base pairs with the IGS to form the P10 region. It is not necessary for all the nucleotides in the first eACA loop portion to form the P10 region, and in some cases only two nucleotides of the first eACA loop portion form the P10 region. The second eACA loop portion, positioned towards the 3’ end of the gene of interest, base pairs with the IGS to form the P1 region.
As described above, in order for circularization to occur, the last nucleotide of the second eACA loop portion (i.e. the nucleotide at the 3’ end of the second eACA loop portion) may form a wobble base pair with a corresponding nucleotide in the IGS. Generally, the wobble base pair is a GU wobble base pair, with G in the IGS and U in the second eACA loop portion. The wobble base pair provides the circularization site, such that once circularized, the nucleotide at the 3’ end of the second eACA loop portion forms the third nucleotide in the loop of the eACA stem-loop structure. This is depicted in Figure 8. In other embodiments, the last nucleotide of the second eACA loop portion may form a canonical base pair with a corresponding nucleotide in the internal guide sequence. In other embodiments, the last nucleotide of the second eACA loop portion may not form a wobble or canonical base pair with a corresponding nucleotide in the internal guide sequence.
The P1 region may also be formed by base pairing of the IGS with a region adjacent to the second eACA loop portion in the 3’ direction, known as the “P1 extension”. If present, the P1 extension typically comprises between 2 and 4 nucleotides, which base pair with the IGS. The P1 region may therefore be formed by the P1 extension and second eACA loop portion base pairing with the IGS, as shown in Figure 2B and Figure 7. Consequently, in some embodiments, the second portion of the eACA sequence and the P1 extension together are capable of forming a P1 region. If the P1
extension is not present, the P1 region is formed by only the second eACA loop portion base pairing with the IGS. P1 extensions have been described in Olson & Muller (2012) RNA 18:581-589. The contents of which are incorporated herein by reference. Generally, if an extended guide sequence (EGS) is used, the P1 extension region will be present.
A particular advantage of the TERIC method is that it can utilise eACA sequences which are already present in the gene of interest. For example, if a stem-loop forming eACA sequence can be found in a gene of interest, this gene can be circularized efficiently without introducing any additional sequences. This in turn means that the resulting circular RNA is far less likely to be immunogenic. An eACA sequence (i.e. a stem-loop forming structure) can first be identified in a gene of interest.
Subsequently, the gene of interest is rearranged such that the eACA sequence is split into two portions, one at each end of the gene of interest. This rearranged gene is then cloned into a TERIC construct for circularisation. An example of this is the protein coding circular RNA T2A Nano Luciferase. This circular RNA already comprises an eACA sequence in its natural sequence. This means a circularisation site can be introduced using the naturally occurring eACA sequence, without the need to perform mutations or introduce additional sequence.
If the natural coding sequence (CDS) does not contain an eACA sequence, it is possible to perform selective mutations to create one. Codon redundancy means that mutations can be made to the nucleotide sequence of the GOI without affecting the resulting peptide sequence. Consequently, an eACA sequence can be provided in the GOI without requiring the introduction of additional sequences. Instead, only selective mutation of the existing sequence is needed, following the rules of codon redundancy.
If there is no eACA in a non-coding RNA, or if the circularization site is placed in an untranslated region (UTR) of a protein-coding RNA, the circularization site can be created by introducing additional nucleotides. For example, as shown in Figure 8, 5 nucleotides (light grey nt) could be introduced to create a stem portion of the eACA sequence, using the existing sequence (black nt) of the GOI to provide the remainder of the eACA.
For example, where an eACA is located in a coding sequence to be circularized, the GOI may comprise, in the 5’ to 3’ direction: a stop codon, a polyAC sequence, multiple siRNA target sites (msiTS), an IRES, a start codon, and the coding sequence including the eACA. Where an eACA is placed in an untranslated region of a sequence to be circularized, the GOI may comprise, in the 5’ to 3’ direction: multiple siRNA target sites (msiTS), an IRES, a start codon, a coding sequence, a stop codon, a polyAC sequence, and the eACA.
Accordingly, in the nucleic acid molecules described herein, the first and/or second portions of the eACA sequence may naturally occur in the gene of interest. In other words, they may be part of the
gene of interest and so are present without having to mutate the existing sequence or introduce additional sequence.
Alternatively, in the nucleic acid molecules described herein, all or part of the eACA sequence may be derived from human ribosomal RNA (rRNA). The use of human rRNA has the potential to provide circular RNAs which are less immunogenic.
The location of the first and second portions of the eACA sequence in the gene of interest has little impact on circularization. One portion of the eACA sequence may be identified in or placed in a coding sequence, whilst the other may be identified in or placed in an untranslated region. It is not necessary for both portions to be in the coding sequence, for example, or for both portions to be in the untranslated region.
Thus, described herein are nucleic acid molecules containing a gene of interest, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, and b) the gene of interest.
In some cases, the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, and d) a second portion of the eACA sequence.
Also described herein are nucleic acid molecules containing a gene of interest, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) the gene of interest, and c) a second bridge sequence, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme.
In some cases, the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a first bridge sequence, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second portion of the eACA sequence, and
e) a second bridge sequence, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme.
Further components that may also be included in the nucleic acid molecules, such as extended guide sequences and loop sequences, are described below.
A nucleic acid molecule, bridge sequence, gene of interest, ribozyme, ribozyme portion or truncated ribozyme, extended guide sequence, loop sequence, homology arm, extended anticodon arm (eACA) sequence or other sequence described herein may comprise or consist of the amino acid sequence of a reference sequence set out herein or may be a variant of a reference sequence set out herein. For example, a variant sequence may have at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 99%, or 100% sequence identity to the reference sequence. In some embodiments, a variant sequence may differ from the reference sequence by insertion, addition, substitution or deletion of 1 nucleotide, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more than 10 nucleotides.
Modifications
An important advantage of the TERIC approach is that it enables modified RNAs to be circularized. Thus, the gene of interest may comprise at least one modified nucleotide.
The gene of interest may comprise any amount of modified nucleotides considered useful for the desired application of the circular RNA (e.g. therapeutic or research applications). Accordingly, the % of nucleotides in the gene of interest which are modified may vary from 0% to 100%. In some cases, the gene of interest may be fully modified, in that 100% of the nucleotides are modified. The gene of interest may comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% modified nucleotides.
The modification may be a base modification. For example, the modification may be a ribose modification. In some embodiments, the modification may be selected from the group consisting of: m5C (5-methylcytidine); m5U (5-methyluridine); m6A (N6-methyladenosine); s2U (2-thiouridine); T (pseudouridine); N1 T (N1-methylpseudouridine); Um (2'-O-methyluridine); m1 A (1 -methyladenosine); m2A (2-methyladenosine); Am (2'-O-methyladenosine); ms2m6A (2-methylthio-N6-methyladenosine); i6A (N6-isopentenyladenosine); ms2i6A (2-methylthio-N6isopentenyladenosine); io6A (N6-(cis- hydroxyisopentenyl)adenosine); ms2io6A (2-methylthio-N6-(cis-hydroxyisopentenyl)adenosine); g6A (N6-glycinylcarbamoyladenosine); t6A (N6-threonylcarbamoyladenosine); ms2t6A (2-methylthio-N6- threonyl carbamoyladenosine); m6t6A (N6-methyl-N6-threonylcarbamoyladenosine); hn6A(N6- hydroxynorvalylcarbamoyladenosine); ms2hn6A (2-methylthio-N6-hydroxynorvalyl
carbamoyladenosine); Ar(p) (2'-O-ribosyladenosine (phosphate)); I (inosine); m1 l (1 -methylinosine); m1 lm (1 ,2'-O-dimethylinosine); m3C (3-methylcytidine); Cm (2'-O-methylcytidine); s2C (2-thiocytidine); ac4C (N4-acetylcytidine);
(5-formylcytidine); m5Cm (5,2'-O-dimethylcytidine); ac4Cm (N4-acetyl-2 - O-methylcytidine); k2C (lysidine); m1 G (1 -methylguanosine); m2G (N2-methylguanosine); m7G (7- methylguanosine); Gm (2'-O-methylguanosine); m2 2G (N2,N2-dimethylguanosine); m2Gm (N2,2'-O- dimethylguanosine); m2 2Gm (N2,N2,2'-O-trimethylguanosine); Gr(p) (2'-O- ribosylguanosine(phosphate)); yW (wybutosine); 02yW (peroxywybutosine); OHyW (hydroxywybutosine); OhyWO (undermodified hydroxywybutosine); imG (wyosine); mimG (methylwyosine); Q (queuosine); oQ (epoxyqueuosine); galQ (galactosyl-queuosine); manQ (mannosyl-queuosine); preQo (7-cyano-7-deazaguanosine); preQi (7-aminomethyl-7- deazaguanosine); G+ (archaeosine); D (dihydrouridine); m5Um (5,2'-O-dimethyluridine); s4U (4- thiouridine); m5s2U (5-methyl-2-thiouridine); s2Um (2-thio-2'-O-methyluridine); acp3U (3-(3-amino-3- carboxypropyl)uridine); ho5U (5-hydroxyuridine); mo5U (5-methoxyuridine); cmo5U (uridine 5- oxyacetic acid); mcmo5U (uridine 5-oxyacetic acid methyl ester); chm5U (5- (carboxyhydroxymethyl)uridine)); mchm5U (5-(carboxyhydroxymethyl)uridine methyl ester); mcm5U (5- methoxycarbonylmethyluridine); mcm5Um (5-methoxycarbonylmethyl-2'-O-methyluridine); mcm5s2U (5-methoxycarbonylmethyl-2-thiouridine); nm3S2U (5-aminomethyl-2-thiouridine); mnm5U (5- methylaminomethyluridine); mnm5s2U (5-methylaminomethyl-2-thiouridine); mnm5se2U (5- methylaminomethyl-2-selenouridine); ncm5U (5-carbamoylmethyluridine); ncm5Um (5- carbamoylmethyl-2'-O-methyluridine); cmnm5U (5-carboxymethylaminomethyluridine); cmnm5Um (5- carboxymethylaminomethyl-2'-O-methyluridine); cmnm5s2U (5-carboxymethylaminomethyl-2- thiouridine); m6 2A (N6,N6-dimethyladenosine); Im (2'-O-methylinosine); m4C (N4-methylcytidine); m4Cm (N4,2'-O-dimethylcytidine); hm3C (5-hydroxymethylcytidine); m3U (3-methyluridine); cm5U (5- carboxymethyluridine); m6Am (N6,2'-O-dimethyladenosine); m6 2Am (N6,N6,O-2'-trimethyladenosine); m2’7G (N2,7-dimethylguanosine); m2’2 7G (N2,N2,7-trimethylguanosine); m3Um (3,2'-O- dimethyluridine); m3D (5-methyldihydrouridine); ^Cm (5-formyl-2'-O-methylcytidine); m1Gm (1 ,2'-O- dimethylguanosine); m1Am (1 ,2'-O-dimethyladenosine); Tm 5U (5-taurinomethyluridine); Tm5s2U (5- taurinomethyl-2-thiouridine)); imG-14 (4-demethylwyosine); imG2 (isowyosine); or ac6A (N6- acetyladenosine), or any combination thereof.
The gene of interest may comprise at least one modified nucleotide, wherein the modification is a m6A (N6-methyladenosine) modification. The gene of interest may comprise at least 95% modified nucleotides, wherein the modification is a m6A (N6-methyladenosine) modification. The gene of interest may comprise 100% modified nucleotides, wherein the modification is a m6A (N6- methyladenosine) modification.
The gene of interest may comprise at least one modified nucleotide, wherein the modification is a N1 T (N1-methylpseudouridine) modification. The gene of interest may comprise at least 95% modified nucleotides, wherein the modification is a N1 1 (N1-methylpseudouridine) modification. The gene of interest may comprise 100% modified nucleotides, wherein the modification is a N1 1 (N1- methylpseudouridine) modification.
The gene of interest may comprise at least one modified nucleotide, wherein the modification is a m5C (5-methylcytidine) modification. The gene of interest may comprise at least 95% modified nucleotides, wherein the modification is a m5C (5-methylcytidine) modification. The gene of interest may comprise 100% modified nucleotides, wherein the modification is a m5C (5-methylcytidine) modification.
When the methods described herein are used for the circularisation of RNAs containing modified bases, both the internal guide sequence of the ribozyme and the sequence of the gene of interest (in particular, the eACA sequence) can be adapted to minimise the disruptive effect of modifications on interaction between ribozyme and GOL In some cases, the IGS can be adapted to be CG rich, for example comprising 60% or more, 80% or more, or 100% CG content. The skilled person will appreciate that corresponding adaptations should be made to the sequence of the gene of interest, in order to maintain complementary base pairing. The precise nature of the sequence alterations or adaptations will depend on the nucleotide modifications that are desired in the resulting circular RNA.
Extended guide sequences
The nucleic acid molecules and modified ribozymes described herein may further each comprise an extended guide sequence (EGS). In particular, a modified ribozyme may comprise a first EGS, and a nucleic acid molecule containing the gene of interest may comprise a second EGS. The first and the second EGS may be capable of complementary base pairing to each other. The function of the EGS is generally to increase the length of the complementary base-pairing region between the ribozyme and the nucleic acid molecule containing the gene of interest.
The modified ribozymes described herein may comprise a first EGS positioned at a 5’ end of the truncated ribozyme. The nucleic acid molecules containing the gene of interest described herein may comprise a second EGS positioned at a 3’ end of the nucleic acid molecule. Generally, there is a loop sequence situated between the first EGS and the truncated ribozyme, as described in more detail below. Similarly, there is also generally a loop sequence situated between the second EGS and the gene of interest.
The first and second EGS may be partly or fully complementary to each other. Generally, mismatches are tolerated well and do not materially affect circularization. Accordingly, the first EGS may be at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% complementary to the second EGS. Generally, the first EGS may be substantially complementary to the second EGS, such as at least 70% complementary.
If present, the first and second EGS may each be between 1 and 500 nucleotides in length. For example, the first and second EGS may each be between 10 and 50 nucleotides in length. The first and second EGS may each be 20, 30, or 40 nucleotides in length.
An exemplary first EGS sequence is GGUCAAUCGGUUGGCUUCCG (SEQ ID NO: 22).
An exemplary second EGS sequence is CGGAAGCCAACCGAUUGACC (SEQ ID NO: 23).
Lengthening the base paired P1 region can have a deleterious effect on circularisation. To avoid this, the ribozymes and nucleic acid molecules described herein may further comprise loop sequences, such as a first loop sequence and a second loop sequence.
The first and second loops may be configured to act as spacers. For example, the first and second loops may act as spacers between, in the ribozyme, the internal guide sequence (IGS) and the first extended guide sequence (EGS), and in the nucleic acid molecule containing the gene of interest, the gene of interest and the second EGS. The loop sequences are preferably not complementary to each other, such that there is little or no base pair interaction between the first and second loop sequences. Because of the low or non-complementarity between the two loop sequences, the base-paired P1 region remains at a fixed length. Generally, the loop sequences are substantially non-complementary. For example, the loop sequences may have less than 30%, less than 20%, less than 10%, or less than 5% complementarity.
The first and second loop sequences may each be between 1 and 10 nucleotides in length. It is not necessary for the first and second loop sequences to have the same number of nucleotides, and in fact the TERIC method works well when the first and second loop sequences are different lengths. The first loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length. The second loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length. A preferred combination is a 6 nucleotide first loop sequence and a 5 nucleotide second loop sequence. Another preferred combination is a 3 nucleotide first loop sequence and a 2 nucleotide second loop sequence.
The first loop sequence is positioned 3’ to the first EGS and 5’ to the truncated ribozyme, in other words, in between the first EGS and the truncated ribozyme. The second loop sequence is positioned 3’ to the gene of interest, and 5’ to the second EGS, in other words, between the gene of interest and the second EGS in the nucleic acid molecule containing the gene of interest. If the P1 extension region is present, the second loop sequence is 3’ to the P1 extension.
It is not always necessary to have two loop sequences. Circularization can be achieved using only a first loop sequence, positioned in between the first EGS and the truncated ribozyme, without a second loop sequence. Alternatively, only the loop sequence positioned between the second portion of the gene of interest and the second EGS may be used.
An exemplary sequence for the first loop sequence is AAATAA.
An exemplary sequence for the second loop sequence is ACACC.
Thus, described herein are modified ribozymes comprising, in the 5’ to 3’ direction: a) an extended guide sequence (EGS) b) a loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
Also described herein are modified ribozymes comprising, in the 5’ to 3’ direction: a) an extended guide sequence (EGS) b) a loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme and does not comprise a 5’ portion of a corresponding wildtype ribozyme.
Also described herein are nucleic acid molecules containing a gene of interest, where the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, b) the gene of interest, c) a loop sequence, and d) an extended guide sequence.
Also described herein are nucleic acid molecules containing a gene of interest, wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a first bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, b) the gene of interest, c) a second bridge sequence comprising a sequence corresponding to a 5’ portion of a ribozyme, d) a loop sequence, and e) an extended guide sequence.
Also described herein are nucleic acid molecules containing a gene of interest, where the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second portion of the eACA sequence; e) a loop sequence, and f) an extended guide sequence.
Also described herein are nucleic acid molecules containing a gene of interest, wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a first bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest, d) a second bridge sequence comprising a sequence corresponding to a 5’ portion of a ribozyme, e) a second portion of the eACA sequence; f) a loop sequence, and g) an extended guide sequence.
As part of the circularisation methods described herein, the ribozyme and the nucleic acid molecule comprising the GOI may further be provided with homology arms. The purpose of the homology arms is to enable the 5’ end of the GOI to interact with the 3’ end of the ribozyme, as shown in Figure 2B. Figure 2B labels the homology arm on the GOI as “HR-G” and the homology arm on the ribozyme as “HR-R”.
The homology arms perform a similar function to the EGS, by extending the region of complementary base pairing between the GOI and the ribozyme. As a result, the first and second homology arms may be partly or fully complementary to each other. Generally, mismatches are tolerated well and do not materially affect circularization. Accordingly, the first homology arm may be at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% complementary to the second homology arm. Generally, the homology arms are substantially complementary to each other, such as at least 70% complementary to each other.
The homology arms can be of any length which facilitates circularisation, and do not necessarily have to be of the same length. For example, the homology arm on the GOI can be longer or shorter than the homology arm on the ribozyme. The homology arms may each be at least 1 , 2, 3, 4, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 300 or 500 nucleotides in length. The homology arms may each be between 1 and 50, 1 and 40, 1 and 30, 1 and 20, 5 and 50, 5 and 40, 5 and 30, 5 and 20, 10 and 50, 10 and 40, 10 and 30, or 10 and 20 nucleotides in length. In some cases, the homology arms are each 20 nucleotides in length.
An exemplary pair of homology arm sequences is below:
HR-R (ribozyme): CAGGACAACAGCATCACTAG (SEQ ID NO: 47)
HR-G (GOI): CTAGTGATGCTGTTGTCCTG (SEQ ID NO: 48)
In the case of the modified ribozyme, the homology arm is generally placed at the 3’ end. In the case of the nucleic acid molecule containing the gene of interest, the homology arm is generally placed at the 5’ end. Accordingly, a modified ribozyme as described herein may comprise, in a 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and d) a homology arm sequence.
A nucleic acid molecule comprising a gene of interest as described herein may comprise, in a 5’ to 3’ direction: a) a homology arm sequence, b) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme sequence, c) the gene of interest, d) a loop sequence, and e) an extended guide sequence.
Methods of producing circular RNAs
Described herein are methods for producing a circular gene of interest. The methods generally comprise providing a bridge sequence (as described herein) at a 5’ end of the gene to be circularized, and optionally also providing a second bridge sequence at a 3’ end of the gene to be circularised. The bridge sequence(s) can be added to the gene of interest by methods known in the art. For example, a DNA template can be synthesised which comprises a bridge sequence 5’ to the gene of interest, and/or a bridge sequence 3’ to the gene of interest. As described elsewhere herein, the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, and the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme.
The methods further comprise providing a modified ribozyme, as described elsewhere herein. Generally, the modified ribozyme is one in which a 3’ portion of the ribozyme has been removed. Thus, the modified ribozyme may comprise a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme. For example, in the case of the Ana group I intron, the modified ribozyme may correspond to nucleotides 1-226, or 1-242 of SEQ ID NO: 1 , with nucleotides 227-249 or 243-249 removed, respectively. The modified ribozyme may also comprise a truncated ribozyme wherein the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme. The truncated ribozyme may be truncated at both the 5’ and 3’ ends compared to a wild-type ribozyme.
Generally, the methods comprise combining the gene of interest and the modified ribozyme under conditions suitable for circularization to occur. Circularisation protocols are generally known in the art.
Advantageously, such a step may be carried out for between 10 and 60 minutes, in some cases between 20 and 40 minutes. Circularization may be achieved by heating the gene of interest and modified ribozyme together, for example to between 50°C and 60°C. Circularization may be achieved by heating the mixture of the gene of interest and the modified ribozyme to about 55°C for about 20 minutes. The gene of interest and the modified ribozyme may be combined and/or heated together in any suitable circularization buffer known in the art.
Methods for producing a circular gene of interest may comprise: a) providing a nucleic acid molecule, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction, a first bridge sequence, and a gene of interest, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) providing a modified ribozyme comprising a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and c) combining the nucleic acid molecule and the modified ribozyme under conditions suitable for circularization to occur.
The ratio of modified ribozyme to gene of interest is generally 1 :1 or greater. For example, the ratio of modified ribozyme to gene of interest may be 2:1 , 3:1 , 4:1 , 5:1 , 6:1 , or greater. In particular, the ratio of modified ribozyme to gene of interest may be 4:1 .
The concentration of gene of interest is generally between about 100 nM and about 500 nM. In some embodiments, the concentration of gene of interest may be between 200 nM and 400 nM.
The steps of the methods described herein may be preceded by steps providing DNA templates encoding any of the modified ribozymes or genes of interest. Such methods may also include a step of in vitro transcription of the DNA templates to provide RNA precursors. The methods may also include a step of refolding the modified ribozymes and genes of interest following in vitro transcription and prior to the addition of a circularization buffer.
Because the TERIC method enables the TER to be reused, the methods described herein may further comprise a step of recovering the modified ribozyme. The recovered modified ribozyme may then be used in a future reaction. Suitable methods for recovering the modified ribozyme include separation and purification by gel filtration or gel extraction. In some cases, the modified ribozyme can be recovered and/or purified at the same time as recovering the desired circular RNA. In other cases, the TER can be immobilised on a solid surface, for example covalently linked to a bead or plate, or other suitable surface known in the art. The gene of interest can be added, circularised and washed out for recovery of the circular RNA, while the TER remains bound to the solid surface and can be reused for a second and further rounds of circularisation with new genes of interest.
Also disclosed herein are circular RNAs obtainable by the methods described above.
The invention also provides circular RNAs comprising a sequence encoding a gene of interest, wherein the circular RNA comprises at least one modified nucleotide residue as described herein, and wherein the circular RNA does not comprise exogenous exon sequences, and/or does not comprise any ribozyme-derived sequence.
The invention also provides circular RNAs comprising a sequence encoding a gene of interest, wherein the circular RNA comprises an eACA sequence, and wherein the circular RNA comprises at least one modified nucleotide residue. In some cases, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or 100% of the nucleotide residues in the circular RNA may be modified.
Kits
The invention provides kits for circularizing a gene of interest. Generally, the kits of the invention comprise a modified ribozyme and a nucleic acid molecule containing the gene of interest, as described herein.
Kits for circularizing a gene of interest may comprise a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme, and b) the gene of interest.
Kits for circularizing a gene of interest may comprise a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, c) an internal guide sequence (IGS), and d) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of the ribozyme, b) a first portion of an extended anticodon arm (eACA) sequence, c) the gene of interest,
d) a second portion of the eACA sequence; e) a second loop sequence, and f) a second EGS.
The disclosure also encompasses DNA precursor molecules encoding any of the modified ribozymes, genes of interest, and nucleic acid molecules containing genes of interest described herein. As shown in Figure 7, any of the ribozymes or nucleic acid molecules described herein can be provided as DNA templates, which subsequently undergo in vitro transcription (IVT) in order to provide ribozymes and RNA molecules that can be circularized. Also disclosed herein are kits comprising a first DNA molecule encoding a modified ribozyme and a second DNA molecule encoding a nucleic acid molecule.
The invention can further be described by reference to the following non-limiting numbered embodiments.
1 . A method for producing a circular gene of interest, the method comprising: a) providing a nucleic acid molecule, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction, a first bridge sequence, and a gene of interest, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) providing a modified ribozyme comprising a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and c) combining the nucleic acid molecule and the modified ribozyme under conditions suitable for circularization to occur.
2. The method of clause 1 , wherein the nucleic acid molecule further comprises a second bridge sequence located 3’ of the gene of interest, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme, and wherein the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme.
3. The method of clause 1 or 2, wherein at least a portion of the first bridge sequence is capable of complementary base pairing with a 3’ portion of the modified ribozyme.
4. The method of any preceding clause, wherein at least a portion of the first bridge sequence is capable of forming a P9 region with a 3’ portion of the modified ribozyme.
5. The method of any preceding clause, wherein the first bridge sequence is between 1 and 249 nucleotides in length.
6. The method of any preceding clause, wherein the first bridge sequence is between 1 and 30 nucleotides in length.
7. The method of any preceding clause, wherein the first bridge sequence is 8 nucleotides in length.
8. The method of any of clauses 1-7, wherein the first bridge sequence comprises a sequence corresponding to nucleotide residues 242-249 of SEQ ID NO: 1 or 49.
9. The method of any one of clauses 1-6, wherein the first bridge sequence is 23 nucleotides in length.
10. The method of any of clauses 1-6 or 9, wherein the first bridge sequence comprises a sequence corresponding to nucleotide residues 227-249 of SEQ ID NO: 1 or 49.
11. The method of any preceding clause, wherein the 3’ portion of the corresponding wild-type ribozyme is between 1 and 249 nucleotides in length.
12. The method of any preceding clause, wherein the 3’ portion of the corresponding wild-type ribozyme is between 1 and 30 nucleotides in length.
13. The method of any preceding clause, wherein the 3’ portion of the corresponding wild-type ribozyme is 8 nucleotides in length.
14. The method of any one of clauses 1-12, wherein the 3’ portion of the corresponding wild-type ribozyme is 23 nucleotides in length.
15. The method of any one of clauses 1-13, wherein the truncated ribozyme comprises the nucleotide sequence of SEQ ID NO: 13
16. The method of any one of clauses 1 -12 or 14, wherein the truncated ribozyme comprises the nucleotide sequence of SEQ ID NO: 12
17. The method of any one of clauses 2-16, wherein at least a portion of the second bridge sequence is capable of complementary base pairing with a 5’ portion of the modified ribozyme.
18. The method of any one of clauses 2-17, wherein at least a portion of the second bridge sequence is capable of acting as an internal guide sequence.
The method of any one of clauses 2-18, wherein the second bridge sequence is between 1 and 1500 nucleotides in length. The method of any one of clauses 2-19, wherein the second bridge sequence comprises a sequence corresponding to residues 1-10 of SEQ ID NO: 1 or 49. The method of any one of clauses 2-20, wherein the 5’ portion of the corresponding wild-type ribozyme is between 1 and 1500 nucleotides in length. The method of any one of clauses 2-21 , wherein the 5’ portion of the corresponding wild-type ribozyme comprises a sequence corresponding to residues 1-10 of SEQ ID NO: 1 or 49. The method of any preceding clause, wherein the gene of interest comprises at least one modified nucleotide. The method of clause 23, wherein the modification is a base modification. The method of clause 23 or 24, wherein the at least one modified nucleotide is selected from the group consisting of N6-methyladenosine, N1-methyl-pseudouridine, or combinations thereof. The method of any preceding clause, wherein the modified ribozyme further comprises a first extended guide sequence (EGS) at a 5’ end, and the nucleic acid molecule further comprises a second EGS at a 3’ end. The method of clause 26, wherein the first and second EGS are substantially complementary to each other. The method of clause 26 or 27, wherein the modified ribozyme comprises, in a 5’ to 3’ direction: the first EGS, a first loop sequence, and the truncated ribozyme; and wherein the nucleic acid molecule comprises, in a 5’ to 3’ direction: the first bridge sequence, the gene of interest, a second loop sequence, and the second EGS. The method of clause 28, wherein the first and second loop sequences are substantially non- complementary. The method of clause 29 or 30, wherein the first and second loop sequences are configured to act as spacers.
31. The method of any preceding clause, wherein the modified ribozyme further comprises a first homology arm sequence at the 3’ end, and the nucleic acid molecule further comprises a second homology arm sequence at the 5’ end.
32. The method of clause 31 , wherein the first and second homology arm sequences are substantially complementary to each other.
33. The method of any preceding clause, wherein the ratio of modified ribozyme to nucleic acid molecule is 1 :1 or greater.
34. The method of any preceding clause, wherein the ratio of modified ribozyme to nucleic acid molecule is about 4:1 .
35. The method of any preceding clause, wherein the concentration of the nucleic acid molecule is between about 200 nM and about 400 nM.
36. The method of any preceding clause wherein step (c) is performed for between 20 and 40 minutes.
37. The method of any preceding clause, wherein step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme.
38. The method of any preceding clause, wherein step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme to between 50°C and 60°C.
39. The method of any preceding clause, wherein step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme to about 55°C for about 20 minutes.
40. A circular RNA obtained by the method of any one of clauses 1 to 39.
41. A modified ribozyme for use in a method of circularizing an RNA molecule, the modified ribozyme comprising, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, and c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
42. The modified ribozyme of clause 41 , where the truncated ribozyme further does not comprise a 5’ portion of a corresponding wild-type ribozyme.
43. A nucleic acid molecule containing a gene of interest for use in a method of circularizing the gene of interest, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme sequence, b) the gene of interest, c) a loop sequence, and d) an extended guide sequence.
44. The nucleic acid molecule of clause 43, further comprising a second bridge sequence at a 3’ end, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme sequence.
45. A kit for circularizing a gene of interest, the kit comprising a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a 3’ portion of a ribozyme, and b) the gene of interest.
46. A kit comprising a first DNA molecule encoding the modified ribozyme of the kit of clause 41 or 42, and a second DNA molecule encoding the nucleic acid molecule containing the gene of interest of the kit of clause 43 or 44.
47. A DNA molecule encoding a modified ribozyme according to any preceding clause.
48. A DNA molecule encoding a nucleic acid molecule containing a gene of interest, according to any preceding clause.
49. A circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA comprises at least one modified nucleotide residue, and wherein the circular RNA does not comprise exogenous exon sequences.
50. A circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA comprises an eACA sequence, and wherein the circular RNA comprises at least one modified nucleotide residue.
Aspects and embodiments described herein with the term “comprising” may include other features or steps within the scope. The terms " comprising" or “comprises” may be substituted with the terms "
consisting of, “consists of, “consisting essentially of or “consists essentially of and vice versa, wherever they occur herein.
The phrase "selected from the group comprising" may be substituted with the phrase "selected from the group consisting of and vice versa, wherever they occur herein.
It is also understood that the application discloses all combinations of any of the above aspects and embodiments described above with each other, unless the context demands otherwise. Similarly, the application discloses all combinations of the preferred and/or optional features either singly or together with any of the other aspects, unless the context demands otherwise.
Modifications of the above embodiments, further embodiments and modifications thereof will be apparent to the skilled person on reading this disclosure, and as such, these are within the scope of the present invention. Those skilled in the art will appreciate that the present invention is defined by the appended claims and not by the Examples or other description of certain embodiments included herein.
All documents and sequence database entries mentioned in this specification are incorporated herein by reference in their entirety for all purposes.
Similarly, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise.
“and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. For example, “A and/or B” is to be taken as specific disclosure of each of (i) A, (ii) B and (iii) A and B, just as if each is set out individually herein.
Unless defined otherwise above, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention. Generally, nomenclatures used in connection with, and techniques of, cell and tissue culture, molecular biology, immunology, genetics and protein and nucleic acid chemistry described herein are those well-known and commonly used in the art, or according to manufacturer’s specifications.
The invention will now be further described by way of the following Examples, which are meant to serve to assist one of ordinary skill in the art in carrying out the invention and are not intended in any way to limit the scope of the invention, with reference to the Figures.
EXAMPLES
Materials & Methods
DNA templates:
The plasmids TRIC V1-CVB3-EGFP (SEQ ID NO: 24), TRIC V2-CVB3-EGFP (SEQ ID NO: 25), TRIO V2-circZnf609 (SEQ ID NO: 26), PIE-CVB3-EGFP (SEQ ID NO: 27), and PIE-circZnf609 (SEQ ID NO: 28) were generated as described in GB2308675.4. To obtain these plasmids, the inventors amplified them in TOP10 competent cells and purified them using the QIAGEN Maxi Plus plasmid purification kit. The purified plasmids were then linearized using EcoR V and cleaned through phenol:chloroform:isoamyl alcohol extraction. For the TERIC constructs, the inventors PCR amplified the templates from the corresponding TRIC plasmids, utilizing the primers specified in table 1 . Table 1 DNA primers for TERIC
In vitro transcription (IVT):
In vitro transcriptions (IVT) were carried out using a DNA template concentration of 50 ng/pl. The IVT reaction mixture consisted of 14 pg/pl homemade T7 polymerase, 0.04 U/pl RNase inhibitor
(Promega), 6 mM of each nucleotide triphosphate (NTP), and 1X IVT buffer. For TERIC, the
composition of 1X IVT buffer included 80 mM HEPES-K (pH 7.5), 2 mM spermidine, 40 mM DTT, and 24 mM MgCh. On the other hand, for TRIC and PIE, the concentration of MgCh in 1X IVT buffer was adjusted to 14 mM.
The IVT reactions were incubated at 37 °C for 3-5 hours, followed by digestion with RNase-free DNase I for 20 minutes. To remove any precipitation, 100 mM EDTA was added to achieve a final concentration of 25 mM. Subsequently, an equal volume of 7.5 M lithium chloride was added to precipitate the RNAs. This precipitation step was performed for a duration of 30 minutes to overnight at -20 °C. The resulting precipitates were centrifuged at 13,000 rpm/min for at least 20 minutes, and the RNA pellets were washed with 75% alcohol, air-dried, and dissolved in DEPC-treated H2O.
Circular RNA synthesis:
To begin, all RNAs underwent a refolding process. They were initially denatured at 95 °C for 2 minutes and then annealed on ice for 3 minutes. The circularization step was carried out in a 10 pl reaction volume and was terminated by adding 2 pl of 100 mM EDTA.
For TRIC and PIE precursors, refolded RNAs at a final concentration of 200 nM were combined with 10X circularization buffer (composed of 500 mM Tris-HCI, pH 7.4, 100 mM MgCh, 10 mM DTT, and 20 mM GTP). The mixture was heated at 55 °C for 8 minutes to allow circularization.
In the case of TERIC, there were two protocols used. In protocol A, TERs and pGs were mixed and refolded as described above in DEPC-treated H2O. Subsequently, they were supplied with the circularisation buffer for 20min of circularization at 55 °C. In protocol B, the refolding of TSRs and pGs took place in the circularisation buffer, followed by mixing and 20min of circularization at 55 °C. The splicing conditions, such as the concentration of GTP, the ratio of TER to pG, the concentration of pGs, and the reaction duration, were optimized accordingly.
RT-PCR and Sanger sequencing:
Reverse transcriptase and DNA polymerase used here are the SuperScrip IV Reverse Transcriptase (Thermo Fisher) and the Q5 High-Fidelity DNA Polymerase (NEB). Manufacturer’s manuals were followed for reverse transcription and PCR. The pG and circularized pG of TERIC V2 were used as templates for reserve transcription using the RT-PCR_Reverse (GTGAACCGCATCGAGCTG (SEQ ID NO: 43)) as the reverse primer. Then reverse transcription products were subject to PCR using RT- PCR_Forward (TTTGCTGTATTCAACTTAACAATGAATTGTAATG (SEQ ID NO: 44)) and RT- PCR_Reverse. RT-PCR product was gel extracted and submitted for Sanger sequencing.
Native agarose gel:
100 ml of 1.0% agarose gels were prepared in 1X TBE (89 mM Tris, 89 mM Boric acid, 3 mM EDTA) in the presence of 10 ul SYBR Safe (Thermo Fisher). Owl Easy Cast B2 Mini Gel Electrophoresis Systems (Thermo Scitific) at constantly 25W at room temperature with 1X TBE as running buffer for
40-60min. For sample loading, each RNA sample (~500ng) was mixed with equal volume of formamide loading buffer (Thermo Fisher) and denatured at 95 °C for 2 min. Gels (including gels described below) were all imaged by a Bio-Rad Chemidoc XRS+ Imaging System.
Urea agarose gel:
Gels were operated vertically using the Bio-Rad Mini-PROTEAN Tetra Vertical Electrophoresis Cell system. Agarose was firstly boiled in DEPC H2O, mixed with urea, top up with 10X TBE and DEPC H2O and then poured into gel plates with 1 .5 mm spacers. To avoid gels from slipping during running, a 0.75 mm stopper was sticked to the bottom edge of the bottom plate. Gels were placed at 4 °C for gelation for several hours. Combs need to be exposed by slipping down the cover plate before removing, since pulling out combs directly would damage gels. At this moment, clean gel wells with pipette tips and then slip back the cover plate. Gels were run at room temperature at 20 for 30 min. For sample loading, each RNA sample (~1 OOng) was mixed with equal volume of urea loading buffer (NEB) and denatured at 95 °C for 2 min. Gels were stained in 10ml 1X TBE with SYBR Safe for 10 min before imaging.
Example 1
Group I introns utilize internal guide sequences (IGS) to form P1 and P10 structures with flanking exons, bringing the exons into proximity to facilitate splicing (Figure 2A). In the case of trans excision ribozyme (TER), a type of ribozyme lacking exons and IGS, tertiary interactions enable recognition of substrates containing exons and IGS, leading to excision of bridge sequences between exons (Sargueil B & Tanner NK (1993) J Mol Biol. 233:639-643). Further research has demonstrated that retaining IGS in TSR and restoring the P9.0 interaction between TER and the 3' bridge can enhance TSR efficiency (Bell MA, Johnson AK & Testa SM (2002) Biochemistry, 41 :15327-33; Bell MA, Johnson AK & Testa SM (2004) Biochemistry, 43:4323-31).
To develop a system that could circularize genes of interest in trans, the inventors utilised a group I intron which included an IGS sequence (the cyanobacterium Anabaena tRNALeu intron, referred to hereafter as “Ana”). To cause this ribozyme to excise material from a separate gene of interest, the inventors relocated a 3’ portion of the intron to the 5’ end of the gene of interest (Figure 2B). The inventors speculated that this 3’ portion (also called a 3’ bridge sequence) would form the P9.0 region with the trans excision ribozyme, thus improving the interaction between ribozyme and gene of interest to facilitate splicing.
In addition, the inventors discovered that introducing extended guide sequences (EGS) on each of the ribozyme and the gene of interest can further improve the interaction between the two to support splicing. Finally, the inventors incorporate the discovery that bacterial exons can be replaced with extended anticodon arm (eACA) sequences. Both the EGS and eACA sequence elements were used in the present invention to provide a trans excision ribozyme capable of efficient circularization of a gene of interest. This approach is termed trans excision ribozyme based circularization (TERIC).
Based on the above, the inventors constructed two trans excision ribozymes based on Ana, namely TER V1 (comprising nucleotides 1 to 241 of Ana, SEQ ID NO: 13) and TER V2 (comprising nucleotides 1 to 226 of Ana, SEQ ID NO: 12) (Figure 2). Precursors of the corresponding genes of interest (pGs) were also generated, using a 3’ bridge sequence of nucleotides 242-249 of Ana (SEQ ID NO: 14) for TER V1 , and nucleotides 227-249 of Ana (SEQ ID NO: 15) for TER V2 (Figure 2B). These 3’ bridge sequences were joined to the 5’ end of a gene of interest encoding EGFP and including other sequence elements such as CVB3 (an IRES), stop and start codons, and polyAC.
Successful circularization of these constructs would yield a 1593 nt circCVB3-EGFP. Previous studies have shown that the P. carinii Group I Intron TER does not require GTP cofactor for its function (Bell MA, Johnson AK & Testa SM (2002)). To investigate if the same holds true for the Ana TSR, the inventors conducted splicing reactions using final concentrations of 0.1 , 0.5, and 2 mM GTP in 10 pL reaction volumes. Two protocols were employed for setting up the splicing reactions. In protocol A, in vitro transcribed TERs and pGs were first mixed at a 5:1 ratio, refolded, supplemented with splicing buffer, and heated at 55°C for 20 minutes. In protocol B, TSRs and pGs were individually refolded in splicing buffer, followed by mixing and heating at 55°C for 20 minutes. The final concentration of pGs was 200 nM. Splicing reactions were halted using 2 pL of 100 mM EDTA and loaded onto a 0.8% native agarose gel.
As depicted in Figure 3A, TERIC V2 exhibited the highest splicing efficiency at 2 mM GTP when protocol A was employed, although TER V1 also achieved efficient splicing with protocol A. Splicing was also achieved with protocol B, but was less efficient compared to protocol A. The bands marked by empty circles represent circCVB3-EGFP.
To validate the presence of circCVB3-EGFP, the inventors loaded samples of pGs before and after circularization onto a 6M urea 1 .5% agarose gel. The results indicated that circular RNA was present in the spliced sample but not in the pGs sample (Figure 3B). Further confirmation of circularization was achieved through RT-PCR using primers as indicated in Figure 2B. As anticipated, a 1236 bp RT- PCR product was amplified from the circularized reaction, while no product was observed in the pG sample (Figure 3C). In conclusion, TERIC demonstrates efficient synthesis of circular RNAs.
Example 2
Next, the inventors optimized the ratio between TERs and pGs. Initially, the inventors maintained the concentration of pGs at 200 nM and gradually increased the concentration of TERs from 200 nM to 1600 nM. Figure 4A illustrates that within the TER/pG ratio range of 1-4, an increase in TER concentration resulted in improved circularization efficiency. However, when the TER/pG ratio exceeded 4, the circularization efficiency did not exhibit significant changes. Subsequently, the inventors fixed the ratio between TERs and pGs at 4 and investigated the effect of pG concentration and circularization time. As depicted in Figures 4B and 4C, pGs concentrations ranging from 200 nM
to 400 nM, along with circularization times of 20-40 minutes, yielded the highest circularization efficiency while maintaining a relatively low level of nicking.
Example 3
Previous work by the inventors had demonstrated that inclusion of an eACA sequence is crucial for enhancing the efficiency of TRIC V2 compared to TRIC V1 (as described in GB2308675.4 and PCT/EP2024/065837, the contents of which are incorporated herein in their entirety), and the permuted intron-exon method (PIE), which relies on bacterial tRNA exon sequences. To investigate if the same principle applies to TERIC, the inventors synthesized a pG in which the eACA was replaced with native tRNA sequences (Figure 5B). The pG sequence is shown in SEQ ID NO: 9, with the tRNA sequences set forth in SEQ ID NO: 45 (5’ end) and SEQ ID NO: 46 (3’ end). As depicted in Figure 5A, circular RNAs were observed in the sample containing the tRNA precursor; however, the inclusion of eACA yielded significantly superior results in terms of circularization efficiency.
Example 4
It is known that the permuted intron-exon method is unable to circularize modified circular RNAs because base modifications can disrupt the ribozyme structure (Wesselhoeft, R. A. et al. (2019) Mol Cell 74:508-520 e504). The inventors speculated that the activity of TRIC would also be abolished. To investigate this, the inventors synthesized PIE-CVB3-EGFP and TRIC V2-CVB3-EGFP precursors (SEQ ID NOs: 27 and 25 respectively) with either m6A or N1 1 modifications (N6-methyladenosine and N1-methyl-pseudouridine, respectively) or without modifications, and tested the ability of PIE and TRIC approaches to circularize these base-modified genes of interest.
As shown in Figure 6A, unmodified PIE and TRIC precursors efficiently convert to circular RNAs, while no circular RNA is observed for the modified PIE or TRIC variants. To assess whether the TERIC approach could circularize modified pGs, the inventors synthesized modified precursors while keeping the ribozyme unmodified. Consistent with the previous observations, unmodified TERIC V2 successfully circularized the unmodified pG. However, surprisingly, no circular RNAs were observed for the modified pGs. The inventors identified two potential reasons for the failure to circularize modified pGs. Firstly, although the TER is unmodified, the 3’ bridge sequence located at the 5’ end of the pGs would be modified, and this 3’ bridge sequence spans nucleotides 227-249 of the Ana intron. It is possible that m6A or N1 1 modifications within this short 3’ bridge sequence abolishes TERIC activity. Secondly, the IGS in TER V2, UUGAG, is an AU-rich sequence, and m6A or N1 T modifications in the corresponding eACA could weaken the P1 and P10 structures, thus disrupting TER activity.
To test these hypotheses, the inventors replaced the IGS of TER V2 with CCGCC (as shown in Figure 6C with IGS shown in shaded underlined text). Accordingly, the pG was also mutated, and circularization of this pG would yield circZnf609. Since the IGS is now CCGCC, only the U of the GU wobble base pair in the P1 structure would be modified by N1 K We synthesized pGs for PIE-
circZnf609 (SEQ ID NO: 28), TRIC V2-circZnf609 (SEQ ID NO: 26), and TERIC V2-circZnf609 (SEQ ID NO: 11) with or without modifications, along with unmodified TER (SEQ ID NO: 10). As shown in Figure 6B, as expected, no circularization was observed from the modified circular RNAs in the PIE construct. Interestingly, for TRIC V2-circZnf609, no circularization was observed either, indicating that removing AUs in the P1 and P10 structures cannot restore ribozyme activity. However, for TERIC V2, circularized modified RNAs were clearly observed (Figure 6B). The replacement of IGS with CCGCC was able to restore circularization for both N1 ^-modified precursors and m6A-modified precursors, although only for N1 ^-modified precursors was this restored to the same level as unmodified pGs.
Example 5
We have demonstrated that the ribozyme truncated at the 3’ end can function as TERs for RNA circularization. To assess whether a ribozyme further truncated at the 5’ end can also serve as TERs, we shortened the 3’ truncated ribozyme at the 5’ end. As shown in Figure 9, the 5’ and 3’ truncated ribozyme (TER-trunc; SEQ ID NO: 52) successfully converted pG into circRNA with an efficiency comparable to that of the 3’ truncated ribozyme.
The purpose of relocating the 3’ portion of the ribozyme to the 5’ end of GOIs is to reconstitute the ribozyme structure between the ribozyme and the GOI. It is the structure, rather than the sequence or length, that is crucial for reconstitution to be functional. To confirm this, we introduced mutations in the 3’ portion of the ribozyme that was moved to the 5’ end of the GOI (SEQ ID NOs: 53 and 54).
Mutations in the 3’ bridge sequence of the GOI, whether altering the sequence or the length, did not prevent circularization in the TERIC method (Figure 9). In summary, in TERIC, the ribozyme can be truncated at both the 5’ and 3’ ends, and the bridge sequences on GOIs do not need to be identical to the truncated part of the ribozyme
Overall, the results demonstrate that the TERIC approach provides a further option for circularization of unmodified RNAs. Advantageously, TERIC can also be used for circularization of modified RNAs where previous approaches such as PIE and TRIC are unsuitable.
Claims
1 . A method for producing a circular gene of interest, the method comprising: a) providing a nucleic acid molecule, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction, a first bridge sequence, and a gene of interest, wherein the first bridge sequence comprises a sequence corresponding to a 3’ portion of a ribozyme, b) providing a modified ribozyme comprising a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and c) combining the nucleic acid molecule and the modified ribozyme under conditions suitable for circularization to occur.
2. The method of claim 1 , wherein the nucleic acid molecule further comprises a second bridge sequence located 3’ of the gene of interest, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme, and wherein the truncated ribozyme does not comprise a 5’ portion of a corresponding wild-type ribozyme.
3. The method of claim 1 or 2, wherein at least a portion of the first bridge sequence is capable of complementary base pairing with a 3’ portion of the modified ribozyme.
4. The method of any preceding claim, wherein the first bridge sequence is between 1 and 1500 nucleotides in length.
5. The method of any preceding claim, wherein the 3’ portion of the corresponding wild-type ribozyme is between 1 and 1500 nucleotides in length.
6. The method of any one of claims 2-5, wherein at least a portion of the second bridge sequence is capable of complementary base pairing with a 5’ portion of the modified ribozyme.
7. The method of any one of claims 2-6, wherein the second bridge sequence is between 1 and 1500 nucleotides in length.
8. The method of any one of claims 2-7, wherein the 5’ portion of the corresponding wild-type ribozyme is between 1 and 1500 nucleotides in length.
9. The method of any preceding claim, wherein the gene of interest comprises at least one modified nucleotide.
10. The method of any preceding claim, wherein 100% of the nucleotides of the gene of interest are modified nucleotides.
11. The method of any preceding claim, wherein the modified ribozyme further comprises a first extended guide sequence (EGS) at a 5’ end, and the nucleic acid molecule further comprises a second EGS at a 3’ end.
12. The method of claim 11 , wherein the modified ribozyme comprises, in a 5’ to 3’ direction: the first EGS, a first loop sequence, and the truncated ribozyme; and wherein the nucleic acid molecule comprises, in a 5’ to 3’ direction: the first bridge sequence, the gene of interest, a second loop sequence, and the second EGS.
13. The method of any preceding claim, wherein the modified ribozyme further comprises a first homology arm sequence at the 3’ end, and the nucleic acid molecule further comprises a second homology arm sequence at the 5’ end.
14. The method of any preceding claim, wherein the ratio of modified ribozyme to nucleic acid molecule is 1 :1 or greater.
15. The method of any preceding claim, wherein the ratio of modified ribozyme to nucleic acid molecule is about 4:1 .
16. The method of any preceding claim, wherein the concentration of the nucleic acid molecule is between about 200 nM and about 400 nM.
17. The method of any preceding claim wherein step (c) is performed for between 20 and 40 minutes.
18. The method of any preceding claim, wherein step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme to between 50°C and 60°C.
19. The method of any preceding claim, wherein step (c) comprises heating the mixture of the nucleic acid molecule and the modified ribozyme to about 55°C for about 20 minutes.
20. A circular RNA obtained by the method of any one of claims 1 to 19.
21. A modified ribozyme for use in a method of circularizing an RNA molecule, the modified ribozyme comprising, in the 5’ to 3’ direction: a) a first extended guide sequence (EGS), b) a first loop sequence, and
c) a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme.
22. The modified ribozyme of claim 21 , wherein the truncated ribozyme further does not comprise a 5’ portion of a corresponding wild-type ribozyme.
23. The modified ribozyme of claim 21 or claim 22, wherein corresponding wild-type ribozyme is a group I intron or group II intron.
24. A nucleic acid molecule containing a gene of interest for use in a method of circularizing the gene of interest, wherein the nucleic acid molecule comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a sequence corresponding to a 3’ portion of a ribozyme sequence, b) the gene of interest, c) a loop sequence, and d) an extended guide sequence.
25. The nucleic acid molecule of claim 24, further comprising a second bridge sequence at a 3’ end, wherein the second bridge sequence comprises a sequence corresponding to a 5’ portion of a ribozyme sequence.
26. A kit for circularizing a gene of interest, the kit comprising a modified ribozyme and a nucleic acid molecule containing the gene of interest, wherein the modified ribozyme comprises a truncated ribozyme, wherein the truncated ribozyme does not comprise a 3’ portion of a corresponding wild-type ribozyme, and wherein the nucleic acid molecule containing the gene of interest comprises, in the 5’ to 3’ direction: a) a bridge sequence comprising a 3’ portion of a ribozyme, and b) the gene of interest.
27. A kit comprising a first DNA molecule encoding the modified ribozyme of the kit of claim 6 and a second DNA molecule encoding the nucleic acid molecule containing the gene of interest of the kit of claim 26.
28. The kit of claim 26 or 27, wherein the gene of interest comprises at least one modified nucleotide.
29. The kit of any one of claims 26-28, wherein 100% of the nucleotides of the gene of interest are modified nucleotides.
30. A DNA molecule encoding a modified ribozyme as defined in any preceding claim.
31. A DNA molecule encoding a nucleic acid molecule as defined in any preceding claim, wherein the nucleic acid molecule contains a gene of interest.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GBGB2313326.7A GB202313326D0 (en) | 2023-09-01 | 2023-09-01 | Methods for making circular RNAs |
| GB2313326.7 | 2023-09-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025046039A1 true WO2025046039A1 (en) | 2025-03-06 |
Family
ID=88296885
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2024/074233 Pending WO2025046039A1 (en) | 2023-09-01 | 2024-08-29 | Methods for making circular rnas |
Country Status (2)
| Country | Link |
|---|---|
| GB (1) | GB202313326D0 (en) |
| WO (1) | WO2025046039A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6528640B1 (en) * | 1997-11-05 | 2003-03-04 | Ribozyme Pharmaceuticals, Incorporated | Synthetic ribonucleic acids with RNAse activity |
| WO2018237372A1 (en) * | 2017-06-23 | 2018-12-27 | Cornell University | RNA MOLECULES, METHODS FOR PRODUCING CIRCULAR RNA, AND METHODS OF TREATMENT |
| WO2021158964A1 (en) * | 2020-02-07 | 2021-08-12 | University Of Rochester | Ribozyme-mediated rna assembly and expression |
| EP4116421A1 (en) * | 2021-03-10 | 2023-01-11 | Rznomics Inc. | Self-circularized rna structure |
-
2023
- 2023-09-01 GB GBGB2313326.7A patent/GB202313326D0/en not_active Ceased
-
2024
- 2024-08-29 WO PCT/EP2024/074233 patent/WO2025046039A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6528640B1 (en) * | 1997-11-05 | 2003-03-04 | Ribozyme Pharmaceuticals, Incorporated | Synthetic ribonucleic acids with RNAse activity |
| WO2018237372A1 (en) * | 2017-06-23 | 2018-12-27 | Cornell University | RNA MOLECULES, METHODS FOR PRODUCING CIRCULAR RNA, AND METHODS OF TREATMENT |
| WO2021158964A1 (en) * | 2020-02-07 | 2021-08-12 | University Of Rochester | Ribozyme-mediated rna assembly and expression |
| EP4116421A1 (en) * | 2021-03-10 | 2023-01-11 | Rznomics Inc. | Self-circularized rna structure |
Non-Patent Citations (22)
| Title |
|---|
| ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 405 - 410 |
| BELINSKY M G ET AL: "NON-RIBOZYME SEQUENCES ENHANCE SELF-CLEAVAGE OF RIBOZYMES DERIVED FROM HEPATITIS DELTA VIRUS", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, GB, vol. 19, no. 3, 11 February 1991 (1991-02-11), pages 559 - 564, XP000605221, ISSN: 0305-1048, DOI: 10.1093/NAR/19.3.559 * |
| BELL MAJOHNSON AKTESTA SM, BIOCHEMISTRY, vol. 41, 2002, pages 15327 - 33 |
| BELL MAJOHNSON AKTESTA SM, BIOCHEMISTRY, vol. 43, 2004, pages 4323 - 31 |
| FORD ETHAN ET AL: "Synthesis of circular RNA in bacteria and yeast using RNA cyclase ribozymes derived from a group I intron of phage T4", PROC. NATI. ACAD. SCI. USA, 22 December 1993 (1993-12-22), pages 3117 - 3121, XP093229585 * |
| GAMBILL LAUREN ET AL: "A split ribozyme that links detection of a native RNA to orthogonal protein outputs", NATURE COMMUNICATIONS, vol. 14, no. 1, 1 February 2023 (2023-02-01), UK, XP093201429, ISSN: 2041-1723, Retrieved from the Internet <URL:https://www.nature.com/articles/s41467-023-36073-3> DOI: 10.1038/s41467-023-36073-3 * |
| JECK, W. RSHARPLESS, N. E, NAT BIOTECHNOL, vol. 32, 2014, pages 453 - 461 |
| LITKE JACOB L ET AL: "Highly efficient expression of circular RNA aptamers in cells using autocatalytic transcripts", NATURE BIOTECHNOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 37, no. 6, 8 April 2019 (2019-04-08), pages 667 - 675, XP036900696, ISSN: 1087-0156, [retrieved on 20190408], DOI: 10.1038/S41587-019-0090-6 * |
| LITKE JACOB L ET AL: "Trans ligation of RNAs to generate hybrid circular RNAs using highly efficient autocatalytic transcripts", METHODS, ACADEMIC PRESS, NL, vol. 196, 13 May 2021 (2021-05-13), pages 104 - 112, XP086875344, ISSN: 1046-2023, [retrieved on 20210513], DOI: 10.1016/J.YMETH.2021.05.009 * |
| LITKE JACOB L. ET AL: "SUPPLEMENTARY INFORMATION: Highly efficient expression of circular RNA aptamers in cells using autocatalytic transcripts", NATURE BIOTECHNOLOGY, 8 April 2019 (2019-04-08), XP093233265, Retrieved from the Internet <URL:https://static-content.springer.com/esm/art:10.1038/s41587-019-0090-6/MediaObjects/41587_2019_90_MOESM1_ESM.pdf> DOI: 10.1038/s41587-019-0090-6 * |
| LIU, C. XCHEN, L. L, CELL, vol. 185, 2022, pages 2016 - 2034 |
| OBI PRISCA ET AL: "The design and synthesis of circular RNAs", METHODS, ACADEMIC PRESS, NL, vol. 196, 2 March 2021 (2021-03-02), pages 85 - 103, XP086875211, ISSN: 1046-2023, [retrieved on 20210302], DOI: 10.1016/J.YMETH.2021.02.020 * |
| OLSONMULLER, RNA, vol. 18, 2012, pages 581 - 589 |
| PARDI, N ET AL., NAT REV DRUG DISCOV, vol. 17, 2018, pages 261 - 279 |
| PEARSONLIPMAN, PNAS USA, vol. 85, 1988, pages 2444 - 2448 |
| PETKOVIC, SMULLER, S, NUCLEIC ACIDS RES, vol. 43, 2015, pages 2454 - 2465 |
| PUTTARAJU, MBEEN, M. D, NUCLEIC ACIDS RES, vol. 20, 1992, pages 5357 - 5364 |
| SALVO J L G ET AL: "Deletion-tolerance and trans-splicing of the bacteriophage T4 td intron - Analysis of the P6-L6a region", JOURNAL OF MOLECULAR BIOLOGY, ACADEMIC PRESS, UNITED KINGDOM, vol. 211, no. 3, 5 February 1990 (1990-02-05), pages 537 - 549, XP024011301, ISSN: 0022-2836, [retrieved on 19900205], DOI: 10.1016/0022-2836(90)90264-M * |
| SARGUEIL BTANNER NK, J MOL BIOL., vol. 233, 1993, pages 639 - 643 |
| SMITHWATERMAN, J. MOL BIOL., vol. 147, 1981, pages 195 - 197 |
| WESSELHOEFT, R. A ET AL., MOL CELL, vol. 74, 2019, pages 508 - 520 |
| WESSELHOEFT, R. A.KOWALSKI, P. SANDERSON, D. G, NAT COMMUN, vol. 9, 2018, pages 2629 |
Also Published As
| Publication number | Publication date |
|---|---|
| GB202313326D0 (en) | 2023-10-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2024536853A (en) | Circular RNA and its preparation method | |
| CN112805386B (en) | Plasmid containing a sequence encoding mRNA with a segmented poly(A) tail | |
| WO2022271965A2 (en) | Compositions and methods for improved protein translation from recombinant circular rnas | |
| KR20240055811A (en) | Guide RNA with chemical modifications for prime editing | |
| CN104109687A (en) | Construction and application of Zymomonas mobilis CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-association proteins)9 system | |
| EP4219723B1 (en) | Circular rna platforms, uses thereof, and their manufacturing processes from engineered dna | |
| JP2025131687A (en) | Synthetic transfer RNA with extended anticodon loop | |
| WO2021113774A1 (en) | Nucleic acid compositions | |
| WO2024140987A1 (en) | Rna circularization | |
| WO2024145248A1 (en) | Compositions and methods for generating circular rna | |
| CN115927331A (en) | DNA framework for promoting circRNA cyclization and overexpression and construction method and application thereof | |
| WO2025046039A1 (en) | Methods for making circular rnas | |
| US20220243243A1 (en) | Expression of products from nucleic acid concatemers | |
| US20210317179A1 (en) | Precisely engineered stealthy messenger rnas and other polynucleotides | |
| CN115806970B (en) | A method for preparing single-stranded RNA | |
| JP2025535368A (en) | Gene editing system containing reverse transcriptase | |
| CN119421716A (en) | Compositions and methods for affinity purification of circular RNA | |
| WO2025140537A1 (en) | Rna circularization | |
| WO2025036272A1 (en) | Capped mrna and method of preparation thereof | |
| WO2024252011A2 (en) | Circular rnas and methods for making the same | |
| CN119162173A (en) | A new method for preparing circular RNA in vitro based on magnetic beads | |
| WO2024179544A1 (en) | Engineered dna molecule for coding rna | |
| CN118581085A (en) | A translation control element, expression vector, pharmaceutical composition, vaccine composition, construction method, method for expressing target protein and application thereof | |
| CN117561333A (en) | Compositions and methods for improving protein translation from recombinant circular RNAs | |
| WO2020251413A1 (en) | Dna-cutting agent based on cas9 protein from the bacterium pasteurella pneumotropica |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24765107 Country of ref document: EP Kind code of ref document: A1 |