WO2024252011A2 - Circular rnas and methods for making the same - Google Patents
Circular rnas and methods for making the same Download PDFInfo
- Publication number
- WO2024252011A2 WO2024252011A2 PCT/EP2024/065837 EP2024065837W WO2024252011A2 WO 2024252011 A2 WO2024252011 A2 WO 2024252011A2 EP 2024065837 W EP2024065837 W EP 2024065837W WO 2024252011 A2 WO2024252011 A2 WO 2024252011A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acid
- recombinant nucleic
- acid molecule
- sequence
- eaca
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/67—General methods for enhancing the expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/12—Type of nucleic acid catalytic nucleic acids, e.g. ribozymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/50—Physical structure
- C12N2310/53—Physical structure partially self-complementary or closed
- C12N2310/532—Closed or circular
Definitions
- an important feature of the invention is the presence of an extended anticodon arm (eACA) sequence provided in the recombinant nucleic acid molecule to be circularized.
- eACA extended anticodon arm
- This sequence provides a structure that is structurally similar to the anticodon arm found in Ana tRNA, comprising a stem and a loop, and enables circularization with greater efficiency compared to existing PIE methods and methods such as those described in WO2022/191642.
- eACA sequences that is, sequences capable of forming a stem-loop structure as described herein
- the recombinant nucleic acid molecules described herein may produce circular RNAs comprising only the gene of interest, without exogenous splicing sequences or unwanted spacer sequences, and may do so in a highly efficient way by utilising the propensity of the eACA sequence to form a stem-loop structure.
- EGS - extended guide sequence elGS - extended internal guide sequence eACA - anticodon arm-like structure GOI - gene of interest IGS - internal guide sequence
- nucleic acid molecules are provided herein for making circular RNAs.
- the nucleic acid molecules are generally linear prior to circularization.
- the nucleic acid molecule for circularization comprises the gene of interest (GOI), which is the gene to be circularized, a ribozyme, capable of performing the circularization, and an extended anticodon arm (eACA) sequence, split into two portions.
- GOI gene of interest
- eACA extended anticodon arm
- the ribozyme may be any ribozyme capable of acting as a trans-splicing ribozyme.
- the ribozyme is derived from or is a group I intron.
- suitable ribozymes are the Tetrahymena ribosomal intron, T4 phage thymidylate synthase intron, Anabaena (Ana) pre-tRNA intron, Azoarcus sp. BH72 lie tRNA intron, and Staphylococcus phage Twort ribonucleotide reductase intron. Sequences of these ribozymes are shown below, with the internal guide sequence (IGS) shown with underlined, shaded letters.
- IGS internal guide sequence
- Tetrahymena ribosomal intron AAATAG CAATATTTACCTTTGGAGGGAAAAGTTATCAGGCATG CACCTG GTAG CTAGTCTTTAAAC CAATAGATTGCATCGGTTTAAAAGGCAAGACCGTCAAATTGCGGGAAAGGTCAACAGCCGTTC AGTACCAAGTCTCAGGGGAAACTTTGAGATGGCCTTGCAAAGGGTATGGTAATAAGCTGACGGA CATGGTCCTAACCACGCAGCCAAGTCCTAAGTCAACAGATCTTCTGTTGATATGGATGCAGTTCA CAGACTAAATGTCGGTCGGGGAAGATGTATTCTTCTCATAAGATATAGTCGGACCTCCTTAATG GGAGCTAGCGGATGAAGTGATGCAACACTGGAGCCGCTGGGAACTAATTTGTATGCGAAAGTAT ATTGATTAGTTTTGGAGTACTCG (SEQ ID NO: 3)
- Staphylococcus phage Twort ribonucleotide reductase intron Staphylococcus phage Twort ribonucleotide reductase intron:
- Recombinant nucleic acid molecules described herein also generally comprise an internal guide sequence (IGS).
- IGS may be part of the ribozyme, such as part of the group I intron. Generally, this enables the TRIO approach to use an intact intron. However, it is possible to use a truncated ribozyme sequence in the TRIO method, with the native IGS removed and replaced by a different IGS to that which would normally be present.
- the function of the IGS is to base pair with the end regions of the gene of interest, in order to bring them into proximity with each other so that circularization may occur.
- the IGS binds through complementary base pairing to both first and second portions of the eACA sequence, which are located at either end of the GOI to be circularised.
- An extended anticodon arm (eACA) sequence is one which is capable of forming a stem-loop structure (also known as a hairpin or hairpin loop).
- Stem-loop structures form when two regions of single-stranded RNA which are generally complementary to each other (when read in opposite directions) base-pair with each other. The base-pairing results in a double helix structure ending in an unpaired loop.
- the natural propensity of eACA sequences to form stem-loop structures may be utilised to enable circularization of a gene of interest, as shown in Figure 12.
- the linear recombinant nucleic acid molecule Prior to circularization, the linear recombinant nucleic acid molecule comprises an eACA sequence in two separate portions. A first portion of the eACA sequence is positioned at or near the 5’ end of the gene of interest, and a second portion of the eACA sequence is positioned at or near the 3’ end of the gene of interest, as shown in Figure 10A (top panel).
- splicing by the trans- ribozyme causes the first and second portions of the eACA sequence to be covalently joined, in order to create a circular version of the gene of interest.
- the first and second portions are joined to form the eACA sequence, which is generally capable of forming a stemloop structure as shown in Figure 8C, Figure 10A, and Figure 12.
- the first portion of the eACA sequence may comprise a first eACA stem portion and a first eACA loop portion.
- the second portion of the eACA sequence may comprise a second eACA stem portion and a second eACA loop portion.
- stem portion it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the stem of a stem-loop structure.
- loop portion it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the loop of a stem-loop structure.
- Figure 12 shows how a stem-loop forming structure can be identified in a gene of interest, and subsequently used for circularizing the RNA.
- the stem and loop portions of the eACA sequence are capable of forming a stem-loop structure in the recombinant nucleic acid molecules described herein.
- the specific nucleotide sequence of the eACA sequence is not important for TRIC and does not determine whether circularization will occur. Rather, it is the structure as opposed to the sequence of the eACA that is important. Consequently, the eACA may be of any nucleotide sequence, provided that the last nucleotide in the second eACA loop portion is one which can form a wobble base pair with a corresponding nucleotide in the internal guide sequence described herein.
- the last nucleotide in the second eACA loop portion may be a uracil that forms a wobble base pair with a corresponding guanine in the internal guide sequence.
- the last nucleotide in the second eACA loop portion may be a cytosine that forms a wobble base pair with a corresponding adenine in the internal guide sequence.
- the first and second eACA stem portions may be complementary to each other, though this is not necessary.
- the first and second eACA stem portions are generally each at least 5 nucleotides in length but may be a short as 1 nucleotide in length each.
- the stem portion lengths may be adapted depending on the gene of interest to be circularized. For example, longer stem lengths (such as lengths greater than 15 nt) may be advantageous if circularizing long (>500 nt) genes of interest.
- the first and second stem portions may each be at least 1 , at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, at least 25 or at least 30 nucleotides in length.
- the first and second stem portions may each be at least 15 or at least 25 nucleotides in length.
- the first and second eACA stem portions may be 1 to 50 nucleotides in length, for example 5 to 40 nucleotides.
- the first and second stem portions need not be the same length, for example one stem portion may be one or two nucleotides shorter than the other, provided a stem-loop structure can still be formed.
- the anticodon arm loop of the Ana tRNA group I intron is naturally 7 nucleotides in length. Consequently, in the circular RNAs described herein, the loop of the stem-loop structure may be 7 nucleotides in length, particularly when the ribozyme used is, or is derived from, the Ana group I intron. If other group I introns are used, the loop of the stem-loop structure may have a different nucleotide length.
- the loop of the stem-loop structure may be between 3 to 40 nucleotides in length and is generally between 3 and 10 nucleotides in length.
- the loop of the stem-loop structure is at least 3 nucleotides in length, at least 4 nucleotides in length or at least 5 nucleotides in length, particularly if the ribozyme is or is derived from the Ana group I intron. In some embodiments, the loop of the stem-loop structure is generally between 5 and 11 nucleotides in length.
- the first eACA loop portion comprises 4 nucleotides
- the second eACA loop portion comprises 3 nucleotides.
- the first and second eACA stem portions may each be at least 15 nucleotides in length
- the first eACA loop portion may be 4 nucleotides in length
- the second eACA loop portion may be 3 nucleotides in length.
- the first portion of the eACA sequence may comprise as few as 5 nucleotides (for example, 1 stem nucleotide and 4 loop nucleotides).
- the second portion of the eACA sequence may comprise as few as 4 nucleotides (for example, 1 stem nucleotides and 3 loop nucleotides).
- the first portion of the eACA sequence may comprise, for example, 19 nucleotides (e.g. 15 stem nucleotides and 4 loop nucleotides), or 29 nucleotides (e.g. 25 stem nucleotides and 4 loop nucleotides).
- the second portion of the eACA sequence may comprise, for example, 18 nucleotides (e.g. 15 stem nucleotides and 3 loop nucleotides), or 28 nucleotides (e.g. 25 stem nucleotides and 3 loop nucleotides).
- An exemplary first portion of the eACA sequence comprising a 1 nucleotide stem and a 4 nucleotide loop may comprise the nucleotide sequence 5 -NNNNN-3’, wherein N is any nucleotide
- an exemplary second portion of the eACA sequence comprising a 1 nucleotide stem and a 3 nucleotide loop may comprise the nucleotide sequence 5’-NNNU-3’, wherein N is any nucleotide.
- a possible ACA sequence is GATCACCACTTTAAGGTGATC (SEQ ID NO: 58).
- the NLuc GOI is rearranged such that the ACA sequence is provided in two portions (a 5’ first portion and a 3’ second portion).
- the first portion of the eACA sequence comprises the sequence TTAAGGTGATC (SEQ ID NO: 59).
- the second portion of the eACA sequence comprises the sequence GATCACCACT (SEQ ID NO: 60).
- the first and second eACA loop portions base pair with the internal guide sequence (IGS) to form the P1 and P10 regions, which are critical for ribozyme activity.
- the first eACA loop portion positioned towards the 5’ end of the gene of interest, base pairs with the IGS to form the P10 region. It is not necessary for all the nucleotides in the first eACA loop portion to form the P10 region, and in some cases only two nucleotides of the first eACA loop portion form the P10 region.
- the second eACA loop portion positioned towards the 3’ end of the gene of interest, base pairs with the IGS to form the P1 region.
- the last nucleotide of the second eACA loop portion may form a wobble base pair with a corresponding nucleotide in the IGS.
- the wobble base pair is a GU wobble base pair, with G in the IGS and U in the second eACA loop portion or an AC wobble base pair with A in the IGS and C in the second eACA loop portion.
- the wobble base pair provides the circularization site, such that once circularized, the nucleotide at the 3’ end of the second eACA loop portion forms the third nucleotide in the loop of the eACA stem-loop structure. This is depicted in Figure 8C.
- the P1 region may also be formed by base pairing of the IGS with a region adjacent to the second eACA loop portion in the 3’ direction, known as the “P1 extension”. If present, the P1 extension typically comprises between 2 and 4 nucleotides, which base pair with the IGS. The P1 region may therefore be formed by the P1 extension and second eACA loop portion base pairing with the IGS, as shown in Figure 10A. Consequently, in some embodiments, the second portion of the eACA sequence and the P1 extension together are capable of forming a P1 region. If the P1 extension is not present, the P1 region is formed by only the second eACA loop portion base pairing with the IGS. P1 extensions have been described in Olson & Muller (2012) RNA 18:581-589. The contents of which are incorporated herein by reference. Generally, if an extended guide sequence (EGS) is used, the P1 extension region will be present.
- EGS extended guide sequence
- a particular advantage of the TRIC method is that it may utilise eACA sequences which are already present in the gene of interest. For example, if a stem-loop forming eACA sequence can be found in a gene of interest, this gene may be circularized efficiently without introducing any additional sequences. This in turn means that the resulting circular RNA is far less likely to be immunogenic.
- Figure 12 shows first the identification of an eACA sequence (i.e. a stem-loop forming structure) in a gene of interest. Subsequently, the gene of interest is rearranged such that the eACA sequence is split into two portions, one at each end of the gene of interest. This rearranged gene is then cloned into the TRIC construct for circularization.
- This circular RNA already comprises an eACA sequence in its natural sequence. This means a circularization site can be introduced using the naturally occurring eACA sequence, without the need to perform mutations or introduce additional sequence.
- Codon redundancy means that mutations may be made to the nucleotide sequence of the GOI without affecting the resulting peptide sequence. Consequently, an eACA sequence may be provided in the GOI without requiring the introduction of additional sequences. Instead, only selective mutation of the existing sequence is needed, following the rules of codon redundancy. Examples of this are the T2A-EGFP and circZNF609 circular RNAs described in Example 5, in which mutations are introduced based on codon redundancy to introduce the circularization site.
- the circularization site may be created by introducing additional nucleotides. For example, as shown in Figure 8C, 5 nucleotides (light grey nt) may be introduced to create a stem portion of the eACA sequence, using the existing sequence (black nt) of the GOI to provide the remainder of the eACA.
- the first and/or second portions of the eACA sequence may naturally occur in the gene of interest. In other words, they may be part of the gene of interest and so are present without having to mutate the existing sequence or introduce additional sequence.
- all or part of the eACA sequence may be derived from human ribosomal RNA (rRNA).
- rRNA ribosomal RNA
- the use of human rRNA has the potential to provide circular RNAs which are less immunogenic.
- the recombinant nucleic acids described herein may further comprise an extended guide sequence (EGS), in particular a first EGS and a second EGS which are capable of complementary base pairing to each other.
- EGS extended guide sequence
- the function of the EGS is to increase the length of the complementary base-pairing region at the two ends of the recombinant nucleic acid molecule. In this way, the EGS may be included to compensate for a shorter IGS, in particular when longer (>500 nt) GOIs are circularized.
- the recombinant nucleic acids described herein may comprise a first EGS positioned 5’ of the IGS.
- the recombinant nucleic acids described herein may comprise a second EGS positioned 3’ to the second portion of the eACA sequence.
- there is a loop sequence situated between the first EGS and the IGS as described in more detail below.
- the first and second EGS may be partly or fully complementary to each other. Generally, mismatches are tolerated well and do not materially affect circularization.
- the first EGS may be at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% complementary to the second EGS.
- the first and second EGS may each be between 1 and 500 nucleotides in length.
- the first and second EGS may each be between 10 and 50 nucleotides in length.
- the first and second EGS may each be 20, 30, or 40 nucleotides in length.
- An exemplary first EGS sequence is GGUCAAUCGGUUGGCUUCCG (SEQ ID NO: 56).
- An exemplary second EGS sequence is CGGAAGCCAACCGAUUGACC (SEQ ID NO: 57).
- the recombinant nucleic acids described herein may further comprise loop sequences, such as a first loop sequence and a second loop sequence.
- the first and second loops may act as spacers between, at the 5’ end, the internal guide sequence (IGS) and the first extended guide sequence (EGS), and at the 3’ end, the P1 region and the second EGS.
- the loop sequences are preferably not complementary to each other, such that there is little or no base pair interaction between the first and second loop sequences. Because of the low or noncomplementarity between the two loop sequences, the base-paired P1 region remains at a fixed length.
- the first loop may alternatively be described herein as “left loop”.
- the second loop may alternatively be described herein as “right loop”.
- the first and second loop sequences may each be between 1 and 10 nucleotides in length. It is not necessary for the first and second loop sequences to have the same number of nucleotides, and in fact the TRIC method works well when the first and second loop sequences are different lengths.
- the first loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length.
- the second loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length.
- a preferred combination is a 6 nucleotide first loop sequence and a 5 nucleotide second loop sequence.
- Another preferred combination is a 3 nucleotide first loop sequence and a 2 nucleotide second loop sequence.
- the first loop sequence is positioned 3’ to the first EGS and 5’ to the IGS, in other words, in between the first EGS and the IGS.
- the second loop sequence is positioned 3’ to the second portion of the eACA sequence, and 5’ to the second EGS, in other words, between the second portion of the eACA sequence and the second EGS. If the P1 extension region is present, the second loop sequence is 3’ to the P1 extension.
- the recombinant nucleic acids described herein may comprise only a first loop sequence, positioned in between the first EGS and the IGS, without a second loop sequence.
- the recombinant nucleic acids described herein may comprise only the loop sequence positioned between the second portion of the eACA sequence (or, if present, P1 extension) and the second EGS.
- An exemplary sequence for the first loop sequence is AAATAA (SEQ ID NO: 54).
- An exemplary sequence for the second loop sequence is ACACC (SEQ ID NO: 55).
- the gene of interest refers to the sequence which is to be circularized.
- the GOI may comprise a coding sequence, coding for a peptide or protein, or may be a noncoding sequence.
- the GOI may also comprise a combination of coding and noncoding sequence.
- the term “gene of interest” encompasses sequences which include additional sequence elements, for example, a translation initiation element, such as an internal ribosome entry site (IRES) sequence, multiple siRNA target sites (msiTS), spacer sequences such as polyAC sequences, start codons, stop codons, and any other sequence elements known to be useful in the art for producing circular RNA.
- a translation initiation element such as an internal ribosome entry site (IRES) sequence, multiple siRNA target sites (msiTS), spacer sequences such as polyAC sequences, start codons, stop codons, and any other sequence elements known to be useful in the art for producing circular RNA.
- the GOI may comprise, in the 5’ to 3’ direction: a stop codon, a polyAC sequence, multiple siRNA target sites (msiTS), an IRES, a start codon, and the coding sequence including the eACA.
- the GOI may comprise, in the 5’ to 3’ direction: multiple siRNA target sites (msiTS), an IRES, a start codon, a coding sequence, a stop codon, a polyAC sequence, and the eACA. See, for example, Figures 10A and 10B.
- IRESs for use in the invention may include viral IRESs, such as the Coxsackievirus B3 (CVB3), cafeteria roenbergensis Virus (CroV), or Classical Swine Fever Virus (CSFV) IRES, the DNA sequences of which are set out below.
- CVB3 Coxsackievirus B3
- CroV Clustereteria roenbergensis Virus
- CSFV Classical Swine Fever Virus
- a viral IRES may be modified to remove stop codons in open reading frames.
- Suitable modified viral IRESs include modified CSFV IRESs.
- a modified CSFV IRES may for example comprise the DNA sequence of SEQ ID NO: 96; nucleotides 324 to 696 of SEQ ID NO: 75; or nucleotides 596 to 968 of SEQ ID NO: 77. Modified IRESs may be useful in the rolling circle translation of the circular RNA as described herein.
- TRIO is suitable for genes of interest of any length. TRIO is particularly suitable for long genes of interest.
- long is generally considered to mean a sequence of at least 500 nucleotides.
- the gene of interest may be at least 100, at least 250, at least 500, at least 1000, at least 2000, at least 3000, at least 4000, at least 5000, at least 6000, at least 7000, or at least 8000 nucleotides in length.
- recombinant nucleic acid molecules for making a circular RNA as described herein may comprise, in the 5’ to 3’ direction: a) an internal guide sequence (IGS), b) a ribozyme, c) a first portion of an extended anticodon arm (eACA) sequence, d) a gene of interest, and e) a second portion of the eACA sequence, wherein a nucleotide in the second portion of the eACA sequence forms a wobble base pair with a nucleotide in the IGS.
- IGS internal guide sequence
- eACA extended anticodon arm
- recombinant nucleic acids described herein may comprise, in a 5’ to 3’ direction: a) a first EGS, b) a first loop sequence, c) an internal guide sequence (IGS), d) a ribozyme, e) a first portion of an extended anticodon arm (eACA) sequence, f) a gene of interest, g) a second portion of the eACA sequence, h) a P1 extension, i) a second loop sequence, and j) a second EGS.
- IGS internal guide sequence
- eACA extended anticodon arm
- the first and second eACA stem portions may be 15 or 25 base pairs and the loop of the resulting stem-loop structure may be 7 nucleotides.
- the recombinant nucleic acids described herein may further comprises elements to facilitate the transcription or circularization process.
- the recombinant nucleic acids described herein may further comprise a T7 high efficiency sequence, one or more restriction enzyme cleavage sites, and/or a poly(A) tail.
- the recombinant nucleic acid is a DNA template, it may further comprise a T7 promoter sequence.
- a recombinant nucleic acid may further comprise a nucleotide sequence encoding a self-cleaving peptide to ensure the production of monomeric protein during rolling circle amplification. Such additional elements are known in the art.
- the recombinant nucleic acids described herein may lack stop codons.
- the recombinant nucleic acids may be engineered or modified to remove stop codons in open reading frames. This may facilitate rolling circle amplification. Methods for producing a circular RNA
- methods for producing a circular RNA comprise the provision of a linear recombinant nucleic acid molecule, such as those described herein, and the splicing of that molecule to generate a circular RNA.
- methods for producing a circular RNA may comprise: a) providing a recombinant nucleic acid molecule as described herein, and b) circularizing the recombinant nucleic acid molecule.
- recombinant nucleic acid molecule may refer to either a DNA template molecule or an RNA precursor.
- the recombinant nucleic acid molecule is a linear DNA template molecule.
- in vitro transcription is performed using the DNA template molecule to obtain a linear RNA precursor molecule.
- a circularization/splicing step is performed with the RNA precursor, to generate a circular RNA molecule.
- methods for producing a circular RNA may comprise: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor.
- some splicing may occur during transcription, known as co-transcriptional splicing.
- the methods described herein may additionally comprise the suppression of splicing (and circularization) during step (b) (the transcription step).
- the methods described herein may comprise performing step (b) in the presence of NTPs at a concentration of approximately 24 mM, and Mg2+ at a concentration less than 18 mM, less than 16 mM, less than 14 mM, or less than 12 mM.
- the transcribing step is performed in the presence of nucleoside triphosphates (NTPs) at a concentration of approximately 24 mM and Mg2+ at a concentration of 16 mM or less. If the concentration of NTPs is more or less than 24 mM, the concentration of Mg2+ may also vary.
- NTPs nucleoside triphosphates
- Methods for producing a circular RNA may therefore comprise: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor; wherein step (b) is performed in the presence of Mg2+ at a concentration of 16 mM or less and NTPs at a concentration of approximately 24 mM.
- RNAs comprising: a) providing a recombinant nucleic acid molecule as described herein, and b) transcribing and circularizing the recombinant nucleic acid molecule, wherein splicing occurs co-transcriptionally.
- an advantage of the TRIC method is that it can be used to circularise genes of interest that naturally comprise a sequence that is capable of forming an eACA stem-loop. This enables a circular RNA to be created that does not contain exogenous sequence material (such as exon sequences), which are frequently immunogenic, and so provides clear benefits over existing methods such as PIE.
- methods for producing a circular RNA may comprise: a) identifying a gene of interest comprising a sequence capable of forming an eACA stem-loop, b) preparing a recombinant nucleic acid molecule comprising, in a 5’ to 3’ direction: an internal guide sequence (IGS) - a ribozyme - a sequence encoding the gene of interest, and c) circularizing the recombinant nucleic acid molecule.
- IGS internal guide sequence
- Step (b) above may comprise rearranging the gene of interest to place a first portion of the sequence capable of forming an eACA stem-loop at the 5’ end, and a second portion of the sequence capable of forming an eACA stem-loop at the 3’ end.
- Suitable genes of interest comprising a sequence capable of forming an eACA stem-loop may be identified by the skilled person using approaches known in the art, including the use of software such as RNAFold (University of Vienna).
- Methods for producing a circular RNA may also comprise the expression of the recombinant nucleic acid molecule in a cell, with subsequent circularization being performed in the cell.
- Circular RNAs can be generated using the recombinant nucleic acid molecules described herein.
- the circular RNAs comprise a sequence encoding a gene of interest.
- One advantage of the described circular RNAs is that in some cases, they do not comprise exogenous splicing sequences, such as the circular RNAs obtained with the PIE method, which typically comprise exogenous exon sequences.
- Circular RNAs described herein may not comprise RNA from a circularization agent.
- circular RNAs described herein may not contain any RNA from a ribozyme, in particular from a group I intron.
- the circular RNAs described herein may comprise only the gene of interest. This in turn reduces the immunogenicity of the circular RNAs compared to circular RNAs generated using methods of the art (such as the PIE method).
- the circular RNAs described herein may be less immunogenic compared to circular RNAs (e.g. those encoding the same GOI) which comprise exogenous exon sequences.
- immunogenicity By less immunogenic, it is meant that transfection with the circular RNA produces less of an immune response by the host cell, for example, reduced production of cytokines, chemokines or other immune signalling molecules. Suitable methods for determining immunogenicity are known in the art. For example, immunogenicity can be determined by measuring the production of immune factors (cytokines, chemokines, etc) in transfected cells for a period of time following transfection.
- immune factors cytokines, chemokines, etc
- the sequence encoding the gene of interest may comprise a sequence capable of forming an eACA stem-loop structure as described herein.
- the sequence capable of forming an eACA stem-loop structure may be naturally occurring in the gene of interest, or may have been introduced prior to circularization.
- circular RNAs obtainable by the methods disclosed herein.
- a circular RNA obtainable by a method comprising: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor.
- Circular RNAs obtained from the recombinant nucleic acids described herein may be used in various ways to exert a therapeutic effect.
- circular RNAs can act as a “sponge” for micro RNAs (miRNA), in turn preventing or reducing the degradation of a target mRNA by the miRNA.
- Circular RNAs may also influence protein trafficking and subcellular protein localisation. The specific therapeutic effect will be determined by the gene of interest, including whether the gene of interest is a coding or noncoding sequence.
- circular RNAs can also be used to drive expression of a gene of interest in vivo or in vitro.
- a recombinant nucleic acid molecule as described herein is circularized to provide a circular RNA comprising the gene of interest, and the circular RNA is administered to the cell.
- Circular RNAs obtained from the recombinant nucleic acid molecules as described herein may be used to drive expression of a gene of interest through rolling circle amplification.
- Suitable circular RNAs may lack in-frame stop codons.
- a circular RNA may comprise a modified viral IRES as described herein which lacks in frame stop codons.
- the circular RNA may be translated continuously multiple times by a ribosome to generate polyproteins.
- a suitable circular RNA may further comprise a DNA sequence encoding a self-cleaving peptide.
- the self-cleaving peptide causes ribosomal skipping during translation of the circular RNA by a ribosome to generate monomeric proteins.
- Suitable self-cleaving peptides are well-known in the art and include 2A peptides, such as T2A, P2A, E2A and F2A.
- Described herein are methods for treating a disease in a subject, the methods comprising (a) circularizing a recombinant nucleic acid molecule as described herein to provide a circular RNA, and (b) administering the circular RNA to the subject.
- recombinant nucleic acid molecules such as those described herein for use as a medicament; recombinant nucleic acid molecules as described herein for use in a method of treating a disease in a subject; circular RNA obtained from a recombinant nucleic acid molecule as described herein for use as a medicament; and circular RNA obtained from a recombinant nucleic acid molecule as described herein for use in a method of treating a disease in a subject
- compositions comprising a circular RNA or a or recombinant nucleic acid molecule as described herein, and a pharmaceutically acceptable excipient.
- DNA templates were linearized and cleaned by phenol:chloroform:isoamyl alcohol extraction.
- IVT was performed at 50 ng/ul of DNA template, 14 ug/ul of homemade T7 polymerase, 0.04 U/ul of RNase inhibitor (Promega), 6 mM of each NTPs and 1X IVT buffer.
- 1X IVT buffer contains 80 mM HEPES-K (pH7.5), 2 mM spermidine, 40 mM DTT and 24 mM MgCI2.
- the concentration of MgCI2 in 1X IVT buffer is 14 mM.
- IVT reactions were incubated at 37 °C for 3-5h and then digested by RNase-free DNase I for 20min. Afterwards, 100 mM EDTA was added to a concentration of 25 mM to clear any precipitation. Then equal volume of 7.5 M lithium chloride is added to precipitate RNAs for 30 min to overnight at -20 °C. Then precipitations were spun down at 13000 rpm/min for at least 20min. RNA pellets were washed by 75% alcohol, air dried and dissolved in DEPC treated H2O.
- RNAs were circularized during IVT.
- DNA template digested IVT reactions were supplied with extra GTP at 2 mM concentration and heated at 55 °C for 20min.
- Concentration of ribozyme was fixed at 1 uM and concentration of GTP was varied from 1 uM to 2000 uM. At each concentration of GTP, time course of circularization of TRIC-V2 and PIE were monitored. Then time for 50% circularization completeness (t1/2) for each sample was calculated and used for estimation of initial circularization speed of each construct at each GTP concentration (Vobs). Vobs was then plotted against GTP concentration to calculate kinetics parameters of the TRIC-V2.
- Reverse transcriptase and DNA polymerase used here are the SuperScrip IV Reverse Transcriptase (Thermo Fisher) and the Q5 High-Fidelity DNA Polymerase (NEB). Manufacturer’s manuals were followed for reverse transcription and PCR. The IVT sample, RNA I and II were used as templates for reverse transcription and PCR using the using primers indicated in Figure 2B.
- RNA clean & concentrator kit ZYMO RESEARCH
- A549 and HEK 293F cells were cultured in DMEM (with 10% FBS, High Glucose GlutaMAX, Life Technologies Ltd) and Freestyle (Gibco) medias, respectively.
- NIuc expression and siRNA knocking down studies 50 ng of circular NIuc and an equal molar amount of linear mRNA were transfected into 10,000 cells in 96-well plates using the MessengerMax transfection reagent. For protein expression, 5ul of cells were taken for luciferase assay using the Nano-Gio® Luciferase Assay System (Promega). For siRNA knocking down, siRNAs were transfected into cells using the RNAiMax (Invitrogen) on day 3. 5 hours after siRNA transfection, expression of NIuc was measured using the above kit.
- the inventors selected the tRNA Leu intron from the cyanobacterium Anabaena (Ana) for the initial test of the TRIC approach. This first construct was designated TRIC-V0.
- Ana is short (249 nt) but highly active.
- the Ana intron divides the leucine transfer RNA (tRNA) into a 34 nt left half and a 51 nt right half (L34/R51) in the anticodon arm (ACA) ( Figure 2A-insert labelled “tRNA”).
- the inventors set out to determine whether using some of the leucine tRNA sequence would enable circularization.
- Leu tRNA anticodon arm a L15/R30 portion was reserved and joined on either side of a gene of interest (in this case a 3*Flag coding sequence) for circularization.
- the construct is shown in Figure 2B (SEQ ID NO: 14).
- Both the major species I and II contained a fast-moving minor species, and both run faster than themselves in the 12% PAGE, indicating that they are circular RNAs. Since the minor bands in I and II run at the same place as linear intron and nicking 3*Flag, respectively, the inventors concluded that species I is circular intron and species II is circular 3*Flag (the gene of interest). To further confirm the circular identity of 3*Flag, the inventors performed reverse transcription followed by PCR (RT-PCR) on the IVT sample and species I and II ( Figure 2D). As expected, a 109 bp DNA product was produced from IVT and II but not I. The inventors then cloned this PCR product to a sequencing vector and performed Sanger sequencing.
- RT-PCR reverse transcription followed by PCR
- the internal guide sequence (IGS) of the Ana group I intron is short (5 nucleotides), and the inventors speculated that a short IGS might make it difficult to circularize long genes of interest.
- the inventors introduced an extended guide sequence (EGS) at the 5’ end of the TRIC, which could form a 20 nucleotide base-paired structure with a corresponding region at the 3’ end ( Figure 3A). This construct was designated TRIC-V1 .
- the sequence connecting the IGS and EGS constitutes an internal loop, which the inventors also optimised in TRIC-V1 .
- Three constructs were created, each containing a different loop configuration.
- TRIC-V1.0 SEQ ID NO: 15
- V1 .1 SEQ ID NO: 16
- V1 .2 SEQ ID NO: 17
- the 3’ loop length was 3 nucleotides in V1 .1 and V1 .2, whilst the 3’ loop length was 5 nucleotides in V1 .0.
- the efficiency (that is, the ratio of full-length precursor to circular GOI) of the three V1 .1 variants was similar, since full length precursors mostly converted to circular RNAs during in vitro transcription.
- One advantage of the V1 variants compared to VO is that the amount of circular intron is reduced. Reduction of circular introns is beneficial because these circular introns cannot be removed by RNase R digestion.
- V1 .1 gave the highest ratio of circular 3*Flag to linear intron, so this variant was taken forward for further investigation.
- the 3*Flag sequence (SEQ ID NO: 18) is 141 nucleotides in length.
- the inventors also determined the capacity of TRIC-V1 for circularization of long genes of interest.
- Five new constructs were created with the aim of producing circular CVB3-EGFP (EGFP, 1638 nt) (SEQ ID NO: 19), CVB3-Firefly luciferase (Flue, 2601 nt) (SEQ ID NO: 20), CVB3-Spike protein of SARS-CoV 2-EGFP (Spike, 5469 nt) (SEQ ID NO: 21), CVB3-spCas9-EGFP (Cas9, 5757 nt) (SEQ ID NO: 22) and CVB3-Factor 8- EGFP (Factor s, 8706 nt) (SEQ ID NO: 23) ( Figure 4A).
- Spike protein of SARS-CoV 2-EGFP AAAATCCGTTGACCTTAAACGGTCGTGTGGGTTCAAGTCCCTCCACCCCCACGCCGGAAACGCAATAGCCGAAAAACAA
- TRIC V1 can circularize long genes of interest co-transcriptionally, as indicated by the multiple RNA species observed for each construct in Figure 4B, without the post-transcriptional circularization protocol.
- TRIC-V1 produces more splicing products than the PIE approach for all the genes of interest, suggesting that the co-transcriptional circularization efficiency of TRIC-V1 is higher than PIE.
- co-transcriptionally produced circular RNAs are heavily nicked, since circular RNAs for EGFP and Fluci are nicked at over 50% and circular RNAs for Spike, Cas9 and Factor 8 were difficult to detect.
- the additional 20 minute post-transcriptional circularization did increase the amount of splicing product, but mainly by increasing the amount of nicking.
- Co- transcriptional nicking for the short GOI 3*Flag but fatal for the long GOIs used in this experiment.
- the inventors sought to identify a protocol that could be used to produce long circular GOIs.
- Mg2+ is believed to be the major source of circular RNA nicking (Wesselhoeft et al. (2016)). Since Mg2+ is essential for both in vitro transcription and circularization, the inventors speculated that reasonable ways to reduce nicking of the circular RNAs would be to suppress co-transcriptional circularization and to increase the speed of post-transcriptional circularization.
- NTPs nucleoside triphosphates
- Both the TRIC-V1 and the PIE rely on native exon sequences, namely the tRNA sequence in the case of the Ana, to work. These native exon sequences will unavoidably be part of resulting circular GOIs (as shown in Figures 5A and 5B), although their inclusion is associated with certain drawbacks.
- the native exon sequences are immunogenic (Liu, C. X. et al. Mol Cell 82, 420-434 e426 (2022)), which hinders the biomedical applications of circular RNAs produced by group I intron-based methods such as PIE.
- the inclusion of the native exon sequences means that circularization sites must be positioned at untranslated regions (UTRs) of protein coding circular RNAs. Since UTRs are normally highly structured internal ribosome entry sites (IRESs), extra unwanted spacer sequences are needed to ensure the group I intron and remaining exons are separated from the IRESs.
- the inventors restored the left arm (L15/R2 - V1 .32 (SEQ ID NO: 31)) or the right arm (L3/R30 - V1.33 (SEQ ID NO: 32)) of the tRNA sequence.
- Restoration of the right arm (V1 .33) restored most of the circularization efficiency.
- the length of the right arm was reduced gradually in constructs V1 .34 (L3/R25) (SEQ ID NO: 33), V1 .35 (L3/R16) (SEQ ID NO: 34) and V1 .36 (L3/R9) (SEQ ID NO: 35).
- V1 .36 The minimum length for the right arm was identified as 9 nucleotides (V1 .36, L3/R9). Subsequently, the right arm was fixed at 9 nucleotides and an 8 nt left arm was tested (V1 .39, L8/R9 (SEQ ID NO: 38)). The L8/R9 construct fully restored circularization efficiency to the V1 .0 level. The requirement of the tRNA sequence in V1 .39 has therefore been reduced to a 17-nucleotide structure which mimics the anticodon arm (ACA) found in the Ana tRNA.
- ACA anticodon arm
- the inventors sought to determine whether the group I intron activity could be replicated based on structure, rather than specific nucleotide sequences. To that end, the inventors reversed the sequence of the ACA stem, whilst keeping the sequence of the ACA loop the same (construct designated V2.0 (SEQ ID NO: 39)) and found the circularization efficiency is at least as high as V1 .0 and V1 .39. With V2.0, the requirement on tRNA sequence is reduced to 5nt (L-CTT/R-AA). A 5 nt sequence cannot be specific to any species, therefore TRIC-V2 doesn’t rely on the original bacterial tRNA sequence.
- the inventors also identified that the stem length could be increased to 15 base pairs (V2.1 (SEQ ID NO: 40)) without abolishing circularization efficiency.
- V2.1 SEQ ID NO: 40
- the only tRNA sequence present is the L-CTT/R-AA sequence, and the remaining base pairs were not derived from the tRNA sequence.
- the longer stem is advantageous since it may enable better circularization of longer GOIs.
- the inventors subsequently reversed the sequence of the P1 and P10 regions (from L-CTT/R-AA to L- GAT/R-TT), except for the uracil which forms a wobble base pair with the internal guide sequence (construct designated V2.2 (SEQ ID NO: 41). Again, circularization efficiency was found to be unaffected. In this way, the inventors have shown that the IGS requires only a GU wobble base pair for circularization and is not otherwise dependent on sequence.
- an anticodon arm-like structure which consists of a 7 nt loop with a uracil as the third nucleotide and a >5 base pair stem can be found in a gene of interest, this gene of interest can be circularized efficiently without introducing any unwanted sequence (Figure 8). If the circularization site is placed in a coding sequence, the site of choice can be extended because of codon redundancy. Potentially, only a few nt mutations would need to be made without affecting the peptide sequence. If the circularization site is placed in a noncoding RNA or UTR of a protein coding circular RNA, the circularization site can be assembled by introducing as few as 5 extra nucleotides to create an eACA.
- eACA anticodon arm-like structure
- the TRIC-V2 construct was then tested on three protein coding circular RNAs, including circular T2A- EGFP, T2A-Nano Luciferase, and circular Znf609.
- the circularization site was positioned at the CDS. For all the circular RNAs tested, multiple circularization sites in the CDS were found. Full-length precursors of those three RNAs were produced and circularized for 20 minutes as already described and shown in Figure 12.
- the sequences of the full length precursors for T2A-EGFP, T2A-Nano Luciferase and Znf609 are set forth in SEQ ID NOs: 42, 43 and 44 respectively. As shown in Figure 8D (urea-agarose gel), these long circular RNAs were all produced efficiently with TRIC-V2.
- the CVB3-EGFP was cloned to the TRIC-V2 construct and two stem lengths were tested: 15 nt (SEQ ID NO: 45) and 25 nt (SEQ ID NO: 46).
- the effect of increasing the length of the extended guide sequence (EGS) was determined by providing a TRIC-V1 construct (that is, one having L15/R51 of the tRNA sequence) with a 40 nt EGS (SEQ ID NO: 47). Full length precursors of all these constructs were produced and circularization was performed for 4 minutes.
- TRIC-V1.0 converts -50% of full length precursor to circular RNA.
- Increasing the EGS from 20 to 40 did not substantially increase circularization efficiency.
- TRIC-V2 was then compared with the PIE method on circularization of long GOIs EGFP, Spike and Cas9, for 1 or 3 minutes of circularization.
- PIE converts a small number of EGFP precursors to circular RNAs
- TRIC-V2 converts many more.
- Figure 9B 2% native agarose gel
- PIE converts less than 50% of full-length EGFP precursor to circular RNAs but TRIC-V2 converts most of full-length precursors to circular RNAs.
- Circularization efficiency, circular RNA yield (ratio between circular RNAs and total RNAs) and nicking ratio were then calculated using the CVB3-EGFP as an example.
- a yield of 74.8% is 90.3% of the limit yield (MW ratio between circular RNA and its precursor).
- TRIC-V2 provides higher efficiency, higher yield, and lower nicking compared to PIE, all without the issues of native exon sequences in the resulting circular RNAs.
- TRIC-V2 can produce circular genes of interest without any bacterial sequence.
- the inventors determined whether circular RNAs produced by TRIC-V2 were less immunogenic than those produced by PIE.
- Circular RNAs were purified by gel filtration (using an SRT-2000 SEC column (Sepax)) and RNase R digestion.
- A549 cells were then transfected with the circular RNAs, and the expression levels of immune factors (IL6, CCL5, and INF beta) was monitored by RT-qPCR.
- Various controls were included in the experiment, such as mocks (Mock and Lipo), a positive control (poly I :C), unmodified linear mRNAs (lin. EGFP and lin. NIuc) and modified linear mRNAs (lin. EGFP-modi and lin. Nluc-modi).
- GAPDH was used as the internal reference.
- Figures 11 D-F depict the results.
- Circular RNAs are more stable than mRNAs and are thus promising alternatives to mRNAs for therapeutics.
- Natural circRNAs can serve as sponges for microRNAs and proteins but are also templates for translation.
- IRS internal ribosome entry site
- CircRNAs that lack an in-frame termination codon allow ribosomes to translate completely around the circRNA multiple times, resulting in a polyprotein, a process known as rolling circle translation (RCT) ( Figure 15a).
- the polyprotein can in principle be cleaved by a protease or a self-cleaving sequence, thus allowing multiple copies of the GOI to be made in a single round of initiation.
- RCT can be 100-fold more efficient than single-shot translation.
- RCT encounters two challenges: low initiation efficiency and accessory sequences introduced by currently used in vitro circularization methods.
- This eACA structure is essential for circularization to happen.
- This eACA is a stem-loop structure which contains a stem >1 bp and a 7 nt loop with a U at the third position ( Figure 16A a-0). This minimal structure enables circularization of GOIs with little restriction without introducing unwanted sequences.
- the G and U wobble base pair between IGS (internal guide sequence) and GOIs is known to be important for ribozyme to function.
- Tetrahymena thermophila (Tetra) group I intron-derived constructs the Tetra-STS (RZ construct, WO2022/191642) and the Tetra-Rzy, for circRNA synthesis.
- CVB3-EGFP into Tetra-STS (AU-rich no. 16) and Tetra-Rzy (CVB3IRES-GFP) constructs (SEQ ID Nos: 94 and 95).
- V2 outperforms both constructs ( Figure 18).
- the Tetra-V2 (SEQ ID NO: 93) also produced circCVB3-EGFP efficiently, demonstrating that optimizations from the Ana intron can be effectively applied to other group I introns.
Landscapes
- Genetics & Genomics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Plant Pathology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
This invention relates to recombinant nucleic acid molecule for making a circular RNA, comprising in the 5' to 3' direction: a) an internal guide sequence (IGS), b) a ribozyme, c) a first portion of an extended anticodon arm (eACA) sequence, d) gene of interest, and e) a second portion of the eACA sequence. A nucleotide in the second portion of the eACA sequence forms a wobble base pair with a nucleotide in the IGS. Recombinant nucleic acid molecules, circular RNA, and methods of producing and using circular RNA are provided.
Description
CIRCULAR RNAS AND METHODS FOR MAKING THE SAME
This application claims priority from GB2308675.4 filed 09 June 2023, the contents and elements of which are herein incorporated by reference for all purposes.
FIELD OF THE INVENTION
The invention relates to recombinant nucleic acid molecules for making circular RNAs, methods for making circular RNAs, and circular RNAs which do not comprise exogenous exon sequences.
BACKGROUND
Circular RNAs (circRNAs) are covalently closed single-stranded RNA molecules. Over the last decade, tens of thousands of circRNAs have been identified from viruses to humans. Within human cells, they are mainly produced by a back splicing process. Due to their ubiquitous distribution, there is an increasing interest in them. Although their natural function is not yet well determined, some hypotheses are that they act as sponges for microRNAs and proteins, or as templates to produce peptides or proteins. At the same time, circRNAs are now becoming the basis of next-generation mRNA therapeutics, mainly because of their high stability.
RNA in vitro circularization can be achieved by chemical or enzymatic ligation, but such strategies are inefficient in producing long circRNAs and prone to forming intermolecular ligations (Petkovic, S. & Muller, S. RNA circularization strategies in vivo and in vitro. Nucleic Acids Res 43, 2454-2465, (2015)). An alternative method, the so-called permuted intron-exon method (PIE) is efficient in circularizing long RNAs (Puttaraju, M. & Been, M. D. Group I permuted intron-exon (PIE) sequences self-splice to produce circular exons. Nucleic Acids Res 20, 5357-5364 (1992); Wesselhoeft, R. A., Kowalski, P. S. & Anderson, D. G. Engineering circular RNA for potent and stable translation in eukaryotic cells. Nat Commun 9, 2629, doi:10.1038/s41467-018-05096-6 (2018); WO2023/046153). The PIE method takes advantage of the self-splicing property of the group I intron, a well-studied ribozyme that can join flanking exons by a two-step transesterification reaction in the absence of any protein. The PIE method moves the 5' half of the intron to the 3' end of the RNA and thereby connects flanking exons, so that the splicing product is a circular form of the preconnected exons.
Although the PIE method is an attractive one-step method to produce circRNA, it has some drawbacks. The key for PIE to work is the formation of the ribozyme core by the two intron halves that are separated by the preconnected exon. The long preconnected exon has the potential to interact with adjacent intron halves and could block the formation of the functional ribozyme core. Also, circRNAs produced by PIE contain unwanted sequences which have been reported to induce immune responses (Liu, C. X. et al. Mol Cell 82, 420-434 e426 (2022)). Group I intron can also act as a trans ribozyme to target a second molecule (Inoue T et al, Intermolecular exon ligation of the rRNA
precursor of Tetrahymena: Oligonucleotides can function as 5' exons. Cell, vol 43(2) (1985); Been, MD et al, One binding site determines sequence specificity of Tetrahymena pre-rRNA self-splicing transsplicing and RNA enzyme activity. Cell, vol. 47(2) (1986)). Many efforts have been spent in engineering the trans ribozyme to deliver correct sequences to defective mRNAs in vivo (Sullenger BA et al, Ribozyme-mediated repair of defective mRNA by targeted trans-splicing Nature, vol. 371 (6498) (1994); Lan, N., Howrey, R. P., Lee, S. W., Smith, C. A. & Sullenger, B. A. Ribozyme-mediated repair of sickle beta-globin mRNAs in erythrocyte precursors. Science 280, 1593-1596 (1998). Jones JT et al, Tagging ribozyme reaction sites to follow trans-splicing in mammalian cells Nat Med, vol. 2(6) (1996). Kohler U et al, Trans-splicing Ribozymes for Targeted Gene Delivery, J Mol Biol o\. 285(5), (1999)), but unfortunately the efficiency is low.
Various attempts have been made to provide circular RNAs which do not comprise unwanted sequences, for example, which do not comprise spacer sequences or native exon sequences such as those introduced with the PIE method. One such approach is described in WO2022/191642, in which the internal guide sequence (IGS) of the ribozyme is reverse complementary to a target site region in the gene of interest, in order to self-splice and circularize. The IGS region and the target site region flank the ribozyme and a gene of interest, respectively, and a guanine at the 5’ end of the IGS region forms a wobble base pair with an uracil at the 3’ end of the target site region. However, such approaches are limited by poor efficiency in converting precursor to circular RNAs and produce high levels of nicking with low yield.
Accordingly, there is a need in the art for more efficient methods of RNA circularization which also enable circular RNAs to be produced without unwanted sequences.
SUMMARY
In a first aspect, the invention provides a recombinant nucleic acid molecule for making a circular RNA, comprising in the 5’ to 3’ direction: a) an internal guide sequence (IGS), b) a ribozyme, c) a first portion of an extended anticodon arm (eACA) sequence, d) a gene of interest, and e) a second portion of the eACA sequence, wherein a nucleotide in the second portion of the eACA sequence forms a wobble base pair with a nucleotide in the IGS.
A gene of interest in a recombinant nucleic acid molecule of the first aspect may comprise a coding sequence. The gene of interest may further comprise an internal ribosome entry site (IRES) sequence.
In some embodiments, a recombinant nucleic acid molecule of the first aspect may be devoid of stop codons in frame with the coding sequence.
In a second aspect, the invention provides the use of a recombinant nucleic acid molecule as described herein in a method of making a circular RNA.
In a third aspect, the invention provides a method for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule as described herein, and b) circularizing the recombinant nucleic acid molecule.
In a fourth aspect, the invention provides a method for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the recombinant nucleic acid molecule.
In a fifth aspect, the invention provides a method for producing a circular RNA, comprising: a) identifying a gene of interest comprising a sequence capable of forming an eACA stem-loop, b) preparing a recombinant nucleic acid molecule comprising, in a 5’ to 3’ direction: an internal guide sequence (IGS) - a ribozyme - a sequence encoding the gene of interest, and c) circularizing the recombinant nucleic acid molecule.
In a sixth aspect, the invention provides a method for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule as described herein, and b) transcribing and circularizing the recombinant nucleic acid molecule, wherein circularization occurs co-transcriptionally.
In a seventh aspect, the invention provides a circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA does not comprise exogenous splicing sequences.
In an eighth aspect, the invention provides a circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA does not comprise RNA from a circularization agent.
In a ninth aspect, the invention provides a circular RNA obtainable by the methods described herein, wherein the circular RNA is less immunogenic compared to a circular RNA which comprises exogenous exon sequences.
In a tenth aspect, the invention provides a circular RNA obtainable by the methods described herein, wherein the circular RNA does not comprise exogenous exon sequences.
A circular RNA of any of the seventh to the tenth aspects may be produced by a method of any of the third to the sixth aspects.
In an eleventh aspect, the invention provides the use of a circular RNA as described herein in an in vitro method of expressing a protein in a cell.
In a twelfth aspect, the invention provides a method for expressing a gene of interest in a cell, the method comprising (a) circularizing a recombinant nucleic acid molecule as described herein to provide a circular RNA comprising the gene of interest, and (b) administering the circular RNA to the cell.
In a thirteenth aspect, the invention provides a method of treating a disease in a subject, the method comprising (a) circularizing a recombinant nucleic acid molecule as described herein to provide a circular RNA, and (b) administering the circular RNA to the subject.
In a fourteenth aspect, the invention provides a recombinant nucleic acid molecule as described herein, for use as a medicament.
In a fifteenth aspect, the invention provides a recombinant nucleic acid molecule as described herein, for use in a method of treating a disease in a subject.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 : Comparison of prior art approaches and trans ribozyme-based circularization (TRIC). (A) Schematic depicting splicing by the group I intron from cyanobacterium Anabaena (Ana), resulting in a linear molecule comprising joined exons. (B) Schematic depicting circularization of a gene of interest by the permuted intron-exon (PIE) method. (C) Schematic depicting splicing by a group I intron acting as a trans ribozyme, in which a target sequence is edited. (D) Schematic depicting circularization of a gene of interest using the TRIC approach.
Figure 2: (A) Schematic of the Ana tRNA-intron. (B) Initial version of TRIC (VO). The tRNA sequence of Ana has been retained (L15/R30). Insert: reaction steps of TRIC-VO. (C) (right of Figure) 12% and 6% denaturing PAGE to identify circular 3*Flag produced by TRIC-VO. (D) RT-PCR of in vitro transcription product, band I product and band II product. (E) Sanger sequencing of RT-PCR product.
Figure 3: (A) Schematic of TRIC-V1 (L15/R30) in which EGS and loop regions are present. (B) 6% denaturing PAGE was run with product of in vitro transcription of V1 variants. (C) RNase R digestion of the in vitro transcription product of TRIC-V1.
Figure 4: (A) Relative lengths of long genes of interest. (B) 0.8% native agarose gel run with in vitro transcription samples and circularized samples of TRIC-V1 . Expected band positions for circular products are labelled with black circles on the gel.
Figure 5: (A) Schematic of DNA template, RNA precursor and resultant circular RNA of the PIE-Ana approach. Sequence of the linear precursor is shown on the right. (B) Schematic of DNA template, RNA precursor and resultant circular RNA of the TRIC-V1 approach. Sequence of the linear precursor is shown on the right.
Figure 6: (A) Results of in vitro transcription performed at varying concentrations of Mg2+. (B) Full length precursors of the TRIC-V1 were subject to circularization for 10, 20, 40 min and separated in a 0.8% native agarose gel. (C) Circularization efficiency of TRIC-V1 for long GOIs. (D) Comparison of TRIC-V1 to the PIE-Ana method. Expected band positions for circular products are labelled with black circles on the gels.
Figure 7: (A) Schematic of TRIC-V1 with a short GOL (B) Upper panel and middle table show sequence information for TRIC-V1 . Lower table shows the right and left arm configurations of the tRNA sequence in TRIC variants 1 .30-1 .39.
Figure 8: (A) Schematic of TRIC-V2. (B) 6% denaturing PAGE showing circularization with V1 .30- V1 .39 variants and with V2.0, 2.1 and 2.2 variants. L/R configurations are shown under the gel. (C) The extended anticodon arm (eACA). As long as a stem loop structure can be found where the loop contains a uracil in 3rd position and the stem is >5bp, a circularization site can be assembled. Thus, circularization sites can be either placed in UTR or CDS. (D) Circularization of RNAs with the eACAs in CDS. Expected band positions for circular products are labelled with black circles on the gels.
Figure 9: (A) Comparison between TRIC-V1 and TRIC-V2 on 2% native agarose gel, with varying EGS and eACA stem lengths. (B, C) Comparison between PIE and TRIC-V2 on CVB3-EGFP, CVB3- Spike-EGFP and CVB3-Cas9-EGFP. (D) Michaelis-Menten fitting of the TRIC-V2 and the PIE for RNA circularization. Data presented as means + SDs; n = 3; **p < 0.01. (E) Comparison of efficiency, yield and nicking between TRIC-V1 , TRIC-V2 and PIE. Data presented as means + SDs; n = 3; **p < 0.01.
Figure 10: (A) Schematic overview of TRIC-V2 approach. (B) Legend for components of TRIC-V2.
Figure 11 : (A) Schematic of circular RNAs produced by TRIC-V2, including position of siRNA targeting sites (msiTS). Left: arrangement of the resultant circ GOIs: eACA (UTR)-msiTS-IRES-start codon-CDS-stop codon-PolyAC-eACA (UTR). The eACA is assembled in UTR of circ. GOIs. Right: arrangement of the resultant circ GOIs: eACA (CDS)-stop codon-PolyAC-msiTS-IRES-start codon- eACA (CDS). The eACA is assembled in CDS of circ. GOIs. A multiple siRNA target site (msiTS) is
introduced for siRNA targeting and to separate IRES from eACA. (B) Upper panel: HPLC profile of a circularized RNA sample. Lower panel: HPLC fractions were resolved in a 0.8% native agarose gel. (C) Two-step purified (HPLC and RNase R digestion) circular RNAs on a 6M-1 .5% urea-agarose gel. (D-F) Expression of INF-beta (D), IL6 (E) and CCL5 (F) by A549 cells. Data presented as means + SDs; n = 3; *p<0.05, **p < 0.01 , ***p<0.001 . (G) Expression of NIuc in HEK293 cells following transfection with linear or circular constructs. Data presented as means + SDs; n > 3; *p<0.05, **p < 0.01 . (H) Expression of NIuc following siRNA treatment. Data presented as means + SDs; n = 6; **p < 0.01.
Figure 12: Schematic depicting how a circular RNA can be provided from a gene of interest which comprises an extended anticodon arm sequence.
Figure 13: Schematic showing the positioning of circularization site in either UTR or CDS. (A) Overview for TRIC-V2, showing resulting circular constructs. (B) Exemplary sequences with eACA placed in UTR (SEQ ID NO: 61) or placed in CDS (SEQ ID NO: 62).
Figure 14: Schematic showing the different recombinant nucleic acid constructs used in the experiments described herein.
Figure 15: (A) schematic illustrating that when circRNAs contain only a CDS but no stop codons, ribosomes can translate such circRNA indefinitely and produce a polyprotein, a process known as the rolling circle translation (RCT). (B) schematic illustrating that when a 2A sequence is present, such as T2A or P2A, the polyprotein will be converted to protein monomers. (C) Secondary structure of the 373 nt CSFV IRES. Multiple stop codons are present in all possible frames: 4, 10, and 8 in frames 1-3, respectively. Frame 1 was cleaned of stop codons by three mutations (U52C, A171G, and A191G), and one deletion (d349U) to align with the CDS. (D) Equimolar amounts of circOR4F17-Nluc-RCT (SEQ ID NO: 78), circCSFV-Nluc (SEQ ID NO: 74), circCSFV-Nluc-RCT (SEQ ID NO: 75), V2 circNIuc, and N1 ^-modified linear NIuc were transfected into HEK 293F cells. 24 hours later, NIuc expression was monitored. (E) Western blot analysis of NIuc expression from either single shot translation or RCT using CSFV IRES. Actin was used as internal reference. Polyproteins were observed. Data in (D) are mean ± SD for 4 biological replicates. *p < 0.05, **p = 0.01 and ***p < 0.001 , unpaired two-tailed t-test
Figure 16: (A) schematic showing the original eACA that contains a 7 nt loop with a U (T) at the third position and a stem structure (a-0) and new constructs with different Loop length and U positions (b-1 to m-12). (B) Precursor RNAs were diluted to 400 ng/ul, denature at 95 °C for 2 min and placed on ice for at least 3 min. Then 1 ul of 10X splicing buffer was added to each tube contains 9 ul of refolded RNA precursors. 1 or 3 min later, add 2 ul 100 mM EDTA to each tube to stop the reaction. Then -500 ng of each RNA was loaded to a 1 .5% native agarose gel and run at 25W for 30min.
Figure 17: (A) Full length precursors of TRICv2 constructs with GU, or CA base pairs were circularized for 1 to 20min and analysed on a 1.2% native agarose gel. (B) Circularization of TRICv2 (C®A) was confirmed by Urea-Agarose gel.
Figure 18: (A-B) Comparison between V2 using either Ana group I intron or Tetrahymena thermophila group I intron (Tetra) and the Tetra-STS and the Tetra-Rzy constructs. CVB3 (coxsackievirus B3)- EGFP was cloned to the best Tetra-STS (AU-rich no. 16) and Tetra-Rzy (CVB3IRES-GFP) constructs. FL were circularized for designated time and analyzed on native agarose gels (a) or urea-agarose gels (b).
DETAILED DESCRIPTION
The inventors have determined that if a 5’ end of an RNA molecule is connected to a 3’ end of a transsplicing ribozyme, the splicing product is a circular RNA. The approach described herein is named “Trans Ribozyme-based Circularization”, or TRIC. TRIC uses an intact intron, joined upstream of the sequence to be circularized. This approach better enables the folding of the catalytic structural core compared to the PIE method, in which the intron is split in two. This in turn permits the circularization of longer sequences or genes of interest because there is less chance of interference between the gene of interest and the intron.
An important feature of the invention is the presence of an extended anticodon arm (eACA) sequence provided in the recombinant nucleic acid molecule to be circularized. This sequence provides a structure that is structurally similar to the anticodon arm found in Ana tRNA, comprising a stem and a loop, and enables circularization with greater efficiency compared to existing PIE methods and methods such as those described in WO2022/191642. Because the advantages of the eACA are dependent on structure, rather than specific nucleotide sequence, eACA sequences (that is, sequences capable of forming a stem-loop structure as described herein) may be readily identified in genes of interest so that they may be circularized without the need to add additional sequence or perform extensive mutations. Consequently, the recombinant nucleic acid molecules described herein may produce circular RNAs comprising only the gene of interest, without exogenous splicing sequences or unwanted spacer sequences, and may do so in a highly efficient way by utilising the propensity of the eACA sequence to form a stem-loop structure.
Abbreviations
ACA - anticodon arm
CDS - coding sequence
EGS - extended guide sequence elGS - extended internal guide sequence eACA - anticodon arm-like structure GOI - gene of interest
IGS - internal guide sequence
IVT - in vitro transcription
PIE - permuted intron-exon
RNA - ribonucleic acid
TRIC - trans ribozyme based circularization tRNA - transfer RNA
UTR - untranslated region
RCT - rolling circle translation
Recombinant nucleic acid molecules are provided herein for making circular RNAs. The nucleic acid molecules are generally linear prior to circularization. In general, the nucleic acid molecule for circularization comprises the gene of interest (GOI), which is the gene to be circularized, a ribozyme, capable of performing the circularization, and an extended anticodon arm (eACA) sequence, split into two portions.
The ribozyme may be any ribozyme capable of acting as a trans-splicing ribozyme. In some cases, the ribozyme is derived from or is a group I intron. Examples of suitable ribozymes are the Tetrahymena ribosomal intron, T4 phage thymidylate synthase intron, Anabaena (Ana) pre-tRNA intron, Azoarcus sp. BH72 lie tRNA intron, and Staphylococcus phage Twort ribonucleotide reductase intron. Sequences of these ribozymes are shown below, with the internal guide sequence (IGS) shown with underlined, shaded letters.
Ana group I intron:
AATAATTGAGCCTTAGAGAAGAAATTCTTTAAGTGGATGCTCTCAAACTCAGGGAAACCTAAATCT AGCTATAGACAAGGCAATCCTGAGCCAAGCCGAAGTAGTAATTAGTAAGTTAACAACAGATAACTT ACAGCTAATCGGAAGGTGCAGAGACTCGACGGGAGCTACCCTAACGTCAAGACGAGGGTAAAGA GAGAGTCCAATTCTCAAAGCCAATAGGCAGTAGCGAAAGCTGCGGGAGAATG (SEQ ID NO: 1)
T4 phage group I intron:
AATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACGGGGAACCTCTCTAGTAG ACAATCCCGTGCTAAATTGTAGGACTTGCCCTTTAATAAATACTTCTATATTTAAAGAGGTATTTAT GAAAAGCGGAATTTATCAGATTAAAAATACTTTAAACAATAAAGTATATGTAGGAAGTGCTAAAGAT TTTGAAAAGAGATGGAAGAGGCATTTTAAAGATTTAGAAAAAGGATGCCATTCTTCTATAAAACTT CAGAGGTCTTTTAACAAACATGGTAATGTGTTTGAATGTTCTATTTTGGAAGAAATTCCATATGAGA AAG ATTTG ATTATTG AACG AG AAAATTTTTG G ATTAAAG AG CTTAATTCTAAAATTAATG G ATACAA TATTGCTGATGCAACGTTTGGTGATACATGTTCTACGCATCCATTAAAAGAAGAAATTATTAAGAAA CGTTCTGAAACTGTTAAAGCTAAGATGCTTAAACTTGGACCTGATGGTCGGAAAGCTCTTTACAGT AAACCCGGAAGTAAAAACGGGCGTTGGAATCCAGAAACCCATAAGTTTTGTAAGTGCGGTGTTCG CATACAAACTTCTGCTTATACTTGTAGTAAATGCAGAAATCGTTCAGGTGAAAATAATTCATTCTTT AATCATAAGCATTCAGACATAACTAAATCTAAAATATCAGAAAAGATGAAAGGTAAAAAGCCTAGT AATATTAAAAAGATTTCATGTGATGGGGTTATTTTTGATTGTGCAGCAGATGCAGCTAGACATTTTA AAATTTCGTCTGGATTAGTTACTTATCGTGTAAAATCTGATAAATGGAATTGGTTCTACATAAATGC CTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTCGAAACGATAGACAACTTGCTTTAACA AGTTGGAGATATAGTCTGCTCTGCATGGTGACATGCAGCTGGATATAATTCCGGGGTAAGATTAA CGACCTTATCTGAACATAATG (SEQ ID NO: 2)
Tetrahymena ribosomal intron:
AAATAG CAATATTTACCTTTGGAGGGAAAAGTTATCAGGCATG CACCTG GTAG CTAGTCTTTAAAC CAATAGATTGCATCGGTTTAAAAGGCAAGACCGTCAAATTGCGGGAAAGGGGTCAACAGCCGTTC AGTACCAAGTCTCAGGGGAAACTTTGAGATGGCCTTGCAAAGGGTATGGTAATAAGCTGACGGA CATGGTCCTAACCACGCAGCCAAGTCCTAAGTCAACAGATCTTCTGTTGATATGGATGCAGTTCA CAGACTAAATGTCGGTCGGGGAAGATGTATTCTTCTCATAAGATATAGTCGGACCTCTCCTTAATG GGAGCTAGCGGATGAAGTGATGCAACACTGGAGCCGCTGGGAACTAATTTGTATGCGAAAGTAT ATTGATTAGTTTTGGAGTACTCG (SEQ ID NO: 3)
Azoarcus sp. BH72 He tRNA intron:
ATTTCGATGTGCCTTGCGCCGGGAAACCACGCAAGGGATGGTGTCAAATTCGGCGAAACCTAAG CGCCCGCCCGGGCGTATGGCAACGCCGAGCCAAGCTTCGGCGCCTGCGCCGATGAAGGTGTAG AGACTAGACGGCACCCACCTAAGGCAAACGCTATGGTGAAGGCATAGTCCAGGGAGTGGCGAAA GTCACACAAACCGG (SEQ ID NO: 4)
Staphylococcus phage Twort ribonucleotide reductase intron:
AAACAATTCTGCCCCCTATATTAGTAATAGTGTAGGTTAAAAAACTTCCTTAATTCATGGGAAATCT CCCTGACTTTTATTATAAATTTTGGTATAGTAAAATGAGAAGGAGATAATCATGAGCAAGATAACCA ATAGCAAAGAAACCCAAAAAGTACCTAACGCTACTAAAAGTATTTACCACCATATAAAAAGTAAAA GAAGGATGGAAGTCATTAAATCACTTAATGAATTGGTAATTATCTTGTGCAACGACTAGAGAAAAG ATAGTTTATTGTTACAGGCAGTAAATGAAGACTGAGTATCGTACACACAAGTGAGTGGAAACAGG AAGTATCCTAGAGTAACGACTAGGATAATGATATAGTCTGAACATTGTAGGTGACTACAAGAAGGT AAGGAGTAACGAACCTTATCGTAACATAATTG (SEQ ID NO: 5)
Recombinant nucleic acid molecules described herein also generally comprise an internal guide sequence (IGS). The IGS may be part of the ribozyme, such as part of the group I intron. Generally, this enables the TRIO approach to use an intact intron. However, it is possible to use a truncated ribozyme sequence in the TRIO method, with the native IGS removed and replaced by a different IGS to that which would normally be present.
The function of the IGS is to base pair with the end regions of the gene of interest, in order to bring them into proximity with each other so that circularization may occur. The IGS binds through complementary base pairing to both first and second portions of the eACA sequence, which are located at either end of the GOI to be circularised.
Extended anticodon arm (eACA) sequence
An extended anticodon arm (eACA) sequence is one which is capable of forming a stem-loop structure (also known as a hairpin or hairpin loop). Stem-loop structures form when two regions of single-stranded RNA which are generally complementary to each other (when read in opposite directions) base-pair with each other. The base-pairing results in a double helix structure ending in an unpaired loop. The natural propensity of eACA sequences to form stem-loop structures may be utilised to enable circularization of a gene of interest, as shown in Figure 12.
Prior to circularization, the linear recombinant nucleic acid molecule comprises an eACA sequence in two separate portions. A first portion of the eACA sequence is positioned at or near the 5’ end of the gene of interest, and a second portion of the eACA sequence is positioned at or near the 3’ end of the gene of interest, as shown in Figure 10A (top panel). During circularization, splicing by the trans-
ribozyme causes the first and second portions of the eACA sequence to be covalently joined, in order to create a circular version of the gene of interest. In the resulting circular molecule, the first and second portions are joined to form the eACA sequence, which is generally capable of forming a stemloop structure as shown in Figure 8C, Figure 10A, and Figure 12.
The first portion of the eACA sequence may comprise a first eACA stem portion and a first eACA loop portion. The second portion of the eACA sequence may comprise a second eACA stem portion and a second eACA loop portion. By “stem portion” it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the stem of a stem-loop structure. By “loop portion” it is meant a part of the first (or second) portion of the eACA sequence that is capable of forming the loop of a stem-loop structure. Figure 12 shows how a stem-loop forming structure can be identified in a gene of interest, and subsequently used for circularizing the RNA. Thus, the stem and loop portions of the eACA sequence are capable of forming a stem-loop structure in the recombinant nucleic acid molecules described herein.
The specific nucleotide sequence of the eACA sequence is not important for TRIC and does not determine whether circularization will occur. Rather, it is the structure as opposed to the sequence of the eACA that is important. Consequently, the eACA may be of any nucleotide sequence, provided that the last nucleotide in the second eACA loop portion is one which can form a wobble base pair with a corresponding nucleotide in the internal guide sequence described herein. The last nucleotide in the second eACA loop portion may be a uracil that forms a wobble base pair with a corresponding guanine in the internal guide sequence. Alternatively, the last nucleotide in the second eACA loop portion may be a cytosine that forms a wobble base pair with a corresponding adenine in the internal guide sequence.
The first and second eACA stem portions may be complementary to each other, though this is not necessary.
The first and second eACA stem portions are generally each at least 5 nucleotides in length but may be a short as 1 nucleotide in length each. The stem portion lengths may be adapted depending on the gene of interest to be circularized. For example, longer stem lengths (such as lengths greater than 15 nt) may be advantageous if circularizing long (>500 nt) genes of interest.
Thus, the first and second stem portions may each be at least 1 , at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, at least 25 or at least 30 nucleotides in length. In particular, the first and second stem portions may each be at least 15 or at least 25 nucleotides in length. For example, the first and second eACA stem portions may be 1 to 50 nucleotides in length, for example 5 to 40 nucleotides.
The first and second stem portions need not be the same length, for example one stem portion may be one or two nucleotides shorter than the other, provided a stem-loop structure can still be formed.
The anticodon arm loop of the Ana tRNA group I intron is naturally 7 nucleotides in length. Consequently, in the circular RNAs described herein, the loop of the stem-loop structure may be 7 nucleotides in length, particularly when the ribozyme used is, or is derived from, the Ana group I intron. If other group I introns are used, the loop of the stem-loop structure may have a different nucleotide length. The loop of the stem-loop structure may be between 3 to 40 nucleotides in length and is generally between 3 and 10 nucleotides in length. In many cases, the loop of the stem-loop structure is at least 3 nucleotides in length, at least 4 nucleotides in length or at least 5 nucleotides in length, particularly if the ribozyme is or is derived from the Ana group I intron. In some embodiments, the loop of the stem-loop structure is generally between 5 and 11 nucleotides in length.
Generally, if the loop of the stem-loop structure comprises 7 nucleotides, then the first eACA loop portion comprises 4 nucleotides, and the second eACA loop portion comprises 3 nucleotides.
Accordingly, the first and second eACA stem portions may each be at least 15 nucleotides in length, the first eACA loop portion may be 4 nucleotides in length, and the second eACA loop portion may be 3 nucleotides in length. In the resulting circular RNA molecule, this produces an eACA sequence which is capable of forming a stem-loop structure, wherein the stem comprises at least 15 base pairs, and the loop is 7 nucleotides in length.
In total, the first portion of the eACA sequence may comprise as few as 5 nucleotides (for example, 1 stem nucleotide and 4 loop nucleotides). The second portion of the eACA sequence may comprise as few as 4 nucleotides (for example, 1 stem nucleotides and 3 loop nucleotides). Alternatively, the first portion of the eACA sequence may comprise, for example, 19 nucleotides (e.g. 15 stem nucleotides and 4 loop nucleotides), or 29 nucleotides (e.g. 25 stem nucleotides and 4 loop nucleotides). The second portion of the eACA sequence may comprise, for example, 18 nucleotides (e.g. 15 stem nucleotides and 3 loop nucleotides), or 28 nucleotides (e.g. 25 stem nucleotides and 3 loop nucleotides).
An exemplary first portion of the eACA sequence comprising a 1 nucleotide stem and a 4 nucleotide loop may comprise the nucleotide sequence 5 -NNNNN-3’, wherein N is any nucleotide, and an exemplary second portion of the eACA sequence comprising a 1 nucleotide stem and a 3 nucleotide loop may comprise the nucleotide sequence 5’-NNNU-3’, wherein N is any nucleotide. As an example, in the case of Nano Luciferase (NLuc), a possible ACA sequence is GATCACCACTTTAAGGTGATC (SEQ ID NO: 58). For circularization, the NLuc GOI is rearranged such that the ACA sequence is provided in two portions (a 5’ first portion and a 3’ second portion). The first portion of the eACA sequence comprises the sequence TTAAGGTGATC (SEQ ID NO: 59). The second portion of the eACA sequence comprises the sequence GATCACCACT (SEQ ID NO: 60).
It will be realised that the provision of a thymine in a DNA precursor molecule will result in a uracil in the translated RNA molecule for circularization.
The first and second eACA loop portions base pair with the internal guide sequence (IGS) to form the P1 and P10 regions, which are critical for ribozyme activity. The first eACA loop portion, positioned towards the 5’ end of the gene of interest, base pairs with the IGS to form the P10 region. It is not necessary for all the nucleotides in the first eACA loop portion to form the P10 region, and in some cases only two nucleotides of the first eACA loop portion form the P10 region. The second eACA loop portion, positioned towards the 3’ end of the gene of interest, base pairs with the IGS to form the P1 region.
As described above, in order for circularization to occur, the last nucleotide of the second eACA loop portion (i.e. the nucleotide at the 3’ end of the second eACA loop portion) may form a wobble base pair with a corresponding nucleotide in the IGS. Generally, the wobble base pair is a GU wobble base pair, with G in the IGS and U in the second eACA loop portion or an AC wobble base pair with A in the IGS and C in the second eACA loop portion. The wobble base pair provides the circularization site, such that once circularized, the nucleotide at the 3’ end of the second eACA loop portion forms the third nucleotide in the loop of the eACA stem-loop structure. This is depicted in Figure 8C.
The P1 region may also be formed by base pairing of the IGS with a region adjacent to the second eACA loop portion in the 3’ direction, known as the “P1 extension”. If present, the P1 extension typically comprises between 2 and 4 nucleotides, which base pair with the IGS. The P1 region may therefore be formed by the P1 extension and second eACA loop portion base pairing with the IGS, as shown in Figure 10A. Consequently, in some embodiments, the second portion of the eACA sequence and the P1 extension together are capable of forming a P1 region. If the P1 extension is not present, the P1 region is formed by only the second eACA loop portion base pairing with the IGS. P1 extensions have been described in Olson & Muller (2012) RNA 18:581-589. The contents of which are incorporated herein by reference. Generally, if an extended guide sequence (EGS) is used, the P1 extension region will be present.
A particular advantage of the TRIC method is that it may utilise eACA sequences which are already present in the gene of interest. For example, if a stem-loop forming eACA sequence can be found in a gene of interest, this gene may be circularized efficiently without introducing any additional sequences. This in turn means that the resulting circular RNA is far less likely to be immunogenic. This is depicted in Figure 12, which shows first the identification of an eACA sequence (i.e. a stem-loop forming structure) in a gene of interest. Subsequently, the gene of interest is rearranged such that the eACA sequence is split into two portions, one at each end of the gene of interest. This rearranged gene is then cloned into the TRIC construct for circularization. An example of this is the protein coding circular RNA T2A Nano Luciferase. This circular RNA already comprises an eACA sequence in its natural
sequence. This means a circularization site can be introduced using the naturally occurring eACA sequence, without the need to perform mutations or introduce additional sequence.
If the natural coding sequence (CDS) does not contain an eACA sequence, it is possible to perform selective mutations to create one. Codon redundancy means that mutations may be made to the nucleotide sequence of the GOI without affecting the resulting peptide sequence. Consequently, an eACA sequence may be provided in the GOI without requiring the introduction of additional sequences. Instead, only selective mutation of the existing sequence is needed, following the rules of codon redundancy. Examples of this are the T2A-EGFP and circZNF609 circular RNAs described in Example 5, in which mutations are introduced based on codon redundancy to introduce the circularization site.
If there is no eACA in a non-coding RNA, or if the circularization site is placed in an untranslated region (UTR) of a protein-coding RNA, the circularization site may be created by introducing additional nucleotides. For example, as shown in Figure 8C, 5 nucleotides (light grey nt) may be introduced to create a stem portion of the eACA sequence, using the existing sequence (black nt) of the GOI to provide the remainder of the eACA.
Accordingly, in the recombinant nucleic acids described herein, the first and/or second portions of the eACA sequence may naturally occur in the gene of interest. In other words, they may be part of the gene of interest and so are present without having to mutate the existing sequence or introduce additional sequence.
Alternatively, in the recombinant nucleic acids described herein, all or part of the eACA sequence may be derived from human ribosomal RNA (rRNA). The use of human rRNA has the potential to provide circular RNAs which are less immunogenic.
The location of the first and second portions of the eACA sequence in the gene of interest has little impact on circularization. One portion of the eACA sequence may be identified in or placed in a coding sequence, whilst the other may be identified in or placed in an untranslated region. It is not necessary for both portions to be in the coding sequence, for example, or for both portions to be in the untranslated region.
Extended guide sequence (EGS)
The recombinant nucleic acids described herein may further comprise an extended guide sequence (EGS), in particular a first EGS and a second EGS which are capable of complementary base pairing to each other. The function of the EGS is to increase the length of the complementary base-pairing region at the two ends of the recombinant nucleic acid molecule. In this way, the EGS may be included to compensate for a shorter IGS, in particular when longer (>500 nt) GOIs are circularized.
The recombinant nucleic acids described herein may comprise a first EGS positioned 5’ of the IGS. The recombinant nucleic acids described herein may comprise a second EGS positioned 3’ to the second portion of the eACA sequence. Generally, there is a loop sequence situated between the first EGS and the IGS, as described in more detail below. Similarly, there is also generally a loop sequence situated between the second EGS and the second portion of the eACA sequence.
The first and second EGS may be partly or fully complementary to each other. Generally, mismatches are tolerated well and do not materially affect circularization. In the recombinant nucleic acids described herein, the first EGS may be at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% complementary to the second EGS.
If present, the first and second EGS may each be between 1 and 500 nucleotides in length. For example, the first and second EGS may each be between 10 and 50 nucleotides in length. The first and second EGS may each be 20, 30, or 40 nucleotides in length.
An exemplary first EGS sequence is GGUCAAUCGGUUGGCUUCCG (SEQ ID NO: 56).
An exemplary second EGS sequence is CGGAAGCCAACCGAUUGACC (SEQ ID NO: 57).
Loops
Lengthening the base paired P1 region may have a deleterious effect on circularization. To avoid this, the recombinant nucleic acids described herein may further comprise loop sequences, such as a first loop sequence and a second loop sequence.
The first and second loops may act as spacers between, at the 5’ end, the internal guide sequence (IGS) and the first extended guide sequence (EGS), and at the 3’ end, the P1 region and the second EGS. The loop sequences are preferably not complementary to each other, such that there is little or no base pair interaction between the first and second loop sequences. Because of the low or noncomplementarity between the two loop sequences, the base-paired P1 region remains at a fixed length. The first loop may alternatively be described herein as “left loop”. The second loop may alternatively be described herein as “right loop”.
The first and second loop sequences may each be between 1 and 10 nucleotides in length. It is not necessary for the first and second loop sequences to have the same number of nucleotides, and in fact the TRIC method works well when the first and second loop sequences are different lengths. The first loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length. The second loop sequence may be 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 nucleotides in length. A preferred combination is a 6 nucleotide first loop sequence and a 5 nucleotide second loop sequence. Another preferred combination is a 3 nucleotide first loop sequence and a 2 nucleotide second loop sequence.
The first loop sequence is positioned 3’ to the first EGS and 5’ to the IGS, in other words, in between the first EGS and the IGS. The second loop sequence is positioned 3’ to the second portion of the eACA sequence, and 5’ to the second EGS, in other words, between the second portion of the eACA sequence and the second EGS. If the P1 extension region is present, the second loop sequence is 3’ to the P1 extension.
It is not always necessary to have two loop sequences. The recombinant nucleic acids described herein may comprise only a first loop sequence, positioned in between the first EGS and the IGS, without a second loop sequence. Alternatively, the recombinant nucleic acids described herein may comprise only the loop sequence positioned between the second portion of the eACA sequence (or, if present, P1 extension) and the second EGS.
An exemplary sequence for the first loop sequence is AAATAA (SEQ ID NO: 54).
An exemplary sequence for the second loop sequence is ACACC (SEQ ID NO: 55).
Gene of interest (GOD
The gene of interest (GOI) refers to the sequence which is to be circularized. The GOI may comprise a coding sequence, coding for a peptide or protein, or may be a noncoding sequence. The GOI may also comprise a combination of coding and noncoding sequence. It will also be understood that, as used herein, the term “gene of interest” encompasses sequences which include additional sequence elements, for example, a translation initiation element, such as an internal ribosome entry site (IRES) sequence, multiple siRNA target sites (msiTS), spacer sequences such as polyAC sequences, start codons, stop codons, and any other sequence elements known to be useful in the art for producing circular RNA. For example, where an eACA is located in a coding sequence to be circularized, the GOI may comprise, in the 5’ to 3’ direction: a stop codon, a polyAC sequence, multiple siRNA target sites (msiTS), an IRES, a start codon, and the coding sequence including the eACA. Where an eACA is placed in an untranslated region of a sequence to be circularized, the GOI may comprise, in the 5’ to 3’ direction: multiple siRNA target sites (msiTS), an IRES, a start codon, a coding sequence, a stop codon, a polyAC sequence, and the eACA. See, for example, Figures 10A and 10B.
Suitable translation initiation elements include internal ribosome entry site (IRES) sequences. IRESs for use in the invention may include viral IRESs, such as the Coxsackievirus B3 (CVB3), Cafeteria roenbergensis Virus (CroV), or Classical Swine Fever Virus (CSFV) IRES, the DNA sequences of which are set out below.
CVB3-IRES
TTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCATTGGGCGCTAGCACTCTGGTATCACG GTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAACACACACCGATC AACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATC AATAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCT
AGTAACACCGTGGAAGTTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGA GTCACCGCATTCCCCACGGGCGACCGTGGCGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAA CCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTGAGCTAGTTGGTAGTCCTCCG GCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGTGTGTCG TAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTG GCTGCTTATGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTA ATAGAGCTATTATATATCCCTTTGTTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACA ATTCATTGTTAAGTTGAATACAGCAAA (SEQ ID NO: 71)
CroV-IRES:
GTATAAGAGACAGGTGTTTGCCTTGTCTTCGGACTGGCATCTTGGGACCAACCCCCCTTTTCCCC AGCCATGGGTTAAATGGCAATAAAGGACGTAACAACTTTGTAACCATTAAGCTTTGTAATTTTGTA ACCACTAAGCTTTGTGCACATAATGTAACCATCAAGCTTGTTAGTCCCAGCAGGAGGTTTGCATG CTTGTAGCCGAAATGGGGCTCGACCCCCCATAGTAGGATACTTGATTTTGCATTCCATTGTGGAC CTGCAAACTCTACACATAGAGGCTTTGTCTTGCATCTAAACACCTGAGTACAGTGTGTACCTAGAC CCTATAGTACGGGAGGACCGTTTGTTTCCTCAATAACCCTACATAATAGGCTAGGTGGGCATGCC CAATTTGCAAGATCCCAGACTGGGGGTCGGTCTGGGCAGGGTTAGATCCCTGTTAGCTACTGCC TGATAGGGTGGTGCTCAACCATGTGTAGTTTAAATTGAGCTGTTCATATACC (SEQ ID NO: 72)
CSFV-IRES:
GTATACGAGGTTAGTTCATTCTCGTATACACGATTGGACAAATCAAAATTATAATTTGGTTCAGGG CCTCCCTCCAGCGACGGCCGAACTGGGCTAGCCATGCCCATAGTAGGACTAGCAAACGGAGGG ACTAGCCGTAGTGGCGAGCTCCCTGGGTGGTCTAAGTCCTGAGTACAGGACAGTCGTCAGTAGT TCGACGTGAGCAGAAGCCCACCTCGAGATGCTACGTGGACGAGGGCATGCCCAAGACACACCTT AACCCTAGCGGGGGTCGCTAGGGTGAAATCACACCACGTGATGGGAGTACGACCTGATAGGGC GCTGCAGAGGCCCACTATTAGGCTAGTATAAAAATCTCTGCTGTACATGGCAC (SEQ ID NO: 73)
In some preferred embodiments, a viral IRES may be modified to remove stop codons in open reading frames. Suitable modified viral IRESs include modified CSFV IRESs. A modified CSFV IRES may for example comprise the DNA sequence of SEQ ID NO: 96; nucleotides 324 to 696 of SEQ ID NO: 75; or nucleotides 596 to 968 of SEQ ID NO: 77. Modified IRESs may be useful in the rolling circle translation of the circular RNA as described herein.
The TRIO method is suitable for genes of interest of any length. TRIO is particularly suitable for long genes of interest. In this context, “long” is generally considered to mean a sequence of at least 500 nucleotides. Thus, the gene of interest may be at least 100, at least 250, at least 500, at least 1000, at least 2000, at least 3000, at least 4000, at least 5000, at least 6000, at least 7000, or at least 8000 nucleotides in length.
Thus, recombinant nucleic acid molecules for making a circular RNA as described herein may comprise, in the 5’ to 3’ direction: a) an internal guide sequence (IGS), b) a ribozyme, c) a first portion of an extended anticodon arm (eACA) sequence, d) a gene of interest, and e) a second portion of the eACA sequence, wherein a nucleotide in the second portion of the eACA sequence forms a wobble base pair with a nucleotide in the IGS.
In some embodiments, recombinant nucleic acids described herein may comprise, in a 5’ to 3’ direction: a) a first EGS, b) a first loop sequence, c) an internal guide sequence (IGS), d) a ribozyme, e) a first portion of an extended anticodon arm (eACA) sequence, f) a gene of interest, g) a second portion of the eACA sequence, h) a P1 extension, i) a second loop sequence, and j) a second EGS.
Where the gene of interest to be circularized is long (>500 nt), the first and second eACA stem portions may be 15 or 25 base pairs and the loop of the resulting stem-loop structure may be 7 nucleotides.
Further components
The recombinant nucleic acids described herein may further comprises elements to facilitate the transcription or circularization process. For example, the recombinant nucleic acids described herein may further comprise a T7 high efficiency sequence, one or more restriction enzyme cleavage sites, and/or a poly(A) tail. If the recombinant nucleic acid is a DNA template, it may further comprise a T7 promoter sequence. A recombinant nucleic acid may further comprise a nucleotide sequence encoding a self-cleaving peptide to ensure the production of monomeric protein during rolling circle amplification. Such additional elements are known in the art.
In other embodiments, the recombinant nucleic acids described herein may lack stop codons. For example the recombinant nucleic acids may be engineered or modified to remove stop codons in open reading frames. This may facilitate rolling circle amplification.
Methods for producing a circular RNA
Generally, methods for producing a circular RNA comprise the provision of a linear recombinant nucleic acid molecule, such as those described herein, and the splicing of that molecule to generate a circular RNA.
Thus, methods for producing a circular RNA may comprise: a) providing a recombinant nucleic acid molecule as described herein, and b) circularizing the recombinant nucleic acid molecule.
It will be appreciated that the term “recombinant nucleic acid molecule” may refer to either a DNA template molecule or an RNA precursor.
Suitable protocols for circularization are known in the art, for example as described in Puttaraju & Been (1992) Nucleic Acids Res 20, 5357-5364; and Wesselhoeft, Kowalski & Anderson (2018) Nat Commun 9, 2629. The contents of which are incorporated herein by reference.
In some cases, the recombinant nucleic acid molecule is a linear DNA template molecule. In a first part of the method, in vitro transcription is performed using the DNA template molecule to obtain a linear RNA precursor molecule. Subsequently, a circularization/splicing step is performed with the RNA precursor, to generate a circular RNA molecule.
Thus, methods for producing a circular RNA may comprise: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor.
In the above methods, some splicing (and consequently circularization) may occur during transcription, known as co-transcriptional splicing. In some cases, it is preferable to suppress splicing during the transcription step, so that splicing and circularization occurs only post-transcriptionally. Thus, the methods described herein may additionally comprise the suppression of splicing (and circularization) during step (b) (the transcription step).
Although magnesium ions are generally essential for in vitro transcription and splicing to occur, co- transcriptional splicing/circularization can be suppressed by performing the transcribing step at low concentrations of magnesium ions and with an excess of nucleoside triphosphates (NTPs). NTPs can act to chelate Mg2+ during transcription. Thus, the methods described herein may comprise performing step (b) in the presence of NTPs at a concentration of approximately 24 mM, and Mg2+ at a concentration less than 18 mM, less than 16 mM, less than 14 mM, or less than 12 mM. In some cases, the transcribing step is performed in the presence of nucleoside triphosphates (NTPs) at a
concentration of approximately 24 mM and Mg2+ at a concentration of 16 mM or less. If the concentration of NTPs is more or less than 24 mM, the concentration of Mg2+ may also vary.
Methods for producing a circular RNA may therefore comprise: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor; wherein step (b) is performed in the presence of Mg2+ at a concentration of 16 mM or less and NTPs at a concentration of approximately 24 mM.
In other cases, suppression of splicing during transcription is not necessary to produce circular RNAs. Thus, described herein are methods for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule as described herein, and b) transcribing and circularizing the recombinant nucleic acid molecule, wherein splicing occurs co-transcriptionally.
As described above, an advantage of the TRIC method is that it can be used to circularise genes of interest that naturally comprise a sequence that is capable of forming an eACA stem-loop. This enables a circular RNA to be created that does not contain exogenous sequence material (such as exon sequences), which are frequently immunogenic, and so provides clear benefits over existing methods such as PIE.
Thus, methods for producing a circular RNA may comprise: a) identifying a gene of interest comprising a sequence capable of forming an eACA stem-loop, b) preparing a recombinant nucleic acid molecule comprising, in a 5’ to 3’ direction: an internal guide sequence (IGS) - a ribozyme - a sequence encoding the gene of interest, and c) circularizing the recombinant nucleic acid molecule.
Step (b) above may comprise rearranging the gene of interest to place a first portion of the sequence capable of forming an eACA stem-loop at the 5’ end, and a second portion of the sequence capable of forming an eACA stem-loop at the 3’ end.
Suitable genes of interest comprising a sequence capable of forming an eACA stem-loop may be identified by the skilled person using approaches known in the art, including the use of software such as RNAFold (University of Vienna).
Also described herein is the use of any of the recombinant nucleic acid molecules described herein in methods for producing a circular RNA.
Methods for producing a circular RNA may also comprise the expression of the recombinant nucleic acid molecule in a cell, with subsequent circularization being performed in the cell.
Circular RNAs
Circular RNAs can be generated using the recombinant nucleic acid molecules described herein. The circular RNAs comprise a sequence encoding a gene of interest. One advantage of the described circular RNAs is that in some cases, they do not comprise exogenous splicing sequences, such as the circular RNAs obtained with the PIE method, which typically comprise exogenous exon sequences.
By “exogenous”, it is meant any sequence information that is not naturally present in the gene of interest. Circular RNAs described herein may not comprise RNA from a circularization agent. For example, circular RNAs described herein may not contain any RNA from a ribozyme, in particular from a group I intron. In some cases, the circular RNAs described herein may comprise only the gene of interest. This in turn reduces the immunogenicity of the circular RNAs compared to circular RNAs generated using methods of the art (such as the PIE method). The circular RNAs described herein may be less immunogenic compared to circular RNAs (e.g. those encoding the same GOI) which comprise exogenous exon sequences. By less immunogenic, it is meant that transfection with the circular RNA produces less of an immune response by the host cell, for example, reduced production of cytokines, chemokines or other immune signalling molecules. Suitable methods for determining immunogenicity are known in the art. For example, immunogenicity can be determined by measuring the production of immune factors (cytokines, chemokines, etc) in transfected cells for a period of time following transfection.
In the circular RNAs described herein, the sequence encoding the gene of interest may comprise a sequence capable of forming an eACA stem-loop structure as described herein. For example, a sequence capable of forming an eACA stem-loop structure with a stem of at least 5 base pairs, and a loop of 7 nucleotides, wherein the third nucleotide of the loop in the 5’ to 3’ direction is uracil. The sequence capable of forming an eACA stem-loop structure may be naturally occurring in the gene of interest, or may have been introduced prior to circularization.
Also described herein are circular RNAs obtainable by the methods disclosed herein. For example, described herein is a circular RNA obtainable by a method comprising: a) providing a recombinant nucleic acid molecule as described herein, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor.
Uses of circular RNAs/precursors
The recombinant nucleic acids described herein may be useful in therapeutic methods or as a medicament. Circular RNAs obtained from the recombinant nucleic acids described herein may be used in various ways to exert a therapeutic effect. For example, circular RNAs can act as a “sponge” for micro RNAs (miRNA), in turn preventing or reducing the degradation of a target mRNA by the
miRNA. Circular RNAs may also influence protein trafficking and subcellular protein localisation. The specific therapeutic effect will be determined by the gene of interest, including whether the gene of interest is a coding or noncoding sequence.
In addition, circular RNAs can also be used to drive expression of a gene of interest in vivo or in vitro. Thus, described herein are methods for expressing a gene of interest in a cell, in which a recombinant nucleic acid molecule as described herein is circularized to provide a circular RNA comprising the gene of interest, and the circular RNA is administered to the cell.
Rolling Circle Translation
Circular RNAs obtained from the recombinant nucleic acid molecules as described herein may be used to drive expression of a gene of interest through rolling circle amplification. Suitable circular RNAs may lack in-frame stop codons. For example, a circular RNA may comprise a modified viral IRES as described herein which lacks in frame stop codons. In some embodiments, the circular RNA may be translated continuously multiple times by a ribosome to generate polyproteins. In other embodiments, a suitable circular RNA may further comprise a DNA sequence encoding a self-cleaving peptide. The self-cleaving peptide causes ribosomal skipping during translation of the circular RNA by a ribosome to generate monomeric proteins. Suitable self-cleaving peptides are well-known in the art and include 2A peptides, such as T2A, P2A, E2A and F2A.
Methods of Treatment
Described herein are methods for treating a disease in a subject, the methods comprising (a) circularizing a recombinant nucleic acid molecule as described herein to provide a circular RNA, and (b) administering the circular RNA to the subject.
Also described herein are recombinant nucleic acid molecules such as those described herein for use as a medicament; recombinant nucleic acid molecules as described herein for use in a method of treating a disease in a subject; circular RNA obtained from a recombinant nucleic acid molecule as described herein for use as a medicament; and circular RNA obtained from a recombinant nucleic acid molecule as described herein for use in a method of treating a disease in a subject
Also described herein are pharmaceutical compositions comprising a circular RNA or a or recombinant nucleic acid molecule as described herein, and a pharmaceutically acceptable excipient.
The terms " comprising" or “comprises” may be substituted with the terms " consisting of, “consists of, “consisting essentially of or “consists essentially of and vice versa, wherever they occur herein.
It is to be understood that the application discloses all combinations of any of the above aspects and embodiments described above with each other, unless the context demands otherwise. Similarly, the
application discloses all combinations of the preferred and/or optional features either singly or together with any of the other aspects, unless the context demands otherwise.
Modifications of the above embodiments, further embodiments and modifications thereof will be apparent to the skilled person on reading this disclosure, and as such, these are within the scope of the present invention.
All documents and sequence database entries mentioned in this specification are incorporated herein by reference in their entirety for all purposes.
“and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. For example, “A and/or B” is to be taken as specific disclosure of each of (i) A, (ii) B and (iii) A and B, just as if each is set out individually herein.
EXAMPLES
The invention will now be described with reference to the following non-limiting examples.
Materials & Methods
In vitro transcription (IVT) protocol:
DNA templates were linearized and cleaned by phenol:chloroform:isoamyl alcohol extraction. IVT was performed at 50 ng/ul of DNA template, 14 ug/ul of homemade T7 polymerase, 0.04 U/ul of RNase inhibitor (Promega), 6 mM of each NTPs and 1X IVT buffer. For IVT where co-transcriptional splicing is allowed, 1X IVT buffer contains 80 mM HEPES-K (pH7.5), 2 mM spermidine, 40 mM DTT and 24 mM MgCI2. When co-transcriptional splicing is to be suppressed, the concentration of MgCI2 in 1X IVT buffer is 14 mM. IVT reactions were incubated at 37 °C for 3-5h and then digested by RNase-free DNase I for 20min. Afterwards, 100 mM EDTA was added to a concentration of 25 mM to clear any precipitation. Then equal volume of 7.5 M lithium chloride is added to precipitate RNAs for 30 min to overnight at -20 °C. Then precipitations were spun down at 13000 rpm/min for at least 20min. RNA pellets were washed by 75% alcohol, air dried and dissolved in DEPC treated H2O.
Circularization protocol:
3*Flag circular RNAs were circularized during IVT. For co-transcriptional circularization of EGFP, Fluci, Spike, Cas9 and Factor 8, DNA template digested IVT reactions were supplied with extra GTP at 2 mM concentration and heated at 55 °C for 20min. For full-length precursors, 9 ul of RNAs were firstly denatured at 95 °C for 2 min and then annealed on ice for 3 min. Then annealed RNAs were supplied with 1 ul of 10X circularization buffer (500 mM Tris-HCI, pH=7.4, 100 mM MgCI2, 10 mM DTT, 20 mM GTP) and heated at 55 °C for designated time.
Michaelis-Menten fiting:
Concentration of ribozyme was fixed at 1 uM and concentration of GTP was varied from 1 uM to 2000 uM. At each concentration of GTP, time course of circularization of TRIC-V2 and PIE were monitored. Then time for 50% circularization completeness (t1/2) for each sample was calculated and used for estimation of initial circularization speed of each construct at each GTP concentration (Vobs). Vobs was then plotted against GTP concentration to calculate kinetics parameters of the TRIC-V2.
Calculation of circularization efficiency.
To measure efficiency, ratio of yield and nicking, samples were loaded to native agarose gels, and intensity of full length, circular RNA and nicked RNAs were measured in image J. Efficiency equals the percentage of full-length precursor that has been converted to circular RNAs. Yield is calculated by dividing total RNA with circular RNA. Nicking rate equals the ratio between nicked RNA and circular RNA.
RNase R digestion :
To do RNase R digestion, IVT of TRIC-V1 .0 was precipitated by 7.5 M LiCI and dissolved in DEPC treated water. Then 5ug of RNA was digested with 10 U of RNase R (antibodies-online) at 37 °C for 15min. Digested RNAs were loaded to denaturing PAGE.
RT-PCR:
Reverse transcriptase and DNA polymerase used here are the SuperScrip IV Reverse Transcriptase (Thermo Fisher) and the Q5 High-Fidelity DNA Polymerase (NEB). Manufacturer’s manuals were followed for reverse transcription and PCR. The IVT sample, RNA I and II were used as templates for reverse transcription and PCR using the using primers indicated in Figure 2B.
Circular RNA purification:
A two-step strategy was used to purify circular RNAs. First, circular RNA samples were injected into an SRT-2000 SEC column (Sepax) operated on an AKTA pure system (Cytiva). The running buffer contained 10 mM Tris-HCI (pH=6.5) and 1 mM EDTA. Then fractions containing circular RNAs were pooled, digested by RNase R (antibodies-online, 0.5U/ug) at 37 °C for 1 hour and cleaned using the RNA clean & concentrator kit (ZYMO RESEARCH).
Cell transfection, RT-qPCR, Niue expression and siRNA knocking down:
A549 and HEK 293F cells were cultured in DMEM (with 10% FBS, High Glucose GlutaMAX, Life Technologies Ltd) and Freestyle (Gibco) medias, respectively.
For immunogenicity studies, 200 ng of each RNA was transfected into 100,000 A549 cells in 24-well plates using the MessengerMax transfection reagents (Invitrogen). 24 hours later, cells were harvested, and total RNAs were extracted by TRIzol (Invitrogen) and the RNA clean & concentrator kit. Then, 300 ng of each total RNA was reverse transcribed by the Superscript IV reverse
transcriptase (Invitrogen) using random hexamer plus 20 mer oligo dT (Invitrogen). The reverse transcription products were 20 times diluted and used as templates for qPCR, primers are listed in Table 1 (ViiA 7, Thermo Fisher).
For NIuc expression and siRNA knocking down studies, 50 ng of circular NIuc and an equal molar amount of linear mRNA were transfected into 10,000 cells in 96-well plates using the MessengerMax transfection reagent. For protein expression, 5ul of cells were taken for luciferase assay using the Nano-Gio® Luciferase Assay System (Promega). For siRNA knocking down, siRNAs were transfected into cells using the RNAiMax (Invitrogen) on day 3. 5 hours after siRNA transfection, expression of NIuc was measured using the above kit.
Statistical analysis
The t-Test-Two-Sample Assuming Unequal Variances model was used for statistical analysis. Details can be found in Figure legends.
Example 1 - TRIC
The inventors selected the tRNALeu intron from the cyanobacterium Anabaena (Ana) for the initial test of the TRIC approach. This first construct was designated TRIC-V0.
Ana is short (249 nt) but highly active. The Ana intron divides the leucine transfer RNA (tRNA) into a 34 nt left half and a 51 nt right half (L34/R51) in the anticodon arm (ACA) (Figure 2A-insert labelled “tRNA”). The inventors set out to determine whether using some of the leucine tRNA sequence would enable circularization. Of the Leu tRNA anticodon arm, a L15/R30 portion was reserved and joined on either side of a gene of interest (in this case a 3*Flag coding sequence) for circularization. The construct is shown in Figure 2B (SEQ ID NO: 14).
In vitro transcription was performed, and the samples were loaded on a 12% denaturing polyacrylamide gel (PAGE) as described in Puttaraju & Been (1992). The results of the gel are shown in Figure 2C (left column labelled IVT).
Consistent with the high activity of Ana, splicing occurs co-transcriptionally and multiple RNA species could be seen in the 12% gel. Full-length precursor (403 nt) and the spliced linear intron (257 nt) were identified, but not the circular gene of interest (3*Flag). The linear intron and two other major species (I and II) were electroeluted from the 12% gel and loaded onto a 6% PAGE (Figure 2C, right column labelled IVT). Both the major species I and II contained a fast-moving minor species, and both run faster than themselves in the 12% PAGE, indicating that they are circular RNAs. Since the minor bands in I and II run at the same place as linear intron and nicking 3*Flag, respectively, the inventors concluded that species I is circular intron and species II is circular 3*Flag (the gene of interest). To further confirm the circular identity of 3*Flag, the inventors performed reverse transcription followed by PCR (RT-PCR) on the IVT sample and species I and II (Figure 2D). As expected, a 109 bp DNA product was produced from IVT and II but not I. The inventors then cloned this PCR product to a sequencing vector and performed Sanger sequencing. The sequencing result clearly shows that circularization of the 3*Flag gene of interest happened as expected (Figure 2E). These results demonstrate that even with a short IGS, TRIC-VO is able to circularize gene of interest highly efficiently during the IVT process (co-transcriptional circularization).
Example 2 - TRIC V1
The internal guide sequence (IGS) of the Ana group I intron is short (5 nucleotides), and the inventors speculated that a short IGS might make it difficult to circularize long genes of interest. To compensate for the short IGS, the inventors introduced an extended guide sequence (EGS) at the 5’ end of the TRIC, which could form a 20 nucleotide base-paired structure with a corresponding region at the 3’ end (Figure 3A). This construct was designated TRIC-V1 .
The sequence connecting the IGS and EGS constitutes an internal loop, which the inventors also optimised in TRIC-V1 . Three constructs were created, each containing a different loop configuration. In TRIC-V1.0 (SEQ ID NO: 15), the 6 nt sequence upstream of the Ana IGS was retained in the loop sequence, while for V1 .1 (SEQ ID NO: 16) and V1 .2 (SEQ ID NO: 17) the length of the loop was reduced by introducing base pair interactions between 3 nt at the 3’ end of the construct and 3 nt at the 5’ end of the loop. This effectively extended the EGS to 23 nt in length (Figure 3A), and reduced the 5’ loop length to 3 nucleotides. The 3’ loop length was 3 nucleotides in V1 .1 and V1 .2, whilst the 3’ loop length was 5 nucleotides in V1 .0.
In vitro transcription was performed for the three variants V1 .0, V1 .1 and V1 .2, and the products were loaded onto a 6% PAGE together with purified circular 3*Flag (the species II from earlier experiments). All three variants were found to circularize the gene of interest (3*Flag) highly efficiently (Figure 3B). The inventors subsequently performed digestion of the in vitro transcription product from V1 .0 by an RNase R. As expected, only the circular gene of interest (3*Flag) is resistant to RNase R digestion, while the linear intron is digested (Figure 3C). Compared to the TRIC-VO variant, the efficiency (that is, the ratio of full-length precursor to circular GOI) of the three V1 .1 variants was similar, since full length precursors mostly converted to circular RNAs during in vitro transcription. One advantage of
the V1 variants compared to VO is that the amount of circular intron is reduced. Reduction of circular introns is beneficial because these circular introns cannot be removed by RNase R digestion. Of the three V1 variants, V1 .1 gave the highest ratio of circular 3*Flag to linear intron, so this variant was taken forward for further investigation.
The 3*Flag sequence (SEQ ID NO: 18) is 141 nucleotides in length. The inventors also determined the capacity of TRIC-V1 for circularization of long genes of interest. Five new constructs were created with the aim of producing circular CVB3-EGFP (EGFP, 1638 nt) (SEQ ID NO: 19), CVB3-Firefly luciferase (Flue, 2601 nt) (SEQ ID NO: 20), CVB3-Spike protein of SARS-CoV 2-EGFP (Spike, 5469 nt) (SEQ ID NO: 21), CVB3-spCas9-EGFP (Cas9, 5757 nt) (SEQ ID NO: 22) and CVB3-Factor 8- EGFP (Factor s, 8706 nt) (SEQ ID NO: 23) (Figure 4A). Longer tRNA sequence (L15/R51), internal homology arms (iHRs) and spacer sequences (polyAC) were used in the TRIO V1 .0 construct to produce an identical circular RNA as would be obtained with the PIE Ana construct (Figure 5A, 5B). The sequences of each of the GOIs are shown below.
3*Flag: AAAATCCGTTGACCTTAAACGGTCGTGTGGTACACTCGATCTGGACTAAAGCTGCTCATGGATTACAAAGATCACGATGG TGATTATAAAGATCACGACATCGATTACAAGGATGATGATGATAAGAGACGCTACGGACTT (SEQ ID NO: 18)
EGFP:
AAAATCCGTTGACCTTAAACGGTCGTGTGGGTTCAAGTCCCTCCACCCCCACGCCGGAAACGCAATAGCCGAAAAACAA AAAACAAAAAAAACAAAAAAAAAACCAAAAAAACAAAACACATTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCA TTGGGCGCTAGCACTCTGGTATCACGGTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAA CACACACCGATCAACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATCAA TAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCTAGTAACACCGTGGAAG TTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGAGTCACCGCATTCCCCACGGGCGACCGTGG CGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAACCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTG AGCTAGTTGGTAGTCCTCCGGCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGT GTGTCGTAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTGGCTGCTTA TGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTAATAGAGCTATTATATATCCCTTTG TTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACAATTCATTGTTAAGTTGAATACAGCAAAATGGTGAGCAA GGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCG TGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCC GTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAG CACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTAC AAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGA GGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAA GAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCA GAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAG ACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAG
CTGTACAAGTAAAAAAAACAAAAAACAAAACGGCTATTATGCGTTACCGGCGAGACGCTACGGACTT (SEQ ID NO:
24)
Firefly luciferase (Flue):
AAAATCCGTTGACCTTAAACGGTCGTGTGGGTTCAAGTCCCTCCACCCCCACGCCGGAAACGCAATAGCCGAAAAACAA
AAAACAAAAAAAACAAAAAAAAAACCAAAAAAACAAAACACATTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCA
TTGGGCGCTAGCACTCTGGTATCACGGTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAA
CACACACCGATCAACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATCAA
TAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCTAGTAACACCGTGGAAG
TTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGAGTCACCGCATTCCCCACGGGCGACCGTGG
CGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAACCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTG
AGCTAGTTGGTAGTCCTCCGGCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGT
GTGTCGTAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTGGCTGCTTA
TGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTAATAGAGCTATTATATATCCCTTTG
TTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACAATTCATTGTTAAGTTGAATACAGCAAAATGGAAGATGC
CAAAAACATTAAGAAGGGCCCAGCGCCATTCTACCCACTCGAAGACGGTACCGCCGGCGAGCAGCTGCACAAAGCCAT
GAAGCGCTACGCCCTGGTGCCCGGCACCATCGCCTTTACCGACGCACATATCGAGGTGGACATTACCTACGCCGAGTA
CTTCGAGATGAGCGTTCGGCTGGCAGAAGCTATGAAGCGCTATGGGCTGAATACAAACCATCGGATCGTGGTGTGCAG
CGAGAATAGCTTGCAGTTCTTCATGCCCGTGTTGGGTGCCCTGTTCATCGGTGTGGCTGTGGCCCCAGCTAACGACATC
TACAACGAGCGCGAGCTGCTGAACAGCATGGGCATCAGCCAGCCCACCGTCGTATTCGTGAGCAAGAAAGGGCTGCAA
AAGATCCTCAACGTGCAAAAGAAGCTACCGATCATACAAAAGATCATCATCATGGATAGCAAGACCGACTACCAGGGCT
TCCAAAGCATGTACACCTTCGTGACTTCCCATTTGCCACCCGGCTTCAACGAGTACGACTTCGTGCCCGAGAGCTTCGA
CCGGGACAAAACCATCGCCCTGATCATGAACAGTAGTGGCAGTACCGGATTGCCCAAGGGCGTAGCCCTACCGCACCG
CACCGCTTGTGTCCGATTCAGTCATGCCCGCGACCCCATCTTCGGCAACCAGATCATCCCCGACACCGCTATCCTCAGC
GTGGTGCCATTTCACCACGGCTTCGGCATGTTCACCACGCTGGGCTACTTGATCTGCGGCTTTCGGGTCGTGCTCATGT
ACCGCTTCGAGGAGGAGCTATTCTTGCGCAGCTTGCAAGACTATAAGATTCAATCTGCCCTGCTGGTGCCCACACTATT
TAGCTTCTTCGCTAAGAGCACCCTCATTGATAAGTACGACCTAAGCAACTTGCACGAGATCGCCAGCGGCGGGGCGCC
GCTCAGCAAGGAGGTAGGTGAGGCCGTGGCTAAACGCTTCCACCTACCAGGCATCCGCCAGGGCTACGGCCTGACAG
AAACAACCAGCGCCATTCTGATCACCCCCGAAGGGGACGACAAGCCTGGCGCAGTAGGCAAGGTGGTGCCCTTCTTCG
AGGCTAAGGTGGTGGACTTGGACACCGGCAAGACACTGGGTGTGAACCAGCGCGGCGAGCTGTGCGTCCGTGGCCCC
ATGATCATGAGCGGCTACGTTAACAACCCCGAGGCTACAAACGCACTCATCGACAAGGACGGCTGGCTGCACAGCGGC
GACATCGCCTACTGGGACGAGGACGAGCACTTCTTCATCGTGGACCGGCTGAAGAGCCTGATCAAATACAAGGGCTAC
CAGGTAGCCCCAGCCGAACTGGAGAGCATCCTGCTGCAACACCCCAACATCTTCGACGCCGGGGTCGCCGGCCTGCC
CGACGACGATGCCGGCGAGCTGCCCGCCGCAGTCGTCGTGCTGGAACACGGTAAAACCATGACCGAGAAGGAGATCG
TGGACTATGTGGCCAGCCAGGTTACAACCGCCAAGAAGCTGCGCGGTGGTGTTGTGTTCGTGGACGAGGTGCCTAAAG
GACTGACCGGTAAGTTGGACGCCCGCAAGATCCGCGAGATTCTCATTAAGGCCAAGAAGGGCGGCAAGATCGCCGTGA
GCAGTGATTACAAGGATGATGATGATAAGTAAAAAAAACAAAAAACAAAACGGCTATTATGCGTTACCGGCGAGACGCTA
CGGACTT (SEQ ID NO: 25)
Spike protein of SARS-CoV 2-EGFP
AAAATCCGTTGACCTTAAACGGTCGTGTGGGTTCAAGTCCCTCCACCCCCACGCCGGAAACGCAATAGCCGAAAAACAA
AAAACAAAAAAAACAAAAAAAAAACCAAAAAAACAAAACACATTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCA
TTGGGCGCTAGCACTCTGGTATCACGGTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAA
CACACACCGATCAACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATCAA
TAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCTAGTAACACCGTGGAAG
TTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGAGTCACCGCATTCCCCACGGGCGACCGTGG
CGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAACCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTG
AGCTAGTTGGTAGTCCTCCGGCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGT
GTGTCGTAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTGGCTGCTTA
TGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTAATAGAGCTATTATATATCCCTTTG
TTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACAATTCATTGTTAAGTTGAATACAGCAAAATGGGGATTCT
GCCTAGCCCCGGCATGCCCGCCCTCCTGAGCCTGGTGAGCCTGCTGAGCGTCCTGTTGATGGGGTGCGTGGCCGAGA
CCGGCACACAGTGCGTGAACCTGACAACTAGAACACAGCTGCCCCCAGCCTACACCAACAGCTTTACTAGGGGAGTGT
ACTACCCTGATAAGTGCTTCAGGAGCAGTGTGCTGCACAGCACCCAGGACCTCTTTTTGCCCTTTTTCAGTAACGTGAC
CTGGTTCCACGCTATCCACGTGAGCGGCACAAACGGCACAAAACGGTTCGATAACCCAGTGCTGCCCTTCAACGACGG
CGTCTACTTCGCCAGCACTGAGAAGTCCAACATTATCAGGGGGTGGATCTTCGGGACAACCCTGGACAGCAAGACACA
GAGCCTCCTGATTGTGAACAACGCCACCAACGTGGTCATTAAGGTCTGCGAGTTCCAGTTCTGCAACGACCCCTTTCTG
GGGGTGTACTACCACAAGAACAACAAGTCCTGGATGGAGTCCGAGTTTCGGGTGTACTCCAGCGCCAACAACTGCACA
TTTGAGTACGTGAGCCAGCCCTTCTTGATGGATCTGGAGGGCAAGCAGGGGAACTTTAAGAACCTTCGGGAGTTCGTCT
TTAAGAACATCGACGGGTACTTTAAGATCTACAGCAAGCACACCCCTATTAACCTGGTCCGCGATCTGCCCCAGGGCTT
CAGCGCCCTGGAGCCACTGGTGGACCTCCCCATCGGGATCAACATTACTAGGTTCCAGACACTCCTGGCCCTGCACCG
GAGCTACCTGACTCCCGGCGATAGCTCCAGCGGGTGGACTGCCGGGGCCGCCGCCTACTACGTCGGCTACCTGCAGC
CCAGGACATTCCTGCTCAAGTACAACGAGAACGGCACCATCACTGATGCCGTGGACTGCGCTCTCGATCCCCTGAGCG
AGACAAAGTGCACCCTGAAGTCCTTCACCGTGGAGAAGGGGATCTACCAGACATCTAACTTCAGGGTGCAGCCCACTG
AGTCCATTGTGAGATTTCCTAACATCACCAACCTGTGCCCCTTCGGCGAGGTCTTTAACGCCACACGGTTCGCCAGCGT
CTACGCTTGGAACAGGAAGCGGATCTCCAACTGCGTGGCCGACTACAGCGTGCTGTACAACAGCGCCTCCTTCAGCAC
CTTTAAGTGCTACGGCGTGAGCCCCACTAAGCTGAACGACCTGTGCTTCACTAACGTGTACGCTGATAGCTTCGTGATT
AGGGGCGATGAGGTCCGGCAGATTGCCCCCGGCCAGACAGGGAAGATCGCTGACTACAACTATAAGCTGCCCGATGA
CTTCACAGGGTGTGTGATCGCCTGGAACAGCAACAACCTGGATAGCAAGGTCGGGGGGAACTACAACTACCTGTACAG
GCTGTTCAGAAAGTCCAACCTGAAGCCCTTCGAGCGGGATATCAGCACTGAGATCTACCAGGCCGGGAGCACCCCCTG
CAACGGCGTCGAGGGCTTTAACTGCTACTTTCCCCTCCAGAGCTACGGCTTCCAACCCACCAACGGCGTGGGGTACCA
GCCCTACCGGGTCGTGGTGCTTAGCTTCGAGCTCCTGCACGCCCCTGCAACCGTGTGCGGCCCCAAGAAGAGCACAAA
CCTCGTGAAGAACAAGTGCGTGAACTTCAACTTCAACGGGCTGACCGGCACCGGCGTGCTGACTGAGAGTAACAAGAA
GTTTCTCCCCTTCCAGCAGTGCGGGAGGGATATTGCCGACACTACTGATGCCGTGCGCGACCCTCAGACCCTTGAGAT
TCTGGACATCACTCCCTGCTCCTTCGGCGGAGTGTCCGTGATCACACCAGGGACCAACACTAGCAACCAGGTCGCCGT
CTTGTACCAGGACGTGAACTGCACCGAGGTGCCCGTGGCCATCCACGCCGATCAGCTGACCCCAACTTGGCGGGTGTA
CTCCACCGGGAGCAACGTGTTTCAGACACGGGCCGGGTGCCTGATCGGGGCTGAGCATGTCAACAACAGCTACGAGT
GCGATATTCCTATCGGCGCCGGGATTTGCGCCAGCTACCAGACACAGACAAACAGCCCTGGCTCGGCCAGCTCCGTCG
CCAGCCAGAGCATCATCGCCTACACAATGTCCTTGGGGGCCGAGAACAGCGTCGCCTACAGCAACAACAGCATTGCCA
TCCCCACCAACTTCACAATCAGCGTCACAACCGAGATCCTGCCCGTGAGCATGACTAAGACATCCGTGGATTGCACAAT
GTACATCTGCGGCGACTCCACAGAGTGCAGTAACCTGCTGCTCCAGTACGGGTCCTTTTGTACCCAGCTGAACAGAGC
CTTGACCGGCATTGCCGTGGAGCAGGACAAGAACACCCAGGAGGTGTTTGCCCAGGTGAAGCAGATCTACAAGACACC
CCCCATAAAGGACTTCGGGGGCTTTAACTTCTCCCAGATCTTGCCCGACCCCAGCAAGCCCTCCAAGCGGTCATTCATT
GAGGACCTGCTGTTTAACAAGGTGACCCTCGCTGATGCAGGGTTTATCAAGCAGTACGGCGATTGCCTGGGGGACATT
GCCGCTAGGGACTTGATCTGCGCCCAGAAGTTCAACGGCCTGACCGTGCTCCCCCCCCTGTTGACAGACGAGATGATC
GCCCAGTACACTAGCGCTCTGCTGGCCGGCACTATCACCAGCGGCTGGACCTTTGGGGCCGGCGCCGCCCTCCAGAT
TCCATTCGCCATGCAGATGGCCTACCGCTTCAACGGCATCGGCGTGACTCAGAACGTGCTGTACGAGAACCAGAAGCT
GATTGCCAACCAGTTCAACTCCGCCATCGGCAAGATCCAGGACAGCCTGTCCAGCACAGCATCCGCCCTGGGGAAGTT
GCAGGATGTGGTGAACCAGAACGCCCAGGCCCTGAACACACTGGTGAAGCAGCTGAGCAGCAACTTTGGCGCCATCAG
TAGCGTCCTGAACGACATCCTGTCACGGCTTGATCCACCAGAGGCCGAGGTCCAGATCGATCGGCTGATTACAGGCCG
GCTGCAGAGTCTCCAGACCTACGTCACCCAGCAGCTGATCCGGGCTGCCGAGATTAGAGCCTCCGCCAATCTCGCCGC
CACAAAGATGAGTGAATGCGTCCTGGGCCAGAGCAAAAGGGTCGACTTTTGCGGGAAGGGGTACCACCTGATGTCCTT
TCCCCAGTCCGCCCCTCACGGAGTGGTGTTCCTCCATGTCACATACGTGCCTGCCCAGGAGAAGAACTTCACTACAGC
CCCCGCCATTTGCCACGATGGGAAGGCCCATTTCCCTAGAGAGGGCGTGTTCGTGTCCAACGGGACCCACTGGTTCGT
GACTCAGCGGAACTTTTATGAACCCCAGATTATCACAACCGATAACACATTTGTCTCCGGCAACTGTGATGTGGTCATCG
GCATCGTGAACAACACCGTGTACGACCCACTGCAGCCTGAGCTTGACAGCTTTAAGGAGGAGCTGGACAAGTACTTTAA
GAACCACACCAGCCCTGATGTGGACCTTGGCGACATCAGCGGCATTAACGCCAGCGTGGTGAACATTCAGAAGGAGAT
CGATAGACTGAACGAGGTGGCCAAGAACTTGAACGAGTCACTGATTGACCTGCAGGAGCTGGGGAAGTACGAGCAGTA
CATTAAGGGGTCCGGGAGGGAGAACCTGTACTTTCAGGGGGGGGGGGGGTCCGGCTACATTCCCGAGGCCCCAAGG
GATGGCCAGGCCTACGTGAGGAAGGATGGGGAGTGGGTGCTGCTGAGCACCTTCTTGGGCAGCGGCAGCAGTATGGT
GAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGT
TCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAG
CTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATG
AAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGC
AACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTT
CAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAA
GCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTA
CCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGA
GCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATG
GACGAGCTGTACAAGTAAAAAAAACAAAAAACAAAACGGCTATTATGCGTTACCGGCGAGACGCTACGGACTT (SEQ
ID NO: 26) spCas9-EGFP
AAAATCCGTTGACCTTAAACGGTCGTGTGGGTTCAAGTCCCTCCACCCCCACGCCGGAAACGCAATAGCCGAAAAACAA
AAAACAAAAAAAACAAAAAAAAAACCAAAAAAACAAAACACATTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCA
TTGGGCGCTAGCACTCTGGTATCACGGTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAA
CACACACCGATCAACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATCAA
TAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCTAGTAACACCGTGGAAG
TTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGAGTCACCGCATTCCCCACGGGCGACCGTGG
CGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAACCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTG
AGCTAGTTGGTAGTCCTCCGGCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGT
GTGTCGTAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTGGCTGCTTA
TGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTAATAGAGCTATTATATATCCCTTTG
TTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACAATTCATTGTTAAGTTGAATACAGCAAAATGGATAAGAA
ATACTCAATAGGCTTAGATATCGGCACAAATAGCGTCGGATGGGCGGTGATCACTGATGAATATAAGGTTCCGTCTAAAA
AGTTCAAGGTTCTGGGAAATACAGACCGCCACAGTATCAAAAAAAATCTTATAGGGGCTCTTTTATTTGACAGTGGAGAG
ACAGCGGAAGCGACTCGTCTCAAACGGACAGCTCGTAGAAGGTATACACGTCGGAAGAATCGTATTTGTTATCTACAGG
AGATTTTTTCAAATGAGATGGCGAAAGTAGATGATAGTTTCTTTCATCGACTTGAAGAGTCTTTTTTGGTGGAAGAAGACA
AGAAGCATGAACGTCATCCTATTTTTGGAAATATAGTAGATGAAGTTGCTTATCATGAGAAATATCCAACTATCTATCATC
TGCGAAAAAAATTGGTAGATTCTACTGATAAAGCGGATTTGCGCTTAATCTATTTGGCCTTAGCGCATATGATTAAGTTTC
GTGGTCATTTTTTGATTGAGGGAGATTTAAATCCTGATAATAGTGATGTGGACAAACTATTTATCCAGTTGGTACAAACCT
ACAATCAATTATTTGAAGAAAACCCTATTAACGCAAGTGGAGTAGATGCTAAAGCGATTCTTTCTGCACGATTGAGTAAAT
CAAGACGATTAGAAAATCTCATTGCTCAGCTCCCCGGTGAGAAGAAAAATGGCTTATTTGGGAATCTCATTGCTTTGTCA
TTGGGTTTGACCCCTAATTTTAAATCAAATTTTGATTTGGCAGAAGATGCTAAATTACAGCTTTCAAAAGATACTTACGATG
ATGATTTAGATAATTTATTGGCGCAAATTGGAGATCAATATGCTGATTTGTTTTTGGCAGCTAAGAATTTATCAGATGCTAT
TTTACTTTCAGATATCCTAAGAGTAAATACTGAAATAACTAAGGCTCCCCTATCAGCTTCAATGATTAAACGCTACGATGA
ACATCATCAAGACTTGACTCTTTTAAAAGCTTTAGTTCGACAACAACTTCCAGAAAAGTATAAAGAAATCTTTTTTGATCAA
TCAAAAAACGGATATGCAGGTTATATTGATGGGGGAGCTAGCCAAGAAGAATTTTATAAATTTATCAAACCAATTTTAGAA
AAAATGGATGGTACTGAGGAATTATTGGTGAAACTAAATCGTGAAGATTTGCTGCGCAAGCAACGGACCTTTGACAACG
GCTCTATTCCCCATCAAATTCACTTGGGTGAGCTGCATGCTATTTTGAGAAGACAAGAAGACTTTTATCCATTTTTAAAAG
ACAATCGTGAGAAGATTGAAAAAATCTTGACTTTTCGAATTCCTTATTATGTTGGTCCATTGGCGCGTGGCAATAGTCGTT
TTGCATGGATGACTCGGAAGTCTGAAGAAACAATTACCCCATGGAATTTTGAAGAAGTTGTCGATAAAGGTGCTTCAGCT
CAATCATTTATTGAACGCATGACAAACTTTGATAAAAATCTTCCAAATGAAAAAGTACTACCAAAACATAGTTTGCTTTATG
AGTATTTTACGGTTTATAACGAATTGACAAAGGTCAAATATGTTACTGAAGGAATGCGAAAACCAGCATTTCTTTCAGGTG
AACAGAAGAAAGCCATTGTTGATTTACTCTTCAAAACAAATCGAAAAGTAACCGTTAAGCAATTAAAAGAAGATTATTTCA
AAAAAATAGAATGTTTTGATAGTGTTGAAATTTCAGGAGTTGAAGATAGATTTAATGCTTCATTAGGTACCTACCATGATTT
GCTAAAAATTATTAAAGATAAAGATTTTTTGGATAATGAAGAAAATGAAGATATCTTAGAGGATATTGTTTTAACATTGACC
TTATTTGAAGATAGGGAGATGATTGAGGAAAGACTTAAAACATATGCTCACCTCTTTGATGATAAGGTGATGAAACAGCTT
AAACGTCGCCGTTATACTGGTTGGGGACGTTTGTCTCGAAAATTGATTAATGGTATTAGGGATAAGCAATCTGGCAAAAC
AATATTAGATTTTTTGAAATCAGATGGTTTTGCCAATCGCAATTTTATGCAGCTGATCCATGATGATAGTTTGACATTTAAA
GAAGACATTCAAAAAGCACAAGTGTCTGGACAAGGCGATAGTTTACATGAACATATTGCAAATTTAGCTGGTAGCCCTGC
TATTAAAAAAGGTATTTTACAGACTGTAAAAGTTGTTGATGAATTGGTCAAAGTAATGGGGCGGCATAAGCCAGAAAATAT
CGTTATTGAAATGGCACGTGAAAATCAGACAACTCAAAAGGGCCAGAAAAATTCGCGAGAGCGTATGAAACGAATCGAA
GAAGGTATCAAAGAATTAGGAAGTCAGATTCTTAAAGAGCATCCTGTTGAAAATACTCAATTGCAAAATGAAAAGCTCTAT
CTCTATTATCTCCAAAATGGAAGAGACATGTATGTGGACCAAGAATTAGATATTAATCGTTTAAGTGATTATGATGTCGAT
CACATTGTTCCACAAAGTTTCCTTAAAGACGATTCAATAGACAATAAGGTCTTAACGCGTTCTGATAAAAATCGTGGTAAA
TCGGATAACGTTCCAAGTGAAGAAGTAGTCAAAAAGATGAAAAACTATTGGAGACAACTTCTAAACGCCAAGTTAATCAC
TCAACGTAAGTTTGATAATTTAACGAAAGCTGAACGTGGAGGTTTGAGTGAACTTGATAAAGCTGGTTTTATCAAACGCC
AATTGGTTGAAACTCGCCAAATCACTAAGCATGTGGCACAAATTTTGGATAGTCGCATGAATACTAAATACGATGAAAAT
GATAAACTTATTCGAGAGGTTAAAGTGATTACCTTAAAATCTAAATTAGTTTCTGACTTCCGAAAAGATTTCCAATTCTATA
AAGTACGTGAGATTAACAATTACCATCATGCCCATGATGCGTATCTAAATGCCGTCGTTGGAACTGCTTTGATTAAGAAAT
ATCCAAAACTTGAATCGGAGTTTGTCTATGGTGATTATAAAGTTTATGATGTTCGTAAAATGATTGCTAAGTCTGAGCAAG
AAATAGGCAAAGCAACCGCAAAATATTTCTTTTACTCTAATATCATGAACTTCTTCAAAACAGAAATTACACTTGCAAATGG
AGAGATTCGCAAACGCCCTCTAATCGAAACTAATGGGGAAACTGGAGAAATTGTCTGGGATAAAGGGCGAGATTTTGCC
ACAGTGCGCAAAGTATTGTCCATGCCCCAAGTCAATATTGTCAAGAAAACAGAAGTACAGACAGGCGGATTCTCCAAGG
AGTCAATTTTACCAAAAAGAAATTCGGACAAGCTTATTGCTCGTAAAAAAGACTGGGATCCAAAAAAATATGGTGGTTTTG
ATAGTCCAACGGTAGCTTATTCAGTCCTAGTGGTTGCTAAGGTGGAAAAAGGGAAATCGAAGAAGTTAAAATCCGTTAAA
GAGTTACTAGGGATCACAATTATGGAAAGAAGTTCCTTTGAAAAAAATCCGATTGACTTTTTAGAAGCTAAAGGATATAAG
GAAGTTAAAAAAGACTTAATCATTAAACTACCTAAATATAGTCTTTTTGAGTTAGAAAACGGTCGTAAACGGATGCTGGCT
AGTGCCGGAGAATTACAAAAAGGAAATGAGCTGGCTCTGCCAAGCAAATATGTGAATTTTTTATATTTAGCTAGTCATTAT
GAAAAGTTGAAGGGTAGTCCAGAAGATAACGAACAAAAACAATTGTTTGTGGAGCAGCATAAGCATTATTTAGATGAGAT
TATTGAGCAAATCAGTGAATTTTCTAAGCGTGTTATTTTAGCAGATGCCAATTTAGATAAAGTTCTTAGTGCATATAACAAA
CATAGAGACAAACCAATACGTGAACAAGCAGAAAATATTATTCATTTATTTACGTTGACGAATCTTGGAGCTCCCGCTGCT TTTAAATATTTTGATACAACAATTGATCGTAAACGATATACGTCTACAAAAGAAGTTTTAGATGCCACTCTTATCCATCAAT CCATCACTGGTCTTTATGAAACACGCATTGATTTGAGTCAGCTAGGAGGTGACGGCAGCGGCAGCAGTATGGTGAGCAA
GGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCG
TGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCC GTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAG CACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTAC
AAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGA
GGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAA GAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCA GAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAG
ACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAG CTGTACAAGTAAAAAAAACAAAAAACAAAACGGCTATTATGCGTTACCGGCGAGACGCTACGGACTT (SEQ ID NO: 27)
Factor 8-EGFP
AAAATCCGTTGACCTTAAACGGTCGTGTGGGTTCAAGTCCCTCCACCCCCACGCCGGAAACGCAATAGCCGAAAAACAA
AAAACAAAAAAAACAAAAAAAAAACCAAAAAAACAAAACACATTAAAACAGCCTGTGGGTTGATCCCACCCACAGGCCCA
TTGGGCGCTAGCACTCTGGTATCACGGTACCTTTGTGCGCCTGTTTTATACCCCCTCCCCCAACTGTAACTTAGAAGTAA
CACACACCGATCAACAGTCAGCGTGGCACACCAGCCACGTTTTGATCAAGCACTTCTGTTACCCCGGACTGAGTATCAA
TAGACTGCTCACGCGGTTGAAGGAGAAAGCGTTCGTTATCCGGCCAACTACTTCGAAAAACCTAGTAACACCGTGGAAG
TTGCAGAGTGTTTCGCTCAGCACTACCCCAGTGTAGATCAGGTCGATGAGTCACCGCATTCCCCACGGGCGACCGTGG CGGTGGCTGCGTTGGCGGCCTGCCCATGGGGAAACCCATGGGACGCTCTAATACAGACATGGTGCGAAGAGTCTATTG AGCTAGTTGGTAGTCCTCCGGCCCCTGAATGCGGCTAATCCTAACTGCGGAGCACACACCCTCAAGCCAGAGGGCAGT
GTGTCGTAACGGGCAACTCTGCAGCGGAACCGACTACTTTGGGTGTCCGTGTTTCATTTTATTCCTATACTGGCTGCTTA
TGGTGACAATTGAGAGATCGTTACCATATAGCTATTGGATTGGCCATCCGGTGACTAATAGAGCTATTATATATCCCTTTG
TTGGGTTTATACCACTTAGCTTGAAAGAGGTTAAAACATTACAATTCATTGTTAAGTTGAATACAGCAAAATGCAAATAGA
GCTCTCCACCTGCTTCTTTCTGTGCCTTTTGCGATTCTGCTTTAGTGCCACCAGAAGATACTACCTGGGTGCAGTGGAAC
TGTCATGGGACTATATGCAAAGTGATCTCGGTGAGCTGCCTGTGGACGCAAGATTTCCTCCTAGAGTGCCAAAATCTTTT
CCATTCAACACCTCAGTCGTGTACAAAAAGACTCTGTTTGTAGAATTCACGGATCACCTTTTCAACATCGCTAAGCCAAG GCCACCCTGGATGGGTCTGCTAGGTCCTACCATCCAGGCTGAGGTTTATGATACAGTGGTCATTACACTTAAGAACATG GCTTCCCATCCTGTCAGTCTTCATGCTGTTGGTGTATCCTACTGGAAAGCTTCTGAGGGAGCTGAATATGATGATCAGAC
CAGTCAAAGGGAGAAAGAAGATGATAAAGTCTTCCCTGGTGGAAGCCATACATATGTCTGGCAGGTCCTGAAAGAGAAT
GGTCCAATGGCCTCTGACCCACTGTGCCTTACCTACTCATATCTTTCTCATGTGGACCTGGTAAAAGACTTGAATTCAGG
CCTCATTGGAGCCCTACTAGTATGTAGAGAAGGGAGTCTGGCCAAGGAAAAGACACAGACCTTGCACAAATTTATACTA
CTTTTTGCTGTATTTGATGAAGGGAAAAGTTGGCACTCAGAAACAAAGAACTCCTTGATGCAGGATAGGGATGCTGCATC
TGCTCGGGCCTGGCCTAAAATGCACACAGTCAATGGTTATGTAAACAGGTCTCTGCCAGGTCTGATTGGATGCCACAGG
AAATCAGTCTATTGGCATGTGATTGGAATGGGCACCACTCCTGAAGTGCACTCAATATTCCTCGAAGGTCACACATTTCT
TGTGAGGAACCATCGCCAGGCGTCCTTGGAAATCTCGCCAATAACTTTCCTTACTGCTCAAACACTCTTGATGGACCTTG
GACAGTTTCTACTGTTTTGTCATATCTCTTCCCACCAACATGATGGCATGGAAGCTTATGTCAAAGTAGACAGCTGTCCA
GAGGAACCCCAACTACGAATGAAAAATAATGAAGAAGCGGAAGACTATGATGATGATCTTACTGATTCTGAAATGGATGT
GGTCAGGTTTGATGATGACAACTCTCCTTCCTTTATCCAAATTCGCTCAGTTGCCAAGAAGCATCCTAAAACTTGGGTAC
ATTACATTGCTGCTGAAGAGGAGGACTGGGACTATGCTCCCTTAGTCCTCGCCCCCGATGACAGAAGTTATAAAAGTCA
ATATTTGAACAATGGCCCTCAGCGGATTGGTAGGAAGTACAAAAAAGTCCGATTTATGGCATACACAGATGAAACCTTTA
AGACTCGTGAAGCTATTCAGCATGAATCAGGAATCTTGGGACCTTTACTTTATGGGGAAGTTGGAGACACACTGTTGATT
ATATTTAAGAATCAAGCAAGCAGACCATATAACATCTACCCTCACGGAATCACTGATGTCCGTCCTTTGTATTCAAGGAG
ATTACCAAAAGGTGTAAAACATTTGAAGGATTTTCCAATTCTGCCAGGAGAAATATTCAAATATAAATGGACAGTGACTGT
AGAAGATGGGCCAACTAAATCAGATCCTCGGTGCCTGACCCGCTATTACTCTAGTTTCGTTAATATGGAGAGAGATCTAG
CTTCAGGACTCATTGGCCCTCTCCTCATCTGCTACAAAGAATCTGTAGATCAAAGAGGAAACCAGATAATGTCAGACAAG
AGGAATGTCATCCTGTTTTCTGTATTTGATGAGAACCGAAGCTGGTACCTCACAGAGAATATACAACGCTTTCTCCCCAA
TCCAGCTGGAGTGCAGCTTGAGGATCCAGAGTTCCAAGCCTCCAACATCATGCACAGCATCAATGGCTATGTTTTTGATA
GTTTGCAGTTGTCAGTTTGTTTGCATGAGGTGGCATACTGGTACATTCTAAGCATTGGAGCACAGACTGACTTCCTTTCT
GTCTTCTTCTCTGGATATACCTTCAAACACAAAATGGTCTATGAAGACACACTCACCCTATTCCCATTCTCAGGAGAAACT
GTCTTCATGTCGATGGAAAACCCAGGTCTATGGATTCTGGGGTGCCACAACTCAGACTTTCGGAACAGAGGCATGACCG
CCTTACTGAAGGTTTCTAGTTGTGACAAGAACACTGGTGATTATTACGAGGACAGTTATGAAGATATTTCAGCATACTTGC
TGAGTAAAAACAATGCCATTGAACCAAGAAGCTTCTCCCAGAATTCAAGACACCCTAGCACTAGGCAAAAGCAATTTAAT
GCCACCACAATTCCAGAAAATGACATAGAGAAGACTGACCCTTGGTTTGCACACAGAACACCTATGCCTAAAATACAAAA
TGTCTCCTCTAGTGATTTGTTGATGCTCTTGCGACAGAGTCCTACTCCACATGGGCTATCCTTATCTGATCTCCAAGAAG
CCAAATATGAGACTTTTTCTGATGATCCATCACCTGGAGCAATAGACAGTAATAACAGCCTGTCTGAAATGACACACTTC
AGGCCACAGCTCCATCACAGTGGGGACATGGTATTTACCCCTGAGTCAGGCCTCCAATTAAGATTAAATGAGAAACTGG
GGACAACTGCAGCAACAGAGTTGAAGAAACTTGATTTCAAAGTTTCTAGTACATCAAATAATCTGATTTCAACAATTCCAT
CAGACAATTTGGCAGCAGGTACTGATAATACAAGTTCCTTAGGACCCCCAAGTATGCCAGTTCATTATGATAGTCAATTA
GATACCACTCTATTTGGCAAAAAGTCATCTCCCCTTACTGAGTCTGGTGGACCTCTGAGCTTGAGTGAAGAAAATAATGA
TTCAAAGTTGTTAGAATCAGGTTTAATGAATAGCCAAGAAAGTTCATGGGGAAAAAATGTATCGTCAACAGAGAGTGGTA
GGTTATTTAAAGGGAAAAGAGCTCATGGACCTGCTTTGTTGACTAAAGATAATGCCTTATTCAAAGTTAGCATCTCTTTGT
TAAAGACAAACAAAACTTCCAATAATTCAGCAACTAATAGAAAGACTCACATTGATGGCCCATCATTATTAATTGAGAATA
GTCCATCAGTCTGGCAAAATATATTAGAAAGTGACACTGAGTTTAAAAAAGTGACACCTTTGATTCATGACAGAATGCTTA
TGGACAAAAATGCTACAGCTTTGAGGCTAAATCATATGTCAAATAAAACTACTTCATCAAAAAACATGGAAATGGTCCAAC
AGAAAAAAGAGGGCCCCATTCCACCAGATGCACAAAATCCAGATATGTCGTTCTTTAAGATGCTATTCTTGCCAGAATCA
GCAAGGTGGATACAAAGGACTCATGGAAAGAACTCTCTGAACTCTGGGCAAGGCCCCAGTCCAAAGCAATTAGTATCCT
TAGGACCAGAAAAATCTGTGGAAGGTCAGAATTTCTTGTCTGAGAAAAACAAAGTGGTAGTAGGAAAGGGTGAATTTACA
AAGGACGTAGGACTCAAAGAGATGGTTTTTCCAAGCAGCAGAAACCTATTTCTTACTAACTTGGATAATTTACATGAAAAT
AATACACACAATCAAGAAAAAAAAATTCAGGAAGAAATAGAAAAGAAGGAAACATTAATCCAAGAGAATGTAGTTTTGCCT
CAGATACATACAGTGACTGGCACTAAGAATTTCATGAAGAACCTTTTCTTACTGAGCACTAGGCAAAATGTAGAAGGTTC
ATATGACGGGGCATATGCTCCAGTACTTCAAGATTTTAGGTCATTAAATGATTCAACAAATAGAACAAAGAAACACACAG
CTCATTTCTCAAAAAAAGGGGAGGAAGAAAACTTGGAAGGCTTGGGAAATCAAACCAAGCAAATTGTAGAGAAATATGCA
TGCACCACAAGGATATCTCCTAATACAAGCCAGCAGAATTTTGTCACGCAACGTAGTAAGAGAGCTTTGAAACAATTCAG
ACTCCCACTAGAAGAAACAGAACTTGAAAAAAGGATAATTGTGGATGACACCTCAACCCAGTGGTCCAAAAACATGAAAC
ATTTGACCCCGAGCACCCTCACACAGATAGACTACAATGAGAAGGAGAAAGGGGCCATTACTCAGTCTCCCTTATCAGA
TTGCCTTACGAGGAGTCATAGCATCCCTCAAGCAAATAGATCTCCATTACCCATTGCAAAGGTATCATCATTTCCATCTAT
TAGACCTATATATCTGACCAGGGTCCTATTCCAAGACAACTCTTCTCATCTTCCAGCAGCATCTTATAGAAAGAAAGATTC
TGGGGTCCAAGAAAGCAGTCATTTCTTACAAGGAGCCAAAAAAAATAACCTTTCTTTAGCCATTCTAACCTTGGAGATGA
CTGGTGATCAAAGAGAGGTTGGCTCCCTGGGGACAAGTGCCACAAATTCAGTCACATACAAGAAAGTTGAGAACACTGT
TCTCCCGAAACCAGACTTGCCCAAAACATCTGGCAAAGTTGAATTGCTTCCAAAAGTTCACATTTATCAGAAGGACCTAT
TCCCTACGGAAACTAGCAATGGGTCTCCTGGCCATCTGGATCTCGTGGAAGGGAGCCTTCTTCAGGGAACAGAGGGAG
CGATTAAGTGGAATGAAGCAAACAGACCTGGAAAAGTTCCCTTTCTGAGAGTAGCAACAGAAAGCTCTGCAAAGACTCC
CTCCAAGCTATTGGATCCTCTTGCTTGGGATAACCACTATGGTACTCAGATACCAAAAGAAGAGTGGAAATCCCAAGAGA
AGTCACCAGAAAAAACAGCTTTTAAGAAAAAGGATACCATTTTGTCCCTGAACGCTTGTGAAAGCAATCATGCAATAGCA
GCAATAAATGAGGGACAAAATAAGCCCGAAATAGAAGTCACCTGGGCAAAGCAAGGTAGGACTGAAAGGCTGTGCTCTC
AAAACCCACCAGTCTTGAAACGCCATCAACGGGAAATAACTCGTACTACTCTTCAGTCAGATCAAGAGGAAATTGACTAT
GATGATACCATATCAGTTGAAATGAAGAAGGAAGATTTTGACATTTATGATGAGGATGAAAATCAGAGCCCCCGCAGCTT
TCAAAAGAAAACACGACACTATTTTATTGCTGCAGTGGAGAGGCTCTGGGATTATGGGATGAGTAGCTCCCCACATGTTC
TAAGAAACAGGGCTCAGAGTGGCAGTGTCCCTCAGTTCAAGAAAGTTGTTTTCCAGGAATTTACTGATGGCTCCTTTACT
CAGCCCTTATACCGTGGAGAACTAAATGAACATTTGGGACTCCTGGGGCCATATATAAGAGCAGAAGTTGAAGATAATAT
CATGGTAACTTTCAGAAATCAGGCCTCTCGTCCCTATTCCTTCTATTCTAGCCTTATTTCTTATGAGGAAGATCAGAGGCA
AGGAGCAGAACCTAGAAAAAACTTTGTCAAGCCTAATGAAACCAAAACTTACTTTTGGAAAGTGCAACATCATATGGCAC
CCACTAAAGATGAGTTTGACTGCAAAGCCTGGGCTTATTTCTCTGATGTTGACCTGGAAAAAGATGTGCACTCAGGCCTG
ATTGGACCCCTTCTGGTCTGCCACACTAACACACTGAACCCTGCTCATGGGAGACAAGTGACAGTACAGGAATTTGCTC
TGTTTTTCACCATCTTTGATGAGACCAAAAGCTGGTACTTCACTGAAAATATGGAAAGAAACTGCAGGGCTCCCTGCAAT
ATCCAGATGGAAGATCCCACTTTTAAAGAGAATTATCGCTTCCATGCAATCAATGGCTACATAATGGATACACTACCTGG
CTTAGTAATGGCTCAGGATCAAAGGATTCGATGGTATCTGCTCAGCATGGGCAGCAATGAAAACATCCATTCTATTCATT
TCAGTGGACATGTGTTCACTGTACGAAAAAAAGAGGAGTATAAAATGGCACTGTACAATCTCTATCCAGGTGTTTTTGAG
ACAGTGGAAATGTTACCATCCAAAGCTGGAATTTGGCGGGTGGAATGCCTTATTGGCGAGCATCTACATGCTGGGATGA
GCACACTTTTTCTGGTGTACAGCAATAAGTGTCAGACTCCCCTGGGAATGGCTTCTGGACACATTAGAGATTTTCAGATT
ACAGCTTCAGGACAATATGGACAGTGGGCCCCAAAGCTGGCCAGACTTCATTATTCCGGATCAATCAATGCCTGGAGCA
CCAAGGAGCCCTTTTCTTGGATCAAGGTGGATCTGTTGGCACCAATGATTATTCACGGCATCAAGACCCAGGGTGCCCG
TCAGAAGTTCTCCAGCCTCTACATCTCTCAGTTTATCATCATGTATAGTCTTGATGGGAAGAAGTGGCAGACTTATCGAG
GAAATTCCACTGGAACCTTAATGGTCTTCTTTGGCAATGTGGATTCATCTGGGATAAAACACAATATTTTTAACCCTCCAA
TTATTGCTCGATACATCCGTTTGCACCCAACTCATTATAGCATTCGCAGCACTCTTCGCATGGAGTTGATGGGCTGTGAT
TTAAATAGTTGCAGCATGCCATTGGGAATGGAGAGTAAAGCAATATCAGATGCACAGATTACTGCTTCATCCTACTTTAC
CAATATGTTTGCCACCTGGTCTCCTTCAAAAGCTCGACTTCACCTCCAAGGGAGGAGTAATGCCTGGAGACCTCAGGTG
AATAATCCAAAAGAGTGGCTGCAAGTGGACTTCCAGAAGACAATGAAAGTCACAGGAGTAACTACTCAGGGAGTAAAAT
CTCTGCTTACCAGCATGTATGTGAAGGAGTTCCTCATCTCCAGCAGTCAAGATGGCCATCAGTGGACTCTCTTTTTTCAG
AATGGCAAAGTAAAGGTTTTTCAGGGAAATCAAGACTCCTTCACACCTGTGGTGAACTCTCTAGACCCACCGTTACTGAC
TCGCTACCTTCGAATTCACCCCCAGAGTTGGGTGCACCAGATTGCCCTGAGGATGGAGGTTCTGGGCTGCGAGGCACA GGACCTCTACGGCAGCGGCAGCAGTATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCG AGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTG ACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTG CAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAG GAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGT GAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAA CAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAG GACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAA CCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGT GACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAAAAAAAACAAAAAACAAAACGGCTATTATGCGTT ACCGGCGAGACGCTACGGACTT (SEQ ID NO: 28)
After in vitro transcription, some samples were subject to an additional 20 minute post-transcriptional circularization protocol. Extra GTP was supplied to each reaction mixture at a final concentration of 2 mM. The reaction mixtures were heated at 55 °C for 20 minutes to initiate circularization as previously described (Wesselhoeft, R. A., Kowalski, P. S. & Anderson, D. G. Nat Commun 9, 2629, (2018)). All the samples were loaded onto a 0.8% agarose gel and run for 1 hour 45 minutes. Results are shown in Figure 4B, with expected band positions for circular products labelled with black circles.
TRIC V1 can circularize long genes of interest co-transcriptionally, as indicated by the multiple RNA species observed for each construct in Figure 4B, without the post-transcriptional circularization protocol. During in vitro transcription, TRIC-V1 produces more splicing products than the PIE approach for all the genes of interest, suggesting that the co-transcriptional circularization efficiency of TRIC-V1 is higher than PIE. However, co-transcriptionally produced circular RNAs are heavily nicked, since circular RNAs for EGFP and Fluci are nicked at over 50% and circular RNAs for Spike, Cas9 and Factor 8 were difficult to detect. The additional 20 minute post-transcriptional circularization did increase the amount of splicing product, but mainly by increasing the amount of nicking. Co- transcriptional nicking for the short GOI 3*Flag, but fatal for the long GOIs used in this experiment. Thus, the inventors sought to identify a protocol that could be used to produce long circular GOIs.
Example 3 - Post-transcriptional circularization
Mg2+ is believed to be the major source of circular RNA nicking (Wesselhoeft et al. (2018)). Since Mg2+ is essential for both in vitro transcription and circularization, the inventors speculated that reasonable ways to reduce nicking of the circular RNAs would be to suppress co-transcriptional circularization and to increase the speed of post-transcriptional circularization.
The inventors tested whether an excess of nucleoside triphosphates (NTPs) could be used to chelate Mg2+ during transcription in order to suppress co-transcriptional splicing. The concentration of NTPs was fixed at 24 mM and the concentration of Mg2+ was varied from 6 mM to 24 mM. As shown in
Figure 6A, when the concentration of Mg2+ was lower than 16 mM, the in vitro transcription products were mainly full length precursors, suggesting that co-transcriptional circularization was indeed suppressed. Consequently, for subsequent studies in vitro transcription was performed using 24 mM NTPs and 14 mM Mg2+. Following transcription, full length precursors were subject to circularization for 0, 10, 20 or 40 minutes, and the reaction products analysed in a 0.8% agarose gel.
As shown in Figure 6B, using the post-transcriptional circularization protocol circular RNAs were now obtained for Spike, Cas9 and Factor 8 (as well as for EGFP and Fluci), and the splicing products were much less complex. Longer circularization times did increase the conversion from full length precursor to circular/nicked GOIs, but didn’t always increase the amount of circular RNA, particularly for over 20 minutes of circularization. Consequently, the 20 minute circularization protocol was chosen to measure the efficiency of TRIC-V1 constructs. As shown in Figure 6C, the efficiency of TRIC-V1 was higher than 80% for the long GOIs, demonstrating that TRIC-V1 can efficiently circularize GOIs as long as >8000 nt. In addition, the efficiency of TRIC-V1 was at least as high as that observed for the PIE Ana approach (Figure 6D).
Example 4 - TRIC-V2
Both the TRIC-V1 and the PIE rely on native exon sequences, namely the tRNA sequence in the case of the Ana, to work. These native exon sequences will unavoidably be part of resulting circular GOIs (as shown in Figures 5A and 5B), although their inclusion is associated with certain drawbacks.
Firstly, the native exon sequences are immunogenic (Liu, C. X. et al. Mol Cell 82, 420-434 e426 (2022)), which hinders the biomedical applications of circular RNAs produced by group I intron-based methods such as PIE. Secondly, the inclusion of the native exon sequences means that circularization sites must be positioned at untranslated regions (UTRs) of protein coding circular RNAs. Since UTRs are normally highly structured internal ribosome entry sites (IRESs), extra unwanted spacer sequences are needed to ensure the group I intron and remaining exons are separated from the IRESs. Finally, tens of thousands of natural circular RNAs exist whose functions are waiting to be deciphered, but group I intron-based methods are not favourable for this field because of the unwanted spacer and native exon sequences. To overcome these limitations, the inventors sought to provide an improved version of TRIC.
To reduce the dependence of circularization on tRNA sequences, length of left arm and right arm of Ana tRNA of the TRIC-V1 .0 was systematically tested (Figure 7). The inventors first reduced the tRNA sequence of TRIC to just 5 nucleotides - 3 in the left arm and 2 in the right (L3/R2). This construct was designated TRIC-V1 .30 (SEQ ID NO: 29) (see Figure 7B for an overview of the L/R configuration of the constructs used in this experiment). The aim of reducing the tRNA sequence to such a short length was to only allow the formation of the P1 and P10 structures. Following in vitro transcription, the reaction product was loaded to a 6% denaturing PAGE. The intron band was strongly detected but the circular 3*Flag band was very weak, indicating that circularization was not the major reaction (Figure 8B). Since the function of the native tRNA exon sequences are to bring RNA sequences into
proximity around the circularization site, the inventors sought to provide an interaction between the sequence 3’ of P10 and the EGS (V1 .31 , SEQ ID NO: 30). The circular 3*Flag band was clearly detected with V1 .31 , but the circularization efficiency was not comparable with TRIC V1 .0 (Figure 8B).
To improve the circularization efficiency, the inventors restored the left arm (L15/R2 - V1 .32 (SEQ ID NO: 31)) or the right arm (L3/R30 - V1.33 (SEQ ID NO: 32)) of the tRNA sequence. Restoration of the right arm (V1 .33) restored most of the circularization efficiency. Starting from this L3/R30 configuration in V1 .33, the length of the right arm was reduced gradually in constructs V1 .34 (L3/R25) (SEQ ID NO: 33), V1 .35 (L3/R16) (SEQ ID NO: 34) and V1 .36 (L3/R9) (SEQ ID NO: 35). The minimum length for the right arm was identified as 9 nucleotides (V1 .36, L3/R9). Subsequently, the right arm was fixed at 9 nucleotides and an 8 nt left arm was tested (V1 .39, L8/R9 (SEQ ID NO: 38)). The L8/R9 construct fully restored circularization efficiency to the V1 .0 level. The requirement of the tRNA sequence in V1 .39 has therefore been reduced to a 17-nucleotide structure which mimics the anticodon arm (ACA) found in the Ana tRNA.
The inventors sought to determine whether the group I intron activity could be replicated based on structure, rather than specific nucleotide sequences. To that end, the inventors reversed the sequence of the ACA stem, whilst keeping the sequence of the ACA loop the same (construct designated V2.0 (SEQ ID NO: 39)) and found the circularization efficiency is at least as high as V1 .0 and V1 .39. With V2.0, the requirement on tRNA sequence is reduced to 5nt (L-CTT/R-AA). A 5 nt sequence cannot be specific to any species, therefore TRIC-V2 doesn’t rely on the original bacterial tRNA sequence.
Subsequently, the inventors also identified that the stem length could be increased to 15 base pairs (V2.1 (SEQ ID NO: 40)) without abolishing circularization efficiency. In this construct, the only tRNA sequence present is the L-CTT/R-AA sequence, and the remaining base pairs were not derived from the tRNA sequence. The longer stem is advantageous since it may enable better circularization of longer GOIs.
The inventors subsequently reversed the sequence of the P1 and P10 regions (from L-CTT/R-AA to L- GAT/R-TT), except for the uracil which forms a wobble base pair with the internal guide sequence (construct designated V2.2 (SEQ ID NO: 41). Again, circularization efficiency was found to be unaffected. In this way, the inventors have shown that the IGS requires only a GU wobble base pair for circularization and is not otherwise dependent on sequence.
Based on the above, the inventors have determined that if an anticodon arm-like structure (eACA) which consists of a 7 nt loop with a uracil as the third nucleotide and a >5 base pair stem can be found in a gene of interest, this gene of interest can be circularized efficiently without introducing any unwanted sequence (Figure 8). If the circularization site is placed in a coding sequence, the site of choice can be extended because of codon redundancy. Potentially, only a few nt mutations would
need to be made without affecting the peptide sequence. If the circularization site is placed in a noncoding RNA or UTR of a protein coding circular RNA, the circularization site can be assembled by introducing as few as 5 extra nucleotides to create an eACA.
Example 5 - TRIC-V2 and long GOIs
The TRIC-V2 construct was then tested on three protein coding circular RNAs, including circular T2A- EGFP, T2A-Nano Luciferase, and circular Znf609. The circularization site was positioned at the CDS. For all the circular RNAs tested, multiple circularization sites in the CDS were found. Full-length precursors of those three RNAs were produced and circularized for 20 minutes as already described and shown in Figure 12. The sequences of the full length precursors for T2A-EGFP, T2A-Nano Luciferase and Znf609 are set forth in SEQ ID NOs: 42, 43 and 44 respectively. As shown in Figure 8D (urea-agarose gel), these long circular RNAs were all produced efficiently with TRIC-V2.
The results of this experiment suggested that longer eACA stems resulted in better circularization efficiency. To investigate this further, the CVB3-EGFP was cloned to the TRIC-V2 construct and two stem lengths were tested: 15 nt (SEQ ID NO: 45) and 25 nt (SEQ ID NO: 46). Separately, the effect of increasing the length of the extended guide sequence (EGS) was determined by providing a TRIC-V1 construct (that is, one having L15/R51 of the tRNA sequence) with a 40 nt EGS (SEQ ID NO: 47). Full length precursors of all these constructs were produced and circularization was performed for 4 minutes.
As shown in Figure 9A, with 4 minutes of circularization, TRIC-V1.0 converts -50% of full length precursor to circular RNA. Increasing the EGS from 20 to 40 did not substantially increase circularization efficiency.
On the other hand, increasing the stem length to either 15 or 25 nt in TRIC-V2 did increase circularization efficiency. Increasing the EGS of TRIC-V2 from 20 to 40 nt did not substantially increase circularization efficiency.
TRIC-V2 was then compared with the PIE method on circularization of long GOIs EGFP, Spike and Cas9, for 1 or 3 minutes of circularization. At 1 minute of circularization, PIE converts a small number of EGFP precursors to circular RNAs, while TRIC-V2 converts many more. For example, when the eACA stem equals 25 nt, more than 50% of TRIC-V2 precursors converted to circular RNAs within 1 min (Figure 9B, 2% native agarose gel). At 3 minutes of circularization, PIE converts less than 50% of full-length EGFP precursor to circular RNAs but TRIC-V2 converts most of full-length precursors to circular RNAs. Moreover, the PIE produces a visible number of concatenations, which are largely absent with TRIC-V2 (Figure 9B). These differences between PIE and TRIC-V2 are also seen for Spike and the Cas9, suggesting that TRIC-V2 is much faster and cleaner than the PIE method for circularization of the long GOIs (Figure 9C, 0.8% native agarose gel).
Ribozyme kinetics were then measured using the Michaelis-Menten model. TRIC-V2 with a 25 nt stem is 3.7 times faster than PIE in producing the CVB3-EGFP circular RNA (Figure 9D).
Circularization efficiency, circular RNA yield (ratio between circular RNAs and total RNAs) and nicking ratio were then calculated using the CVB3-EGFP as an example. As shown in Figure 9E, for TRIC- V1 , 20 min of circularization gives an efficiency of 95.6% with yield and nicking rates of 58.2% and 26.7%. For TRIC-V2, 3 min of circularization gives an efficiency of 89.4% with yield and nicking rates of 70.6% and 4.6%. Further circularization to 8 min of TRIC-V2 gives higher efficiency and yield (97.8% and 74.8%), but also increased the nicking (7.7%). Of note, a yield of 74.8% is 90.3% of the limit yield (MW ratio between circular RNA and its precursor). For PIE, 8min of circularization gives an efficiency of 86.7% with yield and nicking rates of 65.5% and 9.3%. Consequently, TRIC-V2 provides higher efficiency, higher yield, and lower nicking compared to PIE, all without the issues of native exon sequences in the resulting circular RNAs.
Overall, the post transcriptional circularization protocol and highly efficient TRIC-V2 allow the production of circular RNAs in a clean, highly efficient manner which results in a high final yield.
Example 6 - Immunogenicity of TRIC-V2
It has been shown that bacterial sequences introduced to circular GOIs by the PIE method are immunogenic (Liu, C. X. et al. Mol Cell 82, 420-434 e426, (2022)). As described above, TRIC-V2 can produce circular genes of interest without any bacterial sequence. The inventors determined whether circular RNAs produced by TRIC-V2 were less immunogenic than those produced by PIE.
Constructs that would produce circular EGFP and circular Nano luciferase (NIuc) by PIE (SEQ ID NOs: 48 and 49, respectively) or TRIC-V2 (SEQ ID NOs: 50 and 51 , respectively) were created. Circular RNAs were purified by gel filtration (using an SRT-2000 SEC column (Sepax)) and RNase R digestion.
As shown in Figure 11 B, circular RNA can be separated from the spliced group I intron but not the nicked RNA and full length precursor. To further remove the nicked RNA and full-length precursor, the circular RNA was digested by an exonuclease, RNase R, at 37 °C for 1 h. The circular RNAs were then cleaned by the RNA Clean & Concentrator Kits (ZYMO RESEARCH). Purified circular RNAs were checked by 6M-1 .5% urea-agarose gel. As shown in Figure 11 C, the intron and full-length precursor were not visible in purified circular RNAs. However, there were still some nicked products which were likely produced during the gel running process.
A549 cells were then transfected with the circular RNAs, and the expression levels of immune factors (IL6, CCL5, and INF beta) was monitored by RT-qPCR. Various controls were included in the experiment, such as mocks (Mock and Lipo), a positive control (poly I :C), unmodified linear mRNAs (lin. EGFP and lin. NIuc) and modified linear mRNAs (lin. EGFP-modi and lin. Nluc-modi). GAPDH was used as the internal reference. Figures 11 D-F depict the results. The results showed that poly I :C and
unmodified linear mRNAs elicited a strong expression of immune factors, whereas the negative controls (mock and lipo) did not induce significant expression. Moreover, as expected, modified linear mRNAs did not induce a robust immune response. When it comes to circular RNAs, it was observed that they generally exhibited lower immunogenicity compared to unmodified linear mRNAs.
Specifically, for circular NIuc, the PIE-circ. NIuc triggered a significantly stronger expression of immune factors compared to the TRIC-circ. NIuc. Notably, the TRIC-circ. NIuc behaved similarly to the nonimmunogenic mocks, suggesting a low immunogenicity. Regarding circular EGFP, circular RNAs produced by both methods induced a comparable number of immune responses. These findings demonstrate that the immunogenicity of circular RNA produced by TRIC-V2 is lower than that of PIE, particularly in the case of circular NIuc. Furthermore, unmodified TRIC circular RNA can exhibit a similar level of immunogenicity as nonimmunogenic mocks.
The expression of NIuc from circular and linear RNAs in HEK293 cells over 7 days was then tested (Figure 11G). The circular form of NIuc was had increased and more persistent protein expression compared to linear mRNA. In some cases, expression from circular RNAs may need to be discontinued. One way to achieve this is by using siRNAs. Multiple siRNA target sites (msiTS) were introduced 5’ of the IRES sequence in circular RNAs produced by TRIC-V2 (constructs shown in Figure 11 A). msiTS-circ. NIuc was transfected into HEK293 cells and siRNAs were applied to the cells on day 3 following transfection. As shown in Figure 11 H, expression of NIuc is reduced 5 hours later, demonstrating that circular RNAs produced by TRIC-V2 can be effectively targeted with siRNAs.
Example 7 - Rolling Circle Replication
Circular RNAs (circRNAs) are more stable than mRNAs and are thus promising alternatives to mRNAs for therapeutics. Natural circRNAs can serve as sponges for microRNAs and proteins but are also templates for translation. In the presence of an internal ribosome entry site (IRES) for initiation, circRNAs can be translated efficiently. CircRNAs that lack an in-frame termination codon allow ribosomes to translate completely around the circRNA multiple times, resulting in a polyprotein, a process known as rolling circle translation (RCT) (Figure 15a). The polyprotein can in principle be cleaved by a protease or a self-cleaving sequence, thus allowing multiple copies of the GOI to be made in a single round of initiation. In principle, RCT can be 100-fold more efficient than single-shot translation. However, RCT encounters two challenges: low initiation efficiency and accessory sequences introduced by currently used in vitro circularization methods.
Unlike single-shot translation, RCT primarily produces a polyprotein. This can be converted into protein monomers provided with 2A skipping sequences (Figure 15b). Instead of using circRNAs with only a CDS, which are inefficient in translation initiation, we selected a potent short IRES (OR4F17) for the RCT construct. As expected, the circOR4F17-Nluc-RCT produced a significant number of proteins. However, the translation was still orders of magnitude less efficient compared to N1 ^-modified mRNA.
We then asked whether potent viral IRESs could be used for RCT. However, viral IRESs are typically long and highly structured, which could block RCT either by in-frame stop codons or by causing ribosome stalling. Nevertheless, we tested the 373 nt CSFV IRES. As expected, multiple stop codons were identified in each frame. We therefore engineered frame 1 to eliminate stop codons and align with the CDS (Figure 15c). Notably, the RCT using CSFV IRES showed over 10-fold increase in NIuc expression compared to its single-shot translation and was over 7,000-fold more efficient than the RCT using the OR4F17 IRES (Figure 15d, e).
Example 8 - Loop Optimisation
As has been shown, the eACA structure is essential for circularization to happen. This eACA is a stem-loop structure which contains a stem >1 bp and a 7 nt loop with a U at the third position (Figure 16A a-0). This minimal structure enables circularization of GOIs with little restriction without introducing unwanted sequences.
To further release the restriction of the eACA, we tested multiple versions (Figure 16A b1-m12: SEQ ID Nos: 79 to 90) of the loop in circularization of EGFP. In these loops, loop length varies from 3-11 nt and the U was placed in 1st-5th positions. Circularization of EGFP was abolished in the case of L5_U2 and L3_U1 , but worked efficiently in all other tested cases, including L5_U3 (Figure 16B). This indicates that the Loop needs to be longer than 4 nt and the U cannot be at the 1 st or 2rd position.
Example 9 - Wobble base pair
The G and U wobble base pair between IGS (internal guide sequence) and GOIs is known to be important for ribozyme to function. To explore whether other base pairs are functional, we mutated the G in the IGS to U or A and the U in GOI to either A or C. This way, both new Watson-Crick base pair (U-A) and wobble base pairs (A-C) are included.
We produced full length precursors and performed circularization for 3 min and 30 min. The TRICv2(G-U)+ circularized the EGFP efficiently (Figure 17a). Meanwhile, it’ s clearly shown that both TRICv2(C-A) (SEQ ID NO: 92) and TRICv2(A-U) (SEQ ID NO: 91) can convert Full length to circular RNA. Circular RNAs from TRICv2(C-A) and TRICv2(A-U) were further confirmed by a urea-agarose gel (Figure 17b).
Example 10 - Comparison of TRIC-V2
Recent studies have described Tetrahymena thermophila (Tetra) group I intron-derived constructs, the Tetra-STS (RZ construct, WO2022/191642) and the Tetra-Rzy, for circRNA synthesis. We compared these to V2 by cloning CVB3-EGFP into Tetra-STS (AU-rich no. 16) and Tetra-Rzy (CVB3IRES-GFP) constructs (SEQ ID Nos: 94 and 95). V2 outperforms both constructs (Figure 18). Additionally, the Tetra-V2 (SEQ ID NO: 93) also produced circCVB3-EGFP efficiently, demonstrating that optimizations from the Ana intron can be effectively applied to other group I introns.
SEQUENCES
Claims
1 . A recombinant nucleic acid molecule for making a circular RNA, comprising in the 5’ to 3’ direction: a) an internal guide sequence (IGS), b) a ribozyme, c) a first portion of an extended anticodon arm (eACA) sequence, d) a gene of interest, and e) a second portion of the eACA sequence, wherein a nucleotide in the second portion of the eACA sequence forms a wobble base pair with a nucleotide in the IGS.
2. The recombinant nucleic acid molecule of claim 1 , wherein the first portion of the eACA sequence comprises a first eACA stem portion and a first eACA loop portion.
3. The recombinant nucleic acid molecule of claim 1 or 2, wherein the second portion of the eACA sequence comprises a second eACA stem portion and a second eACA loop portion.
4. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second eACA stem portions are each at least 5 nucleotides in length.
5. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second eACA stem portions are each at least 10 nucleotides in length.
6. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second eACA stem portions are each at least 15 nucleotides in length.
7. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second eACA stem portions are each at least 20 nucleotides in length.
8. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second eACA stem portions are each at least 25 nucleotides in length.
9. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second eACA stem portions are between 1 and 50 nucleotides in length.
10. The recombinant nucleic acid molecule of any preceding claim, wherein the first eACA loop portion is 1-20 nucleotides in length.
11. The recombinant nucleic acid molecule of any preceding claim, wherein the second eACA loop portion is1 -20 nucleotides in length.
12. The recombinant nucleic acid molecule of any preceding claim, wherein the first eACA loop portion is 4 nucleotides in length and wherein the second eACA loop portion is 3 nucleotides in length.
13. The recombinant nucleic acid molecule of any preceding claim, further comprising a P1 extension 3’ to the second portion of the eACA sequence.
14. The recombinant nucleic acid molecule of claim 13, wherein the P1 extension is 2 nucleotides in length.
15. The recombinant nucleic acid molecule of claim 13 or 14, wherein the second portion of the eACA sequence and the P1 extension together are capable of forming a P1 region.
16. The recombinant nucleic acid molecule of claim 15, wherein the P1 region is complementary to the IGS.
17. The recombinant nucleic acid molecule of any preceding claim, wherein a 5’ portion of the first portion of the eACA sequence is capable of forming a P10 region.
18. The recombinant nucleic acid molecule of claim 17, wherein the 5’ portion of the first portion of the eACA sequence that forms a P10 region is complementary to the IGS.
19. The recombinant nucleic acid molecule of claim 17 or 18, wherein the 5’ portion of the first portion of the eACA sequence that forms a P10 region is 2 nucleotides in length.
20. The recombinant nucleic acid molecule of any preceding claim, wherein the nucleotide in the second portion of the eACA sequence that forms a wobble base pair is uracil or cytosine.
21. The recombinant nucleic acid molecule of any preceding claim, wherein the nucleotide in the second portion of the eACA sequence that forms a wobble base pair is the last nucleotide in the second portion of the eACA sequence.
22. The recombinant nucleic acid molecule of any preceding claim, wherein the second eACA loop portion comprises uracil or cytosine as a nucleotide other than the first or second nucleotide in the 5’ to 3’ direction.
23. The recombinant nucleic acid molecule of any preceding claim, wherein the second eACA loop portion comprises uracil or cytosine as the third nucleotide in the 5’ to 3’ direction.
24. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second portions of the eACA sequence together are capable of forming a stem loop structure.
25. The recombinant nucleic acid molecule of claim 24, wherein the loop of the stem loop structure is 3-40 nucleotides in length.
26. The recombinant nucleic acid molecule of claim 24 or 25, wherein the loop of the stem loop structure is greater than 4 nucleotides in length.
27. The recombinant nucleic acid molecule of any preceding claim, wherein the ribozyme is a group I intron.
28. The recombinant nucleic acid molecule of any preceding claim, wherein the wobble base pair comprises a GU or AC wobble base pair.
29. The recombinant nucleic acid molecule of any preceding claim, wherein the second portion of the eACA sequence is complementary to the IGS except for the wobble base pair.
30. The recombinant nucleic acid molecule of any preceding claim, further comprising a first extended guide sequence (EGS) 5’ to the IGS, and a second EGS 3’ to the second portion of the eACA sequence.
31. The recombinant nucleic acid molecule of claim 30, wherein the first EGS is at least 50% complementary to the second EGS.
32. The recombinant nucleic acid molecule of claim 31 , wherein the first EGS is 100% complementary to the second EGS.
33. The recombinant nucleic acid molecule of any of claims 30-32, wherein the first and second EGS are each between 1 and 500 nucleotides in length.
34. The recombinant nucleic acid molecule of any preceding claim, wherein the first and second portions of the eACA sequence naturally occur in the gene of interest.
35. The recombinant nucleic acid molecule of any of claims 1-34, wherein the first and second portions of the eACA sequence are derived from human ribosomal RNA (rRNA).
36. The recombinant nucleic acid molecule of any of claims 1-33 or 35, wherein the nucleotide sequence of the gene of interest is modified to include the first and second portions of the eACA sequence.
37. The recombinant nucleic acid molecule of any of claims 15-36, further comprising a first loop sequence 5’ to the IGS, and a second loop sequence 3’ to the P1 extension.
38. The recombinant nucleic acid molecule of claim 37, wherein the first and second loop sequences are each between 1 and 10 nucleotides in length.
39. The recombinant nucleic acid molecule of claim 37 or 38, wherein the first loop sequence is 6 nucleotides in length.
40. The recombinant nucleic acid molecule of any of claims 37 to 39, wherein the second loop sequence is 5 nucleotides in length.
41. The recombinant nucleic acid molecule of any of claims 37 to 40, wherein the first loop sequence is 6 nucleotides in length and the second loop sequence is 5 nucleotides in length.
42. The recombinant nucleic acid molecule of any of claims 37 to 41 , wherein the first loop sequence is non-complementary to the second loop sequence.
43. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest is at least 500 nucleotides in length.
44. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest is at least 2000 nucleotides in length.
45. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest is at least 4000 nucleotides in length.
46. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest is at least 6000 nucleotides in length.
47. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest is at least 8000 nucleotides in length.
48. The recombinant nucleic acid molecule of any preceding claim, comprising in a 5’ to 3’ direction: a) a first EGS,
b) a first loop sequence, c) an internal guide sequence (IGS), d) a ribozyme, e) a first portion of an extended anticodon arm (eACA) sequence, f) a gene of interest, g) a second portion of the eACA sequence, h) a P1 extension, i) a second loop sequence, and j) a second EGS.
49. The recombinant nucleic acid molecule of any preceding claim, further comprising a T7 high efficiency sequence.
50. The recombinant nucleic acid molecule of any preceding claim, further comprising at least one restriction enzyme cleavage site.
51. The recombinant nucleic acid molecule of any preceding claim, further comprising a poly(A) tail.
52. The recombinant nucleic acid molecule of any preceding claim, wherein the recombinant nucleic acid molecule is a DNA template molecule.
53. The recombinant nucleic acid molecule of claim 52, further comprising a T7 promoter sequence.
54. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest is a noncoding RNA.
55. The recombinant nucleic acid molecule of claims 1-53, wherein the gene of interest encodes a polypeptide.
56. The recombinant nucleic acid molecule of any preceding claim, wherein the recombinant nucleic acid molecule is devoid of in-frame stop codons.
57. The recombinant nucleic acid molecule of any preceding claim, wherein the gene of interest comprises a translation initiation element.
58. The recombinant nucleic acid molecule of claim 57, wherein the translation initiation element is a modified viral IRES that lacks in-frame stop codons.
59. Use of a recombinant nucleic acid molecule according to any preceding claim in a method of making a circular RNA.
60. A method for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule according to any of claims 1-58, and b) circularizing the recombinant nucleic acid molecule.
61. A method for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule according to any of claims 1-58, b) transcribing the recombinant nucleic acid molecule to produce an RNA precursor, and c) circularizing the RNA precursor.
62. The method of claim 61 , wherein splicing is suppressed during step (b).
63. The method of claim 61 or 62, wherein step (b) is performed in the presence of nucleoside triphosphates (NTPs) at a concentration of approximately 24 mM and Mg2+ at a concentration of 16 mM or less.
64. A method for producing a circular RNA, comprising: a) identifying a gene of interest comprising a sequence capable of forming an eACA stem-loop, b) preparing a recombinant nucleic acid molecule comprising, in a 5’ to 3’ direction: an internal guide sequence (IGS) - a ribozyme - a sequence encoding the gene of interest, and c) circularizing the recombinant nucleic acid molecule.
65. A method for producing a circular RNA, comprising: a) providing a recombinant nucleic acid molecule according to any of claims 1-58, and b) transcribing and circularizing the recombinant nucleic acid molecule, wherein circularization occurs co-transcriptionally.
66. A circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA does not comprise exogenous splicing sequences.
67. A circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA does not comprise RNA from a circularization agent.
68. The circular RNA of claim 66 or 67, wherein the circularization agent is a ribozyme.
69. The circular RNA of any of claims 66-68, wherein the circularization agent is a group 1 intron.
70. The circular RNA of any of claims 66-69, wherein the sequence encoding a gene of interest comprises a sequence capable of forming an eACA stem loop structure.
71. The circular RNA of any of claims 66-70, wherein the eACA stem loop structure comprises a stem of at least 5 base pairs.
72. The circular RNA of any of claims 66 to 71 , wherein the eACA stem loop structure comprises a loop at least 7 nucleotides in length.
73. The circular RNA of any of claims 66 to 72, wherein the circular RNA has reduced immunogenicity compared to a circular RNA which comprises exogenous splicing sequences.
74. The circular RNA of any of claims 66 to 73, wherein the circular RNA has reduced immunogenicity compared to a circular RNA produced using the permuted intron-exon (PIE) approach.
75. The circular RNA of any of claims 66 to 74, wherein the sequence encoding a gene of interest is at least 2000, at least 3000, at least 4000, at least 5000, or at least 6000 nucleotides in length.
76. A circular RNA obtained or obtainable by the method of any of claims 60 to 65, wherein the circular RNA is less immunogenic compared to a circular RNA which comprises exogenous exon sequences.
77. A circular RNA obtained or obtainable by the method of any of claims 60 to 65, wherein the circular RNA does not comprise exogenous exon sequences.
78. Use of a circular RNA according to any of claims 66-77 in an in vitro method of expressing a protein in a cell.
79. A method for expressing a gene of interest in a cell, the method comprising (a) circularizing a recombinant nucleic acid molecule according to any of claims 1-58 to provide a circular RNA comprising the gene of interest, and (b) administering the circular RNA to the cell.
80. A method of treating a disease in a subject, the method comprising (a) circularizing a recombinant nucleic acid molecule according to any of claims 1-58 to provide a circular RNA, and (b) administering the circular RNA to the subject.
81. A recombinant nucleic acid molecule according to any of claims 1-58, for use as a medicament.
82. A recombinant nucleic acid molecule according to any of claims 1-58, for use in a method of treating a disease in a subject.
83. A circular RNA obtainable by the method of any of claims 60 to 65.
84. A circular RNA comprising a sequence encoding a gene of interest, wherein the circular RNA comprises an eACA sequence.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2025/065903 WO2025252998A2 (en) | 2024-06-07 | 2025-06-06 | Materials and methods for making circular rnas |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GBGB2308675.4A GB202308675D0 (en) | 2023-06-09 | 2023-06-09 | Circular rnas and methods for making the same |
| GB2308675.4 | 2023-06-09 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2024252011A2 true WO2024252011A2 (en) | 2024-12-12 |
| WO2024252011A3 WO2024252011A3 (en) | 2025-02-06 |
Family
ID=87291639
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2024/065837 Pending WO2024252011A2 (en) | 2023-06-09 | 2024-06-07 | Circular rnas and methods for making the same |
Country Status (2)
| Country | Link |
|---|---|
| GB (1) | GB202308675D0 (en) |
| WO (1) | WO2024252011A2 (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022191642A1 (en) | 2021-03-10 | 2022-09-15 | 알지노믹스 주식회사 | Self-circularized rna structure |
| WO2023046153A1 (en) | 2021-09-26 | 2023-03-30 | Center For Excellence In Molecular Cell Science, Chinese Academy Of Sciences | Circular rna and preparation method thereof |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3642342A4 (en) * | 2017-06-23 | 2021-03-17 | Cornell University | RNA MOLECULES, METHODS FOR PRODUCING CIRCULAR RNA AND TREATMENT METHODS |
| EP4392560A4 (en) * | 2021-08-27 | 2024-11-13 | Peking University | CONSTRUCTS AND METHODS FOR PREPARING CIRCULAR RNA |
| EP4271818B1 (en) * | 2021-09-17 | 2024-07-31 | Flagship Pioneering Innovations VI, LLC | Compositions and methods for producing circular polyribonucleotides |
| CN120187856A (en) * | 2022-09-06 | 2025-06-20 | 尔知渃米斯股份有限公司 | Self-circularizing RNA constructs |
| KR20240034676A (en) * | 2022-09-06 | 2024-03-14 | 알지노믹스 주식회사 | Construct of self-circularization RNA |
-
2023
- 2023-06-09 GB GBGB2308675.4A patent/GB202308675D0/en not_active Ceased
-
2024
- 2024-06-07 WO PCT/EP2024/065837 patent/WO2024252011A2/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022191642A1 (en) | 2021-03-10 | 2022-09-15 | 알지노믹스 주식회사 | Self-circularized rna structure |
| WO2023046153A1 (en) | 2021-09-26 | 2023-03-30 | Center For Excellence In Molecular Cell Science, Chinese Academy Of Sciences | Circular rna and preparation method thereof |
Non-Patent Citations (13)
| Title |
|---|
| BEEN, MD ET AL.: "One binding site determines sequence specificity of Tetrahymena pre-rRNA self-splicing trans-splicing and RNA enzyme activity", CELL, vol. 47, no. 2, 1986, XP023883128, DOI: 10.1016/0092-8674(86)90443-5 |
| INOUE T ET AL.: "Intermolecular exon ligation of the rRNA precursor of Tetrahymena: Oligonucleotides can function as 5' exons", CELL, vol. 43, no. 2, 1985, XP023911634, DOI: 10.1016/0092-8674(85)90173-4 |
| JONES JT ET AL.: "Tagging ribozyme reaction sites to follow trans-splicing in mammalian cells", NAT MED, vol. 2, no. 6, 1996, XP000652816, DOI: 10.1038/nm0696-643 |
| KOHLER U ET AL.: "Trans-splicing Ribozymes for Targeted Gene Delivery", J MOL BIOL, vol. 285, no. 5, 1999, XP004464306, DOI: 10.1006/jmbi.1998.2447 |
| LAN, N.HOWREY, R. P.LEE, S. W.SMITH, C. ASULLENGER, B. A: "Ribozyme-mediated repair of sickle beta-globin mRNAs in erythrocyte precursors", SCIENCE, vol. 280, 1998, pages 1593 - 1596 |
| LIU, C. X. ET AL., MOL CELL, vol. 82, 2022, pages 420 - 434 |
| OLSONMULLER, RNA, vol. 18, 2012, pages 581 - 589 |
| PETKOVIC, SMULLER, S: "RNA circularization strategies in vivo and in vitro", NUCLEIC ACIDS RES, vol. 43, 2015, pages 2454 - 2465, XP055488942, DOI: 10.1093/nar/gkv045 |
| PUTTARAJU, MBEEN, M. D: "Group I permuted intron-exon (PIE) sequences self-splice to produce circular exons", NUCLEIC ACIDS RES, vol. 20, 1992, pages 5357 - 5364, XP055622176, DOI: 10.1093/nar/20.20.5357 |
| PUTTARAJUBEEN, NUCLEIC ACIDS RES, vol. 20, 1992, pages 5357 - 5364 |
| SULLENGER BA ET AL.: "Ribozyme-mediated repair of defective mRNA by targeted trans-splicing", NATURE, vol. 371, 1994, XP002033257, DOI: 10.1038/371619a0 |
| WESSEL HOEFT, R. A.KOWALSKI, P. SANDERSON, D. G, NAT COMMUN, vol. 9, 2018, pages 2629 |
| WESSELHOEFT, R. A.KOWALSKI, P. SANDERSON, D. G: "Engineering circular RNA for potent and stable translation in eukaryotic cells", NAT COMMUN, vol. 9, 2018, pages 2629, XP055622155, DOI: 10.1038/s41467-018-05096-6 |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024252011A3 (en) | 2025-02-06 |
| GB202308675D0 (en) | 2023-07-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240392304A1 (en) | Circular rna and preparation method thereof | |
| US11939580B2 (en) | Construct of self-circularization RNA | |
| US20230242916A1 (en) | Method and drug for treating hurler syndrome | |
| JP6625521B2 (en) | Intracellular translation of circular RNA | |
| EP3724208A1 (en) | Compositions comprising circular polyribonucleotides and uses thereof | |
| Beaudry et al. | An efficient strategy for the synthesis of circular RNA molecules | |
| US11834670B2 (en) | Site-specific DNA modification using a donor DNA repair template having tandem repeat sequences | |
| CN119320775A (en) | CRISPR-based compositions and methods of use | |
| US20250146005A1 (en) | Methods of making circular rna | |
| WO2024140987A1 (en) | Rna circularization | |
| CN113528582B (en) | Method and medicine for targeted editing of RNA based on LEAPER technology | |
| WO2020065062A1 (en) | Off-target activity inhibitors for guided endonucleases | |
| CN115927331A (en) | DNA framework for promoting circRNA cyclization and overexpression and construction method and application thereof | |
| WO2024252011A2 (en) | Circular rnas and methods for making the same | |
| WO2025252998A2 (en) | Materials and methods for making circular rnas | |
| EP4585690A1 (en) | Self-circularization rna structure | |
| JP2025530198A (en) | Self-circularizing RNA constructs | |
| WO2025140537A1 (en) | Rna circularization | |
| WO2025046039A1 (en) | Methods for making circular rnas | |
| WO2024260432A1 (en) | Linear rna cyclization component and use thereof | |
| WO2025076037A1 (en) | Circular rna vectors comprising modified nucleotides | |
| CN120004994A (en) | A cytosine deaminase inhibitor SflSddi and its application | |
| CN117821423A (en) | RalCas13d protein and editing system thereof | |
| CN118910006A (en) | NiCas12b protein-based CRISPR/Cas12b gene editing system and related application thereof | |
| CN116732076A (en) | Closed linear DNA preparation method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24732591 Country of ref document: EP Kind code of ref document: A2 |