WO2023196738A1 - Nanopore sequencing of rna using reverse transcription - Google Patents
Nanopore sequencing of rna using reverse transcription Download PDFInfo
- Publication number
- WO2023196738A1 WO2023196738A1 PCT/US2023/064680 US2023064680W WO2023196738A1 WO 2023196738 A1 WO2023196738 A1 WO 2023196738A1 US 2023064680 W US2023064680 W US 2023064680W WO 2023196738 A1 WO2023196738 A1 WO 2023196738A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- rna
- nanopore
- sequence
- sequencing
- bmrt
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/43504—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates
- C07K14/43563—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from insects
- C07K14/43586—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from insects from silkworms
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
- C12N9/1276—RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
Definitions
- the structure of RNA is vital to its biological function3, and therefore methods to assign structure have been developed using the standard Illumina sequencing platform4,5.
- these methods indirectly sequence cDNA generated from reverse transcription and can so only aim to infer RNA structure from altered fidelity or termination of the sequenced DNA.
- the invention provides systems, methods and compositions for direct nanopore sequencing of RNA using a cellular reverse transcriptase to capture and thread single-stranded RNA through an MspA nanopore without requirement of ligation or prior conversion to cDNA.
- the invention can determine RNA secondary structures simultaneously with direct sequencing. RNA sequence and structure can be determined at the single-molecule level in scalable nanopore sequencing, enabling the determination of the relation between RNA function and structure for any RNA even in a complex RNA pool.
- the invention can be utilized to directly probe and sequence RNA from biological, clinical, or research samples without the need of prior amplification, ligation, or conversion to cDNA, therefore reducing potential biases that these extra steps introduce to the original sample.
- the invention can be used to detect RNA secondary structures in the samples during sequencing.
- the invention is used to directly capture and sequence full-length RNA extracted from biological samples, such as tissue or cell culture, or to assay the sequence homogeneity of supposedly identical RNAs generated in vitro or in vivo from a single type of DNA template.
- the invention outputs the sequence and secondary structure information of the RNA sample.
- Some embodiments of the invention deploy primers that are tagged with cholesterol to be hybridized to target RNA.
- Other variations utilize the “template jumping” activity of the particular cellular reverse transcriptase employed, whereby the enzyme can directly bind to the 3’ end of RNA regardless of sequence and initiate reverse transcription.
- the reverse transcriptase will be immobilized on the lipid membrane via a lipid anchor, and free RNA inside the sample well can be captured and thread through the nanopore for direct RNA sequencing.
- the invention may also be practiced in alternative embodiments, including alternative RTs, such as cellular selfish element RTs, alternative RNA inputs, which may be entirely or partly non-single-stranded, and alternative nanopores.
- alternative RTs such as cellular selfish element RTs
- alternative RNA inputs which may be entirely or partly non-single-stranded
- nanopores alternative nanopore sequencing using a biological nanopore, MspA, which has a smaller length of RNA in the constricted region of the pore and therefore fewer and more resolved current signatures necessary to define to relate current to sequence.
- the invention uses a physiological motor with high processivity tracking on RNA template and operably low impedance by RNA structure or modified nucleotides or chemical damage.
- the invention uses a membrane-tethered DNA primer to recruit RNA by base-pairing, which in one example can be visualized as depicted in Fig.1A.
- RNA 3’ can enter pore but RT-bound duplex cannot.
- RNA is copied into cDNA by the RT, the template that the RT has copied over is pulled away through the pore. (RT back side is butted up to the pore; the pore gets fed the already-copied RNA.)
- RNA is sequenced 3’ to 5’.
- the invention uses a membrane-tethered RT to recruit RNA template base-paired to DNA.
- RNA 3’ can enter pore but duplex cannot.
- RNA is copied into cDNA by the RT, the template that the RT has copied over is pulled away through the pore.
- RNA is sequenced 3’ to 5’. This can be visualized with reference to Fig. 1A, by changing the tether to be on the RT rather than the DNA primer.
- the invention uses the RT in terminal transferase buffer to 3’ extend molecules in the input RNA pool with a nucleotide or nucleotide analog or combination thereof, either in one step or in two separate steps to add two sequential tail sequences.
- RNA is sequenced 3’ to 5’.
- the invention uses membrane-tethered DNA primer to base pair with the RNA 3’ end. RNA 5’ can enter pore but duplex cannot. Add RT to initiate synthesis. As RT uses template for cDNA synthesis, RNA is pulled out of the pore. (RT front side is butted up to the pore). RNA is sequenced 3’ to 5’. This can be visualized with reference to Fig.
- RNA template + annealed portion of primer relative to the pore (RT on same side of pore).
- the invention uses membrane-tethered RT bound to primer to catch an RNA 3’ end near the pore, not necessarily by base-pairing.
- the RNA 5’ end can pass through the pore until it is halted by resistance from RT grip on the 3’ end. Initiate synthesis to pull the template RNA out of the pore.
- RNA is sequenced 3’ to 5’. This can be visualized with reference to Fig.
- the invention is applied to sequencing of DNA or other nucleic acid like molecules or chimeric nucleic acid like molecules; [019] the invention RT motors additionally engineered for desired performance; [020] the invention is used with no need for prior knowledge of RNA or DNA 3’ end sequence; [021] the invention is used with no need for RNA ligation (although ligation could be used to add handle(s) to input RNA); [022] the RNA structure does not induce motor dissociation or impose a lasting barrier; [023] the MspA nanopore quadromer map enables high accuracy of sequencing; [024] the invention provides information about RNA structure simultaneously with sequencing; [025] the invention can develop RNA quadromer map that includes discrimination of modified nucleotides; and/or [026] the invention is configured for automation and high-throughput.
- the invention provides an engineered cellular reverse transcriptase as a potent motor protein that can processively thread ssRNA through the MspA biological nanopore in single nucleotide steps while it is synthesizing cDNA. Threading ssRNA through the MspA nanopore in discrete steps, and ssRNA sequencing with the MspA nanopore, are novel aspects of the invention. Using ssRNAs of known sequences, and we constructed the “quadromer map” for ssRNA in the MspA nanopore, a table that can convert measured nanopore ion current to RNA sequences. In addition, we demonstrate that the single-molecule kinetic rates of the reverse transcriptase are affected by the presence of stable RNA secondary structures.
- the invention enables commercial RNA sequencers that can directly sequence RNA extracted from any biological sample.
- the technology can be used to identify expression levels, mutations, secondary structures, and chemical modifications of RNA.
- the long read nature of nanopore sequencers enables sequencing full length RNA without the need of fragmentation.
- the invention provides an RNA sequencing system as shown in Fig.1, and/or as described herein.
- the invention provides a method of nanopore sequencing of RNA using reverse transcription comprising: using an engineered cellular reverse transcriptase as a motor to directly capture and processively thread single-stranded RNA through a MspA porin nanopore in single nucleotide steps while it is synthesizing cDNA, without requirement of ligation or prior conversion to cDNA.
- the reverse transcriptase is a truncated, modified Bombyx mori R2 non-LTR retroelement RT (BoMoC without or with amino acid substitution(s)), e.g.
- the nanopore porin comprises MspA of Mycobacterium smegmatis.
- the invention encompasses all combinations of the particular embodiments recited herein, as if each combination had been laboriously recited.
- Fig. 1A-B Sequencing RNA with a eukaryotic RT and MspA nanopore.
- A The instrument setup of the MspA nanopore sequencer.
- a lipid bilayer was generated to separate two wells containing buffer solutions, a single MspA nanopore was inserted into the lipid membrane, and a bias of 140 mV was applied to the system during data acquisition.
- Template RNA blue line
- DNA primer red line
- RT After introduction of the RT, it forms an elongation complex that can get captured by the nanopore.
- the RT will come into rest on top of the nanopore and by cDNA synthesis will continuously thread RNA into the pore in discrete steps.
- the positions of the polyA sequences are not to scale due to illustration purposes.
- Top and middle panel Ion current signal from the translocation of RNA1 and RNA1_polyA.
- RNA1_polyA has two polyA sequences inserted close to the 5’ of the RNA about 80 nt apart (top panel, highlighted in orange) while RNA1 does not have polyA inserted (middle panel).
- the orange rectangles highlight the regions where polyA is inserted, and as seen in the top panel, polyA insertion gave rise to high ion current signal (top panel) that does not exist in the data without polyA insertion (middle panel, orange rectangles).
- Bottom panel The blue line represents a segment of the normalized ion current levels from the middle panel, along with RNA sequence aligned to the ion current signals. [037] Fig.
- RNA secondary structures by analyzing the single molecule kinetics of bmRT.
- A. Top panel Overlay of a segment of raw nanopore ion current from RNA translocation (blue) and the steps found via a point of change algorithm8 (grey). Single steps can be detected, and their individual dwell times can be quantified by fitting the cumulative distribution function (CDF) of the dwell time of the same step obtained from difference RNA translocation traces to a single exponential (bottom panel).
- CDF cumulative distribution function
- Bottom panel Dwell time distribution of the first sequence A repeat and second sequence A repeat overlayed, the sequence underneath represent the sequence in the enzyme’s catalytic site at every step. Sequence A is highlighted in magenta and the remainder of the terminal hairpin is in black.
- C. Top panel: a 32 nt RNA oligonucleotide (short black line) was hybridized to the RNA template.
- Bottom panel Dwell time distribution comparison between the same RNA sequence with (red line) and without (blue line) hybridization of the RNA oligonucleotide. Error bars are 95% confidence interval. [038]
- Fig. 3A-C An active helicase model to describe the helicase activity of bmRT. A.
- translocation by one step would require that the –3 nucleotide becomes unpaired.
- the helicase can sense RNA structures both at the –3 position and at further downstream nucleotides up to position –13 or –14 (total length of 11-12 nt), possibly due to preferential binding of the helicase to ssRNA.
- C Model prediction for the dependence of overall translocation rate as a function of the average base pair stability in downstream RNA. The sigmoidal becomes sharper with increased length (m) of the downstream sensing region following position -3.
- Fig. 4A-B Detecting Broccoli RNA-BI ligand binding using direct RNA nanopore sequencing.
- A sequence design of RNA3_Broccoli, the G bases involved in GQ formation are highlighted in red.
- a 6nt RNA duplex that precedes GQs is highlighted in blue. Information about which G bases are involved in GQ formation is obtained from ref. 25.
- B Average dwell time of bmRT along the Broccoli RNA sequence.
- the Broccoli RNA sequence is highlighted in orange and its upper and lower GQ is marked in magenta and grey respectively.
- the BI binding site is located on top of the upper GQ.
- a strong pause was observed when the bmRT’s catalytic site is at nt # 32, which is 12 nt from the start of the lower GQ and 2 nt from the start of the dsRNA duplex.
- Fig. 5 The MspA nanopore instrument setup. Two wells (cis and trans wells) are separated by an insulating lipid bilayer. A single MspA nanopore protein (yellow object cross section) is inserted into the nanopore in the “backwards” fashion.
- FIG. 6A-B The longest Ion current traces obtained by using Eubacterium rectale RT (A left panel) and Bombyx mori RT (B left panel) and RNA1 shown with different time scales (X axis).
- a and B right panel Alignment of the ion current traces to the consensus ion current sequence for the RNA1 (construction of the consensus ion current is described in Supplementary Note 1).
- RNA1 The consensus ion current sequence of RNA1is shown in blue, and ion current levels from A and B are shown in orange. Alignment of the ion current levels to the consensus ion current sequence is achieved by methods developed by the Gundlach lab8. The length of the individual read in (A left panel) and (B left panel) is shown in orange traces and numbers. [042] Fig. 7. Examples of the end of RNA translocation events. Events consistently end after the final tall peak for this particular template, and the current measured upon no further RNA translocation is consistently about 45 pA. RNA1 was used to generate this data. [043] Fig. 8A-E. Construction of RNA ion current consensus sequence. A. Ion current vs. RNA sequence.
- RNA2 was used to generate this data.
- Fig. 10A-D Stopping of RNA translocation in the MspA nanopore as bmRT reaches the 5’ end of the template.
- A in the raw traces acquired (an example from RNA1 is shown), we observed that bmRT fluctuates between two states when it reaches the end of the template, the origin of this fluctuation is unclear.
- B nucleotide position where bmRT stops (we used the first of the two states bmRT fluctuates between at the stop position) at the end of RNA1.
- bmRT was observed to stop at positions ranging from nt 359 to nt 364 in the nanopore, with stopping at nt 360 being the major product.
- bmRT extends its cDNA product by synthesis of a variable length of 3’ overhang21 , and this non-templated addition (NTA) of zero to five nt could allow additional ssRNA entry to the nanopore.
- NTA non-templated addition
- Fig. 11 Dwell time distribution of the RT along an RNA template with two barriers. The enzyme exhibits long dwell times when it encounters stable dsRNA.
- GGGUG For the broccoli RNA, after the initial canonical duplex (CGCCUC), an energy of –3kcal/mol is used for each of the next five nucleotides (GGGUG).
- This segment corresponds to nucleotides stacked on the duplex, part of the bottom mixed tetrad, and the first G of the lower G quadruplex29. (A good fit in panel D is not sensitive to the exact energy value used for this segment which can be between –2.5 and –3.5 kcal/mol per nucleotide, and the length of this segment can be four to six nucleotides.
- the grey box highlights sites that have high dsRNA % but fast translocation rate.
- the dots are color coded to show GC content.
- C. mFold predicted structures of RNA3 in region 1 and 2. In region 1, sites 67 to 69 have high dsRNA % ahead but have fast translocation rate. From the structure we can see that these sites are in an internal loop of a hairpin, and the hairpin should be unfolded when bmRT arrives at these sites, and the opened hairpin no longer poses as a barrier to translocation.
- bmRT translocation rate after the initial invasion the hairpin no longer slows down the enzyme.
- Fig. 15A-C bmRT kinetics on bare RNA3.
- A. The average dwell time (red) of bmRT on a segment of RNA3, containing the 5’ terminal hairpin. The minimum rate is indicated by straight blue line.
- B. Total base pairing energy (black curve, left axis) and individual base pairing energy (dashed red curve, right axis) as in Figure 13A-D, assuming full complementarity.
- Calculated per-nucleotide total base pairing probability in the thermodynamic ensemble (extracted from probability matrix depicted in Fig. 18) is shown as dotted lines for interactions with the 5’ side (P5’, magenta) or 3’ side (P3’, blue).
- the solid red line is the probability-scaled base pairing free energy, defined as ⁇ G bp ⁇ (w1+w2.P5’ +w3.P3’) where w1 to w3 are empirical weights and the term inside the parentheses is capped at one.
- C Model prediction for the translocation profile using the probability-scaled base pairing free energy given in panel B, with the parameter values indicated.
- Fig. 16 Binding of Broccoli RNA aptamer to ligand BI. Top panel: image of ligand titration in test tube. Fluorescence intensity increases until saturation as the ratio of ligand increases from 0 to 10-fold. Bottom panel: quantification of fluorescence intensity by Image J. Fluorescence intensity reaches saturation when ligand is in 5-fold excess. Measurement was repeated three times and plotted with error [052] Fig. 17. Predicted bmRT helicase power as a function of average base pair stability in the downstream region (of length m). The output power curves for three values of m are shown.
- Fig. 18 Structure dot plot showing possible base pairs in RNA3 predicted by mFold. See Supplementary Note 3.
- the terms “a” and “an” mean one or more, the term “or” means and/or. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein, including citations therein, are hereby incorporated by reference in their entirety for all purposes.
- RNAs are hybridized to a short DNA primer that is tagged with cholesterol to direct the RNA/DNA complex to the nanopore membrane ( Figure 1A). Addition of the RT results in binding of the enzyme to the RNA/DNA junction and consequent initiation of cDNA synthesis.
- RNA 3’ end of an elongation complex is drawn through a backwards inserted MspA nanopore ( Figure 1A , Figure 5) and the RT comes to rest on top of the pore, preventing additional RNA transit through it.
- the rate of cDNA synthesis by RT dictates the 3’-to-5’ passage of the RNA template chain through the pore in discrete steps until synthesis is complete.
- the MspA nanopore technique is also a powerful tool to dissect the biophysical function of molecular motors8 as the exact position of the motor protein on its template can be determined, and its single-molecule biophysical parameters (such as dwell time, pauses, backtracking activity) at every single step on the template can be detected and analyzed.
- Establishment of MspA nanopore sequencing of RNA is also a powerful tool to dissect the biophysical function of molecular motors8 as the exact position of the motor protein on its template can be determined, and its single-molecule biophysical parameters (such as dwell time, pauses, backtracking activity) at every single step on the template can be detected and analyzed.
- Nanopore sequencing requires two main components: 1) a processive motor to either pull or feed single-stranded (ss) DNA or RNA through the nanopore in discrete steps, and 2) an a priori knowledge of the ion currents corresponding to all the possible sequences that partially block the nanopore.
- ss single-stranded
- MspA nanopore which has been previously exploited for DNA9 and peptide10 sequencing.
- a major challenge to our goal was to identify a motor protein that could translocate the RNA template through the MspA nanopore in a processive and controlled manner.
- RNA sequences used in this study are summarized in Supplementary Table 1.
- Construction and Validation of the MspA RNA Quadromer Map A nanopore sequencing approach requires a library of currents corresponding to all the possible sequences that can be found inside the nanopore. For the MspA nanopore, this correlation is referred to as the “quadromer map”8 since the ion current is determined by the 4 nt that span the constriction site of the pore. Because RNA sequencing with the MspA nanopore had not yet been achieved, we first proceeded to obtain the RNA quadromer map for the MspA nanopore.
- RNA1 (Supplementary Table 1)
- RNA sequencing traces consistently ended with a very similar signature followed by RNA signal stillness in the pore ( Figure 7), which coincides with bulk biochemistry observations that bmRT does not readily dissociate from its RNA template upon completion of cDNA synthesis14. Therefore, we could assume that the ion current signals obtained close to the end of a translocation event originate from sequences close to the 5’ end of the RNA.
- each nucleotide in the RNA template can be assigned to a single step in the consensus ion current series confirms that the bmRT takes single-nucleotide steps on its RNA template and sequentially releases a single nt of RNA at a time to enter the nanopore.
- the nanopore reports on the RNA sequence partially blocking the current through the constriction site of the pore. In order to relate the dwell times of the RT with the presence of RNA structures in front of the enzyme (next sections), we need to know the exact location of the enzyme on the RNA template when a particular sequence is in the pore.
- RNA structure via nanopore sequencing kinetics Detection of RNA structure via nanopore sequencing kinetics.
- Stable RNA secondary structures have been shown to affect the kinetic rates of molecular motors such as the ribosome16, RNA helicases17, and retroviral RTs18,19.
- To extract kinetic information corresponding to the RNA sequence we determined the average dwell time before each RT step on the RNA template. This procedure involved pooling data obtained for that step from multiple sequencing traces of the same sequence and fitting them to a single exponential function (Figure 2A).
- RT kinetic profiles obtained in the presence and absence of the hybridized oligonucleotide showed pauses in translocation at distinct positions within the dsRNA region ( Figure 2C bottom panel, and Figure 11).
- dwell times with and without the dsRNA barrier remained similar for most translocation steps.
- bmRT dwell times at most positions do not appear to be changed by the presence of the secondary structures in the RNA either in the form of a hairpin or in the form of a duplex. Rather, the enzyme seems to pause at certain particular positions in this region and be unaffected in the regions of secondary structure that surround them. This behavior suggests that bmRT functions as an active helicase capable of destabilizing RNA structures 20 . To explain why the enzyme slows down at certain specific locations within the secondary structures, we constructed an active helicase model to quantitatively describe the kinetic profile of bmRT as a function of barrier stability.
- the translocation cycle of bmRT consists of a residence phase during which events such as dNTP binding and catalysis occur, followed by a stepping phase in which the motor attempts to move along its track.
- the overall observed dwell time at each position would equal k resid -1 + k step -1 , where k resid and k step are the rates of completing a residence and the rate of stepping of the enzyme, respectively.
- the observed dwell times in the presence of barriers can be well explained if the stepping rate depends not only on the base pairing stability of the nucleotide that is stepped over (at the helicase site of the enzyme), but also on the stability of several downstream nucleotides: k step ⁇ P u P mu k ss (1) where P u is the probability that the stepped-over nucleotide is in its unpaired state, P mu is the probability that the following downstream segment of length m is in its unpaired state, and k ss is the stepping rate over single-stranded RNA (in the absence of barrier).
- P u is a function of the Gibbs free energy difference between the unpaired and paired states of the nucleotide: where ⁇ is (k B T) -1 , ⁇ Gb P is the free energy of base pairing for the nucleotide, and ⁇ Ga is the destabilization energy due to the helicase. A large negative value of ⁇ Ga would represent a more “active” helicase 20 .
- m is the length of the downstream segment following the stepped-over nucleotide, is the total base-pairing free energy of the downstream segment, and ⁇ Ga is the same as above (per nucleotide).
- the value of ⁇ Gb P at each nucleotide position can be estimated precisely using the nearest neighbor rules 22 (as the difference in the ⁇ G of the barrier before and after opening of the given nucleotide). Additionally, k resid can be determined from the observed translocation rates in the absence of the barrier. This leaves k ss , ⁇ Ga, and m as the only free parameters in the model. After fitting these parameters, dwell times predicted using Eqs. 1 to 3 are in excellent agreement with the kinetic profiles obtained in the presence of different barriers, with the major and minor points of slowdown properly reproduced (Figure 3A and Figure 13D).
- Downstream sensing could be mediated by direct interaction of the helicase with the RNA to destabilize its folding 17,19,23 , or by a mechanism in which the kinetic stability of the junction arises from a long-range allosteric coupling through the double helix 24 .
- RNA is known to spontaneously form secondary structures of short polymer lengths 25
- longer dwell times are only observed in front of downstream RNA regions that have high probability of being double-stranded, most of which have high GC content (Figure 14A-C).
- Figure 15A-C the predicted base pairing probabilities into our active helicase model, we can qualitatively reproduce the observed pattern of dwell times with an overall correlation coefficient of ⁇ 0.6
- RNA3_Broccoli in Supplementary Table 1 RNA template that contains a single Broccoli RNA aptamer 27 at its 5’ end. This aptamer has two G-quadruplexes (GQ) and can bind the fluorescent ligand BI, which stabilizes the folding of the aptamer RNA 27,28 .
- the Broccoli GQs are preceded by a short RNA duplex (Figure 4A) which was shown to be important in folding of the aptamer based on sequence truncation experiments 27 .
- Figure 4A We first showed that the aptamer binds to BI under our experimental conditions ( Figure 16). Using our assay, we then compared the single-molecule kinetic profiles of bmRT on Broccoli RNA with and without the presence of BI ( Figure 4B). Results indicate that binding to BI and stabilization of the Broccoli RNA structure led to a significant pause of the bmRT when the helicase site of the enzyme is still 1 nt away from the start of the Broccoli RNA duplex.
- RNA template preparation RNA template sequences were ordered as dsDNA gBlocks that contain the T7 promoter from Integrated DNA Technologies (IDT) and inserted into a linearized pRZ plasmid using the infusion cloning kit (Thermo Fisher) and transformed into Sure2 cells (Agilent) following manufacturer’s instructions. Positive colonies were screened with Sanger sequencing.
- RNA oligonucleotide and DNA primer with 5’ cholesterol modification were ordered from IDT.
- RNA template was mixed with DNA primer (and when relevant a 10-fold excess of RNA oligonucleotide for dsRNA barrier experiments) to a final concentration of 0.8 ⁇ M and 2 ⁇ M respectively, in buffer containing 20 mM Tris pH 8.0 and 20 mM NaCl and heated to 75°C for 90 seconds and immediately placed on ice until further use.
- DNA primer and when relevant a 10-fold excess of RNA oligonucleotide for dsRNA barrier experiments
- the open reading frame of the enzymes was codon optimized and ordered from GenScript, and inserted with an N-terminal maltose binding protein tag into the MacroLab vector 2bct that contains a C-terminal 6xHis tag (https://qb3.berkeley.edu/facility/qb3-macrolab/).
- the enzymes were expressed in Rosetta2(DE3)pLysS cells in 2xYT medium and induced with isopropylthio- ⁇ -galactoside.
- the MspA nanopore instrument is a custom-built instrument based on the design from the Gundlach lab8.
- 2 wells of about 120 ⁇ l in volume were drilled into a Teflon block and the two wells were connected with Teflon tubing.
- One end of the tube was heat-shrunk and a small hole (about 20 um in diameter) was created using a fine surgical needle.
- Electrodes were prepared by inserting an Ag/AgCl pellet in heat shrink tubing.
- the Teflon block was mounted onto a custom-made aluminum block. Under the aluminum block is a Peltier that is connected to a temperature control unit (TED200C, Thorlabs).
- An Axopatch 200b (Molecular Devices) was connected to the electrodes and used to apply voltage and measure ion current.
- the Axopatch 200b is connected to a PC using National Instrument’s data acquisition card (DAQ) and controlled with a custom LabVIEW code.
- DAQ National Instrument’s data acquisition card
- the well that contains the 20 um hole is referred to as the cis well, and is where all the biochemical components are introduced during sequencing data acquisition.
- the other well is referred to at the trans well.
- Nanopore Experiments The two wells and tubing were first filled with standard experiment buffer (40 mM HEPES pH 7.5, 400 mM KCl). 180 mV was applied to the system.
- Broccoli RNA template sequences were ordered as dsDNA gBlock as above and inserted into a linearized pRZ plasmid using infusion cloning kit (Takara Bio) and transformed into Stellar cells following manufacturer’s instructions. Positive colonies were screened with Sanger sequencing. PCR was used to amplify templates for in vitro transcription with T7 RNA polymerase (NEB). The RNA product was extracted with phenol and concentration was measured by Nanodrop spectrophotometer (Thermo Fisher). Ligand for Broccoli RNA aptamer BI (LuceRNA) was prepared in 50mM DMSO and further diluted in water.
- Binding of the ligand to the RNA template was tested by varying the ratio of ligand to RNA in buffer containing 20 mM Tris pH 8.0 and 20 mM NaCl and heated to 75°C for 90 seconds and immediately placed on ice. The fluorescence intensity was quantified using ImageJ. In the nanopore experiment using BI ligand, 1:15 ratio of RNA to ligand was used. [080] Data Processing. The data processing pipeline is based on methods described previously8. In short: raw data (collected at 50 kHz) was down sampled to 2 kHz, and RNA translocation events were identified by using a custom GUI written in MATLAB. A point of change algorithm8 was used to identify steps within a continuous series of RNA translocation events.
- RNA ion current consensus sequence Our goal was to construct the consensus of ion current states observed with the RT for the RNA sequence listed in Supplementary table 1. Because RNA threaded through MspA pore in the 3 ⁇ backwards pore orientation has not been observed previously, we devised an experiment in which ‘bookends’ of poly-adenine sequence flanked a sequence of interest.
- the AGbp for each base pair can be estimated using the nearest neighbor parameters 22 (Figure 13B).
- the calculated dwell times in the presence of the barrier ( Figure 13C) simply mirror the base-pair stabilities, and no combination of parameters k ss and AG d can qualitatively reproduce the measured rates.
- the stepping rate would depend not only on the probability that the immediate nucleotide is unpaired, but also on the probability that the downstream segment is unpaired; in the simplest form, we can write: k step — P u P mu k ss , where k ss and P u are the same as above, and P mu is the probability that the downstream segment is in the unpaired state: where m is the length of the downstream segment following the immediate nucleotide, is the sum of the base-pairing free energies of the m nucleotides in this downstream segment, and ⁇ G d is the same as above (per nucleotide).
- the output power of the bmRT helicase can be approximated as the unwinding work divided by unwinding time, i.e., - ⁇ G bp k step .
- the output power as a function of average ⁇ G bp as predicted by this model is shown in Figure 17. It can be seen that that for barriers with moderate AGb P , higher values of m result in higher output power.
- ⁇ G dNTP the free energy released upon dNTP incorporation (and PPi hydrolysis).
- the model presented here has a minimal number of assumptions and free parameters to avoid overfitting.
- the model may be tuned or include additional terms.
- the ⁇ G d may be fine grained along the downstream region to better capture the physical reality of the downstream interaction.
- RNA molecules in equilibrium are partitioned in various folded structures.
- mFold we can obtain the probability of every base in the RNA molecule that is double stranded across all alternative predicted RNA structures.
- mFold predicted five different structures (shown in structure dot plot in Figure 18) and the percentage of dsRNA in the next 3-13 nt downstream to the bmRT catalytic site is calculated per predicted structure. Note that as RT progresses on a single RNA template, upstream RNA will enter the pore and is no longer able to pair with downstream regions.
- RNA molecules that are anchored to the nanopore membrane have a high local concentration and it is possible that our technology is detecting both intra- and inter-molecular RNA structures.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Systems, methods and compositions for direct nanopore sequencing of RNA using a cellular reverse transcriptase to directly capture and thread single-stranded RNA through an MspA nanopore without requirement of ligation or prior conversion to cDNA. Furthermore, generating the RNA sequence by referencing the ion currents to a nanopore RNA quadromer map that connects each detected ion current with a unique 4-nucleotide sequence spanning the nanopore.
Description
Direct Nanopore sequencing of RNA using reverse transcription This invention was made with government support under National Institutes of Health, grant numbers GM032543 and HL156819, and under National Science Foundation, award number 1616591. The government has certain rights in the invention. [001] Introduction [002] RNA plays critical roles as a messenger for protein production and has direct function in a wide diversity of biological processes including transcription regulation, protein synthesis, and nuclear organization1,2. The structure of RNA is vital to its biological function3, and therefore methods to assign structure have been developed using the standard Illumina sequencing platform4,5. However, these methods indirectly sequence cDNA generated from reverse transcription and can so only aim to infer RNA structure from altered fidelity or termination of the sequenced DNA. The Oxford nanopore platform can directly sequence RNA and detect modified bases6; however, this technology lacks the fidelity conferred by cDNA synthesis prior to sequencing7. Furthermore, the multiple hybridization and ligation steps in the sample preparation procedure bias sequencing outcome8. [003] Relevant Literature [004] Nanopore DNA sequencing with MspA has been reported by Derrington et al,. Proc Natl Acad Sci USA. 2010 Sep 14;107(37):16060-5. [005] Highly parallel direct polyadenosine-tailed RNA sequencing on an array of nanopores was reported by Garalde, D. et al. Nature Methods, 15, p.201–206 (2018), reviewed by Hussain S. Trends Biochem Sci 2018 Apr;43(4):225-227, see, e.g. Fig. 1. [006] Summary of the Invention [007] In aspects the invention provides systems, methods and compositions for direct nanopore sequencing of RNA using a cellular reverse transcriptase to capture and thread single-stranded RNA through an MspA nanopore without requirement of ligation or prior conversion to cDNA. The invention can determine RNA secondary structures simultaneously with direct sequencing. RNA sequence and structure can be determined at the single-molecule level in scalable nanopore sequencing, enabling the determination of the relation between RNA function and structure for any RNA even in a complex RNA pool. [008] The invention can be utilized to directly probe and sequence RNA from biological, clinical, or research samples without the need of prior amplification, ligation, or conversion to cDNA, therefore reducing potential biases that these extra steps introduce to the original sample.
In addition, the invention can be used to detect RNA secondary structures in the samples during sequencing. [009] In embodiments we directly sequence RNA molecules, without the need of amplification, ligation, or prior conversion to cDNA, therefore removing the bias that these steps can potentially introduce to the data; we can use the biophysical kinetic behavior of the reverse transcriptase to infer RNA secondary structure information during sequencing. [010] In embodiments the invention is used to directly capture and sequence full-length RNA extracted from biological samples, such as tissue or cell culture, or to assay the sequence homogeneity of supposedly identical RNAs generated in vitro or in vivo from a single type of DNA template. The invention outputs the sequence and secondary structure information of the RNA sample. Some embodiments of the invention deploy primers that are tagged with cholesterol to be hybridized to target RNA. Other variations utilize the “template jumping” activity of the particular cellular reverse transcriptase employed, whereby the enzyme can directly bind to the 3’ end of RNA regardless of sequence and initiate reverse transcription. Using this unique behavior, the reverse transcriptase will be immobilized on the lipid membrane via a lipid anchor, and free RNA inside the sample well can be captured and thread through the nanopore for direct RNA sequencing. The invention may also be practiced in alternative embodiments, including alternative RTs, such as cellular selfish element RTs, alternative RNA inputs, which may be entirely or partly non-single-stranded, and alternative nanopores. [011] In aspects the invention provides RNA nanopore sequencing using a biological nanopore, MspA, which has a smaller length of RNA in the constricted region of the pore and therefore fewer and more resolved current signatures necessary to define to relate current to sequence. The invention uses a physiological motor with high processivity tracking on RNA template and operably low impedance by RNA structure or modified nucleotides or chemical damage. [012] In embodiments the invention uses a membrane-tethered DNA primer to recruit RNA by base-pairing, which in one example can be visualized as depicted in Fig.1A. RNA 3’ can enter pore but RT-bound duplex cannot. As RNA is copied into cDNA by the RT, the template that the RT has copied over is pulled away through the pore. (RT back side is butted up to the pore; the pore gets fed the already-copied RNA.) RNA is sequenced 3’ to 5’. [013] In embodiments the invention uses a membrane-tethered RT to recruit RNA template base-paired to DNA. RNA 3’ can enter pore but duplex cannot. As RNA is copied into cDNA by the RT, the template that the RT has copied over is pulled away through the pore. RNA is sequenced 3’ to 5’. This can be visualized with reference to Fig. 1A, by changing the tether to be on the RT rather than the DNA primer.
[014] In embodiments the invention uses the RT in terminal transferase buffer to 3’ extend molecules in the input RNA pool with a nucleotide or nucleotide analog or combination thereof, either in one step or in two separate steps to add two sequential tail sequences. Anneal a primer to an internal tail location using either a junction-sequence-anchored primer or a primer complementary to the internal part of the added tail. Continue with the protocol as demonstrated. RT will gate 3’-5’ threading of template into the pore. RNA is sequenced 3’ to 5’. [015] In embodiments the invention uses membrane-tethered DNA primer to base pair with the RNA 3’ end. RNA 5’ can enter pore but duplex cannot. Add RT to initiate synthesis. As RT uses template for cDNA synthesis, RNA is pulled out of the pore. (RT front side is butted up to the pore). RNA is sequenced 3’ to 5’. This can be visualized with reference to Fig. 1, as vertical flipping of the RT + RNA template + annealed portion of primer relative to the pore (RT on same side of pore). [016] In embodiments the invention uses membrane-tethered RT bound to primer to catch an RNA 3’ end near the pore, not necessarily by base-pairing. The RNA 5’ end can pass through the pore until it is halted by resistance from RT grip on the 3’ end. Initiate synthesis to pull the template RNA out of the pore. RNA is sequenced 3’ to 5’. This can be visualized with reference to Fig. 1, as vertical flipping of the RT + template RNA + annealed portion of primerrelative to the pore (RT still on same side of pore), with tether changed from RNA to RT, and with RT- bound primer and RNA template that are not base paired. [017] In embodiments: [018] the invention is applied to sequencing of DNA or other nucleic acid like molecules or chimeric nucleic acid like molecules; [019] the invention RT motors additionally engineered for desired performance; [020] the invention is used with no need for prior knowledge of RNA or DNA 3’ end sequence; [021] the invention is used with no need for RNA ligation (although ligation could be used to add handle(s) to input RNA); [022] the RNA structure does not induce motor dissociation or impose a lasting barrier; [023] the MspA nanopore quadromer map enables high accuracy of sequencing; [024] the invention provides information about RNA structure simultaneously with sequencing; [025] the invention can develop RNA quadromer map that includes discrimination of modified nucleotides; and/or [026] the invention is configured for automation and high-throughput.
[027] The invention provides an engineered cellular reverse transcriptase as a potent motor protein that can processively thread ssRNA through the MspA biological nanopore in single nucleotide steps while it is synthesizing cDNA. Threading ssRNA through the MspA nanopore in discrete steps, and ssRNA sequencing with the MspA nanopore, are novel aspects of the invention. Using ssRNAs of known sequences, and we constructed the “quadromer map” for ssRNA in the MspA nanopore, a table that can convert measured nanopore ion current to RNA sequences. In addition, we demonstrate that the single-molecule kinetic rates of the reverse transcriptase are affected by the presence of stable RNA secondary structures. Monitoring this biophysical behavior can be used to determine RNA structures during nanopore sequencing. [028] The invention enables commercial RNA sequencers that can directly sequence RNA extracted from any biological sample. The technology can be used to identify expression levels, mutations, secondary structures, and chemical modifications of RNA. The long read nature of nanopore sequencers enables sequencing full length RNA without the need of fragmentation. [029] In an aspect the invention provides an RNA sequencing system as shown in Fig.1, and/or as described herein. [030] In an aspect the invention provides a method of nanopore sequencing of RNA using reverse transcription comprising: using an engineered cellular reverse transcriptase as a motor to directly capture and processively thread single-stranded RNA through a MspA porin nanopore in single nucleotide steps while it is synthesizing cDNA, without requirement of ligation or prior conversion to cDNA. [031] In embodiments: [032] the reverse transcriptase (RT) is a truncated, modified Bombyx mori R2 non-LTR retroelement RT (BoMoC without or with amino acid substitution(s)), e.g. WO2022/076759; WO2020033777; and/or [033] the nanopore porin comprises MspA of Mycobacterium smegmatis. [034] The invention encompasses all combinations of the particular embodiments recited herein, as if each combination had been laboriously recited. [035] Brief Description of the Drawings [036] Fig. 1A-B. Sequencing RNA with a eukaryotic RT and MspA nanopore. A. The instrument setup of the MspA nanopore sequencer. A lipid bilayer was generated to separate two wells containing buffer solutions, a single MspA nanopore was inserted into the lipid membrane, and a bias of 140 mV was applied to the system during data acquisition. Template RNA (blue line) is hybridized with a DNA primer (red line) that is tagged with cholesterol (grey oval) and anchored to the bilayer. After introduction of the RT, it forms an elongation complex that can
get captured by the nanopore. The RT will come into rest on top of the nanopore and by cDNA synthesis will continuously thread RNA into the pore in discrete steps. The positions of the polyA sequences are not to scale due to illustration purposes. B. Top and middle panel: Ion current signal from the translocation of RNA1 and RNA1_polyA. RNA1_polyA has two polyA sequences inserted close to the 5’ of the RNA about 80 nt apart (top panel, highlighted in orange) while RNA1 does not have polyA inserted (middle panel). The orange rectangles highlight the regions where polyA is inserted, and as seen in the top panel, polyA insertion gave rise to high ion current signal (top panel) that does not exist in the data without polyA insertion (middle panel, orange rectangles). Bottom panel: The blue line represents a segment of the normalized ion current levels from the middle panel, along with RNA sequence aligned to the ion current signals. [037] Fig. 2A-C. Detecting RNA secondary structures by analyzing the single molecule kinetics of bmRT. A. Top panel: Overlay of a segment of raw nanopore ion current from RNA translocation (blue) and the steps found via a point of change algorithm8 (grey). Single steps can be detected, and their individual dwell times can be quantified by fitting the cumulative distribution function (CDF) of the dwell time of the same step obtained from difference RNA translocation traces to a single exponential (bottom panel). B. Top panel: RNA template that contains two repeats of sequence A (highlighted in magenta). The second repeat base pairs to form a stable 5’ terminal hairpin. Bottom panel: Dwell time distribution of the first sequence A repeat and second sequence A repeat overlayed, the sequence underneath represent the sequence in the enzyme’s catalytic site at every step. Sequence A is highlighted in magenta and the remainder of the terminal hairpin is in black. C. Top panel: a 32 nt RNA oligonucleotide (short black line) was hybridized to the RNA template. Bottom panel: Dwell time distribution comparison between the same RNA sequence with (red line) and without (blue line) hybridization of the RNA oligonucleotide. Error bars are 95% confidence interval. [038] Fig. 3A-C. An active helicase model to describe the helicase activity of bmRT. A. Agreement of model predictions with experimental kinetic profiles of bmRT over RNA segments that can form a hairpin (left, data from Figure 2B) or hybridize to an oligonucleotide (right, data from Figure 2C). The major pauses in the presence of barriers (Figure 2B-C) are reproduced by the model. See Figure 13A-D for details. B. Schematic drawing of the bmRT elongation complex showing the expected relative positions of the polymerase catalytic site (–1) and the closest helicase site (–3) during the dwell time of the enzyme. With this arrangement, the –1 and –2 RNA nucleotides are both unpaired. After incorporation of an incoming dNTP at position –1, translocation by one step would require that the –3 nucleotide becomes unpaired. In this model, the helicase can sense RNA structures both at the –3 position and at further
downstream nucleotides up to position –13 or –14 (total length of 11-12 nt), possibly due to preferential binding of the helicase to ssRNA. C. Model prediction for the dependence of overall translocation rate as a function of the average base pair stability in downstream RNA. The sigmoidal becomes sharper with increased length (m) of the downstream sensing region following position -3. This plot uses helicase destabilization energy (∆Gd) of –2.6kcal/mol, single-strand stepping rate of 200 nt/s, and mean residence rate of 34 nt/s, as obtained from the fit to the measurements (Figure 13D). [039] Fig. 4A-B. Detecting Broccoli RNA-BI ligand binding using direct RNA nanopore sequencing. A. sequence design of RNA3_Broccoli, the G bases involved in GQ formation are highlighted in red. A 6nt RNA duplex that precedes GQs is highlighted in blue. Information about which G bases are involved in GQ formation is obtained from ref. 25. B. Average dwell time of bmRT along the Broccoli RNA sequence. The Broccoli RNA sequence is highlighted in orange and its upper and lower GQ is marked in magenta and grey respectively. The BI binding site is located on top of the upper GQ. A strong pause was observed when the bmRT’s catalytic site is at nt # 32, which is 12 nt from the start of the lower GQ and 2 nt from the start of the dsRNA duplex. [040] Fig. 5. The MspA nanopore instrument setup. Two wells (cis and trans wells) are separated by an insulating lipid bilayer. A single MspA nanopore protein (yellow object cross section) is inserted into the nanopore in the “backwards” fashion. Pore insertion was done at 180 mV and sequencing experiments were done at 140 mV voltage bias. During initial phases of the project, we noticed that using the backwards inserted pore resulted in more and longer bmRT- RNA translocation events, so for this study only the backwards inserted MspA was used. [041] Fig. 6A-B. The longest Ion current traces obtained by using Eubacterium rectale RT (A left panel) and Bombyx mori RT (B left panel) and RNA1 shown with different time scales (X axis). A and B right panel: Alignment of the ion current traces to the consensus ion current sequence for the RNA1 (construction of the consensus ion current is described in Supplementary Note 1). The consensus ion current sequence of RNA1is shown in blue, and ion current levels from A and B are shown in orange. Alignment of the ion current levels to the consensus ion current sequence is achieved by methods developed by the Gundlach lab8. The length of the individual read in (A left panel) and (B left panel) is shown in orange traces and numbers. [042] Fig. 7. Examples of the end of RNA translocation events. Events consistently end after the final tall peak for this particular template, and the current measured upon no further RNA translocation is consistently about 45 pA. RNA1 was used to generate this data.
[043] Fig. 8A-E. Construction of RNA ion current consensus sequence. A. Ion current vs. RNA sequence. (Black) Prediction based on 5` feeding forwards pore DNA quadromer map. (Red) Hand-built RNA consensus. B. Consensus generation. (Blue) prediction based on dimer map extracted from the hand-built RNA consensus. (Green) consensus generated from simulated annealing algorithm. C. Mutual information between RNA sequence and ion current states (red). The black line is the null-hypothesis. D. Mutual information between DNA sequence for 5` feeding forwards pore and ion current states (red). E. Broccoli sequence consensus generation. (Blue) Prediction based on a hybrid dimer/quadromer model. (Green) Consensus generated from simulated annealing algorithm. [044] Fig. 9. The overlay of ion current predicted by the quadromer map and actual ion current patterns collected. RNA2 was used to generate this data. [045] Fig. 10A-D. Stopping of RNA translocation in the MspA nanopore as bmRT reaches the 5’ end of the template. A: in the raw traces acquired (an example from RNA1 is shown), we observed that bmRT fluctuates between two states when it reaches the end of the template, the origin of this fluctuation is unclear. B: nucleotide position where bmRT stops (we used the first of the two states bmRT fluctuates between at the stop position) at the end of RNA1. The bmRT was observed to stop at positions ranging from nt 359 to nt 364 in the nanopore, with stopping at nt 360 being the major product. bmRT extends its cDNA product by synthesis of a variable length of 3’ overhang21 , and this non-templated addition (NTA) of zero to five nt could allow additional ssRNA entry to the nanopore. C. Based on observations in A and B, we hypothesize that the most popular stop position (nt 360) is where the bmRT does not perform any NTA, and the distance between the constriction site of the nanopore and the catalytic site of the bmRT is 17 nt. D. An illustration of bmRT and RNA1 position at the end of transcription, with the first base in the quadromer in the constriction site and the catalytic site of the enzyme labeled. [046] Fig. 11. Dwell time distribution of the RT along an RNA template with two barriers. The enzyme exhibits long dwell times when it encounters stable dsRNA. [047] Fig. 12. Average dwell time of the RT along RNA3 that is either hybridized with a RNA oligo barrier that contains no 5’ ssRNA overhang, or a RNA oligo that contains a polyA 5’ ssRNA overhang. Error bars are 95% confidence interval. These two oligos hybridize at the same site as the 32 nt RNA oligo used in Figure 2C with a shorter hybridization region (15-17 nt). We observed that at nt # 13, the dwell time of bmRT is similar between hybridization with the long and short blocker, indicating that the presence of ssRNA overhang has no significant impact on the strand displacement activity of bmRT. [048] Fig. 13A-D. Modeling of the bmRT kinetics. See Supplementary Note 2. A. Measured bmRT dwell times in the absence (blue) and presence (red) of encountered double-stranded
barriers for (from left to right) the annealed oligonucleotide, the hairpin, the broccoli RNA, and long and short blocker oligonucleotides, with the barrier region in each case demarcated by vertical dashed lines. B. The overall free energy barrier (black) and the free energy contribution of individual base pairs (red, right axis) for the segments in panel A, calculated using the nearest neighbor parameters22. For the hairpin RNA, an energy of –3.45 kcal/mol for the hairpin tetraloop29 is applied at its first nucleotide. For the broccoli RNA, after the initial canonical duplex (CGCCUC), an energy of –3kcal/mol is used for each of the next five nucleotides (GGGUG). This segment corresponds to nucleotides stacked on the duplex, part of the bottom mixed tetrad, and the first G of the lower G quadruplex29. (A good fit in panel D is not sensitive to the exact energy value used for this segment which can be between –2.5 and –3.5 kcal/mol per nucleotide, and the length of this segment can be four to six nucleotides. A value of - 1kcal/mol is assigned to each of the remaining nucleotides, but any value between 0 and – 1.5kcal/mol per nucleotide would produce a good fit.) C. Best fit (thicker brown curve) to observed dwell times in the presence of barriers assuming an enzyme without downstream interactions, for which the translocation rate depends only on the probability that the immediate base pair is unpaired. Under this scheme, the rates mirror the base-pair stabilities shown in panel B, and no combination of parameters kss and ∆Gd can qualitatively reproduce the measured rates (dashed curves replicated from panel A). D. Best fit (thicker brown curve) to observed dwell times in the presence of barriers assuming an enzyme with downstream interactions, for which translocation rate depends not only on the probability that the immediate nucleotide is unpaired, but also on the probability that the downstream segment is unpaired. The major and minor points of slowdown due to the barrier in each case are properly reproduced by this model. These fits are obtained when the proximal helicase active site of bmRT is considered to be at position –3. The five data sets (three annealed oligonucleotides, the hairpin, and the broccoli RNA) were fit independently but yielded similar values for parameters kss, m, and ∆Gd. [049] Fig. 14A-C. Investigation of sites on RNA3 that have high dsRNA % and either fast or slow bmRT translocation rate. A (top panel). The average dwell time of bmRT on a segment of RNA3, containing the 5’ terminal hairpin(starts at nt # 136). A (middle and bottom panel). The % of dsRNA in the -3 to -13 position downstream of the enzyme (positions explained in Figure 3B). In A, 2 regions of interest that have high dsRNA % are highlighted in grey. B. Plotting dwell time against % of dsRNA in the next downstream -3 to -13 nt. The black box indicates that steps with longer dwell times are only observed when the region has a high probability of being double-stranded. The grey box highlights sites that have high dsRNA % but fast translocation rate. The dots are color coded to show GC content. C. mFold predicted structures of RNA3 in region 1 and 2. In region 1, sites 67 to 69 have high dsRNA % ahead but
have fast translocation rate. From the structure we can see that these sites are in an internal loop of a hairpin, and the hairpin should be unfolded when bmRT arrives at these sites, and the opened hairpin no longer poses as a barrier to translocation. In region 2, sites 135 and 136 are right after the initial invasion of bmRT into the terminal hairpin, and based on our observation in Figure 2B and our helicase model (Figure 3), bmRT translocation rate after the initial invasion the hairpin no longer slows down the enzyme. [050] Fig. 15A-C. bmRT kinetics on bare RNA3. A. The average dwell time (red) of bmRT on a segment of RNA3, containing the 5’ terminal hairpin. The minimum rate is indicated by straight blue line. B. Total base pairing energy (black curve, left axis) and individual base pairing energy (dashed red curve, right axis) as in Figure 13A-D, assuming full complementarity. Calculated per-nucleotide total base pairing probability in the thermodynamic ensemble (extracted from probability matrix depicted in Fig. 18) is shown as dotted lines for interactions with the 5’ side (P5’, magenta) or 3’ side (P3’, blue). The solid red line is the probability-scaled base pairing free energy, defined as ∆Gbp×(w1+w2.P5’ +w3.P3’) where w1 to w3 are empirical weights and the term inside the parentheses is capped at one. C. Model prediction for the translocation profile using the probability-scaled base pairing free energy given in panel B, with the parameter values indicated. Several points of slowdown in the data are reproduced by the model; the overall correlation coefficient between model and data for this segment is 0.58. [051] Fig. 16. Binding of Broccoli RNA aptamer to ligand BI. Top panel: image of ligand titration in test tube. Fluorescence intensity increases until saturation as the ratio of ligand increases from 0 to 10-fold. Bottom panel: quantification of fluorescence intensity by Image J. Fluorescence intensity reaches saturation when ligand is in 5-fold excess. Measurement was repeated three times and plotted with error [052] Fig. 17. Predicted bmRT helicase power as a function of average base pair stability in the downstream region (of length m). The output power curves for three values of m are shown. [053] Fig. 18. Structure dot plot showing possible base pairs in RNA3 predicted by mFold. See Supplementary Note 3. [054] Description of Particular Embodiments of the Invention [055] Unless contraindicated or noted otherwise, in these descriptions and throughout this specification, the terms “a” and “an” mean one or more, the term “or” means and/or. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and
scope of the appended claims. All publications, patents, and patent applications cited herein, including citations therein, are hereby incorporated by reference in their entirety for all purposes. [056] We disclose a nanopore-based technique that directly sequences RNA and detects RNA structure from the sequencing translocation kinetics, without the need of prior cDNA conversion or RNA modifications. This advancement was achieved using a processive eukaryotic cellular RT to thread RNA into a MspA nanopore one base at a time, as it synthesizes cDNA (Figure 1A). The MspA nanopore instrument is based on the setup developed previously8. This setup comprises two wells filled with electrolyte solutions that are separated by an insulating lipid bilayer in which a single MspA nanopore has been inserted (Figure 5). When voltage is applied across the membrane, ion current flows through the pore. When biological polymers such as DNA, RNA, or peptides enter the pore, the measured current drops to an extent determined by the sequence of the biopolymer spanning the constriction of the MspA nanopore8–10. [057] In the first application strategy, target RNAs are hybridized to a short DNA primer that is tagged with cholesterol to direct the RNA/DNA complex to the nanopore membrane (Figure 1A). Addition of the RT results in binding of the enzyme to the RNA/DNA junction and consequent initiation of cDNA synthesis. Upon application of a voltage bias across the membrane, the RNA 3’ end of an elongation complex is drawn through a backwards inserted MspA nanopore (Figure 1A , Figure 5) and the RT comes to rest on top of the pore, preventing additional RNA transit through it. The rate of cDNA synthesis by RT dictates the 3’-to-5’ passage of the RNA template chain through the pore in discrete steps until synthesis is complete. Using this setup, we established the first MspA nanopore RNA quadromer map that connects each detected ion current with a unique 4-nucleotide sequence spanning the pore, and enables RNA sequencing with the MspA nanopore. [058] The MspA nanopore technique is also a powerful tool to dissect the biophysical function of molecular motors8 as the exact position of the motor protein on its template can be determined, and its single-molecule biophysical parameters (such as dwell time, pauses, backtracking activity) at every single step on the template can be detected and analyzed. We challenged the RT with different RNA structural barriers and quantified its translocation kinetics, which revealed RT’s ability to sense secondary structures ahead of its front boundary. Combining RNA sequencing and monitoring of the single molecule kinetics of the RT, we demonstrated simultaneous RNA sequencing and structure detection without the need of prior conversion to cDNA or chemical modifications of RNA. [059] Establishment of MspA nanopore sequencing of RNA. Nanopore sequencing requires two main components: 1) a processive motor to either pull or feed single-stranded (ss) DNA or RNA through the nanopore in discrete steps, and 2) an a priori knowledge of the ion currents
corresponding to all the possible sequences that partially block the nanopore. In this study, we used the MspA nanopore, which has been previously exploited for DNA9 and peptide10 sequencing. A major challenge to our goal was to identify a motor protein that could translocate the RNA template through the MspA nanopore in a processive and controlled manner. We initially tested two classes of enzymes: RNA helicase and RT. No readily available enzymes tested in either category, including the NS3 helicase from Hepatitis C virus (HCV)11 and retroviral RT12 , could retain and processively thread ssRNA through the nanopore under the necessary cross-membrane voltage bias. With the DNA primer and ssRNA capture strategy of our experimental design, we determined that a bacterial self-splicing intron RT from Eubacterium rectale13 was able to thread ssRNA through the nanopore, but the processivity of the enzyme under nanopore translocation conditions was not optimal and resulted in short RNA translocation events (Figure 6A, left panel). In comparison, we found that a truncated and modified form of a retroelement RT from Bombyx mori14 had the necessary processivity under nanopore experimental conditions and consistently generated long enough RNA translocation events (Figure 6B, left panel). Upon further analysis, as described in the following section, we determined that the B. mori RT was able to generate read lengths up to 400 nt on a 550 nt RNA template (Figure 6B, right panel) while the RT from E. rectale was only able to generate reads up to 60 nt on the same RNA template (Figure 6A, right panel). Therefore, all further experiments were conducted with the RT from B. mori, referred to as bmRT below. RNA sequences used in this study are summarized in Supplementary Table 1. [060] Construction and Validation of the MspA RNA Quadromer Map. A nanopore sequencing approach requires a library of currents corresponding to all the possible sequences that can be found inside the nanopore. For the MspA nanopore, this correlation is referred to as the “quadromer map”8 since the ion current is determined by the 4 nt that span the constriction site of the pore. Because RNA sequencing with the MspA nanopore had not yet been achieved, we first proceeded to obtain the RNA quadromer map for the MspA nanopore. Comparing RNA translocation events we collected with the first RNA template (RNA1) (Supplementary Table 1), we noticed that RNA sequencing traces consistently ended with a very similar signature followed by RNA signal stillness in the pore (Figure 7), which coincides with bulk biochemistry observations that bmRT does not readily dissociate from its RNA template upon completion of cDNA synthesis14. Therefore, we could assume that the ion current signals obtained close to the end of a translocation event originate from sequences close to the 5’ end of the RNA. Based on this observation, we performed bmRT-directed nanopore translocation of RNAs with and without two eight-nucleotide polyadenosine (polyA) tract insertions that are about 80 nt apart near RNA1’s 5’ end (RNA1 and RNA1_PolyA, Figure 1B, Supplementary Table 1). Because
polyA generates a signature of high ion current9, we could roughly assign ion currents to the 80 nt RNA region flanked by the two polyA regions (Figure 1B, top and middle panels), and after removing instrument noise or erratic enzyme behavior, we constructed the corresponding sequence of consensus nanopore ion currents. Figure 1B, bottom panel, shows the consensus ion currents of part of this region). Next, having the precedent that for DNA sequencing using the MspA nanopore the sequence “TT” often correlates with a local ion current minimum9, we were able to match the consensus ion currents to the known sequence of RNA1 with single nucleotide accuracy (Figure 1B bottom panel, Supplementary Note 1). This analysis allowed us to generate the RNA quadromer map (Supplementary Table 2), whose information content is comparable to the published DNA quadromer map data8 (Supplementary Note 1). Comparison between ion currents predicted by the resulting RNA quadromer map with existing DNA quadromer maps (Figure 8A), revealed significant differences between the two, highlighting the importance of newly derived RNA quadromer map for reliable RNA sequencing. A representative segment of consensus ion current related to sequence is shown in Figure 1B (bottom panel). To further verify the quadromer map, we used it to predict the ion current pattern for a different RNA sequence (RNA2, Supplementary Table 1). The predicted ion currents matched well with the experimentally determined ones (Figure 9). The quality of the match is similar to that of previously reported MspA nanopore sequencing of ssDNA9 (Supplementary Note 1), demonstrating the technique’s suitability for RNA sequencing. Importantly, the observation that each nucleotide in the RNA template can be assigned to a single step in the consensus ion current series confirms that the bmRT takes single-nucleotide steps on its RNA template and sequentially releases a single nt of RNA at a time to enter the nanopore. [061] We note that the nanopore reports on the RNA sequence partially blocking the current through the constriction site of the pore. In order to relate the dwell times of the RT with the presence of RNA structures in front of the enzyme (next sections), we need to know the exact location of the enzyme on the RNA template when a particular sequence is in the pore. In other words, we need to establish the offset between the constriction site of the pore and the catalytic site of bmRT. To this end, we exploited a particular feature of this enzyme when it reaches the 5’ end of RNA template: it extends its cDNA product via non-templated addition generating up to five nt of 3’ overhang14,15. Based on the range of positions at which the enzyme stops threading RNA into the nanopore, we estimated that the distance between the enzyme’s catalytic site and the constriction site of the nanopore is 17 nt (Figure 10A-D). This offset allowed us to define the position of the bmRT catalytic site in nanopore sequencing ion-current traces.
[062] Detection of RNA structure via nanopore sequencing kinetics. Stable RNA secondary structures have been shown to affect the kinetic rates of molecular motors such as the ribosome16, RNA helicases17, and retroviral RTs18,19. We aimed to challenge the bmRT with RNA structures during nanopore sequencing and characterize the changes in kinetic behavior of bmRT as it encounters these barriers. To extract kinetic information corresponding to the RNA sequence, we determined the average dwell time before each RT step on the RNA template. This procedure involved pooling data obtained for that step from multiple sequencing traces of the same sequence and fitting them to a single exponential function (Figure 2A). As shown in Figure 2A, the dwell time distribution of bmRT can be described by a single exponential function, which suggests that bmRT has a single dominant rate limiting step between each translocation steps. [063] To examine the effect of RNA secondary structures on bmRT dwell times, we designed an RNA template that contains two repeats of the same sequence (RNA3 in Supplementary Table 1) in which one repeat is partially base paired to form the stem of a stable RNA hairpin and the other is not (Figure 2B top panel). The RT kinetic profiles derived from the average dwell times of the individual steps of the enzyme obtained for the repeats in the presence and absence of the hairpin structural barrier were compared (Figure 2B bottom panel). This analysis revealed a major pause when the catalytic site of the enzyme is 2 nt away from the start of the RNA hairpin. This pausing indicates that the hairpin duplex represents a barrier that slows bmRT translocation along the RNA template, and makes it possible to use this enzyme to simultaneously detect RNA structures concomitant with sequencing. Interestingly, the two kinetic profiles were indistinguishable in the remainder of the hairpin sequence, suggesting that the invasion by the enzyme of the hairpin is sufficient to greatly destabilize it. [064] As a second test, we hybridized an RNA oligonucleotide to a region of the same RNA3 template sequence to create a double-stranded (ds) barrier for the enzyme (Figure 2C top panel). RT kinetic profiles obtained in the presence and absence of the hybridized oligonucleotide showed pauses in translocation at distinct positions within the dsRNA region (Figure 2C bottom panel, and Figure 11). As in the case of the hairpin, dwell times with and without the dsRNA barrier remained similar for most translocation steps. To rule out the possibility that direct contact with the 5’ phosphate of the RNA may have caused this pause, we designed a pair of RNA oligonucleotides that hybridize to RNA3, one with a 5’ ssRNA polyA overhang and another without it. We found that the kinetic profile of bmRT was similar for both oligos (Figure 12), suggesting that the pause we observed before the terminal RNA hairpin (Figure 2B) is most likely due to the presence of a stable RNA secondary structure and not the presence of a 5’ phosphate.
[065] Modeling of the impact of RNA structures on bmRT translocation kinetics.
Surprisingly, bmRT dwell times at most positions do not appear to be changed by the presence of the secondary structures in the RNA either in the form of a hairpin or in the form of a duplex. Rather, the enzyme seems to pause at certain particular positions in this region and be unaffected in the regions of secondary structure that surround them. This behavior suggests that bmRT functions as an active helicase capable of destabilizing RNA structures20. To explain why the enzyme slows down at certain specific locations within the secondary structures, we constructed an active helicase model to quantitatively describe the kinetic profile of bmRT as a function of barrier stability.
[066] As a general description, the translocation cycle of bmRT consists of a residence phase during which events such as dNTP binding and catalysis occur, followed by a stepping phase in which the motor attempts to move along its track. The overall observed dwell time at each position would equal kresid -1+ kstep -1, where kresid and kstep are the rates of completing a residence and the rate of stepping of the enzyme, respectively. As we show in Supplementary Note 2, the observed dwell times in the presence of barriers can be well explained if the stepping rate depends not only on the base pairing stability of the nucleotide that is stepped over (at the helicase site of the enzyme), but also on the stability of several downstream nucleotides: kstep ~ PuPmukss (1) where Pu is the probability that the stepped-over nucleotide is in its unpaired state, Pmu is the probability that the following downstream segment of length m is in its unpaired state, and kss is the stepping rate over single-stranded RNA (in the absence of barrier). Pu is a function of the Gibbs free energy difference between the unpaired and paired states of the nucleotide:
where β is (kBT)-1, ΔGbP is the free energy of base pairing for the nucleotide, and ΔGa is the destabilization energy due to the helicase. A large negative value of ΔGa would represent a more “active” helicase20. Similarly,
where m is the length of the downstream segment following the stepped-over nucleotide, is the total base-pairing free energy of the downstream segment, and ΔGa is the
same as above (per nucleotide).
[067] Knowing the sequence of the dsRNA barrier, the value of ΔGbP at each nucleotide position can be estimated precisely using the nearest neighbor rules22 (as the difference in the ΔG of the barrier before and after opening of the given nucleotide). Additionally, kresid can be determined from the observed translocation rates in the absence of the barrier. This leaves kss, ΔGa, and m as the only free parameters in the model. After fitting these parameters, dwell times
predicted using Eqs. 1 to 3 are in excellent agreement with the kinetic profiles obtained in the presence of different barriers, with the major and minor points of slowdown properly reproduced (Figure 3A and Figure 13D). We fit the model independently to five data sets but the parameters converged to similar values in all cases: kss ~200 s-1, ΔGj — 2.6 kcal/mol, m -10-11 (Figure 13D). Furthermore, the best fit is obtained if the bmRT catalytic site nucleotide (-1 position) and the next nucleotide (-2) are assumed to be always unpaired, indicating that the helicase site of bmRT is at position -3 (Figure 3B). Indeed, structure prediction based on homology modeling of bmRT suggests that position -2 cannot accommodate dsRNA15, in agreement with this assumption.
[068] Our model suggests that bmRT interacts with 11-12 nt of the downstream template (including the stepped-over nucleotide itself, Figure 3B). The formation of a stable complex between bmRT and its RNA template prior to target-priming and cDNA synthesis in the cell14,21 suggests the existence of an extensive bmRT-RNA binding interface. Significantly, previous single molecule optical tweezers studies on the kinetics of two other RNA motors, the NS3 helicase from HCV17, and the RT from the murine leukemia virus19 have revealed that these motors can also sense and slow down in response to RNA secondary structures 6 to 8 nt downstream of the enzymes, suggesting that downstream sensing of structured regions in RNA is not uncommon in RNA helicases. A downstream sensing range of at least 3 nt was similarly inferred from the helicase kinetics and structural analysis of the bacterial ribosome 16,22,23 . Downstream sensing could be mediated by direct interaction of the helicase with the RNA to destabilize its folding
17,19,23 , or by a mechanism in which the kinetic stability of the junction arises from a long-range allosteric coupling through the double helix24.
[069] Using the parameters obtained from bmRT kinetics, we deduced the motor’s characteristic curve for overall translocation speed as a function of ΔGbP (red curve in Figure 3C). The sigmoidal shape of the curve becomes sharper for larger values of m, i.e. for motors displaying longer ranges of downstream RNA structure sensing (Figure 3C). Due to this shape, translocation is barely affected by isolated base pairs, and slows down significantly only if the average stability of the entire downstream segment exceeds ΔGd in magnitude (-2.6 kcal/mol per nucleotide in the case of bmRT).
[070] Since RNA is known to spontaneously form secondary structures of short polymer lengths25, we also quantified the average dwell time at each nt on the RNA3 template alone (without the hybridizing RNA oligonucleotide) and analyzed the correlation between dwell times and the presence of dsRNA predicted by mfold 26 (Supplementary Note 3). As expected, longer dwell times are only observed in front of downstream RNA regions that have high probability of being double-stranded, most of which have high GC content (Figure 14A-C).
Indeed, by incorporating the predicted base pairing probabilities into our active helicase model, we can qualitatively reproduce the observed pattern of dwell times with an overall correlation coefficient of ~0.6 (Figure 15A-C). [071] Detection of RNA aptamer-ligand complex formation. Finally, we explored the possibility to utilize our experimental assay to detect the binding of an RNA aptamer to its ligand that stabilizes its tertiary structure. We designed an RNA template that contains a single Broccoli RNA aptamer27 at its 5’ end (RNA3_Broccoli in Supplementary Table 1). This aptamer has two G-quadruplexes (GQ) and can bind the fluorescent ligand BI, which stabilizes the folding of the aptamer RNA27,28. The Broccoli GQs are preceded by a short RNA duplex (Figure 4A) which was shown to be important in folding of the aptamer based on sequence truncation experiments27. [072] We first showed that the aptamer binds to BI under our experimental conditions (Figure 16). Using our assay, we then compared the single-molecule kinetic profiles of bmRT on Broccoli RNA with and without the presence of BI (Figure 4B). Results indicate that binding to BI and stabilization of the Broccoli RNA structure led to a significant pause of the bmRT when the helicase site of the enzyme is still 1 nt away from the start of the Broccoli RNA duplex. Based on our active helicase model, at this position the downstream sensing range (Figure 3B) covers the Broccoli duplex and its continuous stack of nucleotides up to the first GQ, and a slowdown is indeed expected (Figure 13A-D, middle panel). Broccoli mutation experiments show that replacement of either of the Gs in the GQs for another base results in a significant loss of BI fluorescence, and that GQ formation is critical to the formation of stable Broccoli RNA structure28. This observation, in combination with our single-molecule kinetics data, indicate that both the short RNA duplex that precedes the GQ and the GQ in BI-bound Broccoli RNA represent stable barriers that slow down bmRT, and that the nanopore-based RNA sequencing approach described here can be used to identify stabilized secondary structures such as those of ligand-bound RNA aptamers. [073] We disclose for the first time an RNA nanopore sequencing method that provides both sequence and structure information simultaneously. To this end, we established an assay to follow the translocation of RNA through the MspA nanopore with single nucleotide resolution, gated by an engineered eukaryotic RT, and we generated the RNA quadromer map that allows us to reliably assign ion current signals to RNA sequences. We also showed that kinetics of individual bmRT enzymes can reveal, with single-nucleotide resolution, how translocation rate varies with the sequence-dependent stability of encountered structural barriers. We have found that the pausing of the bmRT indicates when a particularly extensive, stable secondary structure is encountered by the enzyme. Our results indicate that bmRT functions as a processive helicase
that actively unfolds incoming dsRNA and senses secondary structures 11-12 bp downstream of its front boundary. In addition, we showed that slow-down in kinetic rates can be used to detect the presence of RNA aptamer-ligand complexes. [074] Our method enables interrogating RNA sequence and structure directly without the need of prior RNA modifications. Furthermore, our technique directly provides biophysical information on how RNA structure barriers impact the biophysical behavior of a eukaryotic RNA molecular motor protein. Finally, the quadromer map for RNA in the MspA nanopore enables the MspA nanopore tweezers to investigate the activity and dynamics of other processive RNA translocases such as the ribosome or synthetases such as RNA polymerases, and do so with single nucleotide resolution and in a sequence dependent manner. [075] RNA template preparation. RNA template sequences were ordered as dsDNA gBlocks that contain the T7 promoter from Integrated DNA Technologies (IDT) and inserted into a linearized pRZ plasmid using the infusion cloning kit (Thermo Fisher) and transformed into Sure2 cells (Agilent) following manufacturer’s instructions. Positive colonies were screened with Sanger sequencing. PCR was used to amplify templates for in vitro transcription using the MEGAscript kit following manufacturer’s instructions (Thermo Fisher). The RNA templates were purified with the MEGAclear kit (Thermo Fisher), and concentration determined with a Nanodrop spectrophotometer (Thermo Fisher). RNA oligonucleotide and DNA primer with 5’ cholesterol modification were ordered from IDT. The RNA template was mixed with DNA primer (and when relevant a 10-fold excess of RNA oligonucleotide for dsRNA barrier experiments) to a final concentration of 0.8 µM and 2 µM respectively, in buffer containing 20 mM Tris pH 8.0 and 20 mM NaCl and heated to 75°C for 90 seconds and immediately placed on ice until further use. [076] Preparation of RNA motor enzymes. E. rectale RT and N-terminally truncated B. mori RT were expressed and purified as described previously16. In short: The open reading frame of the enzymes was codon optimized and ordered from GenScript, and inserted with an N-terminal maltose binding protein tag into the MacroLab vector 2bct that contains a C-terminal 6xHis tag (https://qb3.berkeley.edu/facility/qb3-macrolab/). The enzymes were expressed in Rosetta2(DE3)pLysS cells in 2xYT medium and induced with isopropylthio-β-galactoside. Cells were lysed by sonication on ice and a three-step purification process (nickel-agarose column, heparin-Sepharose column, HiPrep 16/60 Sephacryl S-200HR size exclusion column) was used to purify the enzymes. The purified enzymes were stored in 25 mM HEPES pH 7.4, 800 mM KCl, 10% glycerol, and 1 mM DTT and stored at -80°C. Working stocks were stored at -20°C after RT dilution to a final concentration of 20 μM in 25 mM HEPES pH 7.5, 800 mM KCl, and 50% glycerol.
[077] MspA nanopore instrumentation. The MspA nanopore instrument is a custom-built instrument based on the design from the Gundlach lab8. In more detail, 2 wells of about 120 μl in volume were drilled into a Teflon block and the two wells were connected with Teflon tubing. One end of the tube was heat-shrunk and a small hole (about 20 um in diameter) was created using a fine surgical needle. Electrodes were prepared by inserting an Ag/AgCl pellet in heat shrink tubing. The Teflon block was mounted onto a custom-made aluminum block. Under the aluminum block is a Peltier that is connected to a temperature control unit (TED200C, Thorlabs). An Axopatch 200b (Molecular Devices) was connected to the electrodes and used to apply voltage and measure ion current. The Axopatch 200b is connected to a PC using National Instrument’s data acquisition card (DAQ) and controlled with a custom LabVIEW code. The well that contains the 20 um hole is referred to as the cis well, and is where all the biochemical components are introduced during sequencing data acquisition. The other well is referred to at the trans well. [078] Nanopore Experiments. The two wells and tubing were first filled with standard experiment buffer (40 mM HEPES pH 7.5, 400 mM KCl). 180 mV was applied to the system. Dry Lipid (4ME 16:0 DIETHER PC 10MG, Avanti polar lipids) was mixed with hexadecane (Sigma-Aldrich) until the consistency resembled that glue, followed by application of the lipid- hexadecane mixture to the tip of the Teflon tubing in the cis well. Lipid bilayer was generated by introducing an air bubble via a pipette to the surface of the tubing. Afterwards, MspA protein (the M2-NNN MspA mutant8) was added to the well to a final concentration of about 0.02 μg/ml. After successful insertion of a single backwards pore, we reduced the system’s voltage to 140 mV and buffer-exchanged the cis well to RT experiment buffer (40 mM HEPES pH 7.5, 320 mM KCl, 3 mM MgCl , 5 mM DTT, 24 o 2 μM dNTP), heated up the system to 36 C, and added the RNA/DNA primer complex to the well to a final concentration of about 15 nM RNA. Afterwards, we added the RT to a final concentration of about 150 nM and started data acquisition. [079] Nanopore Broccoli ligand binding experiment. Broccoli RNA template sequences were ordered as dsDNA gBlock as above and inserted into a linearized pRZ plasmid using infusion cloning kit (Takara Bio) and transformed into Stellar cells following manufacturer’s instructions. Positive colonies were screened with Sanger sequencing. PCR was used to amplify templates for in vitro transcription with T7 RNA polymerase (NEB). The RNA product was extracted with phenol and concentration was measured by Nanodrop spectrophotometer (Thermo Fisher). Ligand for Broccoli RNA aptamer BI (LuceRNA) was prepared in 50mM DMSO and further diluted in water. Binding of the ligand to the RNA template was tested by varying the ratio of ligand to RNA in buffer containing 20 mM Tris pH 8.0 and 20 mM NaCl
and heated to 75°C for 90 seconds and immediately placed on ice. The fluorescence intensity was quantified using ImageJ. In the nanopore experiment using BI ligand, 1:15 ratio of RNA to ligand was used. [080] Data Processing. The data processing pipeline is based on methods described previously8. In short: raw data (collected at 50 kHz) was down sampled to 2 kHz, and RNA translocation events were identified by using a custom GUI written in MATLAB. A point of change algorithm8 was used to identify steps within a continuous series of RNA translocation events. The steps identified and their corresponding dwell times were then used for additional data processing as described in the main article. [081] References 1. Yao, R.-W., Wang, Y. & Chen, L.-L. Cellular functions of long noncoding RNAs. Nat. Cell Biol.21, 542–551 (2019). 2. Batista, P. J. & Chang, H. Y. Long noncoding RNAs: cellular address codes in development and disease. Cell 152, 1298–1307 (2013). 3. Mortimer, S. A., Kidwell, M. A. & Doudna, J. A. Insights into RNA structure and function from genome-wide studies. Nat Rev Genet 15, 469–479 (2014). 4. Loughrey, D., Watters, K. E., Settle, A. H. & Lucks, J. B. SHAPE-Seq 2.0: systematic optimization and extension of high-throughput chemical probing of RNA secondary structure with next generation sequencing. Nucleic Acids Res 42, (2014). 5. Umeyama, T. & Ito, T. DMS-seq for In Vivo Genome-Wide Mapping of Protein-DNA Interactions and Nucleosome Centers. Curr Protoc Mol Biol 123, e60 (2018). 6. Parker, M. T. et al. Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification. Elife 9, e49658 (2020). 7. Stephenson, W. et al. Direct detection of RNA modifications and structure using single- molecule nanopore sequencing. Cell Genom 2, 100097 (2022). 8. Laszlo, A. H., Derrington, I. M. & Gundlach, J. H. MspA nanopore as a single-molecule tool: From sequencing to SPRNT. Methods 105, 75–89 (2016). 9. Laszlo, A. H. et al. Decoding long nanopore sequencing reads of natural DNA. Nat Biotechnol 32, 829–833 (2014). 10. Brinkerhoff, H., Kang, A. S. W., Liu, J., Aksimentiev, A. & Dekker, C. Multiple rereads of single proteins at single-amino acid resolution using nanopores. Science 374, 1509–1513 (2021). 11. Cheng, W., Arunajadai, S. G., Moffitt, J. R., Tinoco, I. & Bustamante, C. Single-base pair unwinding and asynchronous RNA release by the hepatitis C virus NS3 helicase. Science 333, 1746–1749 (2011). 12. Herschhorn, A. & Hizi, A. Retroviral reverse transcriptases. Cell Mol Life Sci 67, 2717– 2747 (2010).
13. Zhao, C., Liu, F. & Pyle, A. M. An ultraprocessive, accurate reverse transcriptase encoded by a metazoan group II intron. RNA 24, 183–195 (2018). 14. Upton, H. E. et al. Low-bias ncRNA libraries using ordered two-template relay: Serial template jumping by a modified retroelement reverse transcriptase. Proc Natl Acad Sci U S A 118, e2107900118 (2021). 15. Pimentel, S. C., Upton, H. E. & Collins, K. Separable structural requirements for cDNA synthesis, nontemplated extension, and template jumping by a non-LTR retroelement reverse transcriptase. J Biol Chem 298, 101624 (2022). 16. Qu, X. et al. The Ribosome Uses Two Active Mechanisms to Unwind mRNA During Translation. Nature 475, 118–121 (2011). 17. Cheng, W., Dumont, S., Tinoco, I. & Bustamante, C. NS3 helicase actively separates RNA strands and senses sequence barriers ahead of the opening fork. Proc Natl Acad Sci U S A 104, 13954–13959 (2007). 18. Vilfan, I. D. et al. Analysis of RNA base modification and structural rearrangement by single-molecule real-time detection of reverse transcription. J Nanobiotechnology 11, 8 (2013). 19. Malik, O., Khamis, H., Rudnizky, S., Marx, A. & Kaplan, A. Pausing kinetics dominates strand-displacement polymerization by reverse transcriptase. Nucleic Acids Res.45, 10190– 10205 (2017). 20. Manosas, M., Xi, X. G., Bensimon, D. & Croquette, V. Active and passive mechanisms of helicases. Nucleic Acids Res 38, 5518–5526 (2010). 21. Eickbush, T. H. & Eickbush, D. G. Integration, Regulation, and Long-Term Stability of R2 Retrotransposons. Microbiol Spectr 3, MDNA3-0011–2014 (2015). 22. Amiri, H. & Noller, H. F. A tandem active site model for the ribosomal helicase. FEBS Lett 593, 1009–1019 (2019). 23. Amiri, H. & Noller, H. F. Structural evidence for product stabilization by the ribosomal mRNA helicase. RNA 25, 364–375 (2019). 24. Kim, S. et al. Probing allostery through DNA. Science 339, 816–819 (2013). 25. Doty, P., Boedtker, H., Fresco, J. R., Haselkorn, R. & Litt, M. Secondary structure in ribonucleic acids*. Proceedings of the National Academy of Sciences 45, 482–499 (1959). 26. Zuker, M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31, 3406–3415 (2003). 27. Filonov, G. S., Moon, J. D., Svensen, N. & Jaffrey, S. R. Broccoli: rapid selection of an RNA mimic of green fluorescent protein by fluorescence-based selection and directed evolution. J Am Chem Soc 136, 16299–16308 (2014). 28. Puchta, O. et al. Genotype-phenotype map of an RNA-ligand complex. 2020.12.17.423258 Preprint at https://doi.org/10.1101/2020.12.17.423258 (2020). 29. Vanegas, P. L., Horwitz, T. S. & Znosko, B. M. Effects of non-nearest neighbors on the thermodynamic stability of RNA GNRA hairpin tetraloops. Biochemistry 51, 2192–2198 (2012).
30. Ross, B. C. Mutual Information between Discrete and Continuous Data Sets. PLoS One 9, e87357 (2014). [082] Supplementary Note 1 [083] Construction of RNA ion current consensus sequence. Our goal was to construct the consensus of ion current states observed with the RT for the RNA sequence listed in Supplementary table 1. Because RNA threaded through MspA pore in the 3` backwards pore orientation has not been observed previously, we devised an experiment in which ‘bookends’ of poly-adenine sequence flanked a sequence of interest. polyA sequences produce large ion currents in MspA sequencing with DNA8. These large ion current motifs were indeed observed in the data (Figure 1). [084] We next hypothesized that a DNA prediction based on the 5` end of the DNA threaded through the forwards pore, could be used to create a rough prediction of the ion current states that would be observed in the 3` RNA backwards pore map. We then generated a hand-built consensus of ion current states between the two polyA sequences and found that it compared reasonably well with the predicted sequence, enabling us to align our pattern of ion current states to the consensus (Figure 8A). [085] The hand-built consensus that we generated contained every possible RNA dimer (e.g. AA, AC, AG, AU, etc.). For each dimer we collected the mean ion current from our hand-build consensus to form a ‘dimer map’. We then used this dimer map to predict the ion current states for the entire 538 base RNA sequence. For the RNA sequence between the bookends (corresponding to RNA nucleotides 436 to 480), rather than using the dimer map prediction, we used our handbuilt consensus instead. We then used an iterative alignment / update procedure and simulated annealing to generate a final consensus. The steps for this procedure are: 1. Generate the predicted ion current sequence P0 as above a. Each state in the prediction is assigned an error T0 = 8 pA. This is the initial temperature for the annealing procedure. 2. For each measured read m, align m to P0 9 3. Generate a new consensus P1 a. For each level I ∈ P0, calculate the median, μi and standard deviation σi of all measured states that aligned to level i. b. Define the new consensus P1 by i. Replace each level i in P0 with μi ii. Update the temperature according to the cooling rate c. We use c = 0.25 pA / iteration.
1. So, T1 = T0 — c = 7.75 pA iii. Replace each error in P0 with
1. This ensures a smooth transition between the annealing parameter and the true errors σi.
4. Repeat steps 2 and 3 until the temperature equals 0 and a final consensus Pn is constructed with median current μi and errors σi (32 iterations, in the case above).
[086] To validate this procedure, we compared the mutual information between the RNA sequence and the ion current consensus to the mutual information content in the 5' forwards pore DNA quadromer map (Figure 5C-D, Ross 201430). The two maps contain comparable information about the DNA/RNA sequence, suggesting that this procedure produced a valid consensus.
[087] Lastly, to generate the consensus for the Broccoli hairpin sequence, we generated a partial RNA quadromer map from the ‘bookends’ RNA sequence. From this partial map we used the following procedure to make a prediction of the Broccoli sequence
1. If the quadromer in the Broccoli sequence was present in the bookends sequence, we used the measured quadromer from the bookends sequence.
2. If the quadromer in the Broccoli sequence was not present in the bookends sequence, we used the mean value of the central dimer in the quadromer taken over all other quadromers.
3. We then used the simulated annealing algorithm above to create the Broccoli consensus (Figure 4).
[088] Supplementary Note 2
[089] Modeling of bmRT Kinetics. The activity cycle of an motor such as bmRT can be modeled as consisting of a residence phase during which events such as dNTP binding and catalysis occur, followed by a stepping phase in which the motor attempts to move along its track. With kresid and kstep as the rates of these two phases, respectively, the overall translocation rate is: ktransloc = (kresid + kstep -1)-1 ,
For a helicase that only interacts with the stepped-over nucleotide (with no further downstream interactions), the stepping rate would depend only on the probability that this nucleotide is unpaired, and we can write: kstep = Pukss,
where Pu is the probability that the stepped-over nucleotide is in the unpaired state, and kss is the stepping rate in the absence of barrier. The probability Pu depends on the Gibbs free energy difference between the unpaired and folded states of the immediate nucleotide, which consists of the free energy of base pairing (ΔGbp, in this case only of the immediate nucleotide) and a destabilization energy due to the helicase (AGd): Pu= (1+exp(- β (ΔGbp - ΔGd)))-1 where P is (kBT)-1.
[090] Knowing the sequence of the dsRNA barrier, the AGbp for each base pair can be estimated using the nearest neighbor parameters22 (Figure 13B). For this scheme, the calculated dwell times in the presence of the barrier (Figure 13C) simply mirror the base-pair stabilities, and no combination of parameters kss and AGd can qualitatively reproduce the measured rates. However, assuming a motor protein with downstream interactions, the stepping rate would depend not only on the probability that the immediate nucleotide is unpaired, but also on the probability that the downstream segment is unpaired; in the simplest form, we can write: kstep — PuPmukss, where kss and Pu are the same as above, and Pmu is the probability that the downstream segment is in the unpaired state:
where m is the length of the downstream segment following the immediate nucleotide,
is the sum of the base-pairing free energies of the m nucleotides in this downstream segment, and ΔGd is the same as above (per nucleotide). With this scheme, the major and minor points of slowdown due to the barrier can be properly reproduced by the model (Figure 13D). The best fit is obtained if the bmRT catalytic site nucleotide (-1 position) and the next nucleotide (-2) are always unpaired, indicating that the helicase site of bmRT is at position -3. The five data sets shown in Figure 13A-D (annealed oligonucleotides, hairpin, and broccoli) were fit independently but converged to similar values for the three parameters kss ~200 s-1, m ~Δ 1, and AGd — 2.6 kcal/mol (Figure 13D), reflecting the robustness of the model fit. The fitted value for parameter m suggests that bmRT interacts with 11-12 nt of the downstream template (including the immediate nucleotide itself).
[091] A corollary of downstream interactions is the sharpening of the change in translocation rate as a function of average ΔGbp (Figure 3C). At the same time, translocation trajectories themselves become smoother: for a helicase with a larger value of m, the average stability of the downstream segment is less affected by individual stable base pairs, and as a result the helicase would show fewer abrupt changes in the rate of stepping along a given barrier.
[092] Another consequence of a larger m is the enhancement of motor power at moderate barrier strengths. Power represents the amount of unwinding work per unit time delivered by the motor during its duty cycle. The output power of the bmRT helicase can be approximated as the unwinding work divided by unwinding time, i.e., -ΔGbpkstep. The output power as a function of average ΔGbp as predicted by this model is shown in Figure 17. It can be seen that that for barriers with moderate AGbP, higher values of m result in higher output power. For bmRT, maximum helicase output power (-290 kcal/mol/s) is achieved when ΔGbp= ~- 2 kcal/mol. Overall, our modeling suggests that downstream sensing may help helicases maintain their speed and power despite structure barriers of various strengths.
[093] For bmRT to translocate over a barrier, the unwinding work cannot exceed its energy source, ΔGdNTP, which is the free energy released upon dNTP incorporation (and PPi hydrolysis). Within this limit, the ratio ΔGbp/AGdNTP describes the helicase energy efficiency. Taking the approximate value of ΔGdNTP 0’ =-8.3 kcal/mol, and a Δ Gbp value of -3.5 kcal/mol (for some G:C base pairs), the bmRT helicase efficiency can therefore be as high as -0.42 for canonical base pairs. In comparison, efficiency of the helicase at its peak output power (when ΔGbp is -2 kcal/mol) is -0.24.
[094] The model presented here has a minimal number of assumptions and free parameters to avoid overfitting. With an expanded dataset of kinetic data, especially in the high-load regime, the model may be tuned or include additional terms. For instance, the Δ Gd may be fine grained along the downstream region to better capture the physical reality of the downstream interaction.
[095] Supplementary Note 3
[096] Estimation of dsRNA content in RNA3 using mFold. RNA molecules in equilibrium are partitioned in various folded structures. Using mFold we can obtain the probability of every base in the RNA molecule that is double stranded across all alternative predicted RNA structures. In the case of RNA3, mFold predicted five different structures (shown in structure dot plot in Figure 18) and the percentage of dsRNA in the next 3-13 nt downstream to the bmRT catalytic site is calculated per predicted structure. Note that as RT progresses on a single RNA template, upstream RNA will enter the pore and is no longer able to pair with downstream regions. However, we rationalize that using mFold prediction we are mapping regions that have high probability to be double stranded given a particular RNA sequence. In addition, RNA molecules that are anchored to the nanopore membrane have a high local concentration and it is possible that our technology is detecting both intra- and inter-molecular RNA structures.
Claims
CLAIMS 1. A method of nanopore sequencing of RNA, comprising using a cellular reverse transcriptase (RT) to directly capture and then as a motor to processively thread single-stranded RNA through a nanopore in single nucleotide steps, and detecting resultant ion currents through the nanopore, wherein the ion currents correlate with the RNA sequence.
2. The method of claim 1 further comprising generating the RNA sequence by referencing the ion currents to a nanopore RNA quadromer map that connects each detected ion current with a unique 4-nucleotide sequence spanning the nanopore.
3. The method of claim 1 further comprising determining RNA secondary structures simultaneously with the direct sequencing.
4. The method of claim 1 performed without amplification, ligation, or prior conversion to cDNA, therefore avoiding the bias that these steps can potentially introduce to the data.
5. The method of claim 1, wherein the RNA is extracted from biological samples, such as tissue or cell culture, or to assay the sequence homogeneity of supposedly identical RNAs generated in vitro or in vivo from a single type of DNA template.
6. The method of claim 1, comprising deploying primers that are tagged with cholesterol to be hybridized to the RNA.
7. The method of claim 1, comprising utilizing template jumping activity of the cellular reverse transcriptase, whereby the enzyme can directly bind to the 3’ end of the RNA regardless of sequence and initiate reverse transcription, wherein, the reverse transcriptase will be immobilized on the lipid membrane via a lipid anchor, and free RNA inside the sample well can be captured and thread through the nanopore for direct RNA sequencing.
8. The method of claim 1, comprising utilizing a membrane-tethered DNA primer to recruit the RNA by base-pairing.
9. The method of claim 1, comprising utilizing a membrane-tethered RT to recruit the RNA template base-paired to DNA.
10. The method of claim 1, comprising utilizing the RT in terminal transferase buffer to 3’ extend molecules in an input RNA pool with a nucleotide or nucleotide analog or combination thereof, either in one step or in two separate steps to add two sequential tail sequences, and anneal a primer to an internal tail location using either a junction-sequence-anchored primer or a primer complementary to the internal part of the added tail.
11. The method of claim 1, comprising utilizing a membrane-tethered DNA primer to base pair with the RNA 3’ end.
12. The method of claim 1, comprising utilizing a membrane-tethered RT bound to primer to catch an RNA 3’ end near the pore, not necessarily by base-pairing.
13. The method of claim 1 wherein the nanopore comprises a Mycobacterium smegmatis porin A (MspA) nanopore.
14. The method of claim 1, wherein the RT comprises an engineered Bombyx mori R2 RT comprising a truncated N-terminal region, an RNA binding domain, an RT domain, and an endonuclease domain, wherein the endonuclease domain comprises a mutation that abolishes endonuclease function, and the mutation is a substitution mutation at amino acid residue D996, D1009, or K1026, e.g. D1009A or K1026A or K1026D or K1026E.
15. A system configured to effect the nanopore sequencing of RNA according to claim 1, and comprising the RNA, the RT and the nanopore.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263327730P | 2022-04-05 | 2022-04-05 | |
| US63/327,730 | 2022-04-05 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023196738A1 true WO2023196738A1 (en) | 2023-10-12 |
Family
ID=88243555
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/064680 Ceased WO2023196738A1 (en) | 2022-04-05 | 2023-03-19 | Nanopore sequencing of rna using reverse transcription |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2023196738A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2025169569A1 (en) * | 2024-02-08 | 2025-08-14 | コニカミノルタ株式会社 | Method for determining sequence of nucleic acid aptamer |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2002103054A1 (en) * | 2001-05-02 | 2002-12-27 | Rubicon Genomics Inc. | Genome walking by selective amplification of nick-translate dna library and amplification from complex mixtures of templates |
| US8617817B2 (en) * | 2010-02-12 | 2013-12-31 | Genisphere, Llc | Whole transciptome sequencing |
| US20200149101A1 (en) * | 2018-11-08 | 2020-05-14 | Siemens Healthcare Gmbh | Direct rna nanopore sequencing with help of a stem-loop reverse polynucleotide |
| US20210261944A1 (en) * | 2018-08-08 | 2021-08-26 | The Regents Of The University Of California | Compositions and methods for ordered and continuous complementary DNA (cDNA) synthesis across non-continuous templates |
-
2023
- 2023-03-19 WO PCT/US2023/064680 patent/WO2023196738A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2002103054A1 (en) * | 2001-05-02 | 2002-12-27 | Rubicon Genomics Inc. | Genome walking by selective amplification of nick-translate dna library and amplification from complex mixtures of templates |
| US8617817B2 (en) * | 2010-02-12 | 2013-12-31 | Genisphere, Llc | Whole transciptome sequencing |
| US20210261944A1 (en) * | 2018-08-08 | 2021-08-26 | The Regents Of The University Of California | Compositions and methods for ordered and continuous complementary DNA (cDNA) synthesis across non-continuous templates |
| US20200149101A1 (en) * | 2018-11-08 | 2020-05-14 | Siemens Healthcare Gmbh | Direct rna nanopore sequencing with help of a stem-loop reverse polynucleotide |
Non-Patent Citations (3)
| Title |
|---|
| "autm Innovation marketplace", 23 December 2021, article CARLOS BUSTAMANTE ET AL.: "Nanopore Sequencing of RNA Using Reverse Transcription", pages: 1 - 3, XP009553132 * |
| ARKADIUSZ BIBILLO ET AL.: "End-to-End Template Jumping by the Reverse Transcriptase Encoded by the R2 Retrotransposon", THE JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 279, no. 15, 28 January 2004 (2004-01-28), pages 14945 - 14953, XP055092375, DOI: 10.1074/jbc.M310450200 * |
| VERMEULEN JOËLLE, DE PRETER KATLEEN, LEFEVER STEVE, NUYTENS JUSTINE, DE VLOED FANNY, DERVEAUX STEFAAN, HELLEMANS JAN, SPELEMAN FRA: "Measurable impact of RNA quality on gene expression results from quantitative PCR", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, GB, vol. 39, no. 9, 1 May 2011 (2011-05-01), GB , pages e63 - e63, XP093101094, ISSN: 0305-1048, DOI: 10.1093/nar/gkr065 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2025169569A1 (en) * | 2024-02-08 | 2025-08-14 | コニカミノルタ株式会社 | Method for determining sequence of nucleic acid aptamer |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Basu et al. | Measuring DNA mechanics on the genome scale | |
| Webster et al. | RNA-binding proteins distinguish between similar sequence motifs to promote targeted deadenylation by Ccr4-Not | |
| Watts et al. | Architecture and secondary structure of an entire HIV-1 RNA genome | |
| Brouze et al. | Measuring the tail: Methods for poly (A) tail profiling | |
| US9051609B2 (en) | Biopolymer Sequencing By Hybridization of probes to form ternary complexes and variable range alignment | |
| CN104726549B (en) | Novel nicking enzyme-based double-stranded nucleic acid isothermal amplification detection method | |
| JP2021040646A (en) | Compositions and Methods for Polynucleotide Sequencing | |
| EP2272976A1 (en) | Method for differentiation of polynucleotide strands | |
| US9493829B2 (en) | Method of DNA sequencing by polymerisation | |
| JP2016535983A (en) | Detection of chemical modifications in nucleic acids | |
| JPWO2012002541A1 (en) | Target molecule detection method | |
| CN110446788A (en) | Novel internal reference oligonucleotides for normalization of sequence data | |
| WO2022261874A1 (en) | Method for obtaining double-stranded sequence by single-stranded rolling circle amplification | |
| WO2023196738A1 (en) | Nanopore sequencing of rna using reverse transcription | |
| CN111356772B (en) | Enzyme screening method | |
| US12297490B2 (en) | Methods for asymmetric DNA library generation and optionally integrated duplex sequencing | |
| Grosswendt et al. | Essentials of miRNA-dependent control of mRNA translation and decay, miRNA targeting principles, and methods for target identification | |
| Chen et al. | A protein triggering exponential amplification reaction enables label-and wash-free one-pot protein assay with high sensitivity | |
| Shaw et al. | Nanopore molecular trajectories of a eukaryotic reverse transcriptase reveal a long-range RNA structure sensing mechanism | |
| US20250369048A1 (en) | Labeling of nucleic acid molecule by interstrand crosslinked double-strand dna | |
| Nakatsu | Microbial genetics | |
| Di et al. | Rational design of terminal deoxynucleotidyl transferase for RNA primer elongation | |
| CA3200114C (en) | Rna probe for mutation profiling and use thereof | |
| JP2023552984A (en) | Method for sequencing polynucleotide fragments from both ends | |
| Rector | Kinetic and Structure Based Studies on Transcription Initiation and Promoter Escape by E. coli RNA Polymerase |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23785526 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23785526 Country of ref document: EP Kind code of ref document: A1 |