[go: up one dir, main page]

WO2025000136A1 - Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique - Google Patents

Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique Download PDF

Info

Publication number
WO2025000136A1
WO2025000136A1 PCT/CN2023/102155 CN2023102155W WO2025000136A1 WO 2025000136 A1 WO2025000136 A1 WO 2025000136A1 CN 2023102155 W CN2023102155 W CN 2023102155W WO 2025000136 A1 WO2025000136 A1 WO 2025000136A1
Authority
WO
WIPO (PCT)
Prior art keywords
rna
reagent
reaction
preparation
base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2023/102155
Other languages
French (fr)
Chinese (zh)
Inventor
欧日晶
张艳
王新新
雷常贵
王莹莹
王业钦
殷建华
金鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to PCT/CN2023/102155 priority Critical patent/WO2025000136A1/en
Publication of WO2025000136A1 publication Critical patent/WO2025000136A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the present invention relates to the field of biotechnology, and in particular to a method for preparing a chain-specific library for rapidly detecting multiple types of RNA and a high-throughput sequencing technology.
  • RNA Cell free RNA
  • body fluids such as blood, urine, cerebrospinal fluid, saliva, milk, pleural effusion, ascites, etc.
  • RNA in cells include ribosomal RNA (rRNA), messenger RNA (mRNA), transfer RNA (tRNA), micro non-coding RNA (miRNA), long non-coding RNA (lncRNA), etc.
  • the free RNA in body fluid samples carries real-time gene expression information from different tissues and organs, which can provide more solutions for early screening of diseases, auxiliary diagnosis, recurrence monitoring, disease prognosis, etc. Due to the non-invasive or minimally invasive sampling method and the advantages of real-time and comprehensive monitoring, free RNA may become an important detection method for precision medicine in the future.
  • RNA detection technology based on high-throughput sequencing has the advantages of single-base resolution, wide detection range and high throughput.
  • challenges in the preparation technology of free RNA high-throughput sequencing libraries including but not limited to low sample starting amount, easy RNA degradation, long experimental cycle, low efficiency, single or limited target RNA type and other problems.
  • the scientific research field mainly uses kits suitable for sequencing library preparation of conventional cell tissue sample RNA to study free RNA.
  • the commercial RNA sequencing library preparation kits on the market have relatively single or limited target RNA types, and the information that can be obtained is also limited, and it is impossible to conduct systematic research on all types of RNA.
  • RNA sequencing library preparation kits are mainly divided into kits suitable for mRNA and miRNA, which cannot capture multiple types of RNA in total RNA at the same time.
  • Various types of RNA include mRNA, lncRNA, tRNA, miRNA, etc.
  • Most commercial RNA sequencing library preparation kits have high requirements for the starting amount of total RNA.
  • commercial RNA sequencing libraries have more experimental steps, more complicated processes, and longer experimental cycles, which are not conducive to rapid and stable detection in the clinical field.
  • kits suitable for mRNA detection include Illumina's TruSeq RNA Library Prep Kit v2, TruSeq Stranded mRNA and Total RNA Library prep kits, NEB's NEBNext Ultra TM II RNA Library Prep Kit for Illumina, TAKARA's SMARTer Stranded Total RNA-Seq Kit, etc.
  • the above kit scheme uses polydeoxythymidine ribonucleotide (oligo dT) magnetic beads or primers to selectively enrich mRNA containing polyadenylic acid (polyA) tails and specifically prepare mRNA sequencing libraries.
  • oligo dT polydeoxythymidine ribonucleotide
  • polyA polyadenylic acid
  • the above kit can also capture lncRNA while capturing mRNA by adapting the rRNA removal step and then using primers containing multiple random bases for reverse transcription, but this method is difficult to capture more short fragments of miRNA and other RNA.
  • the above kits all need to first reverse transcribe into single-stranded cDNA, and then synthesize complementary double-stranded DNA to form double-stranded DNA, and then prepare it as a sequencing library on this basis, so the experimental steps are relatively complicated and the experimental cycle is relatively long.
  • kits suitable for miRNA or small RNA For example, Illumina's TruSeq small RNA Library Prep Kit, Bioo Scientific's Nextflex small RNA-Seq Kit v3, NEB's NEBNext small RNA Library Prep Set for Illumina, MGI's MGIEasy small RNA library preparation kit, and QIAGEN's QIAseq miRNA Library Kit.
  • the above kits all use RNA ligase to directly connect adapters containing specific sequences to the 3' and 5' ends of RNA molecules, then transcribe them into cDNA, and then prepare sequencing libraries through PCR amplification reactions.
  • kits are not suitable for capturing long-chain RNAs such as mRNA and lncRNA, and the experiments require RNA ligase, which is relatively costly, relatively inefficient, takes a long time to connect, has many steps, and a long experimental cycle.
  • RNA Ribonucleic acid
  • oligo dT polydeoxythymidine ribonucleotides
  • PCR is performed using the adapter sequences added at both ends to complete the preparation of the sequencing library.
  • RNA library preparation methods are not suitable for low starting amounts or degraded samples, and have a low adaptability to plasma samples.
  • Scientific research programs generally require mL-level plasma inputs, and this higher starting amount of plasma input will reduce its widespread use.
  • the technical problem to be solved by the present invention is to provide a method for preparing a chain-specific library for rapidly detecting multiple types of RNA and a high-throughput sequencing technology.
  • the present invention provides a method for preparing a chain-specific library for rapidly detecting multiple types of RNA and a high-throughput sequencing technology, which have high sensitivity, a wide range of applications, strong anti-interference ability, a simple and quick preparation method, and are suitable for high-throughput sequencing technology.
  • the present invention provides a method for preparing various types of RNA libraries, including: adding polyA to the end of an RNA sample, performing reverse transcription and U base digestion to prepare a cDNA library;
  • the RNA sample contains at least one of total RNA, rRNA, mRNA, tRNA, miRNA and/or lncRNA.
  • the present invention adds polyadenylic acid (poly A) tails to the ends of various types of RNA molecules and combines the reverse transcription method using polydeoxythymidine ribonucleotide (oligo dT) primers, so that various types of RNA molecules including non-coding RNA (lncRNA, miRNA, etc.) can be reverse transcribed in subsequent steps, thereby realizing the construction of libraries of various types of RNA.
  • poly A polyadenylic acid
  • oligo dT polydeoxythymidine ribonucleotide
  • the preparation method specifically includes: DNA digestion, RNA end modification, poly A tail addition, reverse transcription, U base digestion, cDNA end modification, denaturation, linker addition and library construction.
  • the present invention combines two or more of the above-mentioned specific reaction steps by optimizing the reaction reagents and other conditions. For example, in some embodiments, the steps of DNA digestion, RNA end modification and polyA tailing are combined. In some embodiments, the steps of DNA digestion and RNA end modification are combined, and then the steps of polyA tailing and reverse transcription are combined. In some embodiments, the steps of U base digestion, cDNA end modification and denaturation are combined. In some embodiments, the steps of U base digestion and cDNA end modification are combined. By combining the reaction steps, the reaction steps are reduced and the required time is shortened, while ensuring the efficiency of the reaction and the quality of the obtained library.
  • the DNA digestion and RNA end modification are performed in a first system
  • the reverse transcription is performed in a second system
  • the step of adding the polyA tail is performed in the first system or the second system.
  • the step of adding polyA tail is carried out in a first system, wherein:
  • the first system includes: RNA sample, PolyA polymerase reaction buffer, ATP, BSA, DNase I, T4 polynucleotide kinase, PolyA polymerase, RNase inhibitor and nuclease-free water;
  • the second system includes: the reaction product of the first system, reverse transcription primer, HiScript III reaction buffer, HiScript III reverse transcriptase, dNTP Mix, RNase inhibitor and nuclease-free water.
  • the present invention optimizes the reaction system.
  • the components in the reaction system are properly coordinated to jointly ensure the progress of the reaction.
  • the present invention optimizes the concentration of the components in each system.
  • the concentrations of the components in the first system are as follows: 14 ⁇ L RNA sample, 3 ⁇ L 10 ⁇ PolyA polymerase reaction buffer, 1 ⁇ L 10 mM ATP, 4 ⁇ L 10 mg/mL BSA, 2 ⁇ L DNase I, 0.5 ⁇ L 10 U/ ⁇ L T4 polynucleotide kinase, 1 ⁇ L 5 U/ ⁇ L PolyA polymerase, 0.5 ⁇ L 40 U/ ⁇ L RNase inhibitor, and 4 ⁇ L nuclease-free water;
  • the concentrations of the components of the second system include: 30 ⁇ L of the reaction product of the first system, 2 ⁇ L of 5 ⁇ M reverse transcription primer, 4 ⁇ L of 5 ⁇ HiScript III reaction buffer, 1 ⁇ L of 200 U/ ⁇ l HiScript III reverse transcriptase, 2 ⁇ L of 5 mM dNTP Mix, 0.5 ⁇ L of 40 U/ ⁇ l RNase inhibitor and 10.5 ⁇ L of nuclease-free water.
  • the step of adding polyA tail is carried out in the second system, wherein:
  • the first system includes: RNA sample, DNase I reaction buffer, ATP, DNase I, T4 polynucleotide kinase and RNase inhibitor;
  • the second system includes: the reaction product of the first system, HiScript III reaction buffer, BSA, PEG8000, dNTP Mix, reverse transcription primer, PolyA polymerase, RNase inhibitor, HiScript III reverse transcriptase and nuclease-free water.
  • the present invention optimizes the reaction system.
  • the components in the reaction system are properly coordinated to jointly ensure the progress of the reaction.
  • the present invention optimizes the concentration of the components in each system.
  • the concentrations of the components in the first system are: 14 ⁇ L RNA sample, 2 ⁇ L 10 ⁇ DNase I reaction buffer, 1 ⁇ L 10 mM ATP, 2 ⁇ L DNase I, 0.5 ⁇ L 10 U/ ⁇ L T4 polynucleotide kinase, and 0.5 ⁇ L 40 U/ ⁇ L RNase inhibitor.
  • the concentrations of the components of the second system are: 20 ⁇ L of the reaction product of the first system, 6 ⁇ L of 5 ⁇ HiScript III reaction buffer, 1 ⁇ L of 10 mg/mL BSA, 10 ⁇ L of 50% PEG8000, 2 ⁇ L of 5 mM dNTP Mix, 2 ⁇ L of 5 ⁇ M reverse transcription primer, 1 ⁇ L of 5 U/ ⁇ L Poly A polymerase, 0.5 ⁇ L of 40 U/ ⁇ l RNase inhibitor, 1 ⁇ L of 200 U/ ⁇ l HiScript III reverse transcriptase and 6.5 ⁇ L of nuclease-free water.
  • the cDNA end modification and denaturation are performed in a third system.
  • the third system comprises: the reaction product after U base digestion, polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer with pH 8.0, ultra-thermostable single-stranded binding protein and nuclease-free water.
  • the present invention optimizes the reaction system.
  • the components in the reaction system are properly coordinated to jointly ensure the progress of the reaction.
  • the present invention optimizes the concentration of the components in each system.
  • the concentrations of the components of the third system are: the reaction product after U base digestion, 5 ⁇ L of 10 ⁇ polynucleotide kinase reaction buffer, 1 ⁇ L of 10U/ ⁇ L T4 polynucleotide kinase, 2 ⁇ L of 220mM Tris buffer, pH 8.0, 0.6 ⁇ L of 500ng/ ⁇ L ultra-thermostable single-stranded binding protein, and 21.4 ⁇ L of nuclease-free water.
  • cDNA end modification, denaturation and U base digestion are performed in the same system, and therefore, the third system also includes reagents for U base digestion.
  • the third system includes: the reaction product of polyA tailing and reverse transcription, polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, ultra-thermostable single-stranded binding protein, uracil-specific excision reagent USER enzyme and nuclease-free water.
  • the present invention optimizes the reaction system.
  • the components in the reaction system are properly coordinated to jointly ensure the progress of the reaction.
  • the present invention optimizes the concentration of the components in each system.
  • the concentrations of the components of the third system are: the reaction product after polyA tailing and reverse transcription, 5 ⁇ L of 10 ⁇ polynucleotide kinase reaction buffer, 1 ⁇ L of 10 U/ ⁇ L T4 polynucleotide kinase, 2 ⁇ L of 220 mM Tris buffer, pH 8.0, 0.6 ⁇ L of 500 ng/ ⁇ L ultra-thermostable single-stranded binding protein, 2 ⁇ L of 10 U/ ⁇ L uracil-specific excision reagent USER enzyme, and 19.4 ⁇ L of nuclease-free water.
  • the present invention adopts a flexible method of combining multiple reaction steps into the same experimental operation step and optimizes the experimental conditions, thereby reducing the operation steps and time required for library preparation and ensuring the efficiency of the reaction, thereby obtaining better experimental results.
  • reverse transcription primer has the following nucleotide sequence: poly(T)n-UVNm;
  • n represents the number of bases T
  • m represents the number of bases N
  • n is an integer of 8 to 50
  • m is an integer of 1 to 4
  • At least one T in the poly(T)n is replaced by U; V is selected from any one of base A, base C and base G; and N is selected from any one of base A, base T, base C and base G.
  • the present invention provides a polydeoxythymidine ribonucleotide (oligo dT) reverse transcription primer containing deoxyuracil (dU) for synthesizing a single-stranded cDNA molecule by reverse transcription reaction.
  • the primer sequence comprises deoxyuracil ribonucleotide (dU), polydeoxythymidine ribonucleotide (oligo dT), and other random bases (V and N).
  • a single-stranded cDNA molecule containing a polydeoxythymidine ribonucleotide sequence of deoxyuracil (dU) can be obtained, so that a uracil-specific excision reagent is used to perform a digestion reaction on the single-stranded cDNA in the U base digestion step, thereby excising the polydeoxythymidine ribonucleotide sequence fragment containing deoxyuracil (dU) in the single-stranded cDNA, removing the influence of the low-complexity sequence artificially introduced in the previous step on the subsequent sequencing and analysis, and better realizing the preparation of the cDNA library.
  • the solution provided by the present invention is more conducive to increasing the amount of effective data in sequencing.
  • the reverse transcription primer has a nucleotide sequence as shown in SEQ ID NO: 1.
  • SEQ ID NO: 1 the reverse transcription primer shown in SEQ ID NO: 1 can achieve a better reverse transcription reaction, thereby obtaining a better experimental effect.
  • the linker includes a 5' end linker and a 3' end linker, the 5' end linker sequence has a nucleotide sequence as shown in SEQ ID NO 2 and SEQ ID NO 3, and the 3' end linker sequence has a nucleotide sequence as shown in SEQ ID NO 4 and SEQ ID NO 5.
  • the present invention provides a connector for introducing a specific target sequence at the 3' and 5' ends of a single-stranded cDNA molecule and a corresponding DNA connection scheme.
  • Both connectors contain a double-stranded region and at least one protruding single-stranded region, wherein the double-stranded region contains a universal structural sequence for PCR or sequencing, and the protruding single-stranded region contains one or more (1 to 10) random base sequences, which are used for complementary pairing with the end of the single-stranded DNA molecule, so that the connector and the single-stranded DNA can be splinted.
  • Each connector is formed by the interaction of two polynucleotide sequences to form its special structure, and the two sequences contain complementary pairing regions, which will be complementary paired to form a specific double-stranded structure and a protruding single-stranded region containing random bases after solution mixing and static treatment.
  • PCR amplification can be directly performed to obtain the final library, so as to better realize the preparation of cDNA library.
  • the present invention can connect the DNA adapters at the 3’ and 5’ ends simultaneously in one step, thereby shortening the steps and time while ensuring the connection effect.
  • the reaction system for adding a linker includes: a 5' end linker solution, a 3' end linker solution, a ligation reaction buffer, a ligation reaction enhancer and T4 DNA ligase.
  • a DNA ligase scheme is used to replace RNA ligase, thereby greatly shortening the connection time (in the scheme of adding linkers at both ends of RNA using RNA ligase, due to the relatively low connection efficiency of RNA ligase, the single-end linker connection reaction time usually takes 1 to 2 hours, and the experimental steps of connecting linkers at both ends in steps are also relatively complicated, and the total connection time of the two-end linkers is 2 to 3 hours).
  • the present invention optimizes other components and their concentrations in the reaction system, thereby further improving the connection efficiency of the linker.
  • the reaction system for adding a linker includes: a 5' end linker solution, a 3' end linker solution, a ligation reaction buffer, a hexammaminecobalt chloride solution and T4 DNA ligase.
  • the present invention also provides a sequencing method for various types of RNA libraries, and the cDNA library prepared by the above preparation method is used as a sample for on-machine sequencing.
  • the cDNA library is sequenced after PCR amplification, purification, sample mixing, and single-stranded circularization.
  • the upstream primer of the PCR amplification has a nucleotide sequence as shown in SEQ ID NO 6, and the downstream primer of the PCR amplification has a nucleotide sequence as shown in SEQ ID NO 7.
  • the downstream primers may contain a label (Barcode) sequence for sample identification.
  • Barcode a label
  • the primer sequence By using the primer sequence to amplify the library sample, library samples with different Barcode sequence labels can be obtained.
  • the Barcode sequence of the primer band can be matched with the Barcode of the universal sequencing library of the sequencing platform.
  • the structure of the PCR amplification product can be consistent with the universal library structure of the sequencing platform, so that the source of the sample library can be accurately separated by the Barcode.
  • the present invention provides library construction reagents, including reagent I, reagent II and reagent III;
  • the reagent I comprises: DNase I reaction buffer, ATP, DNase I, T4 polynucleotide kinase, and RNase inhibitor;
  • the reagent II includes: reverse transcription primer, HiScript III reaction buffer, HiScript III reverse transcriptase, dNTP Mix, and RNase inhibitor;
  • the reagent III comprises: polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, and super-thermostable single-stranded binding protein.
  • the reagent I further comprises: PolyA polymerase reaction buffer, BSA, PolyA polymerase;
  • the reagent II also includes: BSA, PEG8000, and PolyA polymerase.
  • the reagent III also includes a U base excision reagent, and the U base excision reagent includes a uracil-specific excision reagent USER enzyme.
  • linker adding reagent which includes: Tris-Hcl buffer, sodium chloride, EDTA, linkers as shown in SEQ ID NO 2 to 5, ligation reaction buffer, T4 ligase and hexamminecobalt chloride.
  • a purification reagent which includes Agencourt AMPure XP magnetic beads.
  • PCR amplification reagents which include PCR enzyme reaction solution, upstream primers of the nucleotide sequence shown in SEQ ID NO 6, and downstream primers of the nucleotide sequence shown in SEQ ID NO 7.
  • RNA extraction reagents also includes RNA extraction reagents.
  • the present invention also provides the application of the construction reagent in preparing various types of RNA libraries.
  • the present invention provides a method for preparing a strand-specific library for rapid detection of multiple types of RNA and a high-throughput sequencing technology.
  • the present invention uses polyadenylic acid polymerase to artificially add polyadenylic acid (poly A) tails to the 3' ends of multiple RNAs, and simultaneously uses polydeoxythymidine ribonucleotide primers with deoxyuracil to synthesize single-stranded cDNA under the action of reverse transcriptase.
  • the obtained single-stranded cDNA molecules are subjected to a series of reactions and finally amplified by PCR to obtain a strand-specific library of multiple types of RNA.
  • the present invention uses polyadenylic acid polymerase to artificially add polyadenylic acid (poly A) tails to the 3’ ends of various RNAs, so that various types of RNA molecules including non-coding RNA (lncRNA, miRNA, etc.) can be reverse transcribed in subsequent steps, thereby achieving library construction and sequencing of various types of RNAs;
  • Single-stranded cDNA is synthesized using poly-deoxythymidine ribonucleotide primers containing deoxyuracil under the action of reverse transcriptase.
  • the obtained single-stranded cDNA molecules can remove the artificially added polynucleotide sequences through the U base digestion reaction, avoiding the addition of low-complexity nucleotide sequences in the sequencing library, so that the sequencing quality and data volume are guaranteed;
  • the present invention adopts a single-stranded cDNA molecule library preparation method, which does not require the synthesis of complementary double-stranded DNA of cDNA molecules, thereby reducing the steps and time required for library preparation.
  • the present invention adopts a single-stranded cDNA molecule library preparation method, the strand-specific information of RNA is retained, which is more conducive to RNA analysis such as gene annotation;
  • the present invention combines some reaction steps by optimizing the reaction reagents and other conditions.
  • the three reaction steps of DNA digestion, RNA end modification and polyadenylic acid tailing reaction are combined into the same operation step; the two reaction steps of cDNA end modification and denaturation reaction are combined into one operation step.
  • the two reaction steps of DNA digestion and RNA end modification reaction are combined into one operation step; the two reaction steps of polyadenylic acid tailing and reverse transcription reaction are combined into one operation step; the three reaction steps of U base digestion, cDNA end modification and denaturation reaction are combined into the same operation step, etc.
  • the overall solution reduces the number of operation steps and time required for library preparation, and ensures the efficiency of the reaction;
  • connection reaction enhancer such as the chemical reagent hexaaminocobalt chloride
  • the present invention uses the reverse transcribed cDNA and the adapter for connection and adds the sequencing structure sequence. Since the DNA ligase scheme is used instead of the RNA ligase scheme, the DNA ligase is superior to the RNA ligase in terms of both cost and efficiency. The total connection time only takes 30 minutes or even shorter. The overall experimental process has fewer operating steps and takes less time.
  • FIG1 is a schematic diagram showing the RNA library preparation and detection principle
  • FIG2 is a schematic diagram of linker preparation, wherein P represents a phosphorylation modification group and B represents a modification group that blocks the connection;
  • FIG3 is a schematic diagram showing a method for preparing an RNA library
  • FIG4 shows the sequencing quality distribution diagram of RNA library sequencing along with the sequencing cycle number
  • FIG5 shows the number of various RNA genes detected in the samples, wherein the RNA samples in FIG5a include mRNA IncRNA and pseudogene RNA, and the RNA samples in FIG5b include miRNA, tRNA, mt-tRNA, mt-rRNA, snoRNA and snRNA;
  • FIG6 shows the number and percentage of various RNA genes detected in the samples, wherein FIG6a is a 200 ⁇ L starting plasma free RNA sample, FIG6b is a 10 ng starting amount UHRR sample, and FIG6c is a 2 ng starting amount UHRR sample;
  • FIG7 shows a scatter plot of gene expression between samples and a consistency analysis result diagram, wherein the Log2(TPM+1) value of gene expression is used for analysis, FIG7a is two technical parallels of a 200 ⁇ L starting plasma free RNA sample, FIG7b is a UHRR sample with a starting amount of 10 ng and 100 ng, and FIG7c is a UHRR sample with a starting amount of 2 ng and 10 ng;
  • FIG8 shows the Pearson correlation coefficient results of gene expression between samples, wherein FIG8a is the Pearson correlation coefficient calculated using the TPM value of the gene expression, and FIG8b is the Pearson correlation coefficient calculated using the Log2(TPM+1) value after the gene expression is converted.
  • the present invention provides a method for preparing a strand-specific library and a high-throughput sequencing technology for rapid detection of multiple types of RNA.
  • Those skilled in the art can refer to the content of this article and appropriately improve the process parameters to achieve it. It should be particularly noted that all similar substitutions and modifications are obvious to those skilled in the art, and they are all considered to be included in the present invention.
  • the methods and applications of the present invention have been described through preferred embodiments, and relevant personnel can obviously modify or appropriately change and combine the methods and applications of this article without departing from the content, spirit and scope of the present invention to implement and apply the technology of the present invention.
  • the present invention provides a chain-specific library preparation and high-throughput gene sequencing technology for rapid detection of various types of RNA molecules.
  • This technology can read various types of RNA information with a relatively low starting amount of RNA input, and is suitable for various types of usage scenarios such as tissue cells and body fluid samples.
  • RNA information can be read and analyzed using high-throughput sequencing technology.
  • the present invention provides a chain-specific library preparation method and high-throughput sequencing technology for rapid detection of multiple types of RNA, which can simultaneously detect multiple types of RNA including mRNA, lncRNA, tRNA, miRNA, etc.
  • the method provided by the present invention is suitable for high-throughput sequencing detection of multiple sample types such as tissue cell RNA and free RNA (cell-free RNA, abbreviated as cfRNA), and can be applicable to RNA samples with low starting amounts at the ng level or even the pg level.
  • the present invention can be used in the fields of free RNA molecular diagnosis in the field of liquid biopsy, RNA molecular detection in tissue cells, etc., and has broad application prospects in the directions of early disease screening, auxiliary diagnosis, recurrence monitoring, disease prognosis, etc., among which potential clinical application scenarios include but are not limited to complex diseases such as pregnancy diseases, tumors, cardiovascular and cerebrovascular diseases, infectious diseases, genetic diseases, nervous system diseases, and mental diseases.
  • the present invention can be extended to research in the fields of animals, plants, and microorganisms, and simultaneously detect various types of RNA; it can be used to study the growth and development characteristics, tissue and cell functions, gene functions, environmental adaptability, species interactions, etc.
  • the technical principle of the present invention may also be extended to single-cell or spatiotemporal omics technology, which can be used to simultaneously capture different types of RNA, including non-coding RNA, and characterize the expression of multiple types of RNA in single cells or at different spatial locations.
  • the principle of the technology of the present invention combined with long-chain PCR technology and single-molecule sequencing, may also be applied to the application research of the full-length transcriptome.
  • an optional step is to perform DNA digestion on the RNA sample to remove the influence of residual DNA molecules on subsequent steps.
  • the end modification enzyme is used to modify the end of the RNA molecule, and the 3' end of the RNA molecule is hydroxylated, so that more natural RNA molecules can achieve polyadenylic acid tailing reaction at the 3' end.
  • polyadenylic acid polymerase is used to complete polyadenylic acid tailing at the 3' end of the RNA molecule.
  • oligo dT primers containing deoxyuracil (dU) are used to complementally pair with RNA molecules, and single-stranded cDNA is synthesized using RNA as a template under the action of reverse transcriptase to obtain single-stranded cDNA molecules with deoxyuracil (dU) and polydeoxythymidine ribonucleotide (oligo dT) sequences.
  • a uracil-specific excision reagent is used to digest the cDNA, thereby excising the polydeoxythymidine ribonucleotide sequence fragment containing deoxyuracil (dU) in the cDNA, removing the influence of the low-complexity sequence artificially introduced in the previous step on subsequent sequencing and analysis. Then, the cDNA library is prepared.
  • the cDNA library preparation method provided by the present invention is to first perform high temperature denaturation treatment on cDNA to unwind the double-stranded structure that may appear in the local area of cDNA.
  • the denaturation reaction system can add single-stranded binding protein to assist in maintaining the linear structure of single-stranded cDNA molecules to avoid annealing and renaturing into complex hairpin-like structures after denaturation.
  • the cDNA after high temperature denaturation is then used through special double-stranded adapters and matching connection technology to complete the cDNA library preparation.
  • the present invention provides a single-stranded DNA library preparation technology.
  • adapters can be added to both ends of the single-stranded cDNA molecule at the same time, and a universal structural sequence for PCR amplification reaction can be introduced, and then a sequencing library can be obtained through PCR amplification.
  • the sequencing library product is subsequently processed and sequenced according to the requirements of the gene sequencer, and the purpose of detecting RNA is achieved by reading and analyzing the sequence information.
  • the label (Barcode) sequence for sample identification can be introduced through PCR primers, and the library samples with different Barcode sequences can be mixed into a sequencing sample according to the proportion requirements of the sequencing data.
  • the sequencing samples are subsequently processed and sequenced according to the requirements of the gene sequencer.
  • Each sequencing result will be accurately located in each sample after the Barcode sequence alignment and splitting, and the sequencing results of each sample will be sequenced to achieve the purpose of detecting RNA for each sample.
  • the Barcode sequence is used to achieve the mixing of multiple library samples for sequencing detection, improve the detection throughput, and reduce the detection cost.
  • the present invention completes polyadenylic acid tailing at the 3' end of various types of RNA molecules, and combines the reverse transcription method using polydeoxythymidine ribonucleotide (oligo dT) primers to achieve effective reverse transcription of various types of RNA, thereby achieving the capture of various types of RNA molecules and the construction of sequencing libraries.
  • Various types of RNA include but are not limited to mRNA, lncRNA, miRNA, tRNA, etc.
  • the polyadenylic acid tail is Poly (A) n (n represents the number of adenine A, and n is an integer between 8 and 200).
  • the present invention designs a polydeoxythymidine ribonucleotide (oligo dT) primer containing deoxyuracil (dU) for synthesizing a single-stranded cDNA molecule by reverse transcription reaction, and the primer sequence is named as the first nucleotide sequence.
  • the primer sequence contains deoxyuracil ribonucleotide (dU), polydeoxythymidine ribonucleotide (oligo dT), and other random bases (V and N), and its sequence is the nucleotide composition shown in Table 1 below.
  • the present invention can combine the multiple reaction steps mentioned in the above experimental principles into the same experimental operation step, thereby reducing the operation steps and time required for library preparation and ensuring the efficiency of the reaction.
  • the preparation method of the library of the present invention specifically includes: DNA digestion, RNA end modification, adding polyA tail, reverse transcription, U base digestion, cDNA end modification, denaturation, adding linker and library construction.
  • the DNA digestion and RNA end modification are performed in a first system; the reverse transcription is performed in a second system; and the cDNA end modification and denaturation are performed in a third system.
  • the step of adding the polyA tail is performed in the first system or the second system.
  • the step of adding the polyA tail is performed in a first system, wherein the first system is: 14 ⁇ L of RNA sample, 3 ⁇ L of 10 ⁇ PolyA polymerase reaction buffer, 1 ⁇ L of 10 mM ATP, 4 ⁇ L of 10 mg/mL BSA, 2 ⁇ L of DNase I, 0.5 ⁇ L of 10 U/ ⁇ L T4 polynucleotide kinase, 1 ⁇ L of 5 U/ ⁇ L PolyA polymerase, 0.5 ⁇ L of 40 U/ ⁇ L RNase inhibitor, and 4 ⁇ L of nuclease-free water;
  • the concentrations of the components of the second system are: 30 ⁇ L of the reaction product of the first system, 2 ⁇ L of 5 ⁇ M reverse transcription primer, 4 ⁇ L of 5 ⁇ HiScript III reaction buffer, 1 ⁇ L of 200 U/ ⁇ l HiScript III reverse transcriptase, 2 ⁇ L of 5 mM dNTP Mix, 0.5 ⁇ L of 40 U/ ⁇ l RNase inhibitor and 10.5 ⁇ L of nuclease-free water.
  • the step of adding polyA tail is performed in a second system, wherein the first system is: 14 ⁇ L RNA sample, 2 ⁇ L 10 ⁇ DNase I reaction buffer, 1 ⁇ L 10 mM ATP, 2 ⁇ L DNase I, 0.5 ⁇ L 10 U/ ⁇ L T4 polynucleotide kinase and 0.5 ⁇ L 40 U/ ⁇ L RNase inhibitor;
  • the second system consists of: 20 ⁇ L of the reaction product of the first system, 6 ⁇ L of 5 ⁇ HiScript III reaction buffer, 1 ⁇ L of 10 mg/mL BSA, 10 ⁇ L of 50% PEG8000, 2 ⁇ L of 5 mM dNTP Mix, 2 ⁇ L of 5 ⁇ M reverse transcription primer, 1 ⁇ L of 5 U/ ⁇ L Poly A polymerase, 0.5 ⁇ L of 40 U/ ⁇ l RNase inhibitor, 1 ⁇ L of 200 U/ ⁇ l HiScript III reverse transcriptase and 6.5 ⁇ L of nuclease-free water.
  • the third system is: the reaction product after U base digestion, 5 ⁇ L of 10 ⁇ polynucleotide kinase reaction buffer, 1 ⁇ L of 10U/ ⁇ L T4 polynucleotide kinase, 2 ⁇ L of 220mM Tris buffer, pH 8.0, 0.6 ⁇ L of 500ng/ ⁇ L ultra-thermostable single-stranded binding protein, and 21.4 ⁇ L of nuclease-free water.
  • the third system also includes a reagent for U base digestion.
  • the third system is: the reaction product of polyA tailing and reverse transcription, 5 ⁇ L of 10 ⁇ polynucleotide kinase reaction buffer, 1 ⁇ L of 10U/ ⁇ L T4 polynucleotide kinase, 2 ⁇ L of 220mM Tris buffer at pH 8.0, 0.6 ⁇ L of 500ng/ ⁇ L ultra-thermostable single-stranded binding protein, 2 ⁇ L of 10U/ ⁇ L uracil specific excision reagent USER enzyme, and 19.4 ⁇ L of nuclease-free water.
  • reaction steps can be selectively combined into the same experimental operation step, so as to flexibly modify the experimental steps and control the experimental time.
  • the combination of reaction steps can be adjusted according to the specific experiment, and the present invention is not limited to this.
  • the present invention provides a connector for introducing a specific target sequence at the 5' and 3' ends of a single-stranded cDNA molecule and a corresponding DNA connection scheme.
  • This scheme involves two connectors, a first connector and a second connector.
  • Both connectors contain a double-stranded region and at least one protruding single-stranded region, wherein the double-stranded region contains a universal structural sequence for PCR or sequencing, and the protruding single-stranded region contains one or more (1-10) random base sequences, which are used for complementary pairing with the end of the single-stranded DNA molecule, so that the connector and the single-stranded DNA can be splinted.
  • the protruding single-stranded region of the first connector is at the 5' end, and the protruding single-stranded region of the second connector is at the 3' end.
  • the first connector is used for complementary pairing at the 5' end of the single-stranded DNA molecule
  • the second connector is used for complementary pairing and splint connection at the 3' end of the single-stranded DNA molecule.
  • One embodiment of the first connector is that two polynucleotide sequences interact to form its special structure, and the two sequences contain complementary pairing regions, which will be complementary paired to form a specific double-stranded structure and a protruding single-stranded region containing random bases after solution mixing and static treatment.
  • One implementation of the second linker is that two polynucleotide sequences interact to form its special structure, and the two sequences contain complementary paired regions, which will complement each other to form a specific double-stranded structure and a protruding single-stranded region containing random bases after solution mixing and static treatment.
  • the schematic diagram of the preparation principle of the first linker and the second linker solution is shown in Figure 2.
  • the present invention provides a sequence scheme of a first joint and a second joint, which consists of four nucleotide sequences, and the sequences are shown in Table 2.
  • the first joint consists of two sequences, namely, the second nucleotide sequence and the third nucleotide sequence in Table 2.
  • the 5' end of the third nucleotide sequence contains a random base sequence, the number of random bases is 1-10, and the random base is base A, base T, base C or base G; the random base sequence can be complementary to the 5' end of the single-stranded cDNA molecule in the first joint structure, and is used for splint connection to improve the connection efficiency.
  • the two sequences of the second joint are the fourth nucleotide sequence and the fifth nucleotide sequence in Table 2.
  • the 3' end of the fifth nucleotide sequence contains a random base sequence, the number of random bases is 1-10, and the random base is base A, base T, base C or base G; the random base sequence can be complementary to the 3' end of the single-stranded cDNA molecule in the second joint structure, and is used for splint connection to improve the connection efficiency.
  • the 5' end of the second nucleotide sequence needs to be specially modified to block it from performing a connection reaction.
  • the 5' end of the fourth nucleotide sequence is phosphorylated; the 3' end is specially modified to block the ligation reaction.
  • the 5' end and 3' end of the third nucleotide sequence and the fifth nucleotide sequence are specially modified to block the ligation reaction.
  • both the 5' and 3' ends of the single-stranded DNA molecule can be directly connected to the linker, and then the primers designed based on the linker sequence can be used for PCR amplification reaction.
  • the present invention uses a DNA ligation reaction enhancing reagent, for example, a chemical reagent hexaamminecobalt chloride with a suitable concentration is used to improve the DNA ligation efficiency, realize the connection of the two end adapters at the 5' end and the 3' end of the single-stranded cDNA molecule in a one-step experimental operation, and shorten the reaction time of the connection.
  • a DNA ligation reaction enhancing reagent for example, a chemical reagent hexaamminecobalt chloride with a suitable concentration is used to improve the DNA ligation efficiency, realize the connection of the two end adapters at the 5' end and the 3' end of the single-stranded cDNA molecule in a one-step experimental operation, and shorten the reaction time of the connection.
  • PCR amplification can be directly performed to obtain the final library.
  • PCR primers a complete sequencing structure sequence can be introduced, including a barcode sequence that can distinguish samples, to adapt to the needs of the sequencing platform.
  • the present invention does not require an additional separate step to synthesize the complementary double-stranded DNA of the cDNA molecule, thereby reducing the steps and time required for library preparation.
  • the present invention adopts the library preparation method of the single-stranded cDNA molecule, the strand-specific information of the RNA is retained, which is more conducive to RNA analysis such as gene annotation.
  • the present invention provides a set of universal primer sequences for PCR amplification of library samples, wherein the forward universal primer is the sixth nucleic acid sequence, and the reverse primer is the seventh nucleic acid sequence.
  • the reverse primer may contain a label (Barcode) sequence for sample identification, which is named the seventh nucleic acid sequence-N, where N represents the Barcode number, and different Barcode sequences have different numbers.
  • the universal primer sequence is composed of the nucleotides in Table 3 below.
  • the Barcode sequence in the seventh nucleic acid sequence-N can be matched with the Barcode of the universal sequencing library of the sequencing platform, the structure of the PCR amplification product can be kept consistent with the universal library structure of the sequencing platform, and the source of the sample library can be accurately split by the Barcode.
  • the adapter and PCR primer sequence schemes (Table 2 and Table 3) provided by the present invention are suitable for preparing high-throughput sequencing libraries of the DNBSEQ and MGISEQ series of the MGI sequencing platform of MGI.
  • the design principle provided by the present invention can also be used to design sequence schemes suitable for other sequencing platforms to prepare compatible sequencing libraries.
  • test materials used in the present invention are all common commercial products and can be purchased on the market.
  • the present invention is further described below in conjunction with the embodiments:
  • UHRR Universal Human Reference RNA
  • Agilent Agilent, 740000
  • One tube of commercial standard 200 ⁇ g RNA, 70% ethanol and 0.1 M sodium acetate solution
  • the precipitate was washed with 70% ethanol, centrifuged again at 4°C, 12,000 ⁇ g for 15 minutes, the supernatant was carefully aspirated, the precipitate was dried at room temperature for 30 minutes, and the precipitate was re-dissolved with 1 mL of nuclease-free H 2 O to a RNA concentration of about 200 ng/ ⁇ L after re-dissolution, and the accurate concentration of the UHRR solution after re-dissolution was determined using a Qubit3.0 fluorescence quantifier (Invitrogen, Q33216).
  • the standard solution was diluted stepwise with nuclease-free H 2 O to prepare 4 samples with different RNA starting amounts, namely 100 ng, 10 ng, 2 ng and 0.2 ng, with a total volume of 14 ⁇ L in each tube.
  • a human plasma sample with a volume of 400 ⁇ L was used and divided into two portions of 200 ⁇ L plasma.
  • Serum/plasma miRNA extraction and separation kits spin column type, TIANGEN, DP503 were used for extraction, and the operation was performed strictly according to the instructions.
  • Example 1 According to the method described in Example 1, the starting amounts of 2 ng, 10 ng, and 100 ng of human universal RNA standard UHRR solutions were prepared, and the samples were named UHRR-2ng, UHRR-10ng, and UHRR-100ng.
  • DNase I RNase Free
  • NEB NEB
  • M0303L DNase Free
  • T4 polynucleotide kinase T4 PNK, NEB, M0201L
  • polyadenylic acid tailing was performed at the 3' end of RNA molecules using polyadenylic acid polymerase (NEB, E. coli Poly (A) Polymerase, M0276L)
  • the reaction system is shown in Table 7. The reaction system was incubated at 37°C for 15 minutes, inactivated at high temperature (95°C) for 5 minutes, and after the reaction, the sample was placed on ice for 2 minutes.
  • the first nucleic acid sequence primer i.e., reverse transcription primer
  • the first nucleotide sequence in this embodiment is specifically 5'-TTTTTTTUTTTTTTTUVN-3' (SEQ ID NO: 1). After adding the primer solution, shake and mix, centrifuge briefly, denature at 65°C for 5 minutes, and incubate at 30°C for 1 minute.
  • the U base digestion reaction was performed using uracil specific excision reagent (USER) enzyme (NEB, USER Enzyme, M5505L), and the digested single-stranded cDNA molecule product was obtained. 4 ⁇ L of U base digestion reaction mixture was added to the reaction system, and its composition is shown in Table 10. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 11 was run.
  • U base digestion reaction was performed using uracil specific excision reagent (USER) enzyme (NEB, USER Enzyme, M5505L), and the digested single-stranded cDNA molecule product was obtained. 4 ⁇ L of U base digestion reaction mixture was added to the reaction system, and its composition is shown in Table 10. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 11 was run.
  • T4 polynucleotide kinase T4 PNK, NEB, M0201L
  • high temperature is used to denature the cDNA molecule to unwind the double-stranded structure that may appear in the local area of the cDNA.
  • ET SSB NEB, M2401S
  • the reaction system is placed on a PCR instrument (Bori, TC-96) and the program in Table 13 is run.
  • the present invention uses a DNA ligation reaction enhancing reagent.
  • a chemical reagent hexaamminecobalt chloride with a suitable concentration is used to improve the DNA ligation efficiency, realize the connection of the two end adapters of the 5' end and the 3' end of the single-stranded cDNA molecule in a one-step experimental operation, and shorten the reaction time of the connection.
  • the primer of the sixth nucleic acid sequence (40 ⁇ M) is used as the forward primer
  • the primer of the seventh nucleic acid sequence-N (40 ⁇ M) is used as the reverse primer for universal PCR amplification reaction.
  • the reverse primer contains a Barcode sequence for sample identification. Each sample uses a unique Barcode sequence, and the different Barcode sequences in the reverse primers of different samples are used to distinguish different samples in the offline data.
  • the PCR enzyme reaction solution used in this embodiment is KAPA HiFi HotStart ReadyMix (2X) (Kapa Biosystems, KK2602), and the reaction composition is shown in Table 16.
  • the reaction system is placed on a PCR instrument (Bori, TC-96) to run the program in Table 17.
  • PCR product obtained by universal PCR amplification was purified using 50 ⁇ L of Agencourt AMPure XP magnetic beads (Beckman Coulter, A63881). The concentration of 35 ⁇ L of purified DNA was measured using Qubit3.0 fluorescence quantification instrument (Invitrogen, Q33216). At the same time, the library samples were mixed into sequencing library samples according to the same final concentration and shaken for use.
  • Single-chain cyclization was performed using the MGIEasy cyclization module V2.0 (MGI, 1000005260) of Shenzhen MGI Intelligent Manufacturing Technology Co., Ltd., and sequencing was performed using the MGISEQ-2000RS high-throughput sequencing reagent kit (FCL PE100) (MGI, 1000012554) of Shenzhen MGI Intelligent Manufacturing Technology Co., Ltd. All operations were performed strictly in accordance with the instructions of the kit.
  • the PE100+10 (Paired end 100+10) sequencing type was used to obtain reliable base sequence information.
  • the offline data was split and screened according to the Barcode sequence to obtain the sequencing data of each sample.
  • the human universal RNA standard UHRR was prepared with starting amounts of 0.2 ng, 2 ng, and 10 ng, respectively, and the samples were named UHRR-F-0.2 ng, UHRR-F-2 ng, and UHRR-F-10 ng.
  • Plasma-F-200uL-1 and Plasma-F-200uL-2 Plasma-F-200uL-2.
  • DNase I RNase Free
  • NEB NEB
  • M0303L DNase Free
  • T4 polynucleotide kinase T4 PNK, NEB, M0201L
  • the reaction system is shown in Table 18. The reaction system was incubated at 37°C for 15 minutes and inactivated at high temperature (95°C) for 5 minutes. After the reaction, the sample was placed on ice for 2 minutes.
  • Polyadenylic acid tailing was performed on the 3' end of the RNA molecule using polyadenylic acid polymerase (NEB, E. coli Poly (A) Polymerase, M0276L); reverse transcription was performed using the primer of the first nucleic acid sequence and 200U/ ⁇ l HiScript III Reverse transcriptase (Novozyme, R302-01).
  • the first nucleic acid sequence in this embodiment is specifically 5'-TTTTTTTUTTTTTTTUVN-3'.
  • 30 ⁇ L of polyadenylic acid tailing and reverse transcription reaction mixture was added to the reaction system, and its composition is shown in Table 19. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 20 was run.
  • Table 19 Compositions of the polyadenylation tailing and reverse transcription reaction mixture.
  • the U base digestion reaction was performed using uracil specific excision reagent (USER) enzyme (NEB, USER Enzyme, M5505L), and the digested single-stranded cDNA molecule product was obtained.
  • the 5' end of the digested single-stranded cDNA molecule was phosphorylated using T4 polynucleotide kinase (T4 PNK, NEB, M0201L).
  • T4 PNK T4 polynucleotide kinase
  • ET SSB (NEB, M2401S) was added to the denaturation reaction system to assist in maintaining the linear structure of the single-stranded cDNA molecules and avoid annealing and renaturing into complex hairpin-like structures after denaturation.
  • 30 ⁇ L of the U base digestion reaction mixture was added to the reaction system, and its composition is shown in Table 21.
  • the reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 22 was run.
  • Example 3 and Example 4 The sequencing quality of the original data sequenced in Example 3 and Example 4 was analyzed, and the sequencing quality of the sequences above Q30 reached more than 90%.
  • the quality of the sequencing chip was analyzed, and the sequencing quality did not decrease significantly during the entire sequencing cycle, and always remained at a high level, as shown in Figure 4. The above results show that high-quality sequencing data can be obtained using the method of the present invention.
  • the original sequencing data is split and screened by the Barcode sequence to obtain the sequencing data of each sample.
  • the present invention can realize the simultaneous detection of multiple samples in the same sequencing chip.
  • the raw sequencing data (raw reads) of each sample were filtered for low-quality bases, short sequences, linkers, and microbial sequences, and then ribosomal RNA (rRNA) was removed.
  • rRNA ribosomal RNA
  • the alignment rate of UHRR samples with low starting amount is relatively low, for example, the total alignment rate of UHRR-F-0.2ng is 43.63%, and the unique alignment rate is 35.27%; the total alignment rate of UHRR-F-2ng is 86.36%, and the unique alignment rate is 61.15%.
  • the total alignment rate of plasma free RNA sample starting from 200 ⁇ L is about 35%, and the unique alignment rate is about 28%.
  • the chain-specific ratios calculated for UHRR samples were all above 80%, and most of them could reach around 90% or above; the chain-specific ratios calculated for plasma samples were around 70%.
  • RNA types analyzed in this embodiment include mRNA (messenger RNA), lncRNA (long non-coding RNA), pseudogene RNA, miRNA (microRNA), tRNA (transfer RNA), mt-tRNA (mitochondrial transfer RNA), mt-rRNA (mitochondrial ribosomal RNA), snoRNA (small neclear RNA), and snRNA (small cytoplasmic RNA).
  • mRNA messenger RNA
  • lncRNA long non-coding RNA
  • pseudogene RNA miRNA
  • miRNA miRNA
  • microRNA miRNA
  • tRNA transfer RNA
  • mt-tRNA mitochondrial transfer RNA
  • mt-rRNA mitochondrial ribosomal RNA
  • snoRNA small neclear RNA
  • snRNA small cytoplasmic RNA
  • All UHRR samples can detect more than 10,000 protein-coding genes; can also detect about 4,000 lncRNA genes; can detect about 3,000 to 10,000 pseudogene RNA; can detect about 200-900 miRNA genes; can detect about 200-300 tRNA genes; can detect about 300-800 snoRNA genes and about 40-90 snRNA genes; 22 tRNAs and 2 rRNAs of mitochondria, namely mt-tRNA and mt-rRNA, can all be detected.
  • the number of genes that can be detected in free RNA is relatively small compared to UHRR. But it can also detect nearly 10,000 protein-coding genes, and can detect other types of RNA. Attached Figure 6 of this embodiment also shows the percentage of the number of various RNA genes detected in some samples.
  • RNA expression was performed using the TPM (Transcript per million) calculation method, based on the whole genome gtf (gene transfer format) file, to achieve qualitative and quantitative analysis of different types of RNA and obtain RNA expression profiles.
  • TPM Transcript per million
  • gtf gene transfer format
  • RNA expression Quantitative analysis of RNA expression was performed according to the Log 2 (TPM+1) value.
  • the fluctuation of low-expression genes had a certain negative impact on the results, and the calculated correlation coefficient was lower than the value calculated by the original TPM value.
  • the correlation coefficient of UHRR samples was at least 0.67, and most of them were above 0.70; the correlation coefficient of two technical replicates of plasma free RNA samples was 0.83.
  • the UHRR sample starting amount range implemented in the present invention is relatively wide, including 4 different starting amount ranges of 0.2ng, 2ng, 10ng, and 100ng. Among them, the lowest UHRR starting amount is as low as 0.2ng. The starting amount of plasma samples is also as low as 200 ⁇ L. The above starting amount is much lower than the RNA starting amount used in most of the current commercial RNA kits and literature. In the case of the low starting amount, under the experimental conditions of two batches of two different library preparation processes (Example 3 and Example 4), a high correlation coefficient can still be shown, indicating that the method of the present invention has high stability and low starting amount advantages.
  • the present invention is a strand-specific library preparation method and high-throughput sequencing technology for rapid detection of multiple types of RNA, and the RNA types that can be detected simultaneously include but are not limited to mRNA, lncRNA, tRNA, miRNA, etc. It solves the problems of the traditional RNA library preparation method that the RNA types captured are relatively single or limited, the experimental steps are complicated, and the time period is long.
  • RNA sequencing library prepared by the present invention does not contain artificially added low-complexity sequences, so that the sequencing quality and data volume are guaranteed.
  • the sequencing library prepared by the present invention contains RNA strand-specific information, which is beneficial to RNA analysis such as gene annotation.
  • the present invention combines multiple reaction steps mentioned in the experimental principle into the same experimental operation step by optimizing the reaction conditions. For example, polyadenylation tailing and reverse transcription reaction are combined into one step, and the two end adapters of cDNA are connected separately and combined into one step, thereby reducing the operation steps and time of library preparation, and the experimental operation is simpler and faster.
  • the present invention uses a DNA ligase scheme to replace the RNA ligase scheme in the miRNA library preparation method to add a sequencing adapter sequence through experimental design, thereby improving the efficiency of connection, shortening the time required for connection, and reducing the cost.
  • the present invention adopts PCR amplification technology to amplify the library sequence, introduces the Barcode sequence for library identification and the structural sequence required for cyclization reaction and sequencing through PCR technology, and can realize the mixing of multiple library samples for sequencing together, thereby improving the detection throughput and reducing the detection cost.
  • the method of the present invention is applicable to high-throughput sequencing detection of various sample types such as tissue cell RNA and free RNA, and can be applied to RNA samples with low starting amounts at the ng level or even the pg level.
  • the present invention can be used for free RNA detection in the field of liquid biopsy.
  • the above is only a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the principles of the present invention. These improvements and modifications should also be regarded as the protection scope of the present invention.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided in the present invention are a method for preparing a strand-specific library for the rapid detection of various types of RNAs, and a high-throughput sequencing method. A polyadenylic acid (poly A) tail is artificially added at the 3' terminal of various RNAs by using a poly A polymerase, and a single-stranded cDNA is synthesized using a poly-deoxythymidine ribonucleotide primer with deoxyuridine under the action of a reverse transcriptase. The obtained single-stranded cDNA molecule is subjected to a series of reactions, and the reaction product is finally amplified by means of PCR to obtain a strand-specific library of various types of RNAs. The library sample can be subjected to on-machine sequencing.

Description

一种快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术A strand-specific library preparation method and high-throughput sequencing technology for rapid detection of multiple types of RNA 技术领域Technical Field

本发明涉及生物技术领域,具体涉及一种快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术。The present invention relates to the field of biotechnology, and in particular to a method for preparing a chain-specific library for rapidly detecting multiple types of RNA and a high-throughput sequencing technology.

背景技术Background Art

随着分子生物学、基因组学等基础学科的发展,液体活检已成为精准医疗领域的前沿热点。游离RNA(cell free RNA,cfRNA)存在于各种体液中,如血液、尿液、脑脊液、唾液、乳汁、胸水、腹水等,可以指示监测机体的生理状态。细胞内的RNA种类包括核糖体RNA(ribosomal RNA,rRNA)、信使RNA(messenger RNA,mRNA)、转运RNA(transfer RNA,tRNA)、微小非编码RNA(microRNA,miRNA)、长链非编码RNA(long non-coding RNA,lncRNA)等。体液样本的游离RNA携带来自不同组织和器官的实时基因表达信息,可以为疾病早筛、辅助诊断、复发监测、疾病预后等提供更多的解决方案。由于其采样方式的无创或微创性,并且具有实时和全面监测的优势,游离RNA有可能成为未来精准医疗的重要检测手段。With the development of basic disciplines such as molecular biology and genomics, liquid biopsy has become a hot spot in the field of precision medicine. Cell free RNA (cfRNA) exists in various body fluids, such as blood, urine, cerebrospinal fluid, saliva, milk, pleural effusion, ascites, etc., and can indicate and monitor the physiological state of the body. The types of RNA in cells include ribosomal RNA (rRNA), messenger RNA (mRNA), transfer RNA (tRNA), micro non-coding RNA (miRNA), long non-coding RNA (lncRNA), etc. The free RNA in body fluid samples carries real-time gene expression information from different tissues and organs, which can provide more solutions for early screening of diseases, auxiliary diagnosis, recurrence monitoring, disease prognosis, etc. Due to the non-invasive or minimally invasive sampling method and the advantages of real-time and comprehensive monitoring, free RNA may become an important detection method for precision medicine in the future.

为了全面探索游离RNA的生物标志物潜力,需要高效、快速、低成本、稳定的游离RNA检测技术。基于高通量测序的RNA检测技术,具有单碱基分辨率、检测范围广、通量高等优势。然而,游离RNA高通量测序文库制备技术还存在着较多挑战,包括但不限于样本起始量低、RNA易降解、实验周期长、效率低、目标RNA类型单一或局限等问题。目前市场上没有专门针对游离RNA的商业化测序文库制备试剂盒,科研领域主要采用适用于常规细胞组织样本RNA的测序文库制备的试剂盒来研究游离RNA。市场上的商业化RNA测序文库制备试剂盒,目标RNA类型比较单一或局限,所能获得的信息也有限,无法对所有类型RNA进行系统性研究。 In order to fully explore the biomarker potential of free RNA, efficient, rapid, low-cost and stable free RNA detection technology is needed. RNA detection technology based on high-throughput sequencing has the advantages of single-base resolution, wide detection range and high throughput. However, there are still many challenges in the preparation technology of free RNA high-throughput sequencing libraries, including but not limited to low sample starting amount, easy RNA degradation, long experimental cycle, low efficiency, single or limited target RNA type and other problems. At present, there are no commercial sequencing library preparation kits specifically for free RNA on the market. The scientific research field mainly uses kits suitable for sequencing library preparation of conventional cell tissue sample RNA to study free RNA. The commercial RNA sequencing library preparation kits on the market have relatively single or limited target RNA types, and the information that can be obtained is also limited, and it is impossible to conduct systematic research on all types of RNA.

目前市场上没有可以全面捕获多种类型RNA及专门针对游离RNA的商业化文库制备试剂盒,其替代的解决方案是使用常规的RNA测序文库制备试剂盒来构建游离RNA的测序文库。Currently, there are no commercial library preparation kits on the market that can comprehensively capture multiple types of RNA and specifically target free RNA. The alternative solution is to use conventional RNA sequencing library preparation kits to construct sequencing libraries for free RNA.

常规的RNA测序文库制备试剂盒,主要分为适用于mRNA和miRNA的试剂盒,不能同时全面地捕获总RNA中多种类型的RNA。多种类型的RNA包括mRNA、lncRNA、tRNA、miRNA等。大多数商业化的RNA测序文库制备试剂盒对总RNA的起始量要求较高。此外,商业化的RNA测序文库的实验步骤较多,过程较复杂,实验周期较长,不利于临床领域的快速和稳定的检测。目前商业化的适用于mRNA检测的测序文库制备试剂盒有Illumina公司的TruSeq RNA Library Prep Kit v2,TruSeq Stranded mRNA and Total RNA Library prep kits,NEB公司的NEBNext UltraTM II RNA Library Prep Kit for Illumina,TAKARA公司的SMARTer Stranded Total RNA-Seq Kit等。上述试剂盒的方案,采用多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)磁珠或引物有选择性地富集含有多聚腺苷酸(polyA)尾巴的mRNA,特异性地制备mRNA的测序文库。上述试剂盒,也可以通过适配rRNA去除的步骤,再利用含有多个随机碱基的引物进行逆转录,在捕获mRNA的同时捕获lncRNA,但该方法很难捕获较多的短片段的miRNA等其他RNA。上述试剂盒都需要首先逆转录为单链的cDNA,然后再合成互补的二链DNA,形成双链DNA,在此基础上再制备为测序文库,因此实验步骤比较复杂,实验周期比较长。Conventional RNA sequencing library preparation kits are mainly divided into kits suitable for mRNA and miRNA, which cannot capture multiple types of RNA in total RNA at the same time. Various types of RNA include mRNA, lncRNA, tRNA, miRNA, etc. Most commercial RNA sequencing library preparation kits have high requirements for the starting amount of total RNA. In addition, commercial RNA sequencing libraries have more experimental steps, more complicated processes, and longer experimental cycles, which are not conducive to rapid and stable detection in the clinical field. Currently, commercial sequencing library preparation kits suitable for mRNA detection include Illumina's TruSeq RNA Library Prep Kit v2, TruSeq Stranded mRNA and Total RNA Library prep kits, NEB's NEBNext Ultra TM II RNA Library Prep Kit for Illumina, TAKARA's SMARTer Stranded Total RNA-Seq Kit, etc. The above kit scheme uses polydeoxythymidine ribonucleotide (oligo dT) magnetic beads or primers to selectively enrich mRNA containing polyadenylic acid (polyA) tails and specifically prepare mRNA sequencing libraries. The above kit can also capture lncRNA while capturing mRNA by adapting the rRNA removal step and then using primers containing multiple random bases for reverse transcription, but this method is difficult to capture more short fragments of miRNA and other RNA. The above kits all need to first reverse transcribe into single-stranded cDNA, and then synthesize complementary double-stranded DNA to form double-stranded DNA, and then prepare it as a sequencing library on this basis, so the experimental steps are relatively complicated and the experimental cycle is relatively long.

目前已商业化的适用于miRNA或small RNA的试剂盒也比较多。例如,Illumina公司的TruSeq small RNA Library Prep Kit,Bioo Scientific公司的Nextflex small RNA-Seq Kit v3,NEB公司的NEBNext small RNA Library Prep Set for Illumina,华大智造MGI公司的MGIEasy small RNA文库制备试剂盒,QIAGEN的QIAseq miRNA Library Kit。上述试剂盒都是利用RNA连接酶直接在RNA分子的3’和5’端连接含有特定序列的接头,再转录为cDNA,然后再通过PCR扩增反应制备测序文库。上述试剂盒不适用于mRNA和lncRNA等长链的RNA捕获,且实验需要RNA连接酶,成本相对较高,效率相对较低,连接时间长,步骤比较多,实验周期比较长。There are also many commercial kits suitable for miRNA or small RNA. For example, Illumina's TruSeq small RNA Library Prep Kit, Bioo Scientific's Nextflex small RNA-Seq Kit v3, NEB's NEBNext small RNA Library Prep Set for Illumina, MGI's MGIEasy small RNA library preparation kit, and QIAGEN's QIAseq miRNA Library Kit. The above kits all use RNA ligase to directly connect adapters containing specific sequences to the 3' and 5' ends of RNA molecules, then transcribe them into cDNA, and then prepare sequencing libraries through PCR amplification reactions. The above kits are not suitable for capturing long-chain RNAs such as mRNA and lncRNA, and the experiments require RNA ligase, which is relatively costly, relatively inefficient, takes a long time to connect, has many steps, and a long experimental cycle.

有一种针对small RNA的商业化试剂盒SMARTer smRNA-Seq Kit for Illumina的原理比较特殊。首先在small RNA的3’端进行多聚腺苷酸加尾,然后利用含有接头序列和多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)引物进行逆转录,并利用逆转录的模板转换(Template switch)的方法在cDNA末尾引入另一端的接头序列。最后利用两端已添加的接头序列进行PCR,完成测序文库的制备。然而,上述方法会引入人为添加的多聚脱氧胸腺嘧啶核糖核苷酸序列,降低了文库序列的复杂度,会严重影响测序质量。为了减少低复杂度序列对测序的影响,需要在测序时添加碱基平衡文库,但这种方案会导致有效数据量降低。There is a commercial kit for small RNA, SMARTer smRNA-Seq Kit for Illumina, which has a special principle. First, polyadenylic acid tailing is performed on the 3’ end of the small RNA, and then reverse transcription is performed using primers containing adapter sequences and polydeoxythymidine ribonucleotides (oligo dT), and the template switching method of reverse transcription is used to introduce the adapter sequence at the other end at the end of the cDNA. Finally, PCR is performed using the adapter sequences added at both ends to complete the preparation of the sequencing library. However, the above method will introduce artificially added polydeoxythymidine ribonucleotide sequences, which reduces the complexity of the library sequence and will seriously affect the sequencing quality. In order to reduce the impact of low-complexity sequences on sequencing, it is necessary to add a base-balanced library during sequencing, but this solution will result in a reduction in the amount of effective data.

常用的RNA文库制备方法不适合低起始量或降解类的样本,对血浆样本的适配度较低,科研方案一般需要mL级别的血浆投入,这种较高血浆起始量投入会降低其使用的广泛性。Commonly used RNA library preparation methods are not suitable for low starting amounts or degraded samples, and have a low adaptability to plasma samples. Scientific research programs generally require mL-level plasma inputs, and this higher starting amount of plasma input will reduce its widespread use.

发明内容Summary of the invention

有鉴于此,本发明要解决的技术问题在于提供一种快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术,本发明提供了灵敏度高、适用范围广、抗干扰能力强、制备方法简单快捷且适用于高通量测序技术的快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术。In view of this, the technical problem to be solved by the present invention is to provide a method for preparing a chain-specific library for rapidly detecting multiple types of RNA and a high-throughput sequencing technology. The present invention provides a method for preparing a chain-specific library for rapidly detecting multiple types of RNA and a high-throughput sequencing technology, which have high sensitivity, a wide range of applications, strong anti-interference ability, a simple and quick preparation method, and are suitable for high-throughput sequencing technology.

本发明提供了多种类型RNA文库的制备方法,包括:将RNA样品末端加polyA,经逆转录和U碱基消化,制备cDNA文库;The present invention provides a method for preparing various types of RNA libraries, including: adding polyA to the end of an RNA sample, performing reverse transcription and U base digestion to prepare a cDNA library;

所述RNA样品中含有总RNA、rRNA、mRNA、tRNA、miRNA和/或lncRNA中至少一种。The RNA sample contains at least one of total RNA, rRNA, mRNA, tRNA, miRNA and/or lncRNA.

本发明通过在各种类型的RNA分子末端添加多聚腺苷酸(poly A)尾巴,结合使用多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)引物的逆转录方法,使得包括非编码RNA(lncRNA、miRNA等)在内的多种类型的RNA分子可以在后续的步骤中完成逆转录,从而实现多种类型的RNA被建库。 The present invention adds polyadenylic acid (poly A) tails to the ends of various types of RNA molecules and combines the reverse transcription method using polydeoxythymidine ribonucleotide (oligo dT) primers, so that various types of RNA molecules including non-coding RNA (lncRNA, miRNA, etc.) can be reverse transcribed in subsequent steps, thereby realizing the construction of libraries of various types of RNA.

进一步的,所述制备方法具体包括:DNA消化、RNA末端修饰、加polyA尾、逆转录、U碱基消化、cDNA末端修饰、变性、加接头和文库构建。Furthermore, the preparation method specifically includes: DNA digestion, RNA end modification, poly A tail addition, reverse transcription, U base digestion, cDNA end modification, denaturation, linker addition and library construction.

本发明通过反应试剂等条件的优化选择,将上述具体反应步骤中两个或以上进行合并。例如,一些实施例中,将DNA消化、RNA末端修饰和加polyA尾的步骤合并。一些实施例中,将DNA消化和RNA末端修饰的步骤合并、然后将加polyA尾和逆转录的步骤合并。一些实施例中,将U碱基消化、cDNA末端修饰和变性的步骤合并。一些实施例中,将U碱基消化和cDNA末端修饰的步骤合并。通过将反应步骤合并,减少了反应的步骤并缩短了所需的时间,同时保证了反应的效率和制得文库的质量。The present invention combines two or more of the above-mentioned specific reaction steps by optimizing the reaction reagents and other conditions. For example, in some embodiments, the steps of DNA digestion, RNA end modification and polyA tailing are combined. In some embodiments, the steps of DNA digestion and RNA end modification are combined, and then the steps of polyA tailing and reverse transcription are combined. In some embodiments, the steps of U base digestion, cDNA end modification and denaturation are combined. In some embodiments, the steps of U base digestion and cDNA end modification are combined. By combining the reaction steps, the reaction steps are reduced and the required time is shortened, while ensuring the efficiency of the reaction and the quality of the obtained library.

在一些实施例中,所述DNA消化和RNA末端修饰在第一体系中进行;In some embodiments, the DNA digestion and RNA end modification are performed in a first system;

所述逆转录在第二体系中进行;The reverse transcription is performed in a second system;

所述加polyA尾的步骤在第一体系或第二体系中进行。The step of adding the polyA tail is performed in the first system or the second system.

进一步的,在一个具体实施例中,Furthermore, in a specific embodiment,

所述加polyA尾的步骤在第一体系中进行,其中:The step of adding polyA tail is carried out in a first system, wherein:

所述第一体系包括:RNA样本、PolyA聚合酶反应缓冲液、ATP、BSA、DNase I、T4多核苷酸激酶、PolyA聚合酶、RNA酶抑制剂和无核酸酶水;The first system includes: RNA sample, PolyA polymerase reaction buffer, ATP, BSA, DNase I, T4 polynucleotide kinase, PolyA polymerase, RNase inhibitor and nuclease-free water;

所述第二体系包括:所述第一体系的反应产物、逆转录引物、HiScript III反应缓冲液、HiScript III逆转录酶、dNTP Mix、RNA酶抑制剂和无核酸酶水。The second system includes: the reaction product of the first system, reverse transcription primer, HiScript III reaction buffer, HiScript III reverse transcriptase, dNTP Mix, RNase inhibitor and nuclease-free water.

为了保证RNA末端修饰、加ployA尾结构和逆转录在同一个体系中能顺利进行反应,本发明对反应体系进行了优化。反应体系中各组分配合得当,共同保证反应的进行。更进一步的,本发明对各体系中的组分浓度进行了优化。In order to ensure that RNA terminal modification, addition of ployA tail structure and reverse transcription can react smoothly in the same system, the present invention optimizes the reaction system. The components in the reaction system are properly coordinated to jointly ensure the progress of the reaction. Furthermore, the present invention optimizes the concentration of the components in each system.

所述第一体系各组分浓度为:RNA样本14μL、10×PolyA聚合酶反应缓冲液3μL、10mM ATP 1μL、10mg/mL BSA 4μL、DNase I 2μL、10U/μL T4多核苷酸激酶0.5μL、5U/μL PolyA聚合酶1μL、40U/μL RNA酶抑制剂0.5μL和无核酸酶水4μL;The concentrations of the components in the first system are as follows: 14 μL RNA sample, 3 μL 10×PolyA polymerase reaction buffer, 1 μL 10 mM ATP, 4 μL 10 mg/mL BSA, 2 μL DNase I, 0.5 μL 10 U/μL T4 polynucleotide kinase, 1 μL 5 U/μL PolyA polymerase, 0.5 μL 40 U/μL RNase inhibitor, and 4 μL nuclease-free water;

所述第二体系各组分浓度包括:所述第一体系的反应产物30μL、5μM逆转录引物2μL、5×HiScript III反应缓冲液4μL、200U/μl HiScript III逆转录酶1μL、5mM dNTP Mix 2μL、40U/μl RNA酶抑制剂0.5μL和无核酸酶水10.5μL。The concentrations of the components of the second system include: 30 μL of the reaction product of the first system, 2 μL of 5 μM reverse transcription primer, 4 μL of 5×HiScript III reaction buffer, 1 μL of 200 U/μl HiScript III reverse transcriptase, 2 μL of 5 mM dNTP Mix, 0.5 μL of 40 U/μl RNase inhibitor and 10.5 μL of nuclease-free water.

进一步的,在另一个具体实施例中,Furthermore, in another specific embodiment,

所述加polyA尾的步骤在第二体系中进行,其中:The step of adding polyA tail is carried out in the second system, wherein:

所述第一体系包括:RNA样本、DNase I反应缓冲液、ATP、DNase I、T4多核苷酸激酶和RNA酶抑制剂;The first system includes: RNA sample, DNase I reaction buffer, ATP, DNase I, T4 polynucleotide kinase and RNase inhibitor;

所述第二体系包括:所述第一体系的反应产物、HiScript III反应缓冲液、BSA、PEG8000、dNTP Mix、逆转录引物、PolyA聚合酶、RNA酶抑制剂、HiScript III逆转录酶和无核酸酶水。The second system includes: the reaction product of the first system, HiScript III reaction buffer, BSA, PEG8000, dNTP Mix, reverse transcription primer, PolyA polymerase, RNase inhibitor, HiScript III reverse transcriptase and nuclease-free water.

为了保证DNA消化和RNA末端修饰在一个体系中反应顺利,而加ployA尾结构和逆转录在同一个体系中反应顺利,本发明对反应体系进行了优化。反应体系中各组分配合得当,共同保证反应的进行。更进一步的,本发明对各体系中的组分浓度进行了优化。In order to ensure that DNA digestion and RNA end modification react smoothly in one system, and that adding a ployA tail structure and reverse transcription react smoothly in the same system, the present invention optimizes the reaction system. The components in the reaction system are properly coordinated to jointly ensure the progress of the reaction. Furthermore, the present invention optimizes the concentration of the components in each system.

所述第一体系各组分浓度为:RNA样本14μL、10×DNase I反应缓冲液2μL、10mM ATP 1μL、DNase I 2μL、10U/μL T4多核苷酸激酶0.5μL和40U/μL RNA酶抑制剂0.5μL;The concentrations of the components in the first system are: 14 μL RNA sample, 2 μL 10× DNase I reaction buffer, 1 μL 10 mM ATP, 2 μL DNase I, 0.5 μL 10 U/μL T4 polynucleotide kinase, and 0.5 μL 40 U/μL RNase inhibitor.

所述第二体系各组分浓度为:所述第一体系的反应产物20μL、5×HiScript III反应缓冲液6μL、10mg/mL BSA 1μL、50%PEG8000 10μL、5mM dNTP Mix 2μL、5μM逆转录引物2μL、5U/μL PolyA聚合酶1μL、40U/μl RNA酶抑制剂0.5μL、200U/μl HiScript III逆转录酶1μL和无核酸酶水6.5μL。The concentrations of the components of the second system are: 20 μL of the reaction product of the first system, 6 μL of 5×HiScript III reaction buffer, 1 μL of 10 mg/mL BSA, 10 μL of 50% PEG8000, 2 μL of 5 mM dNTP Mix, 2 μL of 5 μM reverse transcription primer, 1 μL of 5 U/μL Poly A polymerase, 0.5 μL of 40 U/μl RNase inhibitor, 1 μL of 200 U/μl HiScript III reverse transcriptase and 6.5 μL of nuclease-free water.

在一些实施例中,所述cDNA末端修饰和变性在第三体系中进行。In some embodiments, the cDNA end modification and denaturation are performed in a third system.

此所述第三体系包括:所述经U碱基消化后反应产物、多核苷酸激酶反应缓冲液、T4多核苷酸激酶、pH 8.0的Tris缓冲液、超热稳定单链结合蛋白和无核酸酶水。 The third system comprises: the reaction product after U base digestion, polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer with pH 8.0, ultra-thermostable single-stranded binding protein and nuclease-free water.

为了cDNA末端修饰和变性在同一个体系中反应顺利,本发明对反应体系进行了优化。反应体系中各组分配合得当,共同保证反应的进行。更进一步的,本发明对各体系中的组分浓度进行了优化。In order to smoothly react the cDNA end modification and denaturation in the same system, the present invention optimizes the reaction system. The components in the reaction system are properly coordinated to jointly ensure the progress of the reaction. Furthermore, the present invention optimizes the concentration of the components in each system.

所述第三体系各组分浓度为:所述经U碱基消化后反应产物、10×多核苷酸激酶反应缓冲液5μL、10U/μL T4多核苷酸激酶1μL、220mM pH 8.0的Tris缓冲液2μL、500ng/μL超热稳定单链结合蛋白0.6μL和无核酸酶水21.4μL。The concentrations of the components of the third system are: the reaction product after U base digestion, 5 μL of 10× polynucleotide kinase reaction buffer, 1 μL of 10U/μL T4 polynucleotide kinase, 2 μL of 220mM Tris buffer, pH 8.0, 0.6 μL of 500ng/μL ultra-thermostable single-stranded binding protein, and 21.4 μL of nuclease-free water.

在另一些实施例中,cDNA末端修饰、变性和U碱基消化在同一个体系中进行,因此,所述第三体系中还包括U碱基消化的试剂。In other embodiments, cDNA end modification, denaturation and U base digestion are performed in the same system, and therefore, the third system also includes reagents for U base digestion.

在此实施例中,所述第三体系包括:所述经polyA加尾和逆转录的反应产物、多核苷酸激酶反应缓冲液、T4多核苷酸激酶、pH 8.0的Tris缓冲液、超热稳定单链结合蛋白、尿嘧啶特异性切除试剂USER酶和无核酸酶水。In this embodiment, the third system includes: the reaction product of polyA tailing and reverse transcription, polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, ultra-thermostable single-stranded binding protein, uracil-specific excision reagent USER enzyme and nuclease-free water.

为了cDNA末端修饰和变性在同一个体系中反应顺利,本发明对反应体系进行了优化。反应体系中各组分配合得当,共同保证反应的进行。更进一步的,本发明对各体系中的组分浓度进行了优化。In order to smoothly react the cDNA end modification and denaturation in the same system, the present invention optimizes the reaction system. The components in the reaction system are properly coordinated to jointly ensure the progress of the reaction. Furthermore, the present invention optimizes the concentration of the components in each system.

所述第三体系各组分浓度为:所述经polyA加尾和逆转录的反应产物、10×多核苷酸激酶反应缓冲液5μL、10U/μL T4多核苷酸激酶1μL、220mM pH 8.0的Tris缓冲液2μL、500ng/μL超热稳定单链结合蛋白0.6μL、10U/μL尿嘧啶特异性切除试剂USER酶2μL和无核酸酶水19.4μL。The concentrations of the components of the third system are: the reaction product after polyA tailing and reverse transcription, 5 μL of 10× polynucleotide kinase reaction buffer, 1 μL of 10 U/μL T4 polynucleotide kinase, 2 μL of 220 mM Tris buffer, pH 8.0, 0.6 μL of 500 ng/μL ultra-thermostable single-stranded binding protein, 2 μL of 10 U/μL uracil-specific excision reagent USER enzyme, and 19.4 μL of nuclease-free water.

与现有技术相比,本发明通过采用灵活的将多个反应步骤合并在同一实验操作步骤中进行,并通过实验条件的优化,减少了文库制备所需要的操作步骤和时间,且保证了反应的效率,从而获得更好的实验效果。Compared with the prior art, the present invention adopts a flexible method of combining multiple reaction steps into the same experimental operation step and optimizes the experimental conditions, thereby reducing the operation steps and time required for library preparation and ensuring the efficiency of the reaction, thereby obtaining better experimental results.

进一步的,所述逆转录引物具有如下核苷酸序列:poly(T)n-UVNm;Further, the reverse transcription primer has the following nucleotide sequence: poly(T)n-UVNm;

其中,n表示碱基T的数量,m表示碱基N的数量;Where n represents the number of bases T, and m represents the number of bases N;

n为8~50的整数,m为1~4的整数n is an integer of 8 to 50, and m is an integer of 1 to 4

所述poly(T)n中至少一个T被替换为U;V选自碱基A、碱基C和碱基G中的任意一种;N选自碱基A、碱基T、碱基C和碱基G中的任意一种。 At least one T in the poly(T)n is replaced by U; V is selected from any one of base A, base C and base G; and N is selected from any one of base A, base T, base C and base G.

本发明提供了用于逆转录反应合成单链cDNA分子的含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)逆转录引物,该引物序列包含脱氧尿嘧啶核糖核苷酸(dU)、多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)、以及其他随机碱基(V和N),利用本发明提供的逆转录引物,可以获得含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸序列的单链cDNA分子,以便后续在U碱基消化步骤中采用尿嘧啶特异性切除试剂对单链cDNA进行消化反应,从而切除单链cDNA中的含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸序列片段,去除前面步骤人工引入的低复杂度序列对后续测序和分析的影响,更好的实现cDNA的文库制备。相对于现有技术中为了捕获miRNA或单链DNA而人为引入低复杂度序列,而在测序是添加碱基平衡文库的方案(例如SMARTer smRNA-Seq Kit for Illumina试剂盒),本发明提供的方案更有利于提高测序中的有效数据量。The present invention provides a polydeoxythymidine ribonucleotide (oligo dT) reverse transcription primer containing deoxyuracil (dU) for synthesizing a single-stranded cDNA molecule by reverse transcription reaction. The primer sequence comprises deoxyuracil ribonucleotide (dU), polydeoxythymidine ribonucleotide (oligo dT), and other random bases (V and N). By using the reverse transcription primer provided by the present invention, a single-stranded cDNA molecule containing a polydeoxythymidine ribonucleotide sequence of deoxyuracil (dU) can be obtained, so that a uracil-specific excision reagent is used to perform a digestion reaction on the single-stranded cDNA in the U base digestion step, thereby excising the polydeoxythymidine ribonucleotide sequence fragment containing deoxyuracil (dU) in the single-stranded cDNA, removing the influence of the low-complexity sequence artificially introduced in the previous step on the subsequent sequencing and analysis, and better realizing the preparation of the cDNA library. Compared with the prior art that artificially introduces low-complexity sequences to capture miRNA or single-stranded DNA and adds base-balanced libraries during sequencing (such as the SMARTer smRNA-Seq Kit for Illumina), the solution provided by the present invention is more conducive to increasing the amount of effective data in sequencing.

在一些具体实施例中,所述逆转录引物具有如SEQ ID NO:1所示的核苷酸序列。经过实验证明,相对于其他逆转录引物,例如,N的个数为2个、3个或4个的引物,采用SEQ ID NO:1所示的逆转录引物能够实现更好的逆转录反应,从而获得更好的实验效果。In some specific embodiments, the reverse transcription primer has a nucleotide sequence as shown in SEQ ID NO: 1. Experiments have shown that, compared with other reverse transcription primers, for example, primers with N being 2, 3 or 4, the reverse transcription primer shown in SEQ ID NO: 1 can achieve a better reverse transcription reaction, thereby obtaining a better experimental effect.

进一步的,所述加接头步骤中,所述接头包括5’端接头和3’端接头,所述5’端接头序列具有如SEQ ID NO 2和SEQ ID NO 3所示核苷酸序列,所述3’端接头序列具有如SEQ ID NO 4和SEQ ID NO 5所示核苷酸序列。Furthermore, in the step of adding a linker, the linker includes a 5' end linker and a 3' end linker, the 5' end linker sequence has a nucleotide sequence as shown in SEQ ID NO 2 and SEQ ID NO 3, and the 3' end linker sequence has a nucleotide sequence as shown in SEQ ID NO 4 and SEQ ID NO 5.

本发明提供了在单链cDNA分子的3’和5’两端分别引入特定目标序列的接头和对应的DNA连接方案。两个接头均含有一段双链区域和至少一个突出的单链区域,所述双链区域包含了用于PCR或测序的通用结构序列,所述突出的单链区域含有1个或多个(1~10个)随机碱基序列,用于与单链DNA分子的末端进行互补配对,促使接头与单链DNA可以进行夹板连接。每个接头由两条多核苷酸序列相互作用形成其特殊结构,两条序列含有互补配对的区域,经过溶液混合与静置处理后会互补配对形成特定双链结构和突出的含有随机碱基的单链区域。单链cDNA分子添加了两端的接头后,可以直接进行PCR扩增得到最终的文库,更好的实现cDNA的文库制备。而与现有技术中需要在DNA分子的3’和5’两端分步连接两种接头序列的方案(例如SPLAT(Splinted ligation adapter tagging))相比,本发明在一个步骤中,可同时连接3’和5’两端的DNA接头,在保证了连接效果的同时缩短了步骤和时间。The present invention provides a connector for introducing a specific target sequence at the 3' and 5' ends of a single-stranded cDNA molecule and a corresponding DNA connection scheme. Both connectors contain a double-stranded region and at least one protruding single-stranded region, wherein the double-stranded region contains a universal structural sequence for PCR or sequencing, and the protruding single-stranded region contains one or more (1 to 10) random base sequences, which are used for complementary pairing with the end of the single-stranded DNA molecule, so that the connector and the single-stranded DNA can be splinted. Each connector is formed by the interaction of two polynucleotide sequences to form its special structure, and the two sequences contain complementary pairing regions, which will be complementary paired to form a specific double-stranded structure and a protruding single-stranded region containing random bases after solution mixing and static treatment. After adding connectors at both ends of the single-stranded cDNA molecule, PCR amplification can be directly performed to obtain the final library, so as to better realize the preparation of cDNA library. Compared with the prior art solution that requires connecting two adapter sequences at the 3’ and 5’ ends of the DNA molecule in steps (such as SPLAT (Splinted ligation adapter tagging)), the present invention can connect the DNA adapters at the 3’ and 5’ ends simultaneously in one step, thereby shortening the steps and time while ensuring the connection effect.

本发明中,所述加接头的反应体系中包括:5’端接头溶液、3’端接头溶液、连接反应缓冲液、连接反应增强剂和T4 DNA连接酶。In the present invention, the reaction system for adding a linker includes: a 5' end linker solution, a 3' end linker solution, a ligation reaction buffer, a ligation reaction enhancer and T4 DNA ligase.

在本发明所述的加接头的反应体系中,使用了DNA连接酶方案替代了RNA连接酶,从而使连接时间大幅缩短(利用RNA连接酶在RNA两端添加接头的方案中,由于RNA连接酶的连接效率相对比较低,单端接头连接反应时间通常需要1~2小时,两端分步连接接头的实验步骤过程也比较复杂,两端接头连接总时间在2~3小时)。为了使DNA连接酶能够更好的工作,本发明对反应体系中的其它组分及其浓度进行了优化,从而使接头的连接效率得到进一步的提高。In the reaction system for adding a linker described in the present invention, a DNA ligase scheme is used to replace RNA ligase, thereby greatly shortening the connection time (in the scheme of adding linkers at both ends of RNA using RNA ligase, due to the relatively low connection efficiency of RNA ligase, the single-end linker connection reaction time usually takes 1 to 2 hours, and the experimental steps of connecting linkers at both ends in steps are also relatively complicated, and the total connection time of the two-end linkers is 2 to 3 hours). In order to enable the DNA ligase to work better, the present invention optimizes other components and their concentrations in the reaction system, thereby further improving the connection efficiency of the linker.

一些实施例中,所述加接头的反应体系中包括:5’端接头溶液、3’端接头溶液、连接反应缓冲液、氯化六氨合钴溶液和T4 DNA连接酶。In some embodiments, the reaction system for adding a linker includes: a 5' end linker solution, a 3' end linker solution, a ligation reaction buffer, a hexammaminecobalt chloride solution and T4 DNA ligase.

本发明还提供了多种类型RNA文库的测序方法,以上述制备方法制得的cDNA文库为样本进行上机测序。The present invention also provides a sequencing method for various types of RNA libraries, and the cDNA library prepared by the above preparation method is used as a sample for on-machine sequencing.

进一步的,所述cDNA文库经PCR扩增、纯化、样本混合、单链环化后上机测序。Furthermore, the cDNA library is sequenced after PCR amplification, purification, sample mixing, and single-stranded circularization.

在一些具体实施例中,所述PCR扩增的上游引物具有如SEQ ID NO6所示的核苷酸序列,所述PCR扩增的下游引物具有如SEQ ID NO 7所示的核苷酸序列。In some specific embodiments, the upstream primer of the PCR amplification has a nucleotide sequence as shown in SEQ ID NO 6, and the downstream primer of the PCR amplification has a nucleotide sequence as shown in SEQ ID NO 7.

下游引物可以含有用于样本识别的标签(Barcode)序列,利用该引物序列对文库样本进行扩增,可以获得带不同Barcode序列标签的文库样本,同时可以将所述引物带的Barcode序列与测序平台的通用测序文库的Barcode相匹配,PCR扩增产物的结构可以与测序平台的通用文库结构保持一致,从而可以通过Barcode精确拆分样本文库的来源。The downstream primers may contain a label (Barcode) sequence for sample identification. By using the primer sequence to amplify the library sample, library samples with different Barcode sequence labels can be obtained. At the same time, the Barcode sequence of the primer band can be matched with the Barcode of the universal sequencing library of the sequencing platform. The structure of the PCR amplification product can be consistent with the universal library structure of the sequencing platform, so that the source of the sample library can be accurately separated by the Barcode.

本发明提供了文库的构建试剂,包括试剂I、试剂II和试剂III;The present invention provides library construction reagents, including reagent I, reagent II and reagent III;

所述试剂I包括:DNase I反应缓冲液、ATP、DNase I、T4多核苷酸激酶、RNA酶抑制剂;The reagent I comprises: DNase I reaction buffer, ATP, DNase I, T4 polynucleotide kinase, and RNase inhibitor;

所述试剂II包括:逆转录引物、HiScript III反应缓冲液、HiScript III逆转录酶、dNTP Mix、RNA酶抑制剂;The reagent II includes: reverse transcription primer, HiScript III reaction buffer, HiScript III reverse transcriptase, dNTP Mix, and RNase inhibitor;

所述试剂III包括:多核苷酸激酶反应缓冲液、T4多核苷酸激酶、pH8.0的Tris缓冲液、超热稳定单链结合蛋白。The reagent III comprises: polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, and super-thermostable single-stranded binding protein.

在一些实施例中,所述试剂I中还包括:PolyA聚合酶反应缓冲液、BSA、PolyA聚合酶;In some embodiments, the reagent I further comprises: PolyA polymerase reaction buffer, BSA, PolyA polymerase;

在一些实施例中,所述试剂II中还包括:BSA、PEG8000、PolyA聚合酶。In some embodiments, the reagent II also includes: BSA, PEG8000, and PolyA polymerase.

在一些实施例中,所述试剂III中还包括U碱基切除试剂,所述U碱基切除试剂包括尿嘧啶特异性切除试剂USER酶。In some embodiments, the reagent III also includes a U base excision reagent, and the U base excision reagent includes a uracil-specific excision reagent USER enzyme.

进一步的,还包括加接头试剂,所述加接头试剂中包括:Tris-Hcl缓冲液、氯化钠、EDTA、如SEQ ID NO 2~5所示的接头、连接反应缓冲液、T4连接酶和氯化六氨合钴。Furthermore, it also includes a linker adding reagent, which includes: Tris-Hcl buffer, sodium chloride, EDTA, linkers as shown in SEQ ID NO 2 to 5, ligation reaction buffer, T4 ligase and hexamminecobalt chloride.

进一步的,还包括纯化试剂,所述纯化试剂包括Agencourt AMPure XP磁珠。Furthermore, it also includes a purification reagent, which includes Agencourt AMPure XP magnetic beads.

进一步的,还包括PCR扩增试剂,所述PCR扩增试剂包括PCR酶反应液、如SEQ ID NO 6所示核苷酸序列的上游引物和如SEQ ID NO 7所示核苷酸序列的下游引物。Furthermore, it also includes PCR amplification reagents, which include PCR enzyme reaction solution, upstream primers of the nucleotide sequence shown in SEQ ID NO 6, and downstream primers of the nucleotide sequence shown in SEQ ID NO 7.

进一步的,还包括RNA提取试剂。Furthermore, it also includes RNA extraction reagents.

本发明还提供了所述的构建试剂在制备多种类型RNA文库上的应用。The present invention also provides the application of the construction reagent in preparing various types of RNA libraries.

本发明提供的一种快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术,本发明利用多聚腺苷酸聚合酶在多种RNA的3’末端人工添加多聚腺苷酸(poly A)尾巴,同时利用带有脱氧尿嘧啶的多聚脱氧胸腺嘧啶核糖核苷酸引物在逆转录酶的作用下合成单链cDNA,所获得的单链cDNA分子通过一系列反应并最终通过PCR扩增获得多种类型RNA的链特异性文库的技术方案。与现有技术相比:The present invention provides a method for preparing a strand-specific library for rapid detection of multiple types of RNA and a high-throughput sequencing technology. The present invention uses polyadenylic acid polymerase to artificially add polyadenylic acid (poly A) tails to the 3' ends of multiple RNAs, and simultaneously uses polydeoxythymidine ribonucleotide primers with deoxyuracil to synthesize single-stranded cDNA under the action of reverse transcriptase. The obtained single-stranded cDNA molecules are subjected to a series of reactions and finally amplified by PCR to obtain a strand-specific library of multiple types of RNA. Compared with the prior art:

(1)、本发明利用多聚腺苷酸聚合酶在多种RNA的3’末端人工添加多聚腺苷酸(poly A)尾巴,使得包括非编码RNA(lncRNA、miRNA等)在内的多种类型的RNA分子可以在后续的步骤中完成逆转录,从而实现多种类型的RNA被建库和测序;(1) The present invention uses polyadenylic acid polymerase to artificially add polyadenylic acid (poly A) tails to the 3’ ends of various RNAs, so that various types of RNA molecules including non-coding RNA (lncRNA, miRNA, etc.) can be reverse transcribed in subsequent steps, thereby achieving library construction and sequencing of various types of RNAs;

(2)利用带有脱氧尿嘧啶的多聚脱氧胸腺嘧啶核糖核苷酸引物在逆转录酶的作用下合成单链cDNA,所获得的单链cDNA分子在后续反应中,可通过U碱基消化反应清除人为添加的多聚核苷酸序列,避免了在测序文库中增加复杂度低的核苷酸序列,使得测序质量和数据量有所保障;(2) Single-stranded cDNA is synthesized using poly-deoxythymidine ribonucleotide primers containing deoxyuracil under the action of reverse transcriptase. In the subsequent reaction, the obtained single-stranded cDNA molecules can remove the artificially added polynucleotide sequences through the U base digestion reaction, avoiding the addition of low-complexity nucleotide sequences in the sequencing library, so that the sequencing quality and data volume are guaranteed;

(3)、本发明采用了单链cDNA分子的文库制备方法,不需要合成cDNA分子的互补双链DNA,减少了文库制备所需要的步骤和时间。同时,由于本发明采用了单链cDNA分子的文库制备方法,保留了RNA的链特异性信息,更有利于基因注释等RNA分析;(3) The present invention adopts a single-stranded cDNA molecule library preparation method, which does not require the synthesis of complementary double-stranded DNA of cDNA molecules, thereby reducing the steps and time required for library preparation. At the same time, since the present invention adopts a single-stranded cDNA molecule library preparation method, the strand-specific information of RNA is retained, which is more conducive to RNA analysis such as gene annotation;

(4)、本发明通过反应试剂等条件的优化选择,将一些反应步骤合并。例如,在实施例3中,将DNA消化、RNA末端修饰及多聚腺苷酸加尾反应三个反应步骤合并在同一操作步骤进行;将cDNA末端修饰和变性反应两个反应步骤合并为一个操作步骤进行。在实施例4中,将DNA消化和RNA末端修饰反应两个反应步骤合并为一个操作步骤进行;将多聚腺苷酸加尾与逆转录反应两个反应步骤合并为一个操作步骤进行;将U碱基消化、cDNA末端修饰和变性反应三个反应步骤合并在同一操作步骤进行等。整体的解决方案减少了文库制备所需要的操作步骤和时间,且保证了反应的效率;(4) The present invention combines some reaction steps by optimizing the reaction reagents and other conditions. For example, in Example 3, the three reaction steps of DNA digestion, RNA end modification and polyadenylic acid tailing reaction are combined into the same operation step; the two reaction steps of cDNA end modification and denaturation reaction are combined into one operation step. In Example 4, the two reaction steps of DNA digestion and RNA end modification reaction are combined into one operation step; the two reaction steps of polyadenylic acid tailing and reverse transcription reaction are combined into one operation step; the three reaction steps of U base digestion, cDNA end modification and denaturation reaction are combined into the same operation step, etc. The overall solution reduces the number of operation steps and time required for library preparation, and ensures the efficiency of the reaction;

(5)、本发明的单链DNA在两端添加接头的方案采用了一步连接法,可同时连接3’和5’两端的DNA接头,减少了实验步骤和时间。本发明在连接反应中加入了连接反应增强剂,例如化学试剂六氨基氯化钴,可以显著提高连接反应的反应效率,进一步缩短了连接反应的时间;(5) The scheme of adding connectors at both ends of the single-stranded DNA of the present invention adopts a one-step connection method, which can simultaneously connect the DNA connectors at the 3' and 5' ends, reducing the experimental steps and time. The present invention adds a connection reaction enhancer, such as the chemical reagent hexaaminocobalt chloride, to the connection reaction, which can significantly improve the reaction efficiency of the connection reaction and further shorten the connection reaction time;

(6)、本发明利用逆转录后的cDNA和接头进行连接,添加测序结构序列。由于使用了DNA连接酶方案替代了RNA连接酶方案,DNA连接酶在成本和效率两方面均优于RNA连接酶,总连接时间仅需要30分钟甚至更短,整体实验流程的操作步骤比较少,所需时间短。 (6) The present invention uses the reverse transcribed cDNA and the adapter for connection and adds the sequencing structure sequence. Since the DNA ligase scheme is used instead of the RNA ligase scheme, the DNA ligase is superior to the RNA ligase in terms of both cost and efficiency. The total connection time only takes 30 minutes or even shorter. The overall experimental process has fewer operating steps and takes less time.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1示RNA文库制备和检测原理示意图;FIG1 is a schematic diagram showing the RNA library preparation and detection principle;

图2示接头制备示意图,其中,P代表磷酸化修饰基团,B代表阻断连接的修饰基团;FIG2 is a schematic diagram of linker preparation, wherein P represents a phosphorylation modification group and B represents a modification group that blocks the connection;

图3示RNA文库制备方法示意图;FIG3 is a schematic diagram showing a method for preparing an RNA library;

图4示RNA文库测序随着测序循环数(cycle)的测序质量分布图;FIG4 shows the sequencing quality distribution diagram of RNA library sequencing along with the sequencing cycle number;

图5示样本检测到的各类RNA基因数目,其中图5a中RNA样本包括mRNA IncRNA和pseudogene RNA,图5b中RNA样本包括miRNA、tRNA、mt-tRNA、mt-rRNA、snoRNA和snRNA;FIG5 shows the number of various RNA genes detected in the samples, wherein the RNA samples in FIG5a include mRNA IncRNA and pseudogene RNA, and the RNA samples in FIG5b include miRNA, tRNA, mt-tRNA, mt-rRNA, snoRNA and snRNA;

图6示样本检测到的各类RNA基因数目和百分比,其中图6a为200μL起始血浆游离RNA样本,图6b为10ng起始量的UHRR样本,图6c为2ng起始量的UHRR样本;FIG6 shows the number and percentage of various RNA genes detected in the samples, wherein FIG6a is a 200 μL starting plasma free RNA sample, FIG6b is a 10 ng starting amount UHRR sample, and FIG6c is a 2 ng starting amount UHRR sample;

图7示样本之间的基因表达量散点图和一致性分析结果图,其中使用基因表达量的Log2(TPM+1)值进行分析,图7a为200μL起始血浆游离RNA样本的两个技术平行,图7b为10ng和100ng起始量的UHRR样本,图7c为2ng和10ng起始量的UHRR样本;FIG7 shows a scatter plot of gene expression between samples and a consistency analysis result diagram, wherein the Log2(TPM+1) value of gene expression is used for analysis, FIG7a is two technical parallels of a 200 μL starting plasma free RNA sample, FIG7b is a UHRR sample with a starting amount of 10 ng and 100 ng, and FIG7c is a UHRR sample with a starting amount of 2 ng and 10 ng;

图8示样本间的基因表达量Pearson相关性系数结果,其中图8a为使用基因表达量的TPM值计算的Pearson相关性系数,图8b为使用基因表达量转化后的Log2(TPM+1)值计算的Pearson相关性系数。FIG8 shows the Pearson correlation coefficient results of gene expression between samples, wherein FIG8a is the Pearson correlation coefficient calculated using the TPM value of the gene expression, and FIG8b is the Pearson correlation coefficient calculated using the Log2(TPM+1) value after the gene expression is converted.

具体实施方式DETAILED DESCRIPTION

本发明提供了一种快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术,本领域技术人员可以借鉴本文内容,适当改进工艺参数实现。特别需要指出的是,所有类似的替换和改动对本领域技术人员来说是显而易见的,它们都被视为包括在本发明。本发明的方法及应用已经通过较佳实施例进行了描述,相关人员明显能在不脱离本发明内容、精神和范围内对本文的方法和应用进行改动或适当变更与组合,来实现和应用本发明技术。The present invention provides a method for preparing a strand-specific library and a high-throughput sequencing technology for rapid detection of multiple types of RNA. Those skilled in the art can refer to the content of this article and appropriately improve the process parameters to achieve it. It should be particularly noted that all similar substitutions and modifications are obvious to those skilled in the art, and they are all considered to be included in the present invention. The methods and applications of the present invention have been described through preferred embodiments, and relevant personnel can obviously modify or appropriately change and combine the methods and applications of this article without departing from the content, spirit and scope of the present invention to implement and apply the technology of the present invention.

本发明提供了快速检测各种类型RNA分子的链特异性文库制备与高通量基因测序技术,本技术在较低起始量的RNA投入即可读取多种类型的RNA信息,并适用于组织细胞、体液样本等多种类型的使用场景,通过特殊的逆转录引物设计和消化处理,可去除掉人为添加的低复杂度序列对测序的干扰,利用单链文库制备技术可以实现链特异性的RNA文库制备。通过引入用于样本识别的Barcode序列和用于测序反应的接头结构序列,利用高通量测序技术可读取和分析多种类型的RNA信息。The present invention provides a chain-specific library preparation and high-throughput gene sequencing technology for rapid detection of various types of RNA molecules. This technology can read various types of RNA information with a relatively low starting amount of RNA input, and is suitable for various types of usage scenarios such as tissue cells and body fluid samples. Through special reverse transcription primer design and digestion treatment, the interference of artificially added low-complexity sequences on sequencing can be removed, and chain-specific RNA library preparation can be achieved using single-stranded library preparation technology. By introducing a barcode sequence for sample identification and a connector structure sequence for sequencing reaction, various types of RNA information can be read and analyzed using high-throughput sequencing technology.

本发明提供了一种快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术,可同时检测包括mRNA、lncRNA、tRNA、miRNA等多种RNA类型。本发明提供的方法适用于组织细胞RNA和游离RNA(cell-free RNA,缩写为cfRNA)等多种样本类型的高通量测序检测,可适用于ng级别甚至pg级别的低起始量的RNA样本。本发明可用于液体活检领域游离RNA分子诊断、组织细胞的RNA分子检测等领域,在疾病早筛、辅助诊断、复发监测、疾病预后等方向具有广阔的应用前景,其中,潜在的临床应用场景包括但不限于孕期疾病、肿瘤、心脑血管疾病、感染类疾病、遗传疾病、神经系统疾病、精神类疾病等复杂疾病。本发明可拓展应用于动物、植物和微生物等领域研究,同时检测各种不同类型的RNA;用于研究不同物种的生长发育特点、组织和细胞功能、基因功能、环境适应性、物种间相互作用等,在农业、食品和生物安全等领域也具有应用前景。本发明的技术原理还可能拓展至单细胞或时空组学技术,用于同时捕获包括非编码RNA在内不同类型RNA,用于表征单细胞或不同空间位置的多种类型RNA的表达。本发明技术的原理结合长链PCR技术和单分子测序,还可能应用于全长转录组的应用研究等。The present invention provides a chain-specific library preparation method and high-throughput sequencing technology for rapid detection of multiple types of RNA, which can simultaneously detect multiple types of RNA including mRNA, lncRNA, tRNA, miRNA, etc. The method provided by the present invention is suitable for high-throughput sequencing detection of multiple sample types such as tissue cell RNA and free RNA (cell-free RNA, abbreviated as cfRNA), and can be applicable to RNA samples with low starting amounts at the ng level or even the pg level. The present invention can be used in the fields of free RNA molecular diagnosis in the field of liquid biopsy, RNA molecular detection in tissue cells, etc., and has broad application prospects in the directions of early disease screening, auxiliary diagnosis, recurrence monitoring, disease prognosis, etc., among which potential clinical application scenarios include but are not limited to complex diseases such as pregnancy diseases, tumors, cardiovascular and cerebrovascular diseases, infectious diseases, genetic diseases, nervous system diseases, and mental diseases. The present invention can be extended to research in the fields of animals, plants, and microorganisms, and simultaneously detect various types of RNA; it can be used to study the growth and development characteristics, tissue and cell functions, gene functions, environmental adaptability, species interactions, etc. of different species, and it also has application prospects in the fields of agriculture, food, and biosafety. The technical principle of the present invention may also be extended to single-cell or spatiotemporal omics technology, which can be used to simultaneously capture different types of RNA, including non-coding RNA, and characterize the expression of multiple types of RNA in single cells or at different spatial locations. The principle of the technology of the present invention, combined with long-chain PCR technology and single-molecule sequencing, may also be applied to the application research of the full-length transcriptome.

首先一个可选的步骤,是对RNA样本进行DNA消化,以去除残留的DNA分子对后续步骤的影响。在此基础上使用末端修饰酶对RNA分子进行末端修饰,将RNA分子的3’末端羟基化,促使更多天然RNA分子可以在3’末端实现多聚腺苷酸加尾反应。然后,利用多聚腺苷酸聚合酶,在RNA分子的3’末端完成多聚腺苷酸加尾。利用含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)引物与RNA分子互补配对,并且在逆转录酶的作用下以RNA为模板合成单链cDNA,获得带有脱氧尿嘧啶(dU)和多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)序列的单链cDNA分子。接着,使用尿嘧啶特异性切除试剂对cDNA进行消化反应,从而切除cDNA中的含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸序列片段,去除前面步骤人工引入的低复杂度序列对后续测序和分析的影响。然后进行cDNA的文库制备。First, an optional step is to perform DNA digestion on the RNA sample to remove the influence of residual DNA molecules on subsequent steps. On this basis, the end modification enzyme is used to modify the end of the RNA molecule, and the 3' end of the RNA molecule is hydroxylated, so that more natural RNA molecules can achieve polyadenylic acid tailing reaction at the 3' end. Then, polyadenylic acid polymerase is used to complete polyadenylic acid tailing at the 3' end of the RNA molecule. Polydeoxythymidine ribonucleotide (oligo dT) primers containing deoxyuracil (dU) are used to complementally pair with RNA molecules, and single-stranded cDNA is synthesized using RNA as a template under the action of reverse transcriptase to obtain single-stranded cDNA molecules with deoxyuracil (dU) and polydeoxythymidine ribonucleotide (oligo dT) sequences. Next, a uracil-specific excision reagent is used to digest the cDNA, thereby excising the polydeoxythymidine ribonucleotide sequence fragment containing deoxyuracil (dU) in the cDNA, removing the influence of the low-complexity sequence artificially introduced in the previous step on subsequent sequencing and analysis. Then, the cDNA library is prepared.

本发明提供的cDNA的文库制备方法是首先对cDNA进行高温变性处理,解开cDNA局部区域可能出现的双链结构,变性反应体系可以加入单链结合蛋白来辅助维持单链cDNA分子的线状结构,避免变性后退火复性为复杂的发夹状等结构。高温变性后的cDNA,再通过特殊的双链接头和配套的连接技术,完成cDNA的文库制备。The cDNA library preparation method provided by the present invention is to first perform high temperature denaturation treatment on cDNA to unwind the double-stranded structure that may appear in the local area of cDNA. The denaturation reaction system can add single-stranded binding protein to assist in maintaining the linear structure of single-stranded cDNA molecules to avoid annealing and renaturing into complex hairpin-like structures after denaturation. The cDNA after high temperature denaturation is then used through special double-stranded adapters and matching connection technology to complete the cDNA library preparation.

本发明提供了一种单链DNA的文库制备技术,通过特殊的双链接头和一步法连接技术,可以同时在单链cDNA分子的两端添加接头,引入用于PCR扩增反应的通用结构序列,然后通过PCR扩增得到测序文库。测序文库产物按照基因测序仪的要求进行后续处理和测序,通过序列信息的读取和分析,达到检测RNA的目的。The present invention provides a single-stranded DNA library preparation technology. Through special double-stranded adapters and one-step connection technology, adapters can be added to both ends of the single-stranded cDNA molecule at the same time, and a universal structural sequence for PCR amplification reaction can be introduced, and then a sequencing library can be obtained through PCR amplification. The sequencing library product is subsequently processed and sequenced according to the requirements of the gene sequencer, and the purpose of detecting RNA is achieved by reading and analyzing the sequence information.

上述文库制备的原理图如附图1和附图3所示。The schematic diagrams of the above library preparation are shown in Figures 1 and 3.

在文库制备的PCR扩增步骤,可以通过PCR引物引入用于样本识别的标签(Barcode)序列,可以将带有不同Barcode序列的文库样本按照测序数据的比例需求混合成一个测序样本,测序样本依照基因测序仪的要求进行后续处理和测序,每一条测序结果经过Barcode序列比对拆分后将被精确定位到每一个样本中,将每个样本的测序结果进行序列分析,从而达到对每个样本检测RNA的目的。使用Barcode序列实现多个文库样本混合一起进行测序检测,提高检测通量,降低检测成本。In the PCR amplification step of library preparation, the label (Barcode) sequence for sample identification can be introduced through PCR primers, and the library samples with different Barcode sequences can be mixed into a sequencing sample according to the proportion requirements of the sequencing data. The sequencing samples are subsequently processed and sequenced according to the requirements of the gene sequencer. Each sequencing result will be accurately located in each sample after the Barcode sequence alignment and splitting, and the sequencing results of each sample will be sequenced to achieve the purpose of detecting RNA for each sample. The Barcode sequence is used to achieve the mixing of multiple library samples for sequencing detection, improve the detection throughput, and reduce the detection cost.

本发明在多种类型的RNA分子3’末端完成多聚腺苷酸加尾,结合使用多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)引物的逆转录方法,对多种类型的RNA实现有效的逆转录,从而实现对多种类型的RNA分子的捕获和测序文库的构建。多种类型的RNA包括但不限于mRNA、lncRNA、miRNA、tRNA等。多聚腺苷酸尾巴为Poly(A)n(n代表腺嘌呤A的数目,并且n为8~200之间的整数)。 The present invention completes polyadenylic acid tailing at the 3' end of various types of RNA molecules, and combines the reverse transcription method using polydeoxythymidine ribonucleotide (oligo dT) primers to achieve effective reverse transcription of various types of RNA, thereby achieving the capture of various types of RNA molecules and the construction of sequencing libraries. Various types of RNA include but are not limited to mRNA, lncRNA, miRNA, tRNA, etc. The polyadenylic acid tail is Poly (A) n (n represents the number of adenine A, and n is an integer between 8 and 200).

本发明设计了一种用于逆转录反应合成单链cDNA分子的含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)引物,该引物序列命名为第一核苷酸序列。需要说明的是,该引物序列包含脱氧尿嘧啶核糖核苷酸(dU)、多聚脱氧胸腺嘧啶核糖核苷酸(oligo dT)、以及其他随机碱基(V和N),其序列如下表1所示核苷酸组成。利用本发明引物序列进行逆转录反应,可以获得含有脱氧尿嘧啶(dU)的多聚脱氧胸腺嘧啶核糖核苷酸序列的cDNA分子。The present invention designs a polydeoxythymidine ribonucleotide (oligo dT) primer containing deoxyuracil (dU) for synthesizing a single-stranded cDNA molecule by reverse transcription reaction, and the primer sequence is named as the first nucleotide sequence. It should be noted that the primer sequence contains deoxyuracil ribonucleotide (dU), polydeoxythymidine ribonucleotide (oligo dT), and other random bases (V and N), and its sequence is the nucleotide composition shown in Table 1 below. By using the primer sequence of the present invention for reverse transcription reaction, a cDNA molecule containing a polydeoxythymidine ribonucleotide sequence of deoxyuracil (dU) can be obtained.

表1带有脱氧尿嘧啶的多聚脱氧胸腺嘧啶核糖核苷酸引物序列结构(5’-3’方向)
Table 1 Polydeoxythymidine ribonucleotide primer sequence structure with deoxyuracil (5'-3' direction)

本发明通过实验条件的优化,可将前述实验原理中提及的多个反应步骤合并在同一实验操作步骤中进行,减少了文库制备所需要的操作步骤和时间,且保证了反应的效率。By optimizing the experimental conditions, the present invention can combine the multiple reaction steps mentioned in the above experimental principles into the same experimental operation step, thereby reducing the operation steps and time required for library preparation and ensuring the efficiency of the reaction.

本发明文库的制备方法具体包括:DNA消化、RNA末端修饰、加polyA尾、逆转录、U碱基消化、cDNA末端修饰、变性、加接头和文库构建。The preparation method of the library of the present invention specifically includes: DNA digestion, RNA end modification, adding polyA tail, reverse transcription, U base digestion, cDNA end modification, denaturation, adding linker and library construction.

本发明的一些实施例中,所述DNA消化和RNA末端修饰在第一体系中进行;所述逆转录在第二体系中进行;所述cDNA末端修饰和变性在第三体系中进行。In some embodiments of the present invention, the DNA digestion and RNA end modification are performed in a first system; the reverse transcription is performed in a second system; and the cDNA end modification and denaturation are performed in a third system.

其中,所述加polyA尾的步骤在第一体系或第二体系中进行。Wherein, the step of adding the polyA tail is performed in the first system or the second system.

在一些具体的实施例中,所述加polyA尾的步骤在第一体系中进行,其中,所述第一体系为:RNA样本14μL、10×PolyA聚合酶反应缓冲液3μL、10mM ATP 1μL、10mg/mL BSA 4μL、DNase I 2μL、10U/μL T4多核苷酸激酶0.5μL、5U/μL PolyA聚合酶1μL、40U/μL RNA酶抑制剂0.5μL和无核酸酶水4μL;In some specific embodiments, the step of adding the polyA tail is performed in a first system, wherein the first system is: 14 μL of RNA sample, 3 μL of 10×PolyA polymerase reaction buffer, 1 μL of 10 mM ATP, 4 μL of 10 mg/mL BSA, 2 μL of DNase I, 0.5 μL of 10 U/μL T4 polynucleotide kinase, 1 μL of 5 U/μL PolyA polymerase, 0.5 μL of 40 U/μL RNase inhibitor, and 4 μL of nuclease-free water;

所述第二体系各组分浓度为:所述第一体系的反应产物30μL、5μM逆转录引物2μL、5×HiScript III反应缓冲液4μL、200U/μl HiScript III逆转录酶1μL、5mM dNTP Mix 2μL、40U/μl RNA酶抑制剂0.5μL和无核酸酶水10.5μL。The concentrations of the components of the second system are: 30 μL of the reaction product of the first system, 2 μL of 5 μM reverse transcription primer, 4 μL of 5×HiScript III reaction buffer, 1 μL of 200 U/μl HiScript III reverse transcriptase, 2 μL of 5 mM dNTP Mix, 0.5 μL of 40 U/μl RNase inhibitor and 10.5 μL of nuclease-free water.

在另外一些具体实施例中,所述加polyA尾的步骤在第二体系中进行,其中,所述第一体系为:RNA样本14μL、10×DNase I反应缓冲液2μL、10mM ATP 1μL、DNase I 2μL、10U/μL T4多核苷酸激酶0.5μL和40U/μL RNA酶抑制剂0.5μL;In some other specific embodiments, the step of adding polyA tail is performed in a second system, wherein the first system is: 14 μL RNA sample, 2 μL 10× DNase I reaction buffer, 1 μL 10 mM ATP, 2 μL DNase I, 0.5 μL 10 U/μL T4 polynucleotide kinase and 0.5 μL 40 U/μL RNase inhibitor;

所述第二体系为:所述第一体系的反应产物20μL、5×HiScript III反应缓冲液6μL、10mg/mL BSA 1μL、50%PEG8000 10μL、5mM dNTP Mix 2μL、5μM逆转录引物2μL、5U/μL PolyA聚合酶1μL、40U/μl RNA酶抑制剂0.5μL、200U/μl HiScript III逆转录酶1μL和无核酸酶水6.5μL。The second system consists of: 20 μL of the reaction product of the first system, 6 μL of 5×HiScript III reaction buffer, 1 μL of 10 mg/mL BSA, 10 μL of 50% PEG8000, 2 μL of 5 mM dNTP Mix, 2 μL of 5 μM reverse transcription primer, 1 μL of 5 U/μL Poly A polymerase, 0.5 μL of 40 U/μl RNase inhibitor, 1 μL of 200 U/μl HiScript III reverse transcriptase and 6.5 μL of nuclease-free water.

在一些具体的实施例中,所述第三体系为:所述经U碱基消化后反应产物、10×多核苷酸激酶反应缓冲液5μL、10U/μL T4多核苷酸激酶1μL、220mM pH 8.0的Tris缓冲液2μL、500ng/μL超热稳定单链结合蛋白0.6μL和无核酸酶水21.4μL。In some specific embodiments, the third system is: the reaction product after U base digestion, 5 μL of 10× polynucleotide kinase reaction buffer, 1 μL of 10U/μL T4 polynucleotide kinase, 2 μL of 220mM Tris buffer, pH 8.0, 0.6 μL of 500ng/μL ultra-thermostable single-stranded binding protein, and 21.4 μL of nuclease-free water.

在另外一些具体的实施例中,第三体系中还包括U碱基消化的试剂。所述第三体系为:所述经polyA加尾和逆转录的反应产物、10×多核苷酸激酶反应缓冲液5μL、10U/μL T4多核苷酸激酶1μL、220mM pH 8.0的Tris缓冲液2μL、500ng/μL超热稳定单链结合蛋白0.6μL、10U/μL尿嘧啶特异性切除试剂USER酶2μL和无核酸酶水19.4μL。In some other specific embodiments, the third system also includes a reagent for U base digestion. The third system is: the reaction product of polyA tailing and reverse transcription, 5 μL of 10× polynucleotide kinase reaction buffer, 1 μL of 10U/μL T4 polynucleotide kinase, 2 μL of 220mM Tris buffer at pH 8.0, 0.6 μL of 500ng/μL ultra-thermostable single-stranded binding protein, 2 μL of 10U/μL uracil specific excision reagent USER enzyme, and 19.4 μL of nuclease-free water.

以上文库的制备方法中,在保证文库构建效果的同时,可选择的将不同反应步骤合并在同一实验操作步骤中进行,从而灵活的修改实验步骤,控制实验时间,其中反应步骤的合并可以依据具体试验调整,本发明对此不作限定。In the above library preparation method, while ensuring the library construction effect, different reaction steps can be selectively combined into the same experimental operation step, so as to flexibly modify the experimental steps and control the experimental time. The combination of reaction steps can be adjusted according to the specific experiment, and the present invention is not limited to this.

本发明提供了在单链cDNA分子的5’和3’两端分别引入特定目标序列的接头和对应的DNA连接方案。这套方案涉及两个接头,第一接头和第二接头。两个接头均含有一段双链区域和至少一个突出的单链区域,所述双链区域包含了用于PCR或测序的通用结构序列,所述突出的单链区域含有1个或多个(1-10个)随机碱基序列,用于与单链DNA分子的末端进行互补配对,促使接头与单链DNA可以进行夹板连接。上述第一接头的突出单链区域在5’末端,上述第二接头的突出单链区域在3’末端。第一接头用于单链DNA分子的5’末端的互补配对和,第二接头用于单链DNA分子的3’末端的互补配对和夹板连接。上述第一接头的一种实施方式是由两条多核苷酸序列相互作用形成其特殊结构,两条序列含有互补配对的区域,经过溶液混合与静置处理后会互补配对形成特定双链结构和突出的含有随机碱基的单链区域。上述第二接头的一种实施是由两条多核苷酸序列相互作用形成其特殊结构,两条序列含有互补配对的区域,经过溶液混合与静置处理后会互补配对形成特定双链结构和突出的含有随机碱基的单链区域。上述第一接头和第二接头溶液的制备原理示意图如附图2所示。The present invention provides a connector for introducing a specific target sequence at the 5' and 3' ends of a single-stranded cDNA molecule and a corresponding DNA connection scheme. This scheme involves two connectors, a first connector and a second connector. Both connectors contain a double-stranded region and at least one protruding single-stranded region, wherein the double-stranded region contains a universal structural sequence for PCR or sequencing, and the protruding single-stranded region contains one or more (1-10) random base sequences, which are used for complementary pairing with the end of the single-stranded DNA molecule, so that the connector and the single-stranded DNA can be splinted. The protruding single-stranded region of the first connector is at the 5' end, and the protruding single-stranded region of the second connector is at the 3' end. The first connector is used for complementary pairing at the 5' end of the single-stranded DNA molecule, and the second connector is used for complementary pairing and splint connection at the 3' end of the single-stranded DNA molecule. One embodiment of the first connector is that two polynucleotide sequences interact to form its special structure, and the two sequences contain complementary pairing regions, which will be complementary paired to form a specific double-stranded structure and a protruding single-stranded region containing random bases after solution mixing and static treatment. One implementation of the second linker is that two polynucleotide sequences interact to form its special structure, and the two sequences contain complementary paired regions, which will complement each other to form a specific double-stranded structure and a protruding single-stranded region containing random bases after solution mixing and static treatment. The schematic diagram of the preparation principle of the first linker and the second linker solution is shown in Figure 2.

本发明提供了一种第一接头和第二接头的序列方案,由四条核苷酸序列组成,其序列如表2所示。第一接头由两条序列为表2中的第二核苷酸序列和第三核苷酸序列构成。第三核苷酸序列的5’末端含有随机碱基序列,随机碱基个数为1-10,随机碱基为碱基A、碱基T、碱基C或碱基G;随机碱基序列在第一接头结构中,可以与单链cDNA分子的5’末端互补配对,用于夹板连接,提高连接效率。第二接头的两条序列为表2中的第四核苷酸序列和第五核苷酸序列。第五核苷酸序列3’端含有随机碱基序列,随机碱基个数为1-10,随机碱基为碱基A、碱基T、碱基C或碱基G;随机碱基序列在第二接头结构中,可以与单链cDNA分子的3’末端互补配对,用于夹板连接,提高连接效率。第二核苷酸序列的5’末端需进行特殊修饰处理,阻断其进行连接反应。第四核苷酸序列的5’末端进行磷酸化处理;3’末端进行特殊修饰处理,阻断其进行连接反应。第三核苷酸序列和第五核苷酸序列的5’末端和3’末端均进行特殊修饰处理,阻断其进行连接反应。在DNA连接酶和适宜的反应条件下,单链DNA分子的5’和3’两个末端均可直接连接上接头,然后可以通过基于接头序列设计的引物用于PCR扩增反应。The present invention provides a sequence scheme of a first joint and a second joint, which consists of four nucleotide sequences, and the sequences are shown in Table 2. The first joint consists of two sequences, namely, the second nucleotide sequence and the third nucleotide sequence in Table 2. The 5' end of the third nucleotide sequence contains a random base sequence, the number of random bases is 1-10, and the random base is base A, base T, base C or base G; the random base sequence can be complementary to the 5' end of the single-stranded cDNA molecule in the first joint structure, and is used for splint connection to improve the connection efficiency. The two sequences of the second joint are the fourth nucleotide sequence and the fifth nucleotide sequence in Table 2. The 3' end of the fifth nucleotide sequence contains a random base sequence, the number of random bases is 1-10, and the random base is base A, base T, base C or base G; the random base sequence can be complementary to the 3' end of the single-stranded cDNA molecule in the second joint structure, and is used for splint connection to improve the connection efficiency. The 5' end of the second nucleotide sequence needs to be specially modified to block it from performing a connection reaction. The 5' end of the fourth nucleotide sequence is phosphorylated; the 3' end is specially modified to block the ligation reaction. The 5' end and 3' end of the third nucleotide sequence and the fifth nucleotide sequence are specially modified to block the ligation reaction. Under DNA ligase and appropriate reaction conditions, both the 5' and 3' ends of the single-stranded DNA molecule can be directly connected to the linker, and then the primers designed based on the linker sequence can be used for PCR amplification reaction.

本发明使用了DNA连接反应增强试剂,例如,使用了合适浓度的化学试剂氯化六氨合钴,提高了DNA的连接效率,在一步实验操作中实现了单链cDNA分子的5’末端和3’末端的两端接头的连接,且缩短了连接的反应时间。The present invention uses a DNA ligation reaction enhancing reagent, for example, a chemical reagent hexaamminecobalt chloride with a suitable concentration is used to improve the DNA ligation efficiency, realize the connection of the two end adapters at the 5' end and the 3' end of the single-stranded cDNA molecule in a one-step experimental operation, and shorten the reaction time of the connection.

单链cDNA分子添加了两端的接头后,可以直接进行PCR扩增得到最终的文库。通过PCR的引物,可以引入完整的测序结构序列,包括可区分样本的Barcode序列,以适配测序平台的需求。After adding adapters at both ends of the single-stranded cDNA molecule, PCR amplification can be directly performed to obtain the final library. Through PCR primers, a complete sequencing structure sequence can be introduced, including a barcode sequence that can distinguish samples, to adapt to the needs of the sequencing platform.

本发明无需额外的单独步骤合成cDNA分子的互补双链DNA,因此减少了文库制备所需要的步骤和时间。同时,由于本发明采用了单链cDNA分子的文库制备方法,保留了RNA的链特异性信息,更有利于基因注释等RNA分析。The present invention does not require an additional separate step to synthesize the complementary double-stranded DNA of the cDNA molecule, thereby reducing the steps and time required for library preparation. At the same time, since the present invention adopts the library preparation method of the single-stranded cDNA molecule, the strand-specific information of the RNA is retained, which is more conducive to RNA analysis such as gene annotation.

表2特殊修饰的含有随机碱基的双链结构DNA接头序列(5’-3’方向)
Table 2 Specially modified double-stranded DNA linker sequences containing random bases (5'-3' direction)

本发明提供了一套用于文库样本进行PCR扩增的通用引物序列,其正向通用引物为第六核酸序列,反向引物为第七核酸序列。反向引物可以含有用于样本识别的标签(Barcode)序列,则命名为第七核酸序列-N,N代表Barcode的编号,不同的Barcode序列带有不同的编号,通用引物序列由下表3中核苷酸组成。利用该引物序列对文库样本进行扩增,可以获得带不同Barcode序列标签的文库样本。The present invention provides a set of universal primer sequences for PCR amplification of library samples, wherein the forward universal primer is the sixth nucleic acid sequence, and the reverse primer is the seventh nucleic acid sequence. The reverse primer may contain a label (Barcode) sequence for sample identification, which is named the seventh nucleic acid sequence-N, where N represents the Barcode number, and different Barcode sequences have different numbers. The universal primer sequence is composed of the nucleotides in Table 3 below. By using the primer sequence to amplify the library sample, a library sample with different Barcode sequence labels can be obtained.

利用本发明原理,可以将第七核酸序列-N中的Barcode序列与测序平台的通用测序文库的Barcode相匹配,PCR扩增产物的结构可以与测序平台的通用文库结构保持一致,可以通过Barcode精确拆分样本文库的来源。By using the principles of the present invention, the Barcode sequence in the seventh nucleic acid sequence-N can be matched with the Barcode of the universal sequencing library of the sequencing platform, the structure of the PCR amplification product can be kept consistent with the universal library structure of the sequencing platform, and the source of the sample library can be accurately split by the Barcode.

表3通用引物序列(5’-3’方向)
Table 3 Universal primer sequences (5'-3' direction)

本发明提供的接头和PCR引物序列方案(表2和表3),适用于制备华大智造MGI测序平台的DNBSEQ和MGISEQ系列的高通量测序文库。然而,本发明提供的设计原理,还可以用于设计适用于其他测序平台的序列方案,用以制备适配的测序文库。The adapter and PCR primer sequence schemes (Table 2 and Table 3) provided by the present invention are suitable for preparing high-throughput sequencing libraries of the DNBSEQ and MGISEQ series of the MGI sequencing platform of MGI. However, the design principle provided by the present invention can also be used to design sequence schemes suitable for other sequencing platforms to prepare compatible sequencing libraries.

本发明采用的试材皆为普通市售品,皆可于市场购得。下面结合实施例,进一步阐述本发明:The test materials used in the present invention are all common commercial products and can be purchased on the market. The present invention is further described below in conjunction with the embodiments:

实施例1 RNA样本的提取和准备Example 1 Extraction and preparation of RNA samples

对人类通用RNA标准品(UHRR,Universal Human Reference RNA,Agilent,740000)进行提取。将1管商品化的标准品(200μg RNA,70%乙醇和0.1M醋酸钠溶液)置于4℃,12,000×g条件下离心15分钟,然后吸弃上清液并使用70%乙醇清洗沉淀,在4℃,12,000×g条件下再次离心15分钟,小心地吸弃上清液,将沉淀在室温下干燥30分钟,使用1mL无核酸酶H2O对沉淀进行回溶,使回溶后的RNA浓度约为200ng/μL,并使用Qubit3.0荧光定量仪(Invitrogen,Q33216)测定回溶后UHRR溶液的准确浓度。根据测定的UHRR溶液浓度,使用无核酸酶H2O将标准品溶液进行逐级稀释,制备成4个不同的RNA起始量的样本,分别为100ng、10ng、2ng和0.2ng,每管总体积为14μL。Universal Human Reference RNA (UHRR, Agilent, 740000) was extracted. One tube of commercial standard (200 μg RNA, 70% ethanol and 0.1 M sodium acetate solution) was centrifuged at 4°C, 12,000×g for 15 minutes, then the supernatant was aspirated and the precipitate was washed with 70% ethanol, centrifuged again at 4°C, 12,000×g for 15 minutes, the supernatant was carefully aspirated, the precipitate was dried at room temperature for 30 minutes, and the precipitate was re-dissolved with 1 mL of nuclease-free H 2 O to a RNA concentration of about 200 ng/μL after re-dissolution, and the accurate concentration of the UHRR solution after re-dissolution was determined using a Qubit3.0 fluorescence quantifier (Invitrogen, Q33216). According to the measured concentration of UHRR solution, the standard solution was diluted stepwise with nuclease-free H 2 O to prepare 4 samples with different RNA starting amounts, namely 100 ng, 10 ng, 2 ng and 0.2 ng, with a total volume of 14 μL in each tube.

使用1例体积为400μL的人类血浆样本,分装成200μL的血浆两份。分别使用血清/血浆miRNA提取分离试剂盒(离心柱型,TIANGEN,DP503)进行提取,严格按照说明书进行操作。A human plasma sample with a volume of 400 μL was used and divided into two portions of 200 μL plasma. Serum/plasma miRNA extraction and separation kits (spin column type, TIANGEN, DP503) were used for extraction, and the operation was performed strictly according to the instructions.

实施例2接头的制备Example 2 Preparation of joint

2.1. 5×STE缓冲液的配置2.1. Preparation of 5× STE buffer

根据表4中反应体系,配置5×STE缓冲液,充分震荡混匀。According to the reaction system in Table 4, 5× STE buffer was prepared and thoroughly shaken to mix.

表4 5×STE缓冲液
Table 4 5×STE buffer

2.2.接头的制备2.2. Preparation of the linker

使用5×STE缓冲液进行接头制备,根据表5和表6中体系,分别制备浓度为25μM的第一接头溶液和第二接头溶液,接头溶液充分震荡混匀,并于室温条件静置30分钟,使接头核苷酸序列充分互补配对结合,反应完成后,使用pH 8.0的TE缓冲液(AMBION,AM9849)将25μM的第一接头溶液和25μM的第二接头溶液,分别稀释成1μM的第一接头溶液和1μM第二接头溶液,置于-18℃~-22℃条件下储存。使用接头时,将接头溶液置于室温融化,充分震荡混匀,然后置于4℃备用。上述第一接头和第二接头溶液的制备原理示意图如附图2所示。Use 5×STE buffer to prepare the connector. According to the systems in Table 5 and Table 6, prepare the first connector solution and the second connector solution with a concentration of 25 μM, respectively. The connector solution is fully shaken and mixed, and left to stand at room temperature for 30 minutes to allow the connector nucleotide sequences to fully complement and pair. After the reaction is completed, use pH 8.0 TE buffer (AMBION, AM9849) to dilute the 25 μM first connector solution and the 25 μM second connector solution to 1 μM first connector solution and 1 μM second connector solution, respectively, and store them at -18°C to -22°C. When using the connector, melt the connector solution at room temperature, shake it thoroughly, and then place it at 4°C for use. The schematic diagram of the preparation principle of the above-mentioned first connector and second connector solutions is shown in Figure 2.

表5第一接头(25μM)溶液配置体系
Table 5 First linker (25 μM) solution configuration system

表6第二接头(25μM)溶液配置体系
Table 6 Second linker (25 μM) solution configuration system

实施例3 RNA文库制备及高通量测序Example 3 RNA library preparation and high-throughput sequencing

文库制备的原理图如附图1和附图3所示。下面结合具体实验步骤说明。The schematic diagram of library preparation is shown in Figure 1 and Figure 3. The specific experimental steps are described below.

3.1 RNA样本描述3.1 RNA sample description

按照实施例1中描述的方法,准备起始量分别2ng、10ng、100ng的人类通用RNA标准品UHRR溶液,样本命名为UHRR-2ng、UHRR-10ng、UHRR-100ng。According to the method described in Example 1, the starting amounts of 2 ng, 10 ng, and 100 ng of human universal RNA standard UHRR solutions were prepared, and the samples were named UHRR-2ng, UHRR-10ng, and UHRR-100ng.

3.2 DNA消化、RNA末端修饰及多聚腺苷酸加尾反应3.2 DNA digestion, RNA end modification and polyadenylation tailing reaction

使用DNase I(RNase Free)(NEB,M0303L)进行DNA消化,去除RNA制备可能残留的DNA;使用T4多核苷酸激酶(T4 PNK,NEB,M0201L)对RNA分子进行末端修饰;使用多聚腺苷酸聚合酶(NEB,E.coli Poly(A)Polymerase,M0276L)在RNA分子的3’端进行多聚腺苷酸加尾;反应体系见表7。将反应体系置于37℃孵育15分钟,高温(95℃)条件下灭活5分钟,反应结束后,将样本置于冰上2分钟。DNase I (RNase Free) (NEB, M0303L) was used for DNA digestion to remove possible residual DNA in RNA preparation; T4 polynucleotide kinase (T4 PNK, NEB, M0201L) was used to modify the end of RNA molecules; polyadenylic acid tailing was performed at the 3' end of RNA molecules using polyadenylic acid polymerase (NEB, E. coli Poly (A) Polymerase, M0276L); the reaction system is shown in Table 7. The reaction system was incubated at 37°C for 15 minutes, inactivated at high temperature (95°C) for 5 minutes, and after the reaction, the sample was placed on ice for 2 minutes.

表7 DNA消化、末端修饰及多聚腺苷酸加尾反应
Table 7 DNA digestion, end modification and polyadenylation tailing reaction

3.3逆转录反应3.3 Reverse transcription reaction

向上一步反应产物中加入2μL浓度为5μM的第一核酸序列引物(即逆转录引物),本实施例中的第一核苷酸序列具体为5’-TTTTTTTUTTTTTTTUVN-3’(SEQ ID NO:1)。加入引物溶液后震荡混匀,短暂离心置于65℃变性5分钟,30℃孵育1分钟。Add 2 μL of the first nucleic acid sequence primer (i.e., reverse transcription primer) at a concentration of 5 μM to the reaction product of the previous step. The first nucleotide sequence in this embodiment is specifically 5'-TTTTTTTUTTTTTTTUVN-3' (SEQ ID NO: 1). After adding the primer solution, shake and mix, centrifuge briefly, denature at 65°C for 5 minutes, and incubate at 30°C for 1 minute.

使用200U/μl HiScript III Reverse transcriptase(诺唯赞,R302-01)进行逆转录反应。向反应体系中加入18μL逆转录反应混合液,其组成见表8。反应体系置于PCR仪(博日,TC-96)上运行表9中程序。 200 U/μl HiScript III Reverse transcriptase (Novozyme, R302-01) was used for reverse transcription reaction. 18 μl of reverse transcription reaction mixture was added to the reaction system, and its composition is shown in Table 8. The reaction system was placed in a PCR instrument (Bio-Tech, TC-96) and the program in Table 9 was run.

表8逆转录反应混合液组成成分。
Table 8 Reverse transcription reaction mixture composition.

表9逆转录反应程序
Table 9 Reverse transcription reaction procedure

3.4逆转录产物纯化3.4 Reverse transcription product purification

向反应管中加入100μL的Agencourt AMPure XP磁珠(Beckman Coulter,A63881),对逆转录产物进行纯化,使用22μL的pH 8.0的TE缓冲液(AMBION,AM9849)进行洗脱,并取20μL的纯化连接产物至新的PCR反应管中。留待文库制备使用。Add 100 μL of Agencourt AMPure XP magnetic beads (Beckman Coulter, A63881) to the reaction tube to purify the reverse transcription product, elute with 22 μL of pH 8.0 TE buffer (AMBION, AM9849), and take 20 μL of the purified ligation product to a new PCR reaction tube. Reserve it for library preparation.

3.5 U碱基消化反应3.5 U base digestion reaction

使用尿嘧啶特异性切除试剂(USER)酶(NEB,USER Enzyme,M5505L)进行U碱基消化反应,反应得到的消化后的单链cDNA分子产物。向反应体系中加入4μL的U碱基消化反应混合液,其组成见表10。反应体系置于PCR仪(博日,TC-96)上运行表11中程序。The U base digestion reaction was performed using uracil specific excision reagent (USER) enzyme (NEB, USER Enzyme, M5505L), and the digested single-stranded cDNA molecule product was obtained. 4 μL of U base digestion reaction mixture was added to the reaction system, and its composition is shown in Table 10. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 11 was run.

表10 U碱基消化反应混合液组成成分。
Table 10 U base digestion reaction mixture composition.

表11 U碱基消化反应程序
Table 11 U base digestion reaction program

3.6 cDNA末端修饰和变性反应3.6 cDNA end modification and denaturation reaction

向反应体系中加入末端修饰和变性反应混合液,其组成见表12。反应体系中,T4多核苷酸激酶(T4 PNK,NEB,M0201L)用于对单链cDNA分子的5’末端进行磷酸化修饰。然后利用高温对cDNA分子进行变性,解开cDNA局部区域可能出现的双链结构。变性反应体系加入ET SSB(NEB,M2401S)来维持单链cDNA分子的线状结构,避免变性后退火复性为复杂的发夹状等结构。反应体系置于PCR仪(博日,TC-96)上,运行表13中程序。Add the end modification and denaturation reaction mixture to the reaction system, the composition of which is shown in Table 12. In the reaction system, T4 polynucleotide kinase (T4 PNK, NEB, M0201L) is used to phosphorylate the 5' end of the single-stranded cDNA molecule. Then, high temperature is used to denature the cDNA molecule to unwind the double-stranded structure that may appear in the local area of the cDNA. ET SSB (NEB, M2401S) is added to the denaturation reaction system to maintain the linear structure of the single-stranded cDNA molecule and avoid annealing and renaturation after denaturation to form complex hairpin-like structures. The reaction system is placed on a PCR instrument (Bori, TC-96) and the program in Table 13 is run.

表12末端修饰和变性反应混合液组成成分。
Table 12. Composition of terminal modification and denaturation reaction mixture.

表13末端修饰和变性反应程序
Table 13 Terminal modification and denaturation reaction procedures

3.7连接反应3.7 Ligation reaction

分别向反应体系中加入实施例2中制备的2μL第一接头溶液(1μM)和2μL第二接头溶液(1μM),并加入28μL连接反应混合液,其组成见表14。其中,T4 DNA Ligase(NEB,M0202L)用于接头连接,连接反应增强剂为0.5mM的氯化六氨合钴溶液。反应体系置于PCR仪(博日,TC-96)上运行表15中程序。 2 μL of the first linker solution (1 μM) and 2 μL of the second linker solution (1 μM) prepared in Example 2 were added to the reaction system respectively, and 28 μL of the ligation reaction mixture was added, the composition of which is shown in Table 14. Among them, T4 DNA Ligase (NEB, M0202L) was used for linker connection, and the ligation reaction enhancer was 0.5 mM hexaamminecobalt chloride solution. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 15 was run.

本发明使用了DNA连接反应增强试剂,在本实施例中使用了合适浓度的化学试剂氯化六氨合钴,提高了DNA的连接效率,在一步实验操作中实现了单链cDNA分子的5’末端和3’末端的两端接头的连接,且缩短了连接的反应时间。The present invention uses a DNA ligation reaction enhancing reagent. In this embodiment, a chemical reagent hexaamminecobalt chloride with a suitable concentration is used to improve the DNA ligation efficiency, realize the connection of the two end adapters of the 5' end and the 3' end of the single-stranded cDNA molecule in a one-step experimental operation, and shorten the reaction time of the connection.

表14连接反应混合液组成成分
Table 14 Composition of ligation reaction mixture

表15连接反应程序
Table 15 Ligation Reaction Procedure

3.8连接产物纯化3.8 Purification of ligation product

向反应管中加入160μL的Agencourt AMPure XP磁珠(Beckman Coulter,A63881),对连接产物进行纯化,使用23μL的pH 8.0的TE缓冲液(Invitrogen,AM9849)进行洗脱,并取21μL的纯化连接产物至新的PCR反应管中。留待通用PCR扩增使用。Add 160 μL of Agencourt AMPure XP magnetic beads (Beckman Coulter, A63881) to the reaction tube, purify the ligation product, elute with 23 μL of pH 8.0 TE buffer (Invitrogen, AM9849), and take 21 μL of the purified ligation product to a new PCR reaction tube. Reserve it for universal PCR amplification.

3.9通用PCR扩增3.9 Universal PCR Amplification

使用第六核酸序列的引物(40μM)作为正向引物,第七核酸序列-N的引物(40μM)作为反向引物,用于通用PCR扩增反应。其中,反向引物含有用于样本识别的Barcode序列。每个样本使用唯一的Barcode序列,不同样本的反向引物中的不同Barcode序列用于区分下机数据中的不同样本。本实施例所用PCR酶反应液为KAPA HiFi HotStart ReadyMix(2X)(Kapa Biosystems,KK2602),反应组成见表16。反应体系置于PCR仪(博日,TC-96)上运行表17中程序。The primer of the sixth nucleic acid sequence (40 μM) is used as the forward primer, and the primer of the seventh nucleic acid sequence-N (40 μM) is used as the reverse primer for universal PCR amplification reaction. Among them, the reverse primer contains a Barcode sequence for sample identification. Each sample uses a unique Barcode sequence, and the different Barcode sequences in the reverse primers of different samples are used to distinguish different samples in the offline data. The PCR enzyme reaction solution used in this embodiment is KAPA HiFi HotStart ReadyMix (2X) (Kapa Biosystems, KK2602), and the reaction composition is shown in Table 16. The reaction system is placed on a PCR instrument (Bori, TC-96) to run the program in Table 17.

表16通用PCR扩增反应液组成成分
Table 16 General PCR amplification reaction solution components

表17通用PCR扩增反应程序
Table 17 General PCR amplification reaction program

3.10通用PCR扩增产物纯化和混合3.10 Universal PCR amplification product purification and pooling

将通用PCR扩增所得的50μLPCR产物,使用50μL的Agencourt AMPure XP磁珠(Beckman Coulter,A63881)进行纯化,纯化所得的35μL DNA分别使用Qubit3.0荧光定量仪(Invitrogen,Q33216)测定DNA的浓度,同时按照等质量终浓度将文库样本混合成测序文库样本,并震荡混匀待用。50 μL of PCR product obtained by universal PCR amplification was purified using 50 μL of Agencourt AMPure XP magnetic beads (Beckman Coulter, A63881). The concentration of 35 μL of purified DNA was measured using Qubit3.0 fluorescence quantification instrument (Invitrogen, Q33216). At the same time, the library samples were mixed into sequencing library samples according to the same final concentration and shaken for use.

3.11单链环化及测序反应3.11 Single-stranded cyclization and sequencing reaction

单链环化采用深圳华大智造科技有限公司MGIEasy环化模块V2.0(MGI,1000005260),测序使用深圳华大智造科技有限公司的MGISEQ-2000RS高通量测序试剂套装(FCL PE100)(MGI,1000012554),均严格按照试剂盒说明书进行操作。使用PE100+10(Paired end 100+10)测序类型,测序获得可靠的碱基序列信息。下机数据根据Barcode序列进行拆分筛选,可获得每个样本测序数据。Single-chain cyclization was performed using the MGIEasy cyclization module V2.0 (MGI, 1000005260) of Shenzhen MGI Intelligent Manufacturing Technology Co., Ltd., and sequencing was performed using the MGISEQ-2000RS high-throughput sequencing reagent kit (FCL PE100) (MGI, 1000012554) of Shenzhen MGI Intelligent Manufacturing Technology Co., Ltd. All operations were performed strictly in accordance with the instructions of the kit. The PE100+10 (Paired end 100+10) sequencing type was used to obtain reliable base sequence information. The offline data was split and screened according to the Barcode sequence to obtain the sequencing data of each sample.

实施例4 RNA文库快速制备及高通量测序Example 4 Rapid preparation of RNA library and high-throughput sequencing

4.1 RNA样本描述4.1 RNA sample description

按照实施例1中描述的方法,准备起始量分别0.2ng、2ng、10ng的人类通用RNA标准品UHRR,样本命名为UHRR-F-0.2ng、UHRR-F-2ng、UHRR-F-10ng。According to the method described in Example 1, the human universal RNA standard UHRR was prepared with starting amounts of 0.2 ng, 2 ng, and 10 ng, respectively, and the samples were named UHRR-F-0.2 ng, UHRR-F-2 ng, and UHRR-F-10 ng.

按照实施例1中描述的方法,提取血浆游离RNA样本,样本命名为Plasma-F-200uL-1、Plasma-F-200uL-2。According to the method described in Example 1, plasma free RNA samples were extracted and the samples were named Plasma-F-200uL-1 and Plasma-F-200uL-2.

4.2 DNA消化和RNA末端修饰反应 4.2 DNA digestion and RNA end modification reactions

使用DNase I(RNase Free)(NEB,M0303L)进行DNA消化,去除RNA制备后可能残留的DNA;使用T4多核苷酸激酶(T4 PNK,NEB,M0201L)对RNA分子进行3’末端修饰,反应体系见表18。将反应体系置于37℃孵育15分钟,高温(95℃)条件下灭活5分钟,反应结束后,将样本置于冰上2分钟。DNase I (RNase Free) (NEB, M0303L) was used for DNA digestion to remove the DNA that may remain after RNA preparation; T4 polynucleotide kinase (T4 PNK, NEB, M0201L) was used to modify the 3' end of RNA molecules. The reaction system is shown in Table 18. The reaction system was incubated at 37°C for 15 minutes and inactivated at high temperature (95°C) for 5 minutes. After the reaction, the sample was placed on ice for 2 minutes.

表18 DNA消化和RNA末端修饰反应
Table 18 DNA digestion and RNA end modification reactions

4.3多聚腺苷酸加尾与逆转录反应4.3 Polyadenylation and reverse transcription

使用多聚腺苷酸聚合酶(NEB,E.coli Poly(A)Polymerase,M0276L)在RNA分子的3’端进行多聚腺苷酸加尾;使用第一核酸序列的引物和200U/μl HiScript III Reverse transcriptase(诺唯赞,R302-01)进行逆转录反应。本实施例中的第一核酸序列具体为5’-TTTTTTTUTTTTTTTUVN-3’。向反应体系中加入30μL多聚腺苷酸加尾与逆转录反应混合液,其组成见表19。反应体系置于PCR仪(博日,TC-96)上运行表20中程序。Polyadenylic acid tailing was performed on the 3' end of the RNA molecule using polyadenylic acid polymerase (NEB, E. coli Poly (A) Polymerase, M0276L); reverse transcription was performed using the primer of the first nucleic acid sequence and 200U/μl HiScript III Reverse transcriptase (Novozyme, R302-01). The first nucleic acid sequence in this embodiment is specifically 5'-TTTTTTTUTTTTTTTUVN-3'. 30μL of polyadenylic acid tailing and reverse transcription reaction mixture was added to the reaction system, and its composition is shown in Table 19. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 20 was run.

表19多聚腺苷酸加尾与逆转录反应混合液组成成分。
Table 19 Compositions of the polyadenylation tailing and reverse transcription reaction mixture.

表20多聚腺苷酸加尾与逆转录反应程序
Table 20 Polyadenylation tailing and reverse transcription reaction procedures

4.4逆转录产物纯化4.4 Reverse transcription product purification

向反应管中加入100μL的Agencourt AMPure XP磁珠(Beckman Coulter Coulter,A63881),对逆转录产物进行纯化,使用22μL的pH 8.0的TE缓冲液(Invitrogen,AM9849)进行洗脱,并取20μL的纯化连接产物至新的PCR反应管中。留待文库制备使用。Add 100 μL of Agencourt AMPure XP magnetic beads (Beckman Coulter Coulter, A63881) to the reaction tube, purify the reverse transcription product, elute with 22 μL of pH 8.0 TE buffer (Invitrogen, AM9849), and take 20 μL of the purified ligation product into a new PCR reaction tube. Reserve it for library preparation.

4.5 U碱基消化、cDNA末端修饰和变性反应4.5 U base digestion, cDNA end modification and denaturation reaction

使用尿嘧啶特异性切除试剂(USER)酶(NEB,USER Enzyme,M5505L)进行U碱基消化反应,反应得到的消化后的单链cDNA分子产物。使用T4多核苷酸激酶(T4 PNK,NEB,M0201L)对消化后的单链cDNA分子的5’末端进行磷酸化修饰。然后利用高温对cDNA分子进行变性,解开cDNA局部区域可能出现的双链结构。变性反应体系加入ET SSB(NEB,M2401S)来辅助维持单链cDNA分子的线状结构,避免变性后退火复性为复杂的发夹状等结构。向反应体系中加入30μL的U碱基消化反应混合液,其组成见表21。反应体系置于PCR仪(博日,TC-96)上运行表22中程序。The U base digestion reaction was performed using uracil specific excision reagent (USER) enzyme (NEB, USER Enzyme, M5505L), and the digested single-stranded cDNA molecule product was obtained. The 5' end of the digested single-stranded cDNA molecule was phosphorylated using T4 polynucleotide kinase (T4 PNK, NEB, M0201L). The cDNA molecules were then denatured at high temperature to unwind the double-stranded structure that may appear in the local area of the cDNA. ET SSB (NEB, M2401S) was added to the denaturation reaction system to assist in maintaining the linear structure of the single-stranded cDNA molecules and avoid annealing and renaturing into complex hairpin-like structures after denaturation. 30 μL of the U base digestion reaction mixture was added to the reaction system, and its composition is shown in Table 21. The reaction system was placed on a PCR instrument (Bori, TC-96) and the program in Table 22 was run.

表21 U碱基消化、末端修饰和变性反应混合液组成成分。
Table 21 U Base digestion, end modification and denaturation reaction mixture composition.

表22 U碱基消化、末端修饰和变性反应程序
Table 22 U base digestion, end modification and denaturation reaction procedures

4.6连接反应4.6 Ligation reaction

分别向反应体系中加入实施例2中制备的2μL第一接头溶液(1μM)和2μL第二接头溶液(1μM),并加入28μL连接反应混合液,其组成见表14。其中,T4 DNA Ligase(NEB,M0202L)用于接头连接,连接反应增强剂为0.5mM的氯化六氨合钴溶液。反应体系置于PCR仪(博日,TC-96)上运行表15中程序。2 μL of the first linker solution (1 μM) and 2 μL of the second linker solution (1 μM) prepared in Example 2 were added to the reaction system respectively, and 28 μL of the ligation reaction mixture was added, the composition of which is shown in Table 14. Among them, T4 DNA Ligase (NEB, M0202L) was used for linker connection, and the ligation reaction enhancer was 0.5 mM hexaamminecobalt chloride solution. The reaction system was placed on a PCR instrument (Bori, TC-96) to run the program in Table 15.

4.7连接产物纯化4.7 Purification of ligation product

向反应管中加入80μL的Agencourt AMPure XP磁珠(Beckman Coulter Coulter,A63881),对连接产物进行纯化,使用23μL的TE缓冲液(Invitrogen,AM9849)进行洗脱,并取21μL的纯化连接产物至新的PCR反应管中。留待通用PCR扩增使用。Add 80 μL of Agencourt AMPure XP magnetic beads (Beckman Coulter Coulter, A63881) to the reaction tube to purify the ligation product, elute with 23 μL of TE buffer (Invitrogen, AM9849), and take 21 μL of the purified ligation product to a new PCR reaction tube. Reserve it for universal PCR amplification.

4.8按照3.9、3.10和3.11的步骤,分别完成通用PCR扩增、通用PCR扩增产物纯化和混合、单链环化及测序反应。4.8 According to steps 3.9, 3.10 and 3.11, complete universal PCR amplification, universal PCR amplification product purification and mixing, single-stranded circularization and sequencing reactions respectively.

实施例5 RNA测序数据分析和结果展示Example 5 RNA sequencing data analysis and result presentation

对实施例3和实施例4测序的原始数据进行测序质量的分析,测序质量在Q30以上的序列达到90%以上。对测序芯片的质量进行分析,在整个测序循环(cycle)过程中,测序质量没有明显降低,始终保持在较高水平,如附图4所示。上述结果表明,使用本发明的方法,可以得到高质量的测序数据。The sequencing quality of the original data sequenced in Example 3 and Example 4 was analyzed, and the sequencing quality of the sequences above Q30 reached more than 90%. The quality of the sequencing chip was analyzed, and the sequencing quality did not decrease significantly during the entire sequencing cycle, and always remained at a high level, as shown in Figure 4. The above results show that high-quality sequencing data can be obtained using the method of the present invention.

原始测序数据通过Barcode序列进行拆分筛选,可获得每个样本的测序数据。本发明可以实现多个样本在相同测序芯片中的同时检测。The original sequencing data is split and screened by the Barcode sequence to obtain the sequencing data of each sample. The present invention can realize the simultaneous detection of multiple samples in the same sequencing chip.

对每个样本的原始测序数据(raw reads),进行低质量碱基过滤、短序列去除、接头去除、微生物序列去除,然后进行核糖体RNA(rRNA)去除,并以人类rRNA序列为参考,去除原始数据中的rRNA序列,得到过滤后的序列(clean reads)。The raw sequencing data (raw reads) of each sample were filtered for low-quality bases, short sequences, linkers, and microbial sequences, and then ribosomal RNA (rRNA) was removed. Using the human rRNA sequence as a reference, the rRNA sequence in the raw data was removed to obtain the filtered sequence (clean reads).

将所有过滤后的序列(clean reads)以人类基因组(GRCh38)为参考,进行RNA比对,统计Clean reads数目、总比对率、唯一比对率和比对到外显子区域的比例,以及成功比对的clean read在参考基因的正负链信息情况,从而计算文库的链特异性的比例,结果如表23所示。大部分UHRR样本的总比对率可达到93%-97%,唯一比对率达到60%-73%。低起始量的UHRR样本,比对率相对低一些,例如UHRR-F-0.2ng的总比对率为43.63%,唯一比对率为35.27%;UHRR-F-2ng的总本比对率为86.36%,唯一比对率为61.15%。200μL起始的血浆游离RNA样本,总比对率约为35%,唯一比对率约为28%。UHRR样本计算得到的链特异性比例均在80%以上,大部分可达到90%左右或以上;血浆样本计算得到的链特异性比例在70%左右。All filtered sequences (clean reads) were compared with the human genome (GRCh38) as a reference for RNA alignment. The number of clean reads, total alignment rate, unique alignment rate, and the ratio of alignment to exon regions were counted, as well as the positive and negative strand information of successfully aligned clean reads in the reference gene, so as to calculate the strand specificity ratio of the library. The results are shown in Table 23. The total alignment rate of most UHRR samples can reach 93%-97%, and the unique alignment rate can reach 60%-73%. The alignment rate of UHRR samples with low starting amount is relatively low, for example, the total alignment rate of UHRR-F-0.2ng is 43.63%, and the unique alignment rate is 35.27%; the total alignment rate of UHRR-F-2ng is 86.36%, and the unique alignment rate is 61.15%. The total alignment rate of plasma free RNA sample starting from 200μL is about 35%, and the unique alignment rate is about 28%. The chain-specific ratios calculated for UHRR samples were all above 80%, and most of them could reach around 90% or above; the chain-specific ratios calculated for plasma samples were around 70%.

统计样本中检测到的基因总数目和不同类型的RNA的基因数目,本实施例分析的RNA类型包括mRNA(messenger RNA)、lncRNA(long non-coding RNA)、pseudogene RNA、miRNA(microRNA)、tRNA(transfer RNA)、mt-tRNA(mitochondrial transfer RNA)、mt-rRNA(mitochondrial ribosomal RNA)、snoRNA(small neclear RNA)、snRNA(small cytoplasmic RNA),结果如附图5所示。所有UHRR样本,无论起始量为0.2ng、2ng、10ng、还是100ng,均能检测到1万个以上的蛋白编码基因;还能检测到约4千个lncRNA基因;能检测到约3千到1万个pseudogene RNA;能检测到约200-900个miRNA基因;能检测到约200-300个tRNA的基因;能检测到约300-800个snoRNA的基因和约40-90个snRNA的基因;线粒体的22个tRNA和2个rRNA,即mt-tRNA和mt-rRNA,均能检测到。对于200μL起始的血浆样本来说,游离RNA中能检测到的基因数目相比UHRR来说,相对少一些。但也能检测到接近1万个蛋白编码基因,并且能检测到其他各类RNA。本实施例的附图6还展示了部分样本检测到的各类RNA基因数的百分比。The total number of genes detected in the sample and the number of genes of different types of RNA were counted. The RNA types analyzed in this embodiment include mRNA (messenger RNA), lncRNA (long non-coding RNA), pseudogene RNA, miRNA (microRNA), tRNA (transfer RNA), mt-tRNA (mitochondrial transfer RNA), mt-rRNA (mitochondrial ribosomal RNA), snoRNA (small neclear RNA), and snRNA (small cytoplasmic RNA). The results are shown in Figure 5. All UHRR samples, regardless of the starting amount of 0.2ng, 2ng, 10ng, or 100ng, can detect more than 10,000 protein-coding genes; can also detect about 4,000 lncRNA genes; can detect about 3,000 to 10,000 pseudogene RNA; can detect about 200-900 miRNA genes; can detect about 200-300 tRNA genes; can detect about 300-800 snoRNA genes and about 40-90 snRNA genes; 22 tRNAs and 2 rRNAs of mitochondria, namely mt-tRNA and mt-rRNA, can all be detected. For plasma samples starting with 200μL, the number of genes that can be detected in free RNA is relatively small compared to UHRR. But it can also detect nearly 10,000 protein-coding genes, and can detect other types of RNA. Attached Figure 6 of this embodiment also shows the percentage of the number of various RNA genes detected in some samples.

RNA表达量定量分析,采用TPM(Transcript per million)计算方式,基于全基因组gtf(gene transfer format)文件,实现不同类型RNA的定性和定量,得到RNA表达谱。利用RNA表达谱,对样本检测的一致性进行分析。如附图7的表达量所示,无论是不同起始量的UHRR样本,还是血浆游离RNA的技术重复样本,均有较好的一致性。对所有样本之间表达量,按照原始TPM值以及转化后的Log2(TPM+1)值,分别计算Pearson相关性系数,结果如附图8所示。按照原始TPM值计算的Pearson相关性系数,UHRR样本之间的相关性系数至少在0.85以上;血浆游离RNA技术重复样本的相关性系数为0.99。The quantitative analysis of RNA expression was performed using the TPM (Transcript per million) calculation method, based on the whole genome gtf (gene transfer format) file, to achieve qualitative and quantitative analysis of different types of RNA and obtain RNA expression profiles. The consistency of sample detection was analyzed using the RNA expression profile. As shown in the expression levels in Figure 7, both UHRR samples with different starting amounts and technical replicate samples of plasma free RNA have good consistency. For the expression levels between all samples, the Pearson correlation coefficient was calculated according to the original TPM value and the converted Log2 (TPM+1) value, and the results are shown in Figure 8. According to the Pearson correlation coefficient calculated from the original TPM value, the correlation coefficient between UHRR samples is at least 0.85; the correlation coefficient of the technical replicate samples of plasma free RNA is 0.99.

RNA表达量定量分析,按照Log2(TPM+1)值计算,低表达量的基因的波动对结果有一定的负面影响,所计算相关性系数比原始的TPM值计算的值低。UHRR样本的相关性系数最少为0.67,大部分在0.70以上;血浆游离RNA的两个技术重复样本的相关性系数为0.83。Quantitative analysis of RNA expression was performed according to the Log 2 (TPM+1) value. The fluctuation of low-expression genes had a certain negative impact on the results, and the calculated correlation coefficient was lower than the value calculated by the original TPM value. The correlation coefficient of UHRR samples was at least 0.67, and most of them were above 0.70; the correlation coefficient of two technical replicates of plasma free RNA samples was 0.83.

值得一提的是,本发明实施的UHRR样本最起始量范围较广,包含了0.2ng、2ng、10ng、100ng共计4个不同起始量范围。其中,最低的UHRR起始量已低至0.2ng。而血浆样本的起始量也低至200μL。上述的起始量已远低于目前大部分的商业化RNA试剂盒和文献所使用的RNA起始量。在所述低起始量的情况下,在两种不同的文库制备流程(实施例3和实施例4)的两个批次实验条件下,依然可以表现出较高的相关性系数,表明本发明的方法的具有较高的稳定性和低起始量优势。It is worth mentioning that the UHRR sample starting amount range implemented in the present invention is relatively wide, including 4 different starting amount ranges of 0.2ng, 2ng, 10ng, and 100ng. Among them, the lowest UHRR starting amount is as low as 0.2ng. The starting amount of plasma samples is also as low as 200μL. The above starting amount is much lower than the RNA starting amount used in most of the current commercial RNA kits and literature. In the case of the low starting amount, under the experimental conditions of two batches of two different library preparation processes (Example 3 and Example 4), a high correlation coefficient can still be shown, indicating that the method of the present invention has high stability and low starting amount advantages.

表23样本测序数据和比对率基本情况
Table 23 Basic information of sample sequencing data and alignment rate

综上所述,本发明是快速检测多种类型RNA的链特异性文库制备方法与高通量测序技术,可同时检测的RNA类型包括但不限于mRNA、lncRNA、tRNA、miRNA等。解决了传统的RNA文库制备方法捕获的RNA种类比较单一或局限、实验步骤复杂、时间周期长的问题。 In summary, the present invention is a strand-specific library preparation method and high-throughput sequencing technology for rapid detection of multiple types of RNA, and the RNA types that can be detected simultaneously include but are not limited to mRNA, lncRNA, tRNA, miRNA, etc. It solves the problems of the traditional RNA library preparation method that the RNA types captured are relatively single or limited, the experimental steps are complicated, and the time period is long.

本发明所制备的RNA测序文库,不含有人为添加的低复杂度序列,使得测序质量和数据量有所保障。The RNA sequencing library prepared by the present invention does not contain artificially added low-complexity sequences, so that the sequencing quality and data volume are guaranteed.

本发明所制备的测序文库,含有RNA的链特异性信息,有利于基因注释等RNA分析。The sequencing library prepared by the present invention contains RNA strand-specific information, which is beneficial to RNA analysis such as gene annotation.

本发明通过反应条件的优化,将实验原理中提及的多个反应步骤合并在同一实验操作步骤中进行。例如,将多聚腺苷酸加尾与逆转录反应合并为一步,cDNA的两端接头分别连接合并为一步连接等,因此减少了文库制备的操作步骤和时间,实验操作更加简单快捷。The present invention combines multiple reaction steps mentioned in the experimental principle into the same experimental operation step by optimizing the reaction conditions. For example, polyadenylation tailing and reverse transcription reaction are combined into one step, and the two end adapters of cDNA are connected separately and combined into one step, thereby reducing the operation steps and time of library preparation, and the experimental operation is simpler and faster.

本发明通过实验设计,使用DNA连接酶的方案替代了miRNA文库制备方法中的RNA连接酶方案来添加测序接头序列,提高了连接的效率,缩短了连接所需的时间,并且降低了成本。The present invention uses a DNA ligase scheme to replace the RNA ligase scheme in the miRNA library preparation method to add a sequencing adapter sequence through experimental design, thereby improving the efficiency of connection, shortening the time required for connection, and reducing the cost.

本发明采用PCR扩增技术对文库序列进行扩增,通过PCR技术引入用于文库识别的Barcode序列及用于环化反应和测序所需的结构序列,可以实现多个文库样本混合一起上机测序,提高检测通量,降低检测成本。The present invention adopts PCR amplification technology to amplify the library sequence, introduces the Barcode sequence for library identification and the structural sequence required for cyclization reaction and sequencing through PCR technology, and can realize the mixing of multiple library samples for sequencing together, thereby improving the detection throughput and reducing the detection cost.

本发明的方法适用于组织细胞RNA和游离RNA等多种样本类型的高通量测序检测,可适用于ng级别甚至pg级别的低起始量的RNA样本。The method of the present invention is applicable to high-throughput sequencing detection of various sample types such as tissue cell RNA and free RNA, and can be applied to RNA samples with low starting amounts at the ng level or even the pg level.

本发明可用于液体活检领域的游离RNA检测,目前市场上尚无针对游离RNA样本的专用RNA建库试剂盒,该方法具有广阔的应用前景。以上仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。 The present invention can be used for free RNA detection in the field of liquid biopsy. Currently, there is no dedicated RNA library construction kit for free RNA samples on the market. This method has broad application prospects. The above is only a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the principles of the present invention. These improvements and modifications should also be regarded as the protection scope of the present invention.

Claims (24)

多种类型RNA文库的制备方法,其特征在于,包括:将RNA样品末端加polyA,经逆转录和U碱基消化,制备cDNA文库;A method for preparing various types of RNA libraries, characterized by comprising: adding polyA to the end of an RNA sample, performing reverse transcription and U base digestion, and preparing a cDNA library; 所述RNA样品中含有总RNA、rRNA、mRNA、tRNA、miRNA和/或lncRNA中至少一种。The RNA sample contains at least one of total RNA, rRNA, mRNA, tRNA, miRNA and/or lncRNA. 根据权利要求1所述的制备方法,其特征在于,所述制备方法具体包括:DNA消化、RNA末端修饰、加polyA尾、逆转录、U碱基消化、cDNA末端修饰、变性、加接头和文库构建。The preparation method according to claim 1 is characterized in that the preparation method specifically includes: DNA digestion, RNA end modification, polyA tailing, reverse transcription, U base digestion, cDNA end modification, denaturation, linker addition and library construction. 根据权利要求2所述的制备方法,其特征在于,The preparation method according to claim 2, characterized in that 所述DNA消化和RNA末端修饰在第一体系中进行;The DNA digestion and RNA end modification are performed in a first system; 所述逆转录在第二体系中进行;The reverse transcription is performed in a second system; 所述加polyA尾的步骤在第一体系或第二体系中进行。The step of adding the polyA tail is performed in the first system or the second system. 根据权利要求2或3所述的制备方法,其特征在于,所述cDNA末端修饰和变性在第三体系中进行。The preparation method according to claim 2 or 3 is characterized in that the cDNA end modification and denaturation are carried out in a third system. 根据权利要求4所述的制备方法,其特征在于,第三体系中还包括U碱基消化的试剂。The preparation method according to claim 4 is characterized in that the third system also includes a reagent for U base digestion. 根据权利要求3所述的制备方法,其特征在于,所述加polyA尾的步骤在第一体系中进行,其中:The preparation method according to claim 3, characterized in that the step of adding the polyA tail is carried out in a first system, wherein: 所述第一体系包括:RNA样本、PolyA聚合酶反应缓冲液、ATP、BSA、DNase I、T4多核苷酸激酶、PolyA聚合酶、RNA酶抑制剂和无核酸酶水;The first system includes: RNA sample, PolyA polymerase reaction buffer, ATP, BSA, DNase I, T4 polynucleotide kinase, PolyA polymerase, RNase inhibitor and nuclease-free water; 所述第二体系包括:所述第一体系的反应产物、逆转录引物、HiScript III反应缓冲液、HiScript III逆转录酶、dNTP Mix、RNA酶抑制剂和无核酸酶水。The second system includes: the reaction product of the first system, reverse transcription primer, HiScript III reaction buffer, HiScript III reverse transcriptase, dNTP Mix, RNase inhibitor and nuclease-free water. 根据权利要求3所述的制备方法,其特征在于,所述加polyA尾的步骤在第二体系中进行,其中:The preparation method according to claim 3, characterized in that the step of adding the polyA tail is carried out in a second system, wherein: 所述第一体系包括:RNA样本、DNase I反应缓冲液、ATP、DNase I、T4多核苷酸激酶和RNA酶抑制剂; The first system comprises: RNA sample, DNase I reaction buffer, ATP, DNase I, T4 polynucleotide kinase and RNase inhibitor; 所述第二体系包括:所述第一体系的反应产物、HiScript III反应缓冲液、BSA、PEG8000、dNTP Mix、逆转录引物、PolyA聚合酶、RNA酶抑制剂、HiScript III逆转录酶和无核酸酶水。The second system includes: the reaction product of the first system, HiScript III reaction buffer, BSA, PEG8000, dNTP Mix, reverse transcription primer, PolyA polymerase, RNase inhibitor, HiScript III reverse transcriptase and nuclease-free water. 根据权利要求6或7所述的制备方法,其特征在于,所述逆转录引物具有如下核苷酸序列:poly(T)n-UVNm;The preparation method according to claim 6 or 7, characterized in that the reverse transcription primer has the following nucleotide sequence: poly(T)n-UVNm; 其中,n表示碱基T的数量,m表示碱基N的数量;Where n represents the number of bases T, and m represents the number of bases N; n为8~50的整数,m为1~4的整数n is an integer of 8 to 50, and m is an integer of 1 to 4 所述poly(T)n中至少一个T被替换为U;V选自碱基A、碱基C和碱基G中的任意一种;N选自碱基A、碱基T、碱基C和碱基G中的任意一种。At least one T in the poly(T)n is replaced by U; V is selected from any one of base A, base C and base G; and N is selected from any one of base A, base T, base C and base G. 根据权利要求8所述的制备方法,其特征在于,所述逆转录引物具有如SEQ ID NO:1所示的核苷酸序列。The preparation method according to claim 8 is characterized in that the reverse transcription primer has a nucleotide sequence as shown in SEQ ID NO:1. 根据权利要求4所述的制备方法,其特征在于,The preparation method according to claim 4, characterized in that 所述第三体系包括:所述经U碱基消化后反应产物、多核苷酸激酶反应缓冲液、T4多核苷酸激酶、pH 8.0的Tris缓冲液、超热稳定单链结合蛋白和无核酸酶水。The third system includes: the reaction product after U base digestion, polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, ultra-thermostable single-stranded binding protein and nuclease-free water. 根据权利要求5所述的制备方法,其特征在于,The preparation method according to claim 5, characterized in that 所述第三体系包括:所述经polyA加尾和逆转录的反应产物、多核苷酸激酶反应缓冲液、T4多核苷酸激酶、pH 8.0的Tris缓冲液、超热稳定单链结合蛋白、尿嘧啶特异性切除试剂USER酶和无核酸酶水。The third system includes: the reaction product of polyA tailing and reverse transcription, polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, ultra-thermostable single-stranded binding protein, uracil-specific excision reagent USER enzyme and nuclease-free water. 根据权利要求2所述的制备方法,其特征在于,所述加接头步骤中,所述接头包括5’端接头和3’端接头,所述5’端接头序列具有如SEQ ID NO 2和SEQ ID NO 3所示核苷酸序列,所述3’端接头序列具有如SEQ ID NO 4和SEQ ID NO 5所示核苷酸序列。The preparation method according to claim 2 is characterized in that, in the step of adding a linker, the linker includes a 5' end linker and a 3' end linker, the 5' end linker sequence has a nucleotide sequence as shown in SEQ ID NO 2 and SEQ ID NO 3, and the 3' end linker sequence has a nucleotide sequence as shown in SEQ ID NO 4 and SEQ ID NO 5. 多种类型RNA文库的测序方法,其特征在于,以权利要求1~12任一项所述的制备方法制得的cDNA文库为样本进行上机测序。A method for sequencing multiple types of RNA libraries, characterized in that the cDNA library prepared by the preparation method according to any one of claims 1 to 12 is used as a sample for on-machine sequencing. 根据权利要求13所述的测序方法,其特征在于,所述cDNA文库经PCR扩增、纯化、样本混合、单链环化后上机测序。The sequencing method according to claim 13 is characterized in that the cDNA library is sequenced after PCR amplification, purification, sample mixing, and single-stranded circularization. 根据权利要求14所述的测序方法,其特征在于,所述PCR扩增的上游引物具有如SEQ ID NO 6所示的核苷酸序列,所述PCR扩增的下游引物具有如SEQ ID NO 7所示的核苷酸序列。The sequencing method according to claim 14 is characterized in that the upstream primer of the PCR amplification has a nucleotide sequence as shown in SEQ ID NO 6, and the downstream primer of the PCR amplification has a nucleotide sequence as shown in SEQ ID NO 7. 文库的构建试剂,其特征在于,包括试剂I、试剂II和试剂III;The library construction reagents are characterized by comprising reagent I, reagent II and reagent III; 所述试剂I包括:DNase I反应缓冲液、ATP、DNase I、T4多核苷酸激酶、RNA酶抑制剂;The reagent I comprises: DNase I reaction buffer, ATP, DNase I, T4 polynucleotide kinase, and RNase inhibitor; 所述试剂II包括:逆转录引物、HiScript III反应缓冲液、HiScript III逆转录酶、dNTP Mix、RNA酶抑制剂;The reagent II includes: reverse transcription primer, HiScript III reaction buffer, HiScript III reverse transcriptase, dNTP Mix, and RNase inhibitor; 所述试剂III包括:多核苷酸激酶反应缓冲液、T4多核苷酸激酶、pH 8.0的Tris缓冲液、超热稳定单链结合蛋白。The reagent III includes: polynucleotide kinase reaction buffer, T4 polynucleotide kinase, Tris buffer at pH 8.0, and ultra-thermostable single-chain binding protein. 根据权利要求16所述的构建试剂,其特征在于,The construction reagent according to claim 16, characterized in that 所述试剂I中还包括:PolyA聚合酶反应缓冲液、BSA、PolyA聚合酶。The reagent I also includes: PolyA polymerase reaction buffer, BSA, and PolyA polymerase. 根据权利要求16所述的构建试剂,其特征在于,The construction reagent according to claim 16, characterized in that 所述试剂II中还包括:BSA、PEG8000、PolyA聚合酶。The reagent II also includes: BSA, PEG8000, and PolyA polymerase. 根据权利要求16所述的构建试剂,其特征在于,所述试剂III中还包括U碱基切除试剂,所述U碱基切除试剂包括尿嘧啶特异性切除试剂USER酶。The construction reagent according to claim 16 is characterized in that the reagent III also includes a U base excision reagent, and the U base excision reagent includes a uracil-specific excision reagent USER enzyme. 根据权利要求16所述的构建试剂,其特征在于,还包括加接头试剂,所述加接头试剂中包括:Tris-Hcl缓冲液、氯化钠、EDTA、如SEQ ID NO 2~5所示的接头、连接反应缓冲液、T4连接酶和氯化六氨合钴。The construction reagent according to claim 16 is characterized in that it also includes a linker adding reagent, which includes: Tris-Hcl buffer, sodium chloride, EDTA, linkers as shown in SEQ ID NO 2 to 5, ligation reaction buffer, T4 ligase and hexamminecobalt chloride. 根据权利要求16所述的构建试剂,其特征在于,还包括纯化试剂,所述纯化试剂包括Agencourt AMPure XP磁珠。The construction reagent according to claim 16 is characterized in that it also includes a purification reagent, and the purification reagent includes Agencourt AMPure XP magnetic beads. 根据权利要求16所述的构建试剂,其特征在于,还包括PCR扩增试剂,所述PCR扩增试剂包括PCR酶反应液、如SEQ ID NO 6所示核苷酸序列的上游引物和如SEQ ID NO 7所示核苷酸序列的下游引物。The construction reagent according to claim 16 is characterized in that it also includes a PCR amplification reagent, which includes a PCR enzyme reaction solution, an upstream primer of the nucleotide sequence shown in SEQ ID NO 6, and a downstream primer of the nucleotide sequence shown in SEQ ID NO 7. 根据权利要求16~22任一项所述的构建试剂,其特征在于,还包括RNA提取试剂。The construction reagent according to any one of claims 16 to 22, further comprising an RNA extraction reagent. 权利要求16~23任一项所述的构建试剂在制备多种类型RNA文库中的应用。 Use of the construction reagent according to any one of claims 16 to 23 in preparing multiple types of RNA libraries.
PCT/CN2023/102155 2023-06-25 2023-06-25 Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique Pending WO2025000136A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/102155 WO2025000136A1 (en) 2023-06-25 2023-06-25 Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/102155 WO2025000136A1 (en) 2023-06-25 2023-06-25 Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique

Publications (1)

Publication Number Publication Date
WO2025000136A1 true WO2025000136A1 (en) 2025-01-02

Family

ID=93936497

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/102155 Pending WO2025000136A1 (en) 2023-06-25 2023-06-25 Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique

Country Status (1)

Country Link
WO (1) WO2025000136A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120330309A (en) * 2025-06-13 2025-07-18 北京明识至善生物技术有限公司 Method and kit for mixing high-throughput sequencing libraries in proportion

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104388426A (en) * 2012-02-28 2015-03-04 盛司潼 Oligo dT primer and method for constructing cDNA library
US20180030436A1 (en) * 2013-12-05 2018-02-01 New England Biolabs, Inc. Enrichment and Sequencing of RNA Species
CN111379031A (en) * 2018-12-28 2020-07-07 深圳华大智造科技有限公司 Nucleic acid library construction method, obtained nucleic acid library and use thereof
CN111961707A (en) * 2020-10-14 2020-11-20 苏州贝康医疗器械有限公司 Nucleic acid library construction method and application thereof in analysis of embryo chromosome structural abnormality before implantation
CN114736951A (en) * 2022-04-20 2022-07-12 深圳大学 A high-throughput sequencing library construction method for small RNA
CN115003867A (en) * 2020-03-16 2022-09-02 深圳华大智造科技股份有限公司 A kind of construction method of sequencing library of sample RNA to be tested
CN115896243A (en) * 2022-11-25 2023-04-04 上海厦维医学检验实验室有限公司 Ultrasensitive reverse transcription method based on mixed enzyme system and application thereof

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104388426A (en) * 2012-02-28 2015-03-04 盛司潼 Oligo dT primer and method for constructing cDNA library
US20180030436A1 (en) * 2013-12-05 2018-02-01 New England Biolabs, Inc. Enrichment and Sequencing of RNA Species
CN111379031A (en) * 2018-12-28 2020-07-07 深圳华大智造科技有限公司 Nucleic acid library construction method, obtained nucleic acid library and use thereof
CN115003867A (en) * 2020-03-16 2022-09-02 深圳华大智造科技股份有限公司 A kind of construction method of sequencing library of sample RNA to be tested
CN111961707A (en) * 2020-10-14 2020-11-20 苏州贝康医疗器械有限公司 Nucleic acid library construction method and application thereof in analysis of embryo chromosome structural abnormality before implantation
CN114736951A (en) * 2022-04-20 2022-07-12 深圳大学 A high-throughput sequencing library construction method for small RNA
CN115896243A (en) * 2022-11-25 2023-04-04 上海厦维医学检验实验室有限公司 Ultrasensitive reverse transcription method based on mixed enzyme system and application thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHI HUAJUAN, ZHOU YING, JIA ERTENG, PAN MIN, BAI YUNFEI, GE QINYU: "Bias in RNA‐seq Library Preparation: Current Challenges and Solutions", BIOMED RESEARCH INTERNATIONAL, HINDAWI PUBLISHING CORPORATION, vol. 2021, no. 1, 1 January 2021 (2021-01-01), XP093252765, ISSN: 2314-6133, DOI: 10.1155/2021/6647597 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120330309A (en) * 2025-06-13 2025-07-18 北京明识至善生物技术有限公司 Method and kit for mixing high-throughput sequencing libraries in proportion

Similar Documents

Publication Publication Date Title
CN113166797B (en) Nuclease-based RNA depletion
CN105400776B (en) Oligonucleotide adapters and their application in the construction of single-strand circular libraries for nucleic acid sequencing
CN106795514B (en) Bubble linker and its application in nucleic acid library construction and sequencing
US20150284769A1 (en) Reduced representation bisulfite sequencing with diversity adaptors
JP7641118B2 (en) Probes and methods for enriching target regions using same for high-throughput sequencing
CN102732629B (en) Method for concurrently determining gene expression level and polyadenylic acid tailing by using high-throughput sequencing
CN111808854B (en) Equilibrium linker with molecular barcode and method for rapid construction of transcriptome library
US20030104432A1 (en) Methods of amplifying sense strand RNA
CN107893260B (en) Method and kit for constructing transcriptome sequencing library by efficiently removing ribosomal RNA
CN114507711B (en) Single-cell transcriptome sequencing method and application thereof
JP2010514452A (en) Concentration with heteroduplex
CN112662771B (en) Targeting capture probe of tumor fusion gene and application thereof
CN110157785A (en) A single-cell RNA sequencing library construction method
CN114736951B (en) A method for constructing a high-throughput sequencing library for small RNA
JP7248228B2 (en) Methods and kits for construction of RNA libraries
EP2785865A1 (en) Method and kit for characterizing rna in a composition
CN112410331A (en) Linker with molecular label and sample label and single-chain library building method thereof
CN110699425B (en) Enrichment methods and systems for gene target regions
CN112941635A (en) Second-generation sequencing library building kit and method for improving library conversion rate
CN112626173A (en) RNA library construction method
CN110205365B (en) A high-throughput sequencing method for efficiently studying the RNA interactome and its application
WO2025000136A1 (en) Method for preparing strand-specific library for rapid detection of various types of rnas, and high-throughput sequencing technique
CN114250224B (en) Nucleic acid composition for extracting or detecting small-molecule RNA in sample, kit and method thereof
CN115874291A (en) Method for marking and simultaneously detecting DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) molecules in sample
JP2025505870A (en) Single-cell transcriptome sequencing and its applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23942669

Country of ref document: EP

Kind code of ref document: A1