Detailed Description
The invention will be described in further detail below with reference to the drawings by means of specific embodiments. In the following embodiments, numerous specific details are set forth in order to provide a better understanding of the present application. However, one skilled in the art will readily recognize that some of the features may be omitted in various situations, or replaced by other materials, methods. In some instances, some operations associated with the present application have not been shown or described in the specification to avoid obscuring the core portions of the present application, and may not be necessary for a person skilled in the art to describe in detail the relevant operations based on the description herein and the general knowledge of one skilled in the art.
Furthermore, the described features, operations, or characteristics of the description may be combined in any suitable manner in various embodiments. Also, various steps or acts in the method descriptions may be interchanged or modified in a manner apparent to those of ordinary skill in the art. Thus, the various orders in the description and drawings are for clarity of description of only certain embodiments, and are not meant to be required orders unless otherwise indicated.
The numbering of the components itself, e.g. "first", "second", etc., is used herein merely to distinguish between the described objects and does not have any sequential or technical meaning.
As used herein, cfDNA refers to nucleic acids in circulating blood. Including circulating intracellular nucleic acids in the blood, free endogenous deoxyribonucleic acids and exogenous DNA and RNA, such as in the case of pathologies such as eubacteremia, bacteremia and viremia.
As used herein, "pathogenic microorganisms" refer to microorganisms, or pathogens, that can invade the human body, causing infection or even infectious disease. Among pathogens, the most harmful is bacteria and viruses. Pathogenic microorganisms include fungi, bacteria, spirochetes, mycoplasma, rickettsiae, chlamydia, or viruses.
As used herein, "DSN" is known as duplex-specific nuclease, a Chinese translation double-strand specific nuclease, is a thermostable nuclease. The enzyme is capable of selectively degrading DNA in double-stranded DNA and DNA-RNA hybrids, but has little effect on single-stranded nucleic acid molecules.
As used herein, human Cot-1 DNA is placental DNA, primarily in the range of 50 to 300bp in size, and can be used to enrich for repetitive DNA sequences (e.g., alu and Kpn family members). Human Cot-1 DNA is commonly used for nonspecific hybridization in closed microarray screening. Human Cot-1 DNA is an effective library screening probe for somatic cell hybridization libraries and flow-sorted chromosome libraries (constructed using somatic cell hybridization).
As used herein, a Host (Host), also referred to as a Host, refers to an organism that provides a living environment for parasitic organisms including parasites, microorganisms (including but not limited to fungi, bacteria, spirochetes, mycoplasma, rickettsiae, chlamydia, viruses), and the like. The parasitic organisms gain nutrition by residing in or on the body surface of the host, and the parasitic organisms can damage the host, cause diseases and even die.
Aiming at various defects of the existing host removal technology, the invention provides a method for efficiently and rapidly improving pathogenic microorganism detection in metagenome sequencing.
According to a first aspect, in an embodiment, there is provided a method of constructing a sequencing library of microorganisms, comprising:
hybridization step, including mixing the probe with nucleic acid molecule connected with sequencing joint, reacting to obtain hybridization product;
an enzyme treatment step comprising removing or cleaving double stranded nucleic acid molecules in the system using a double stranded specific nuclease;
the amplification step comprises the step of carrying out PCR amplification on the nucleic acid with double-stranded nucleic acid molecules removed to obtain a library, namely a sequencing library which can be used for microorganism detection.
In one embodiment, the invention significantly increases the microbial detection rate by enzymatic removal of host DNA.
In one embodiment, the kit used in the amplification step may be an existing kit provided by the sequencing platform manufacturer.
In one embodiment, in the hybridization step, probes are used to capture the sequences to be removed.
In one embodiment, in the hybridization step, the length of the probe is 9nt or 9bp.
In one embodiment, the length of the probe in the hybridization step is 9 to 400nt or 9 to 400bp. The organism rRNA probe and the organism synthesis probe are single chains, and the length unit is nt; the genome fragmentation probe and the Cot-1 probe were double-stranded and had a length of b p.
In one embodiment, the length of the probe in the hybridization step is 50 to 300nt or 50 to 300bp.
In one embodiment, the length of the probe in the hybridization step is 60 to 300nt or 60 to 300bp.
In one embodiment, the probe is not modified during the hybridization step, unlike existing probes modified with biotin (which are typically used for subsequent magnetic bead hybridization capture), the probes used in the present invention do not require modification, significantly reducing cost.
In one embodiment, the hybridization step, the probe comprises at least one of an organism rRNA (ribosomal RNA) probe, an organism whole genome probe (e.g., a probe after human genome fragmentation), an organism Cot-1 probe, and an organism synthesis probe.
In one embodiment, organism rRNA probes are used to capture DNA after reverse transcription of organism rRNA.
In one embodiment, a whole genome probe refers to: the method of enzyme cutting or ultrasonic breaking is used to fragment host genome and purify the obtained genome double strand of 50 bp-300 bp, and the double strand can be used as the probe of the invention after denaturation, also called the probe after human genome fragmentation.
In one embodiment, the probe length for genome fragmentation of an organism is 100-200 bp.
In one embodiment, in the hybridization step, the hybridization system further comprises an additive.
In one embodiment, the additives in the hybridization system include, but are not limited to, at least one of tetramethyl ammonium chloride (TMAC), dimethyl sulfoxide (DMSO), betaine, and formamide. The additive can reduce pathogenic losses caused by nonspecific hybridization.
In one embodiment, the hybridization step, the probes include, but are not limited to, at least one of probes for genome fragmentation of an organism, co t-1 nucleic acid fragments of an organism, probes for capturing ribosomal gene sequences of an organism, and artificially synthesized nucleic acid probes for capturing target sequences to be removed.
The Cot-1 nucleic acid fragment is a non-artificial synthetic sequence and is prepared by using a natural sample.
In one embodiment, the target sequence to be removed comprises a genomic repeat region of a host of the microorganism.
In one embodiment, the hybridization procedure is as follows: 90-95 ℃ for 3-10 min; 55-70 ℃ for 10-30 min. The ligation product is denatured and opened to become single-stranded, and the undenatured double strand is completely cleared by the DSN in the next step.
In one embodiment, in the enzyme treatment step, the reaction procedure is as follows: 55-70 ℃ for 10-30 min; 90-98 ℃ and 3-10 min. This step aims at reducing the activity of DSN enzyme.
In one embodiment, the probe is used to capture DNA from the host of the microorganism and/or DNA obtained by reverse transcription of RNA during the hybridization step.
In one embodiment, the nucleic acid molecules are extracted from the biological sample in a hybridization step.
In one embodiment, in the hybridization step, the biological sample includes, but is not limited to, a body fluid sample or a tissue sample taken from an organism. Biological samples containing host nucleic acids are suitable for use in the present invention.
In one embodiment, the body fluid sample includes, but is not limited to, at least one of blood, urine, cerebrospinal fluid, pleural effusion, vitreous humor, interstitial fluid, alveolar lavage fluid, saliva, stool, and the like.
In one embodiment, in the hybridization step, the biological sample includes, but is not limited to, at least one of an oral swab sample, a pharyngeal swab, a nasal swab, and the like.
In one embodiment, the organism comprises a human, animal or plant, including all mammals such as primates (particularly higher primates), sheep, dogs, rodents (e.g., mice or rats), guinea pigs, goats, pigs, cats, rabbits, and cattle. All plants are also included.
In one embodiment, the organism is a host of a microorganism.
In one embodiment, when the host organism of the microorganism is a plant, the organism Cot-1 probe used for hybridization is the corresponding Cot-1 probe of the plant host.
In one embodiment, in the hybridization step, the method of preparing the nucleic acid molecule with the sequencing adapter attached comprises: extracting nucleic acid from biological sample, sequentially carrying out fragmentation, terminal repair, A reaction, sequencing linker connection and purification to obtain nucleic acid connected with sequencing linker.
In one embodiment, in the hybridizing step, the nucleic acid extracted from the biological sample comprises at least one of DNA and RNA.
In one embodiment, in the hybridizing step, the nucleic acid extracted from the biological sample comprises at least one of a single-stranded nucleic acid molecule, a double-stranded nucleic acid molecule.
In one embodiment, nucleic acid extraction from a biological sample may be performed using existing kits and with reference to the instructions of the Kit, including, but not limited to, the nudiflunian Magnetic Pathogen Microorganism DNA/RNA Kit extraction Kit, qi agen, radix angelicae, and the like, capable of extracting nucleic acid DNA, RNA, or DNA and RNA.
In one embodiment, in the hybridization step, when the nucleic acid extracted from the biological sample contains RNA, reverse transcription is performed in advance to obtain double-stranded cDNA, and then the double-stranded cDNA product in the system and the pre-extracted DNA are subjected to fragmentation, end repair, addition of "A" reaction, sequencing linker ligation, and purification to obtain the nucleic acid with the sequencing linker attached.
In one embodiment, in the hybridization step, the complementary sequence on the sequencing adapter is less than 9bp.
In one embodiment, in the hybridization step, the ratio of the molar mass concentration of the probe in the hybridization system to the molar mass concentration of the nucleic acid molecule to which the sequencing adapter is attached is 1 or more, preferably 5 or more, more preferably 5 to 50. In practical experiments, the probe is usually quantified using a mass-to-volume concentration, and when calculating the probe belgium, a molar mass concentration is usually used.
In one embodiment, in the amplification step, the nucleic acid from which the double-stranded nucleic acid molecule has been removed is subjected to PCR amplification and then purified to obtain a library that can be used for on-machine sequencing.
In one embodiment, purification in the hybridization step, amplification step, includes, but is not limited to, magnetic bead purification, column purification, preferably magnetic bead purification.
In one embodiment, the microorganism is a microorganism having a nucleic acid.
In one embodiment, the microorganism includes, but is not limited to, at least one of fungi, bacteria, spirochetes, mycoplasma, rickettsia, chlamydia, viruses.
In one embodiment, the microorganism comprises a pathogenic microorganism.
According to a second aspect, in an embodiment, there is provided a sequencing library constructed by the method of any one of the first aspects.
According to a third aspect, in an embodiment, there is provided a method of detecting a microorganism, comprising a detection step of detecting nucleic acid from the microorganism from the sequencing library of the second aspect.
In an embodiment, the detecting step may be performed on an existing sequencing platform, including but not limited to an MGI sequencing platform, illumina sequencing platform.
According to a fourth aspect, in an embodiment, a kit is provided for detecting nucleic acid from a microorganism from a biological sample, the kit comprising a double-strand specific nuclease and a probe for capturing a target sequence to be removed.
In one embodiment, the probe comprises at least one of an organism rRNA probe, an organism whole genome probe, an organism Cot-1 probe, and an organism synthesis probe.
In one embodiment, the probe comprises at least one of a probe for genome fragmentation of an organism, a fragment of an organism Cot-1 nucleic acid, a probe for capturing ribosomal gene sequences of an organism, and an artificially synthesized nucleic acid probe for capturing a target sequence to be removed.
In one embodiment, the target sequence to be removed comprises a genomic repeat region sequence of a host of the microorganism.
In one embodiment, the kit further comprises a buffer, instructions for use.
In one embodiment, the target sequence to be removed comprises a genomic repeat region of a host of the microorganism.
In response to the shortcomings of the current techniques for removing host nucleic acids in the market, in one embodiment, a method is provided that is efficient and rapid and that can enhance the detection of pathogenic microorganisms in metagenomic sequencing.
In one embodiment, the present invention achieves rapid enrichment of the target region.
In one embodiment, the method of removing host nucleic acid in the present invention is based on the principle that DSN (Duplex-specific nuclease, double-strand specific nuclease) endonucleases can specifically remove fragments of double-stranded DNA molecules, but cannot or cannot be removed from single-stranded DNA molecules, and the removal of host nucleic acid is performed after ligation of adaptors.
In one embodiment, the sample to be detected is first subjected to nucleic acid extraction, the extracted nucleic acid comprising single DNA, single RNA or a mixture of DNA and RNA, followed by conventional pooling procedures including reverse transcription, two-strand synthesis, end repair, addition of a tail and adaptor ligation. Removal of host nucleic acid is performed after ligation of the adaptors.
In one embodiment, the principle of the invention is that a specific probe (such as a human genome fragmented nucleic acid probe, a human Cot-1 nucleic acid fragment, a human ribosome gene sequence probe, a synthetic probe and the like) capable of capturing a sequence to be removed is added into a purified connection product, and the nucleic acid molecules with complementary sequences form a double-chain structure and a nucleic acid molecule incapable of forming a complementary structure maintains a single-chain structure through the steps of heating, denaturing, cooling, annealing and hybridizing together.
In one embodiment, FIG. 1 shows a schematic flow chart for removing host nucleic acid, since the number of nucleic acid fragments of pathogenic microorganisms in the whole system is very small, and in the hybridization reaction in a system where a large number of probes are added, it is difficult to find complementary strands thereof to form a double strand within a defined time after denaturation to a single strand, thereby maintaining a single strand structure. Then adding DSN nuclease capable of specifically cutting double-stranded structure, removing or cutting off double-stranded nucleic acid molecules in the system, wherein the cut-off nucleic acid fragments cannot be amplified in the next step by using the joint sequences connected to two ends, but the single-stranded molecules which are not cut can be smoothly amplified, so that the purpose of enriching the nucleic acid fragments of pathogenic microorganisms is achieved.
In FIG. 1, the dark line represents microbial nucleic acid, the light line represents host nucleic acid, and the product for magnetic bead purification shown in the lower right corner of FIG. 1 shows that a small amount of host sequences are mixed in the microbial sequence, because the host nucleic acid has too many sequences, all host nucleic acid sequences cannot be removed by the invention, and a small amount of host sequences are mixed in the microbial library, but the host sequences in the product have a significantly smaller proportion than the existing method without host removal.
In one embodiment, the double-strand specific cleavage property of DSN nucleases is utilized, and the principle of forming specific double strands between host nucleic acids and probes is utilized to cut adaptor-added nucleic acid fragments so that subsequent amplification or library construction cannot be performed without affecting nucleic acids that fail to form double strands. Thus, the uncleaved nucleic acid is enriched, the detection of pathogenic nucleic acid sequences in the sample is improved, and a large amount of waste of sequencing data is reduced.
In one embodiment, the feature that DSN cannot cleave for double strands smaller than 9bp is utilized, and the complementary sequence at the adaptor is designed to be smaller than 9bp, so that no clearance of non-host nucleic acids results. The adaptor may form a double-stranded structure when annealed and then cut, and when the complementary sequence is less than 9bp, DSN cannot be cut, thereby reducing the risk of the adaptor sequence being cut. Some of the commercially available kits have joints that meet the above-mentioned length requirements, and some of the kits have joints that do not meet the above-mentioned requirements (e.g., the bubble joint of MGI, i.e., the bubble joint), and when they do not meet the above-mentioned requirements, the joints that meet the above-mentioned requirements can be designed by themselves.
In one embodiment, the amount of DSN used is tested such that the double stranded cleavage system for host nucleic acid clearance is at an optimal level.
In one embodiment, different types and concentrations of probes are adapted to form a host-probe duplex system for duplex cleavage of DSNs, with probe types including host rRNA probes, host whole genome probes, host Cot-1 probes, and host synthetic sequence probes. The probe concentration ratio is 1× or more, or 5× to 100×, or 5× to 50×, including, but not limited to, 5×, 10×, 20×, 40×, 50×, and the like. The ratio of the probe concentration refers to the ratio of the molar mass concentration of the probe to the molar mass concentration of the DNA from which the pool is created. The probe is added at a high concentration in order to allow competitive binding of the probe to the host sequence, but too high a probe concentration also causes a decrease in the efficiency of the subsequent PCR, and therefore, the probe concentration ratio is preferably 5 to 100×, more preferably 5 to 50×.
In one embodiment, the method can be combined and used with more technologies due to the fact that only one-step biochemical reaction is introduced in the warehouse building process, and the application range is wider. For example, the method can be combined with other methods for removing host nucleic acid to achieve better results. Such as methylation site cleavage and differential cleavage, which are commonly used in the market, the present invention does not conflict with these methods.
In an embodiment, compared with commonly used host removal products in the market, the method removes the target nucleic acid fragments in the sample in the process of library establishment, is not influenced by host and pathogen states in the sample during sample extraction, and obviously reduces the possibility of pathogen information loss.
In one embodiment, since nucleic acid removal is performed during the pooling process (after adaptor ligation), the host removal effect can be achieved for samples starting with DNA and RNA at the same time, and the information on DNA and RNA of other non-host nucleic acids in the sample is kept from being lost.
In one embodiment, the present invention is simple to operate and uses less time than the commonly used decomplexing products on the market.
Example 1
In this example, host nucleic acid removal is performed after linker ligation.
Coli, staphylococcus aureus, candida albicans, aspergillus brazilian, respiratory adenovirus type 7 and respiratory syncytial virus are respectively common different types of pathogenic microorganisms, and respectively represent gram-negative bacteria, gram-positive bacteria, mononuclear fungi, polynuclear fungi, DNA viruses and RNA viruses. In this embodiment, a method for efficiently and rapidly increasing the detection of pathogenic microorganisms in metagenomic sequencing according to the present invention will be described by taking the case of detecting the pathogenic microorganisms in a sample as an example.
The method comprises the following specific steps:
1. preparing a pathogenic microorganism standard substance: adding Hela cells into a phosphate buffer solution to prepare a cell background solution with the concentration of 1 x 10≡5ce ll/mL, adding escherichia coli with the final concentration of 500CFU/mL, staphylococcus aureus with the final concentration of 500CFU/mL, candida albicans with the final concentration of 500CFU/mL, aspergillus brasiliensis with the final concentration of 100CFU/mL, and respiratory syncytial virus with the concentration of 2500TCID of 50/mL into the prepared cell solution, and fully and uniformly mixing to prepare a standard substance mixed solution.
2. Total DNA and RNA in the prepared pathogen standard were extracted using a Northenpran Magnetic Pathogen Microorganism DNA/RNA Kit extraction Kit.
The post-extraction products were quantified using Qubit HS dsDNA and RNA quantification reagents.
3. Reverse transcription and cDNA reconnection synthesis are carried out on RNA in a sample according to the following reaction system:
TABLE 1
| Component (A)
|
Volume (mu L)
|
| Total DNA and RNA after extraction
|
14
|
| Random Primer
|
1
|
| Total volume of
|
15 |
The Random Primer of this example was Hieff from the company Hieff of the Highai, inc. of Saint Biotech (Shanghai) nextReagents in ds-c DNA Synthesis Kit (Cat# 13488) kit.
After being fully and uniformly mixed according to the formula of the table 1, the primer is placed on ice for 3min for primer combination after being subjected to thermal denaturation at 70 ℃ for 5min. After adding 8. Mu.L of 1st Reaction Buffer and 2. Mu.L of 1st Stand Enzyme Mix and thoroughly mixing, the following reaction procedure was followed: 25 ℃ for 5min;42 ℃ for 30min;85 ℃ for 5min. Reverse transcription of RNA was performed.
After completion of the reverse transcription, 7. Mu.L of 2nd Reaction Buffer and 3. Mu.L of 2nd Stand Enzyme Mix were added to the system and mixed well, and the mixture was allowed to react at 16℃for 5 minutes to synthesize a cDNA in two strands.
4. After the completion of the two-chain synthesis, the reaction product was placed on ice, 10. Mu. L Smearase Buffer and 5. Mu. L S mearase Enzyme Mix were added thereto, and 10. Mu.L of NF H was added 2 O the reaction system was adjusted to 60. Mu.L. After being fully and evenly mixed, the mixture is placed at 30 ℃ for reaction for 25min, and then at 72 ℃ for reaction for 20min. The step can carry out fragmentation, terminal repair and addition of an A tail on the product synthesized by reverse transcription and double-ligation of total DNA and RNA in a reaction system.
5. To the reaction product, corresponding reagents were sequentially added to carry out linker ligation according to the following table 2:
TABLE 2
| Component (A)
|
Volume (mu L)
|
| The product of step 4
|
60
|
| Joint solution (3 mu M)
|
5
|
| Connection enhancer
|
30
|
| T4 DNA ligase
|
5
|
| Total volume of
|
100 |
The ligation enhancer and T4 DNA ligase were Hieff available from Shanghai, inc. of Saint Biotech C37P4OnePot cDNA&gDNA Library Prep Kit.
The linker complementary sequence used in this example is less than 9bp. The linker sequences used in this example are specifically as follows:
adaptor oligonucleotide strand 1:5' -GAACGACATGGCTACGATCCGACTT-3'(SEQ ID No.1);
Adaptor oligonucleotide strand 2:5' -Pho- AGTCGGAGGCCAAGCGGTCTTAGGAAGAC-3'(SEQ ID N o.2)。
The underlined sequence is the reverse complement pair sequence. The 3' -end of the adaptor oligonucleotide chain 1 has a protruding base T, which contributes to the improvement of adaptor ligation efficiency.
"Pho" refers to a phosphorylation modification designed to help a ligase successfully ligate a linker to a nucleic acid in a sample.
The reaction mixture is fully and evenly mixed and then is placed at 20 ℃ for reaction for 15min for joint connection. After the reaction, 60 mu L of magnetic beads are added for magnetic bead purification, and the purified product is dissolved in 18 mu L of NF H 2 O.
6. Hybridization of the probe:
host nucleic acid removal probes were added to the adaptor-ligated product according to the following table:
TABLE 3 Table 3
| Component (A)
|
Volume (mu L)
|
| The linker ligation product in step 5
|
16
|
| Human Cot-1DNA (300 ng/. Mu.L)
|
0.8
|
| Human rRNA probes
|
0.4
|
| Human genome fragmented probe (100 ng/. Mu.L)
|
0.4
|
| DSN buffer
|
2
|
| Tetramethyl ammonium chloride (5M)
|
0.4
|
| Total volume of
|
20 |
The human Cot-1DNA of this example was purchased from Thermo Fisher under the accession number 15279011.
The human rRNA probe of this example was Hieff available from Hieff of Hi-Biotech, inc. (Shanghai) of the next holy BiotechMaxUp r RNA Depletion Kit (Human/Mouse/Rat). Other types of human or host rRNA probes may be used in this example. The initial concentration of human rRNA probe was about 100. Mu.M.
The human genome fragmented probe is obtained by breaking human genome into fragments of 100-200 bp by using Covaris E220 and then purifying by using magnetic beads.
The DSN buffer solution is a reaction buffer solution (buffer) of DSN enzyme, contains ions required by the DSN enzyme reaction, and can improve the hybridization effect of the probe through the test of us, so the DSN buffer solution is added first. DSN enzyme and the kit buffer were purchased from E vrogen.
After the reaction mixture was thoroughly mixed, probe hybridization was performed at the following reaction temperatures:
TABLE 4 Table 4
| Temperature (temperature)
|
Time
|
| 95℃
|
5min
|
| 65℃
|
20min |
After the hybridization was completed, the hybridization system was kept at 65 ℃.
Dsn enzymatic digestion treatment:
1 mu L of DSN nuclease is added into the system after the probe hybridization in the step 6, and after the complete mixing, double-stranded nucleic acid molecules in the system are continuously removed according to the following reaction temperature.
TABLE 5
| Temperature (temperature)
|
Time
|
| 65℃
|
10min
|
| 95℃
|
5min |
PCR amplification:
the digested system from step 7 was placed on ice, 25. Mu.L of 2X Ultima HF Amplification Mix and 4. Mu. L F/R Index primer (20. Mu.M) were added in this order, and after thoroughly mixing, the amplification was performed in cycles: 98 ℃ for 1min;98 ℃ for 10s;60 ℃ for 30s;72 ℃,30s (8 cycles); 72 ℃ for 1min;4 ℃, hold. After the amplification was completed, 45. Mu.L of magnetic beads were added to the amplified product for bead purification, and the purified product was dissolved back in 30. Mu.L of TE.
9. Sequencing and data analysis
The library amplified in step 8 was prepared for DNB (DNA Nanoball) using MGI single strand cyclization kit and MGISEQ-200RS high throughput sequencing kit (SE 50) and sequenced using MGISEQ200 sequencer. Sequencing results were analyzed by comparison with pathogen database using BWA (Burrows-Wheeler-Alignment Tool) software. The pathogen database used in this example is a microbial database constructed from high quality microbial genomes in refseq and Genbank databases.
Experimental results:
the method for efficiently and rapidly improving the detection of the pathogenic microorganisms in the metagenome sequencing in the embodiment is utilized, and the sequencing comparison method is used for evaluating the detection of the pathogenic microorganisms.
TABLE 6
| Method
|
Host rRNA ratio
|
Host total nucleic acid ratio
|
| Host nucleic acid is not removed
|
65.80%
|
90.26%
|
| Example 1
|
1.48%
|
88.93% |
Host rRNA ratio refers to the number of rRNA reads of the host (HeLa cells) in the sample tested relative to the total sequenced number of reads. The host rRNA ratio and the host total nucleic acid ratio in the analysis flow of this example were analyzed separately, i.e., the host rRNA was contained in the host total nucleic acid.
Similarly, the host total nucleic acid ratio refers to the total number of nucleic acids reads of the host (HeLa cells) in the sample tested relative to the total number of reads sequenced.
In FIG. 2, "host nucleic acid was not removed" (control group) is a microorganism detection result obtained by directly performing PCR amplification and performing the subsequent steps without performing the host nucleic acid removal step (probe hybridization and DSN enzyme digestion treatment) using the same adaptor ligation product.
In FIG. 2, "the method of the present invention" is the method of example 1. In the prior art, the single site is detected by a qPCR (real-time fluorescence quantitative P CR) method, the accuracy is low, and the embodiment uses a mNGS sequencing analysis method to analyze all nucleic acid sequences in a sample, so that the accuracy is higher.
Table 6 shows the results of the host rRNA sequence ratios. As can be seen from table 6, the sequencing results without host nucleic acid removal showed that the host rRNA was up to 65.80%, whereas the method of example 1 showed only 1.48% host rRNA by host nucleic acid removal, confirming that the method of example 1 significantly reduced the host rRNA ratio in the sample.
FIG. 2 is a graph showing the detection results of microorganisms as a standard, and it is seen that the method for removing host nucleic acid according to example 1 can improve the detection of pathogenic microorganisms by 3 times or more, as compared with the method without removing host nucleic acid, and the type of pathogenic microorganisms represented by the species of different pathogenic microorganisms in FIG. 2, and the detection results of Escherichia coli (gram-negative bacteria), staphylococcus aureus (gram-positive bacteria), candida albicans (fungi), aspergillus brasiliensis (fungi), adenovirus type 7 (DNA virus) and respiratory syncytial virus (RNA virus) are shown in the order from left to right in the microorganisms on the abscissa of FIG. 2.
Example 2
The present embodiment provides a DNA library construction method, which is specifically as follows:
1. 10 oral swab samples are collected, placed in 2mL of phosphate buffer solution and subjected to intense vortex oscillation for 5min, and all the liquid is fully and uniformly mixed after the swab is removed, so that an oral microorganism to-be-detected product is prepared.
2. Make the following stepsPraise with Nuo ZhanMicrobiome DNAIsolation Kit the extraction kit is used for extracting the total DNA in the prepared oral microorganism to-be-detected product. The post-extraction products were quantified using the Qubit HS dsDNA quantification reagent. />
3. Fragmenting the total DNA in the sample, end repair and adding "a" tail: to the extracted DNA, 10uL Smearase Buffer and 5uL Smearase Enzyme Mix were added, and 10. Mu.L of NF H was added 2 O the reaction system was adjusted to 60. Mu.L. After being fully and evenly mixed, the mixture is placed at 30 ℃ for reaction for 25min, and then at 72 ℃ for reaction for 20min.
4. And sequentially adding corresponding reagents into the reaction product according to the following table to carry out joint connection.
TABLE 7
| Component (A)
|
Volume (mu L)
|
| The product of step 3
|
60
|
| Joint solution (3 mu M)
|
5
|
| Connection enhancer
|
30
|
| T4 DNA ligase
|
5
|
| Total volume of
|
100 |
The reaction mixture is fully and evenly mixed and then is placed at 20 ℃ for reaction for 15min for joint connection. After the reaction, 60 mu L of magnetic beads are added for magnetic bead purification, and the purified product is dissolved in 18 mu L of NF H 2 O.
5. Probe hybridization
Host nucleic acid removal probes were added to the adaptor-ligated products according to the following table.
TABLE 8
| Component (A)
|
Volume (mu L)
|
| The linker ligation product in step 4
|
16
|
| Human Cot-1DNA (300 ng/. Mu.L)
|
0.8
|
| Human rRNA probes
|
0.4
|
| Human genome fragmented probe (100 ng/. Mu.L)
|
0.4
|
| DSN buffer
|
2
|
| Tetramethyl ammonium chloride
|
0.4
|
| Total volume of
|
20 |
Each probe was derived as in example 1, specifically shown in Table 3 and described below.
After the reaction mixture was thoroughly mixed, probe hybridization was performed at the following reaction temperature.
TABLE 9
| Temperature (temperature)
|
Time
|
| 95℃
|
5min
|
| 65℃
|
20min |
After the hybridization was completed, the hybridization system was kept at 65 ℃.
DSN enzymatic digestion treatment
1 mu L of DSN nuclease is added into the system after the probe hybridization in the step 5, and after the complete mixing, double-stranded nucleic acid molecules in the system are continuously removed according to the following reaction temperature.
Table 10
| Temperature (temperature)
|
Time
|
| 65℃
|
10min
|
| 95℃
|
5min |
PCR amplification
The digested system from step 6 was placed on ice, 25. Mu.L of 2X Ultima HF Amplification Mix and 4. Mu. L F/R Index primer (20. Mu.M) were added in this order, and after thoroughly mixing, the amplification was performed in cycles: 98 ℃ for 1min;98℃10s,60℃30s,72℃30s (8 cycles); 72 ℃ for 1min; hold at 4 ℃. After the amplification, 45. Mu.L of magnetic beads were added to the amplified product for bead purification, and the purified product was dissolved back in 30. Mu.L of TE.
8. Sequencing and data analysis
The library amplified in step 7 was DNB prepared using MGI single strand cyclization kit and MGISEQ-200RS high throughput sequencing kit set (SE 50) and sequenced using MGISEQ200 sequencer. Sequencing results were analyzed using BW a in comparison to the pathogen database.
Experimental results:
the method for efficiently and rapidly improving the detection of the pathogenic microorganisms in the metagenome sequencing in the embodiment is utilized, and the sequencing comparison method is used for evaluating the detection of the pathogenic microorganisms.
In FIGS. 3 and 4, the "host nucleic acid was not removed" (control group) is a microorganism detection result obtained by directly performing PCR amplification without performing a host nucleic acid removal step (probe hybridization and DSN enzyme digestion treatment) using the same adaptor ligation product, and performing the subsequent steps.
In FIG. 3, the ordinate "number of detected reads/sequencing data (M)" refers to the number of reads detected by the microorganism divided by the total sequencing data (M) of nucleic acids in the sample, that is, the mNGS microorganism detection common index RPM (Reads per million).
The 9 microorganisms shown in FIG. 3 detected the top 9 microorganisms with reads number "no host nucleic acid removed" (control group) after the alignment.
FIG. 3 is a graph showing the results of detection change of microorganisms in an oral swab, and microorganisms in the abscissa of FIG. 3 are, in order from left to right, E.parvulus, H.parainfluenza, S.gossypii, S.haemophilus, C.gingivalis, P.oral, S.mutans, H.haemolyticus, F.nucleatum. It can be seen that the method for removing host nucleic acid according to example 2 can significantly improve the detection of microorganisms in the oral swab. Because different microorganisms have different nucleic acid ratios in the sample, the lifting effect is different.
In FIG. 4, "differential lysis" is a conventional library construction of a control group after host removal using a commercial kit, norvezan FastPure Host Removal and Microbio me DNA Isolation Kit (cat No. DC 501-01), in the course of extraction, comprising the main steps of: 1. 1mL of the Sample was taken into a Sample Tube, 500. Mu.L of Buffer SL was added and mixed upside down at room temperature for 20min.2. Sample Tube was placed in a centrifuge and centrifuged at 10,200rpm for 3min, and the supernatant carefully discarded using a pipette. 3. 190. Mu.L Buffer NAD and 5. Mu.L Ultra Nuclear are added to Sample Tube, mixed by vortexing and water-bath at 37℃for 10min.4. To Sample Tube 20. Mu.L of protease K was added and mixed by vortexing, and water-bath at 56℃for 10min.5. After the reaction, the mixture was subjected to instantaneous centrifugation, and the whole mixture was transferred to a Lysis Tube, and immediately subjected to a subsequent extraction operation. The subsequent extraction step is the same extraction procedure as the "host nucleic acid not removed" (control). The "method of the present invention" in fig. 4 is the method of the present embodiment.
FIG. 4 is a graph showing the detection change of a representative microorganism in a library of different methods for removing the same oral swab, and the ordinate "RPM fold change" of FIG. 4 refers to the ratio of the RPM values of the microorganism corresponding to the removal of the host nucleic acid to the detection of the host nucleic acid without removal. It can be seen that although the differential lysis method can increase the RPM change factor of the microorganism by a factor greater than that of the method of the present invention, the differential lysis method can result in the loss of pathogenic microorganisms (nucleic acid of the microorganism is released by differential lysis for a freeze-thawed sample or a sample with a damaged microbial structure, and the nucleic acid of the microorganism is cleared together with the host nucleic acid along with nuclease used in the differential lysis method), and the method of the present embodiment 2 can avoid the loss of part of the microorganism detection caused by the differential lysis method (fig. 4).
Example 3
This example was performed with reference to example 2, except that the oral swab source used was different. The probe hybridization system was set without using an additive, and the additive used was dimethyl sulfoxide (DMSO) and the additive used was tetramethyl ammonium chloride (TMAC). The detection results are shown in FIG. 5.
In FIG. 5, "host nucleic acid is not removed" is a microorganism detection result obtained by directly performing PCR amplification and subsequent steps without performing a host nucleic acid removal step (probe hybridization and DSN enzyme digestion treatment) using the same linker ligation product. "use additive TMAC" and "use additive DMSO" are the final concentrations of 100mM TMAC and 1% DMSO, respectively, added during the step 5 probe hybridization. "unused additive" is that DMSO or TMAC is not used during the probe hybridization of step 5, and enzyme-free water is used instead.
In FIG. 5, the ordinate "number of detected reads/sequencing data (M)" refers to the number of reads detected by the microorganism divided by the total sequencing data (M) of nucleic acids in the sample, that is, the mNGS microorganism detection common index RPM (Reads per million).
As can be seen from the results of FIG. 5, the use of both additives TMAC and DMSO increases the number of nucleic acid detected by the method of the invention from microorganisms removed from host nucleic acids (e.g., for Streptococcus pneumoniae, streptococcus mitis, and Streptococcus haemolyticus, the number of detected after use of additive TMAC and DMSO, respectively, is significantly higher than that obtained without removal of host nucleic acid and without use of additive), although the increase for the same microorganism varies. Since the nucleic acid ratio of different microorganisms in the sample is different, the lifting effect of different microorganisms is also different.
In one embodiment, the present invention allows for the removal of host nucleic acids (including DNA and RNA) for free nucleic acids (including cfDNA in plasma and free gDNA in respiratory tract samples) without losing the nucleic acids of the pathogenic microorganism to be detected.
In one embodiment, the invention does not affect nucleic acids in samples with high numbers of dead cells and bacteria, or mycoplasma with poor cell structure, and in samples with abundant viruses without cell structure.
In one embodiment, the present invention requires less time than other techniques, is simpler to operate, has no additional bead purification step in a single tube, and significantly reduces the cost of the reagents required for the above-described decommissioning.
In one embodiment, the invention can increase the number of detected pathogenic microorganisms in a sample by at least 3-fold or more by removing host nucleic acids (including DNA and RNA).
In one embodiment, the invention can be used for host rRNA removal in the co-pooling of DNA and RNA without affecting the library construction of DNA nucleic acids.
In one embodiment, the present invention facilitates targeted applications for different de-hosting needs.
In one embodiment, additives (including but not limited to at least one of tetramethyl ammonium chloride, dimethyl sulfoxide, betaine, and formamide) may be additionally added to the hybridization and DSN excision system of the invention to reduce non-specific hybridization of pathogenic nucleic acids. Specifically, in the absence of additives, the decomplexing probe may bind non-specifically to the sequence of the pathogen and then be excised; after the use of the additive, the binding specificity of the probe is improved, and the risk of such excision is significantly reduced.
The foregoing description of the invention has been presented for purposes of illustration and description, and is not intended to be limiting. Several simple deductions, modifications or substitutions may also be made by a person skilled in the art to which the invention pertains, based on the idea of the invention.