CN106778075A

CN106778075A - A kind of device for detecting blood disease correlation somatic mutation

Info

Publication number: CN106778075A
Application number: CN201710067161.6A
Authority: CN
Inventors: 陈玥茏; 侯光远; 李停; 方真; 刘伟; 玄兆伶; 李大为; 梁峻彬; 陈重建
Original assignee: ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd
Current assignee: Annoroad Genetic Technology (Beijing) Co., Ltd.; Annuo uni-data (Yiwu) Medical Inspection Co. Ltd.; Zhejiang Annuo uni-data Biotechnology Co. Ltd.
Priority date: 2016-12-29
Filing date: 2017-02-07
Publication date: 2017-05-31

Abstract

The present invention relates to a kind of device for detecting blood disease correlation somatic mutation, it includes data acquisition module, frequency of mutation statistical module, contrast module, determination module and testing result output module.Device for detecting somatic mutation of the invention more accurately can make a distinction system mistake with real somatic mutation, not only increase sensitivity, and reduce false positive and false negative.

Description

A kind of device for detecting blood disease correlation somatic mutation

Technical field

The present invention relates to low frequency abrupt climatic change field, and in particular to a kind of for detecting blood disease correlation somatic mutation Device and method.

Background technology

Used as non-physical knurl, its gene-correlation studies be in a leading position in cancer positioning, blood disease related gene to blood disease Detection be also earliest enter clinical practice.In recent years, due to the development of Protocols in Molecular Biology, to blood disease cellular elements The understanding that science of heredity changes also deepens continuously.The related gene mutation of blood disease is somatic mutation (SNV).Hitherto reported blood Disease is related at least tens of kinds fusions.Have realized that in most blood disease and there is chromosomal structural aberration, including Missing, repetition, inversion, transposition etc., cause proto-oncogene and tumor suppressor gene structure variation, protooncogene activation or tumor suppressor gene to lose It is living, produce new fusion, encoding fusion protein.Some genes are the transcription factors of regulating cell propagation, differentiation and apoptosis, When gene morphs, downstream signaling pathway is directly affects, cause ability of cell proliferation enhancing, apoptosis obstacle, differentiation Obstacle etc., produces blood disease phenotype.With the development of the deep and technique of gene detection of blood disease pathogenesis, blood disease Inhereditary material change research experienced chromosome karyotype analysis (cytogenetics) detection, fusion genetic test to point dash forward Become and microdeletions duplicate detection.The detection of these three different dimensions, foundation and reference progressively as blood disease diagnosis and treatment.

On the other hand, the Mainstream Platform of two generations sequencing is generally sequenced (Sequencing By using in synthesis Synthesis, SBS) technology carries out nucleic acid sequencing., it is necessary to carry out sequencing library to nucleic acid (DNA or RNA) sample before sequencing Build, basic procedure is as follows：The end that the DNA after fragmentation carries out fragment is repaired first, afterwards fragment 3' after repair , then with the DNA joints (Adapter) containing sequencing primer binding site be connected above-mentioned DNA fragmentation, most by end plus " A " base Expanded by PCR afterwards, complete sequencing library and build.

The difficult point for being directed to hemopathic genetic test is, is not pure cancer cell in the related sample of blood disease, Wherein, the difficulty of detection will increase also substantial amounts of normal white cell with the reduction of cancer cell proportion.How Distinguish occur in real SNV and the sequencing of two generations PCR mistakes, sequencing false positive and be than the noise for bringing such as inaccurate ought The a great problem for above facing.

The content of the invention

The technical problems to be solved by the invention

Just it has been observed that being based on existing platform, the difficult point for carrying out SNV predictions using blood disease correlated samples is to be sequenced Mistake is accurately distinguished with real SNV.

Therefore, can to more accurately distinguish between sequencing it is an object of the invention to provide one kind wrong with true SNV, so as to more The device and method of blood disease correlation SNV is detected exactly.

The present inventor carries out parallel test by further investigation discovery by collecting substantial amounts of Healthy People sample, can be true The error rate of each position of genome is determined, so as to more accurately distinguish between sequencing mistake and SNV, while it is cloudy with vacation to reduce false positive Property.

That is, the present invention includes：

One kind is used to detect the device of blood disease correlation somatic mutation (SNV), and it includes：

Data acquisition module, for obtaining the sequencing data of blood disease correlated samples DNA and the sequencing number of healthy population DNA Include the frequency of mutation in each sites of blood disease correlated samples DNA and related to the blood disease according to, the sequencing data The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each site of sample DNA；Generally, the blood disease The sequencing data of correlated samples DNA can come from the data that blood disease correlated samples DNA to be measured is sequenced and is obtained；It is described The sequencing data of healthy population DNA can come from the healthy population DNA databases having built up, or to Healthy People all living creatures Thing sample DNA is sequenced, and (sequence measurement should be identical with the sequence measurement for the blood disease correlated samples DNA to be measured, i.e., Parallel sequencing) and the data of acquisition；

Frequency of mutation statistical module, it is connected with the data acquisition module, for counting the healthy population colony Each sites of the DNA in each site frequency of mutation distribution situation, obtain healthy population frequency of mutation statistical model；

Contrast module, it is connected with the data acquisition module and the frequency of mutation statistical module, for by described in The frequency of mutation in each sites of blood sample DNA is contrasted with the healthy population frequency of mutation statistical model, obtains contrast knot Really；

Determination module, it is connected with the contrast module, and the mutation for judging each sites of blood sample DNA is No is real somatic mutation, obtains result of determination；Wherein, when the comparing result is without significant difference, result of determination It is non-somatic mutation (including system mistake and a part of germline mutation)；When the comparing result is for there were significant differences and prominent When Frequency is less than setting value, result of determination is real somatic mutation；When the comparing result is for there were significant differences and prominent When Frequency is more than or equal to setting value, result of determination is germline mutation；The setting value can be according to the actual conditions of sequencing Reasonable set is carried out, for example, in sequencing depth in 100 × when, preferred setting value can be 35%；And

Testing result output module, it is connected with the determination module, sentences described in the determination module for exporting Determine result.

Preferably, the data acquisition module includes the frequency of mutation acquisition module in each sites of blood disease correlated samples DNA, The module further includes following submodules：

Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal to sequencing data Low-quality sequencing data；

Submodule is compared, it is connected with the filter submodule, for by the sequencing data and reference sequences after filtering Compare, obtain sequencing fragment corresponding position in genome；

Pretreatment submodule, it is connected with the submodule that compares, for removing the sequencing fragment for repeating；And

Statistic submodule, it is connected with the pretreatment submodule, for counting each sites of blood disease correlated samples DNA The frequency of mutation.

Preferably, the statistic submodule filters out the confidence value (LOD value) in each sites of blood disease correlated samples DNA More than setting value (such as 100) site and carry out frequency of mutation statistics.For each site i, the i ∈ of each sample { human genome }, the computing formula of the detection LOD for the site of sample to be tested is as follows：

Various pieces in formula are obtained by following equation：

Data are described with following both of which：

model M₀Expression does not make a variation in the site, and the base in any non-reference site is considered as sequencing and makes an uproar Sound；

modelRepresenting in the site has real m to be mutated, and gene frequency is f.

M₀It is equivalent to when being f=0

Reference point is r ∈ { A, T, C, G },

And for every read i (i=1 ... d), the base for covering this site is b_i, the error probability of this base is e_i(this error probability by each base mass value e_iObtain,)。

Preferably, the data acquisition module is including in healthy population corresponding with each sites of blood sample DNA The frequency of mutation acquisition module in each each site of individual DNA, the module further includes following submodules：

Statistic submodule, it is connected with the pretreatment submodule, for counting and the blood disease correlated samples DNA The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each site.

Preferably, the frequency of mutation statistical module includes model correction module, and the model correction module is used for Using the healthy population frequency of mutation statistical model for obtaining, pair health corresponding with each sites of blood disease correlated samples DNA Each individual DNA in crowd everybody point is estimated and casts out the site for deviating considerably from, and counts remaining each site Each site the frequency of mutation distribution situation, obtain new healthy population frequency of mutation statistical model.

Preferably, the determination module includes following submodules：

Mutation conspicuousness decision sub-module, it is connected with the contrast module, for judging the blood disease correlation sample The conspicuousness of the mutation in each sites of this DNA；And

Mutation type decision sub-module, it is connected with the mutation conspicuousness decision sub-module, for judging the blood The type of the mutation with conspicuousness in liquid disease each sites of correlated samples DNA is somatic mutation or germline mutation.

Preferably, the mutation conspicuousness decision sub-module judges the mutation in each sites of blood disease correlated samples DNA Whether frequency with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model has significant difference, and (for example criterion is Normal distribution, P<0.05) it is then true mutation that, there were significant differences, without significant difference then for false positive is mutated.

Preferably, the mutation with conspicuousness in each sites of testing result output module output blood disease correlated samples DNA Position and mutation type.

Preferably, the related sample of the blood disease is peripheral blood or marrow.

Here, the somatic mutation refers to the related somatic mutation of blood disease.

Additionally, the present invention is also provided：

For the method using blood disease correlation somatic mutation (SNV) is detected, it includes one kind：

Data acquisition step, obtains the sequencing data of blood disease correlated samples DNA and the sequencing data of healthy population DNA, The frequency of mutation of the sequencing data including each sites of blood sample DNA and each with the blood disease correlated samples DNA The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in site；Generally, the blood disease correlated samples The sequencing data of DNA can come from the data that blood disease correlated samples DNA to be measured is sequenced and is obtained；The healthy population The sequencing data of DNA can come from the healthy population DNA databases having built up, or to healthy population biological specimen DNA is sequenced, and (sequence measurement should be identical with the sequence measurement for the blood disease correlated samples DNA to be measured, i.e., parallel survey Sequence) and the data of acquisition；

Frequency of mutation statistic procedure, counts each site in each sites of the DNA of the healthy population colony Frequency of mutation distribution situation, obtains healthy population frequency of mutation statistical model；

Contrast step, frequency is mutated by the frequency of mutation in each sites of blood disease correlated samples DNA with the healthy population Rate statistical model is contrasted, and obtains comparing result；

Determination step, whether the mutation for judging each sites of blood disease correlated samples DNA is real somatic mutation, Obtain result of determination；Wherein, when the comparing result is without significant difference, result of determination is that non-somatic mutation (including is System is wrong and a part of germline mutation)；When the comparing result is that there were significant differences and the frequency of mutation is less than setting value, sentence Result is determined for real somatic mutation；When the comparing result is for there were significant differences and the frequency of mutation is more than or equal to setting During value, result of determination is germline mutation；The setting value can carry out reasonable set according to the actual conditions of sequencing, for example, In 100 × when, preferred setting value can be 35% to sequencing depth；And

Testing result exports step, exports the result of determination of the determination step.

Preferably, the data acquisition step includes the frequency of mutation obtaining step in each sites of blood disease correlated samples DNA, The step further includes following sub-steps：

Filtering substep, quality inspection, the low-quality sequencing data of filtering removal are carried out to sequencing data；

Sub-step is compared, the sequencing data after filtering and reference sequences are compared, obtain sequencing fragment in genome In corresponding position；

Pretreatment sub-step, removes the sequencing fragment for repeating；And

Statistics sub-step, the frequency of mutation in each sites of statistics blood disease correlated samples DNA.

Preferably, the statistics sub-step filters out the confidence value (LOD value) in each sites of blood disease correlated samples DNA More than setting value (such as 100) site and carry out frequency of mutation statistics.For each site i, the i ∈ of each sample { human genome }, the computing formula of the detection LOD for the site of sample to be tested is as follows：

Various pieces in formula are obtained by following equation：

Data are described with following both of which：

M₀It is equivalent to when being f=0

Reference point is r ∈ { A, T, C, G },

Preferably, the data acquisition step includes Healthy People corresponding with each sites of blood disease correlated samples DNA The frequency of mutation obtaining step in each sites of each individual DNA in group, the step further includes following sub-steps：

Pretreatment sub-step, removes the sequencing fragment for repeating；And

Statistics sub-step, count in corresponding with each sites of blood disease correlated samples DNA healthy population per each and every one The frequency of mutation in each sites of DNA of body.

Preferably, the frequency of mutation statistic procedure includes that model corrects sub-step, and the model correction sub-step is used for Using the healthy population frequency of mutation statistical model for obtaining, pair health corresponding with each sites of blood disease correlated samples DNA Each individual DNA in crowd everybody point is estimated and casts out the site for deviating considerably from, and counts remaining each site Each site the frequency of mutation distribution situation, obtain new healthy population frequency of mutation statistical model.

Preferably, the determination step includes following sub-steps：

Notable sex determination sub-step is mutated, the conspicuousness of the mutation in each sites of blood disease correlated samples DNA is judged；With And

Mutation type judges sub-step, judges the mutation with conspicuousness in each sites of blood disease correlated samples DNA Type be somatic mutation or germline mutation.

Preferably, the notable sex determination sub-step of mutation judges the mutation in each sites of blood disease correlated samples DNA Whether frequency with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model has significant difference, and (for example criterion is Normal distribution, P<0.05) it is then true mutation that, there were significant differences, without significant difference then for false positive is mutated.

Preferably, the mutation with conspicuousness in each sites of testing result output step output blood disease correlated samples DNA Position and mutation type.

In accordance with the invention it is possible to more accurately make a distinction system mistake with real SNV, not only increase sensitive Degree, and reduce false positive and false negative.

Brief description of the drawings

Fig. 1 is the schematic diagram for detecting of the device of blood disease correlation somatic mutation of the invention.

The specific embodiment of invention

The scientific and technical terminology referred in this specification has the implication identical implication being generally understood that with those skilled in the art, It is defined if any definition of the conflict in this specification.

In general, the term used in this specification has following implication.

Beta is distributed：Beta distributions are one continuously distributed, are the distributions for describing Probability p, and span is 0 to 1.Beta Two parameters of α and β are distributed with, wherein α adds 1 for number of success, and β adds 1 for the frequency of failure.

Subclone：For the cell cultivated, from original clone, then filter out the cell with certain characteristic and carry out Culture, is exactly subcloned.

Target sequence capture sequencing：It is that genome area interested is customized into specific probe to exist with genomic DNA Sequence capturing chip (or solution) is hybridized, and the second generation is recycled after the DNA fragmentation of target genome area is enriched with The research strategy that sequencing technologies are sequenced.

Somatic mutation (SNV)：It refer to the mutation occurred except the extracellular body cell of property.The heredity for not resulting in offspring changes Become, can but cause the genetic structure of contemporary some cells to change.

Germline mutation (SNP)：Inherited genetic defects are transmitted by ovum or sperm, and all of embryonic cell all contains There is same genetic defect, this defect is present in reproduction cell, is handed down from age to age.

Normal chain：That DNA is single-stranded with RNA sequence identical；In duplication, normal chain is exactly former with new chain-ordering identical single Chain, non-template chain.

Embodiment

Embodiment given below, more specific description is carried out to the present invention, but the invention is not restricted to these embodiments.

The device for detecting blood disease correlation somatic mutation of the invention of embodiment 1

Embodiment 1 for detect blood disease correlation somatic mutation device possess：

Data acquisition module, for obtaining the sequencing data of blood disease correlated samples DNA and the sequencing number of healthy population DNA Include the frequency of mutation in each sites of blood disease correlated samples DNA and related to the blood disease according to, the sequencing data The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each site of sample DNA；Generally, the blood disease The sequencing data of correlated samples DNA is from the data that blood disease correlated samples DNA to be measured is sequenced and is obtained, the health The sequencing data of crowd DNA is from the healthy population DNA databases having built up；

Contrast module, it is connected with the data acquisition module and the frequency of mutation statistical module, for by described in The frequency of mutation in each sites of blood disease correlated samples DNA is contrasted with the healthy population frequency of mutation statistical model, is obtained Comparing result；

Determination module, it is connected with the contrast module, for judging each sites of blood disease correlated samples DNA Whether mutation is real somatic mutation, obtains result of determination；Wherein, when the comparing result is for there were significant differences and prominent When Frequency is less than setting value, result of determination is real somatic mutation；And

The data acquisition module includes the frequency of mutation acquisition module in each sites of blood disease correlated samples DNA, the module Further include following submodules：

Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal to sequencing data Low-quality sequencing data (being less than Q30), obtains clean fastq data；

Submodule is compared, it is connected with the filter submodule, for by the sequencing data and reference sequences after filtering Compare, obtain sequencing fragment (reads) corresponding position in genome；Specifically, with BWA softwares to clean Fastq data compare and obtain sam formatted files, and sam formatted files are switched into bam forms with samtools (wherein includes The information of reads corresponding positions in genome), save memory headroom；

Pretreatment submodule, it is connected with the submodule that compares, for removing the sequencing fragment for repeating；It is specific and Speech, the pretreatment module treatment bam files, removes the reads for repeating, and obtains unique bam files；

Statistic submodule, it is connected with the pretreatment submodule, the mutation for counting each sites of blood sample DNA Frequency；

Specifically, the statistic submodule is treated for each site i, i ∈ { human genome } of each sample The computing formula of the detection LOD for the site of test sample sheet is as follows：

Various pieces in formula are obtained by following equation：

Data are described with following both of which：

M₀It is equivalent to when being f=0

Reference point is r ∈ { A, T, C, G }, and for every read i (i=1 ... d)

The base for covering this site is b_i, the error probability of this base is e_i(this error probability by each base matter Value e_iObtain,).Finally, LOD is screened>100 site, obtains the frequency of mutation.

The data acquisition module is also included in healthy population corresponding with each sites of blood disease correlated samples DNA The frequency of mutation acquisition module in each each site of individual DNA, the frequency of mutation in the module and each sites of blood sample DNA The difference of acquisition module is：Its statistic submodule do not screen LOD value more than setting value site, but obtain it is all with it is described The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each sites of blood disease correlated samples DNA.

The frequency of mutation statistical module is used for each in each sites of the DNA for count the healthy population colony The distribution situation of the frequency of mutation in site, obtains healthy population frequency of mutation statistical model.The frequency of mutation statistical module includes Model correction module, the model correction module is used for using the healthy population frequency of mutation statistical model that obtains, pair with Each individual DNA everybody points in the corresponding healthy population in each sites of blood disease correlated samples DNA are estimated and give up Go to deviate considerably from the site of (normal distribution, P ＞ 0.05), and count the frequency of mutation in each site in remaining each site Situation, until the point not deviated considerably from, obtains new healthy population frequency of mutation statistical model.

The determination module includes following submodules：

Whether the frequency of mutation in the mutation conspicuousness decision sub-module judgement each sites of blood disease correlated samples DNA There is significant difference with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model, such as criterion is normal state point Cloth, P<0.05, it is then true mutation that there were significant differences, without significant difference then for false positive is mutated.It is true for what there were significant differences Real mutation, when the frequency of mutation is less than 35%, is judged to real somatic mutation；When the frequency of mutation is more than or equal to 35% When, it is judged to germline mutation.

The information of testing result output module output includes：(such as 1444444 is exhausted on No. 12 chromosomes for true mutated site To position, reference gene group is HG19), mutation type (such as somatic mutation) and mutating alkali yl (such as A->T, R172K), The frequency of mutation (such as 12.34%), mutator (such as EGFR), details (such as including gene, transcript, extron, base mutation Situation, amino acid mutation situation etc.).

Embodiment 2

Somatic mutation detection is carried out to a blood sample for inpatient with haematological diseases.

1.1 blood sample DNA are extracted

Embrane method was used to extract blood sample genomic DNA, specific steps are with reference to Tiangeng company blood/cell/tissue base Because of a group DNA extraction kit operation manual

Repair (End Repair) in 1.2 ends

(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 1.

Table 1

(2) reaction is repaired in end：1.5mL centrifuge tubes are placed in 20 DEG C of warm bath 30 in Thermomixer after adding DNA sample Minute.Reaction uses the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system after terminating, be dissolved in 32 μ LEB.

1.3 ends add " A " (A-Tailing)

(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 2：

Table 2

(2) end adds " A " to react：32 μ L previous steps are added to be placed in 1.5mL centrifuge tubes after purifying the DNA for reclaiming 37 DEG C of warm bath 30 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in In 18 μ L EB.

The connection (Adapter Ligation) of 1.4 joints

(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 3：

Table 3

(2) coupled reaction of joint：18 μ L previous steps are added to be placed in sample tube after purifying the DNA for reclaiming 20 DEG C of warm bath 15 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in In the EB of 30 μ L.

1.5PCR reacts

(1) reagent needed for being taken out from -20 DEG C of kits of preservation, prepares PCR reaction systems in the PCR pipe of 2mL：

Table 4

(2) PCR programs are set, the program setting of PCR reactions is as follows：

Reaction terminates timely take out sample and is put into 4 DEG C of Refrigerator stores and exits on request or close instrument.

(3) with the DNA in 0.9 × nucleic acid purification magnetic bead recovery purifying reaction system, library after purification is dissolved in 20 μ L's ddH₂In O.Qubit detections are carried out to library, by library censorship Agilent 2100.

1.6 blood disease target areas capture chip libraries hybridization

(1) in this experiment, for provide hybrid capture reaction ionic environment buffer solution and for elute physics inhale Attached or non-specific hybridization cleaning fluid, rinsing liquid are commercially obtained.

(2) Hybrid Library is prepared：By DNA library to be hybridized in thawed on ice, the μ g of gross mass 1 are taken (in subsequent operation step This DNA library is referred to as sample library in rapid).

(3) Ann primers Pool is prepared：By the corresponding Tag primer In1 of sample library Index (100 μM) and consensus primer (1000 μM) respectively take 1000pmol mixing, (this mixture is referred to as into Ann primer pool in subsequent process steps).

(4) preparation of sample is hybridized：To adding 5 μ L COT DNA (Human Cot-1DNA, Life in 1.5mL EP pipes Technologies, 1mg/mL), 1 μ g samples library, Ann primers pool.The hybridization sample EP for preparing is sealed with sealed membrane Pipe, the EP pipes that will fill sample library pool/COT DNA/Ann primers pool are placed in vacuum plant until being completely dried.

(5) solution of sample is hybridized：To being added in the dry powder of sample library pool/COT DNA/Ann primers pool：

7.5 μ 2 × hybridization buffers of L

3 μ L hybridization components A

(6) said mixture is placed on preprepared 95 DEG C of heating modules after fully mixing is denatured 10 minutes.

(7) said mixture is transferred in the 0.2mL flat cover PCR pipes containing 4.5 μ L capture chips.Fully be vortexed concussion 3 seconds, Hybridization samples mixture is placed in 47 DEG C of heating module upper 16 hours.The hot lid temperature of heating module need to be set as 57 DEG C, Product need to subsequently be eluted reclaimer operation after hybridization.

(8) by 10 × cleaning fluid (I, II and III), 10 × rinsing liquid and 2.5 × magnetic bead cleaning fluid be configured to 1 × working solution.

Table 5

(9) following reagent is preheated in 47 DEG C of heating modules：

400 μ 1 × rinsing liquids of L

100 μ 1 × cleaning fluids of L I

1.7 prepare affine absorption magnetic bead

(1) by Streptavidin MagneSphere (Dynabeads M-280Streptavidin, hereinafter referred to as magnetic bead) at room temperature After 30 minutes, magnetic bead is fully vortexed balance mixing 15 seconds.

(2) to 100 μ L magnetic beads are dispensed in 1.5mL centrifuge tubes, the centrifuge tube that will fill 100 μ L magnetic beads is placed on magnetic frame, Careful suction abandons supernatant after about 5 minutes, plus twice magnetic bead initial volume 1 × magnetic bead cleaning fluid, be vortexed and mix 10 seconds.To fill The centrifuge tube of magnetic bead puts back to magnetic frame, adsorbs magnetic bead.Treat that solution is clarified, supernatant is abandoned in suction.Time step is repeated, is washed twice altogether.

(3) inhaled after washing is finished and abandon magnetic bead cleaning fluid, with 1 × magnetic bead cleaning fluid resuspended magnetic bead of vortex of magnetic bead initial volume It is transferred in the PCR pipe of 0.2mL.PCR pipe is placed on magnetic frame suction after adsorbing magnetic bead clarification and abandons supernatant.

The combination and rinsing of 1.8DNA and affine absorption magnetic bead

(1) the sample library of hybridization is transferred in the 0.2mL PCR pipes for filling affine absorption magnetic bead, vortex oscillation is mixed.

(2) 0.2mL PCR pipes are placed in 47 DEG C of heating modules 45 minutes, were vortexed every 15 minutes and mixed once, make DNA with Magnetic bead is combined.

After (3) 45 minutes are incubated, to 47 DEG C of μ L of 1 × cleaning fluid I 100 of preheating of addition in the DNA sample that 15 μ L are captured. It is vortexed and mixes 10 seconds.Whole components in 0.2mL PCR pipes are transferred in 1.5mL centrifuge tubes.1.5mL centrifuge tubes are placed in magnetic force Magnetic bead is adsorbed on frame, supernatant is abandoned.

(4) 1.5mL centrifuge tubes are removed from magnetic frame, the 1 × rinsing liquid for adding 200 μ L to preheat 47 DEG C.Mixing is played in suction 10 times (need to operate rapidly, prevent reagent, sample temperature to be less than 47 DEG C).Sample is placed in 47 DEG C of heating module upper 5 minutes after mixing. This step is repeated, is washed twice altogether with 47 DEG C of 1 × rinsing liquid.The centrifuge tube of 1.5mL is placed on magnetic frame, magnetic bead is adsorbed, Abandon supernatant.

(5) to 1 × cleaning fluid I that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, it is vortexed and mixes 2 minutes.Will centrifugation Pipe is placed on magnetic frame, adsorbs magnetic bead, abandons supernatant.To 1 × cleaning fluid II that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, It is vortexed and mixes 1 minute.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.To adding 200 in above-mentioned 1.5mL centrifuge tubes 1 × the cleaning fluid III of μ L room temperatures, is vortexed and mixes 30 seconds.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.

(6) 1.5mL centrifuge tubes are removed from magnetic frame, add 45 μ L PCR water, dissolving wash-out magnetic capture sample.

The PCR amplifications of 1.9 capture dnas

(1) according to the form below prepares PCR mix after capture, and the concussion that is vortexed after preparing is mixed.Enriching primer F and enriching primer R It is purchased from Invitrogen Corp..

(2) the amplification program setting of magnetic bead adsorption of DNA PCR is as follows：

(3) recovery purifying of hybrid capture DNA PCR primers：With in nucleic acid purification magnetic bead recovery purifying reaction system DNA, magnetic bead usage amount is 0.9 ×, library after purification is dissolved in the ddH of 30 μ L₂In O.

1.10 libraries quantify

2100Bio Analyzer (Agilent)/LabChip GX (Caliper) and QPCR detections, note are carried out to library Record library concentration.

Machine sequencing on 1.11 libraries

The library for building is sequenced (PE75) using NextSeq 550AR.

1.12 data processing and inversions

The sequencing data that will be obtained is input into the device of embodiment 1, detects somatic mutation.Testing result is as shown in the table.

Mutator	Details	The frequency of mutation
			KIT	N822K, c.2466T>A	20.9%

1.13 result verifications

The somatic mutation whether same Bone Marrow of Patients sample occurs above-mentioned site is tested using generation sequence measurement Card, testing result shows that KIT genes occur N822K, c.2466T>The mutation of A, deletion frequency about 20%, the result with 1.12 testing results are consistent.The body cell that detection means of the invention can successfully detect blood disease correlation in blood sample is dashed forward Become.

Embodiment 3

Sample of bone marrow to chronic lymphocytic leukemia (CLL) patient carries out somatic mutation detection.Used Embrane method extracts sample of bone marrow genomic DNA, and specific steps are with reference to Tiangeng company blood/cell/tissue extracting genome DNA reagent Box operation manual.

Testing result is as shown in the table.

Mutator	Details	The frequency of mutation
			TP53	S46fs, c.137_144del	54%

The somatic mutation whether same Patient's surplus sample of bone marrow occurs above-mentioned site is entered using generation sequence measurement Row checking, testing result shows that TP53 genes occur S46fs, missing c.137_144del, deletion frequency about 50%, checking knot Fruit is consistent with the testing result of upper table.The body that detection means of the invention can successfully detect blood disease correlation in sample of bone marrow is thin Cytoplasmic process becomes.

Industrial applicibility

According to the present invention, there is provided it is wrong with true SNV, so as to more accurately profit that one kind can more accurately distinguish between sequencing The device and method of SNV is detected with blood sample.

Claims

1. a kind of device for detecting the related somatic mutation of blood disease, it includes：

Data acquisition module, sequencing data and the sequencing number of healthy population DNA for obtaining the related sample DNA of blood disease According to, the sequencing data include the frequency of mutation in each sites of blood sample DNA and with the blood sample DNA everybody The frequency of mutation in each sites of each individual DNA in the corresponding healthy population of point；

Frequency of mutation statistical module, it is connected with the data acquisition module, the institute for counting the healthy population colony The frequency of mutation distribution situation in each site in each sites of DNA is stated, healthy population frequency of mutation statistical model is obtained；

Contrast module, it is connected with the data acquisition module and the frequency of mutation statistical module, for by the blood The frequency of mutation in each site of sample DNA is contrasted with the healthy population frequency of mutation statistical model, obtains comparing result；

Determination module, it is connected with the contrast module, for judge each sites of blood sample DNA mutation whether as Real somatic mutation, obtains result of determination；Wherein, when the comparing result is for there were significant differences and the frequency of mutation is less than During setting value, result of determination is real somatic mutation；And

Testing result output module, it is connected with the determination module, the judgement knot for exporting the determination module Really.

2. device according to claim 1, wherein, the data acquisition module includes the prominent of each sites of blood sample DNA Frequency acquisition module, the frequency of mutation acquisition module in each sites of blood sample DNA further includes following submodules：

Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal low-quality to sequencing data The sequencing data of amount；

Submodule is compared, it is connected with the filter submodule, for the sequencing data after filtering to be carried out with reference sequences Compare, obtain sequencing fragment corresponding position in genome；

Statistic submodule, its with it is described pretreatment submodule be connected, for count each sites of blood sample DNA mutation frequently Rate.

3. device according to claim 2, wherein, the statistic submodule is filtered out in each sites of blood sample DNA Confidence value is more than the site of setting value and carries out frequency of mutation statistics.

4. the device according to any one of claims 1 to 3, wherein, the frequency of mutation statistical module includes model school Syndrome generation module, the model correction module is used for using the healthy population frequency of mutation statistical model that obtains, pair with the blood The every points of each individual DNA in the related corresponding healthy population in each site of sample DNA of liquid disease are estimated and cast out bright Show the site of deviation, and count the distribution situation of the frequency of mutation in each site in remaining each site, obtain new being good for Kang Renqun frequency of mutation statistical models.

5. the device according to any one of Claims 1 to 4, wherein, the determination module includes following submodules：

Mutation conspicuousness decision sub-module, it is connected with the contrast module, for judging each sites of blood sample DNA Mutation conspicuousness；And

Mutation type decision sub-module, it is connected with the mutation conspicuousness decision sub-module, for judging the blood sample Whether the type of the mutation with conspicuousness in each sites of this DNA is somatic mutation.

6. device according to claim 5, wherein, the mutation conspicuousness decision sub-module judges that the blood disease is related The frequency of mutation in each site of sample DNA whether deposited with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model It is normal distribution P in the criterion of significant difference<0.05, it is then true mutation that there were significant differences.

7. the device according to any one of claim 1~6, wherein, testing result output module output blood disease is related Each site of sample DNA the mutation with conspicuousness position and mutation type.

8. the device according to any one of claim 1~7, wherein, the related sample of the blood disease be peripheral blood or Person's marrow.