[go: up one dir, main page]

CN106778075A - A kind of device for detecting blood disease correlation somatic mutation - Google Patents

A kind of device for detecting blood disease correlation somatic mutation Download PDF

Info

Publication number
CN106778075A
CN106778075A CN201710067161.6A CN201710067161A CN106778075A CN 106778075 A CN106778075 A CN 106778075A CN 201710067161 A CN201710067161 A CN 201710067161A CN 106778075 A CN106778075 A CN 106778075A
Authority
CN
China
Prior art keywords
mutation
frequency
module
dna
sites
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710067161.6A
Other languages
Chinese (zh)
Inventor
陈玥茏
侯光远
李停
方真
刘伟
玄兆伶
李大为
梁峻彬
陈重建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Annoroad Genetic Technology (Beijing) Co., Ltd.
Annuo uni-data (Yiwu) Medical Inspection Co. Ltd.
Zhejiang Annuo uni-data Biotechnology Co. Ltd.
Original Assignee
ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd filed Critical ANNOROAD GENETIC TECHNOLOGY (BEIJING) Co Ltd
Publication of CN106778075A publication Critical patent/CN106778075A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to a kind of device for detecting blood disease correlation somatic mutation, it includes data acquisition module, frequency of mutation statistical module, contrast module, determination module and testing result output module.Device for detecting somatic mutation of the invention more accurately can make a distinction system mistake with real somatic mutation, not only increase sensitivity, and reduce false positive and false negative.

Description

A kind of device for detecting blood disease correlation somatic mutation
Technical field
The present invention relates to low frequency abrupt climatic change field, and in particular to a kind of for detecting blood disease correlation somatic mutation Device and method.
Background technology
Used as non-physical knurl, its gene-correlation studies be in a leading position in cancer positioning, blood disease related gene to blood disease Detection be also earliest enter clinical practice.In recent years, due to the development of Protocols in Molecular Biology, to blood disease cellular elements The understanding that science of heredity changes also deepens continuously.The related gene mutation of blood disease is somatic mutation (SNV).Hitherto reported blood Disease is related at least tens of kinds fusions.Have realized that in most blood disease and there is chromosomal structural aberration, including Missing, repetition, inversion, transposition etc., cause proto-oncogene and tumor suppressor gene structure variation, protooncogene activation or tumor suppressor gene to lose It is living, produce new fusion, encoding fusion protein.Some genes are the transcription factors of regulating cell propagation, differentiation and apoptosis, When gene morphs, downstream signaling pathway is directly affects, cause ability of cell proliferation enhancing, apoptosis obstacle, differentiation Obstacle etc., produces blood disease phenotype.With the development of the deep and technique of gene detection of blood disease pathogenesis, blood disease Inhereditary material change research experienced chromosome karyotype analysis (cytogenetics) detection, fusion genetic test to point dash forward Become and microdeletions duplicate detection.The detection of these three different dimensions, foundation and reference progressively as blood disease diagnosis and treatment.
On the other hand, the Mainstream Platform of two generations sequencing is generally sequenced (Sequencing By using in synthesis Synthesis, SBS) technology carries out nucleic acid sequencing., it is necessary to carry out sequencing library to nucleic acid (DNA or RNA) sample before sequencing Build, basic procedure is as follows:The end that the DNA after fragmentation carries out fragment is repaired first, afterwards fragment 3' after repair , then with the DNA joints (Adapter) containing sequencing primer binding site be connected above-mentioned DNA fragmentation, most by end plus " A " base Expanded by PCR afterwards, complete sequencing library and build.
The difficult point for being directed to hemopathic genetic test is, is not pure cancer cell in the related sample of blood disease, Wherein, the difficulty of detection will increase also substantial amounts of normal white cell with the reduction of cancer cell proportion.How Distinguish occur in real SNV and the sequencing of two generations PCR mistakes, sequencing false positive and be than the noise for bringing such as inaccurate ought The a great problem for above facing.
The content of the invention
The technical problems to be solved by the invention
Just it has been observed that being based on existing platform, the difficult point for carrying out SNV predictions using blood disease correlated samples is to be sequenced Mistake is accurately distinguished with real SNV.
Therefore, can to more accurately distinguish between sequencing it is an object of the invention to provide one kind wrong with true SNV, so as to more The device and method of blood disease correlation SNV is detected exactly.
The present inventor carries out parallel test by further investigation discovery by collecting substantial amounts of Healthy People sample, can be true The error rate of each position of genome is determined, so as to more accurately distinguish between sequencing mistake and SNV, while it is cloudy with vacation to reduce false positive Property.
That is, the present invention includes:
One kind is used to detect the device of blood disease correlation somatic mutation (SNV), and it includes:
Data acquisition module, for obtaining the sequencing data of blood disease correlated samples DNA and the sequencing number of healthy population DNA Include the frequency of mutation in each sites of blood disease correlated samples DNA and related to the blood disease according to, the sequencing data The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each site of sample DNA;Generally, the blood disease The sequencing data of correlated samples DNA can come from the data that blood disease correlated samples DNA to be measured is sequenced and is obtained;It is described The sequencing data of healthy population DNA can come from the healthy population DNA databases having built up, or to Healthy People all living creatures Thing sample DNA is sequenced, and (sequence measurement should be identical with the sequence measurement for the blood disease correlated samples DNA to be measured, i.e., Parallel sequencing) and the data of acquisition;
Frequency of mutation statistical module, it is connected with the data acquisition module, for counting the healthy population colony Each sites of the DNA in each site frequency of mutation distribution situation, obtain healthy population frequency of mutation statistical model;
Contrast module, it is connected with the data acquisition module and the frequency of mutation statistical module, for by described in The frequency of mutation in each sites of blood sample DNA is contrasted with the healthy population frequency of mutation statistical model, obtains contrast knot Really;
Determination module, it is connected with the contrast module, and the mutation for judging each sites of blood sample DNA is No is real somatic mutation, obtains result of determination;Wherein, when the comparing result is without significant difference, result of determination It is non-somatic mutation (including system mistake and a part of germline mutation);When the comparing result is for there were significant differences and prominent When Frequency is less than setting value, result of determination is real somatic mutation;When the comparing result is for there were significant differences and prominent When Frequency is more than or equal to setting value, result of determination is germline mutation;The setting value can be according to the actual conditions of sequencing Reasonable set is carried out, for example, in sequencing depth in 100 × when, preferred setting value can be 35%;And
Testing result output module, it is connected with the determination module, sentences described in the determination module for exporting Determine result.
Preferably, the data acquisition module includes the frequency of mutation acquisition module in each sites of blood disease correlated samples DNA, The module further includes following submodules:
Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal to sequencing data Low-quality sequencing data;
Submodule is compared, it is connected with the filter submodule, for by the sequencing data and reference sequences after filtering Compare, obtain sequencing fragment corresponding position in genome;
Pretreatment submodule, it is connected with the submodule that compares, for removing the sequencing fragment for repeating;And
Statistic submodule, it is connected with the pretreatment submodule, for counting each sites of blood disease correlated samples DNA The frequency of mutation.
Preferably, the statistic submodule filters out the confidence value (LOD value) in each sites of blood disease correlated samples DNA More than setting value (such as 100) site and carry out frequency of mutation statistics.For each site i, the i ∈ of each sample { human genome }, the computing formula of the detection LOD for the site of sample to be tested is as follows:
Various pieces in formula are obtained by following equation:
Data are described with following both of which:
model M0Expression does not make a variation in the site, and the base in any non-reference site is considered as sequencing and makes an uproar Sound;
modelRepresenting in the site has real m to be mutated, and gene frequency is f.
M0It is equivalent to when being f=0
Reference point is r ∈ { A, T, C, G },
And for every read i (i=1 ... d), the base for covering this site is bi, the error probability of this base is ei(this error probability by each base mass value eiObtain,)。
Preferably, the data acquisition module is including in healthy population corresponding with each sites of blood sample DNA The frequency of mutation acquisition module in each each site of individual DNA, the module further includes following submodules:
Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal to sequencing data Low-quality sequencing data;
Submodule is compared, it is connected with the filter submodule, for by the sequencing data and reference sequences after filtering Compare, obtain sequencing fragment corresponding position in genome;
Pretreatment submodule, it is connected with the submodule that compares, for removing the sequencing fragment for repeating;And
Statistic submodule, it is connected with the pretreatment submodule, for counting and the blood disease correlated samples DNA The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each site.
Preferably, the frequency of mutation statistical module includes model correction module, and the model correction module is used for Using the healthy population frequency of mutation statistical model for obtaining, pair health corresponding with each sites of blood disease correlated samples DNA Each individual DNA in crowd everybody point is estimated and casts out the site for deviating considerably from, and counts remaining each site Each site the frequency of mutation distribution situation, obtain new healthy population frequency of mutation statistical model.
Preferably, the determination module includes following submodules:
Mutation conspicuousness decision sub-module, it is connected with the contrast module, for judging the blood disease correlation sample The conspicuousness of the mutation in each sites of this DNA;And
Mutation type decision sub-module, it is connected with the mutation conspicuousness decision sub-module, for judging the blood The type of the mutation with conspicuousness in liquid disease each sites of correlated samples DNA is somatic mutation or germline mutation.
Preferably, the mutation conspicuousness decision sub-module judges the mutation in each sites of blood disease correlated samples DNA Whether frequency with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model has significant difference, and (for example criterion is Normal distribution, P<0.05) it is then true mutation that, there were significant differences, without significant difference then for false positive is mutated.
Preferably, the mutation with conspicuousness in each sites of testing result output module output blood disease correlated samples DNA Position and mutation type.
Preferably, the related sample of the blood disease is peripheral blood or marrow.
Here, the somatic mutation refers to the related somatic mutation of blood disease.
Additionally, the present invention is also provided:
For the method using blood disease correlation somatic mutation (SNV) is detected, it includes one kind:
Data acquisition step, obtains the sequencing data of blood disease correlated samples DNA and the sequencing data of healthy population DNA, The frequency of mutation of the sequencing data including each sites of blood sample DNA and each with the blood disease correlated samples DNA The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in site;Generally, the blood disease correlated samples The sequencing data of DNA can come from the data that blood disease correlated samples DNA to be measured is sequenced and is obtained;The healthy population The sequencing data of DNA can come from the healthy population DNA databases having built up, or to healthy population biological specimen DNA is sequenced, and (sequence measurement should be identical with the sequence measurement for the blood disease correlated samples DNA to be measured, i.e., parallel survey Sequence) and the data of acquisition;
Frequency of mutation statistic procedure, counts each site in each sites of the DNA of the healthy population colony Frequency of mutation distribution situation, obtains healthy population frequency of mutation statistical model;
Contrast step, frequency is mutated by the frequency of mutation in each sites of blood disease correlated samples DNA with the healthy population Rate statistical model is contrasted, and obtains comparing result;
Determination step, whether the mutation for judging each sites of blood disease correlated samples DNA is real somatic mutation, Obtain result of determination;Wherein, when the comparing result is without significant difference, result of determination is that non-somatic mutation (including is System is wrong and a part of germline mutation);When the comparing result is that there were significant differences and the frequency of mutation is less than setting value, sentence Result is determined for real somatic mutation;When the comparing result is for there were significant differences and the frequency of mutation is more than or equal to setting During value, result of determination is germline mutation;The setting value can carry out reasonable set according to the actual conditions of sequencing, for example, In 100 × when, preferred setting value can be 35% to sequencing depth;And
Testing result exports step, exports the result of determination of the determination step.
Preferably, the data acquisition step includes the frequency of mutation obtaining step in each sites of blood disease correlated samples DNA, The step further includes following sub-steps:
Filtering substep, quality inspection, the low-quality sequencing data of filtering removal are carried out to sequencing data;
Sub-step is compared, the sequencing data after filtering and reference sequences are compared, obtain sequencing fragment in genome In corresponding position;
Pretreatment sub-step, removes the sequencing fragment for repeating;And
Statistics sub-step, the frequency of mutation in each sites of statistics blood disease correlated samples DNA.
Preferably, the statistics sub-step filters out the confidence value (LOD value) in each sites of blood disease correlated samples DNA More than setting value (such as 100) site and carry out frequency of mutation statistics.For each site i, the i ∈ of each sample { human genome }, the computing formula of the detection LOD for the site of sample to be tested is as follows:
Various pieces in formula are obtained by following equation:
Data are described with following both of which:
model M0Expression does not make a variation in the site, and the base in any non-reference site is considered as sequencing and makes an uproar Sound;
modelRepresenting in the site has real m to be mutated, and gene frequency is f.
M0It is equivalent to when being f=0
Reference point is r ∈ { A, T, C, G },
And for every read i (i=1 ... d), the base for covering this site is bi, the error probability of this base is ei(this error probability by each base mass value eiObtain,)。
Preferably, the data acquisition step includes Healthy People corresponding with each sites of blood disease correlated samples DNA The frequency of mutation obtaining step in each sites of each individual DNA in group, the step further includes following sub-steps:
Filtering substep, quality inspection, the low-quality sequencing data of filtering removal are carried out to sequencing data;
Sub-step is compared, the sequencing data after filtering and reference sequences are compared, obtain sequencing fragment in genome In corresponding position;
Pretreatment sub-step, removes the sequencing fragment for repeating;And
Statistics sub-step, count in corresponding with each sites of blood disease correlated samples DNA healthy population per each and every one The frequency of mutation in each sites of DNA of body.
Preferably, the frequency of mutation statistic procedure includes that model corrects sub-step, and the model correction sub-step is used for Using the healthy population frequency of mutation statistical model for obtaining, pair health corresponding with each sites of blood disease correlated samples DNA Each individual DNA in crowd everybody point is estimated and casts out the site for deviating considerably from, and counts remaining each site Each site the frequency of mutation distribution situation, obtain new healthy population frequency of mutation statistical model.
Preferably, the determination step includes following sub-steps:
Notable sex determination sub-step is mutated, the conspicuousness of the mutation in each sites of blood disease correlated samples DNA is judged;With And
Mutation type judges sub-step, judges the mutation with conspicuousness in each sites of blood disease correlated samples DNA Type be somatic mutation or germline mutation.
Preferably, the notable sex determination sub-step of mutation judges the mutation in each sites of blood disease correlated samples DNA Whether frequency with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model has significant difference, and (for example criterion is Normal distribution, P<0.05) it is then true mutation that, there were significant differences, without significant difference then for false positive is mutated.
Preferably, the mutation with conspicuousness in each sites of testing result output step output blood disease correlated samples DNA Position and mutation type.
Preferably, the related sample of the blood disease is peripheral blood or marrow.
Here, the somatic mutation refers to the related somatic mutation of blood disease.
In accordance with the invention it is possible to more accurately make a distinction system mistake with real SNV, not only increase sensitive Degree, and reduce false positive and false negative.
Brief description of the drawings
Fig. 1 is the schematic diagram for detecting of the device of blood disease correlation somatic mutation of the invention.
The specific embodiment of invention
The scientific and technical terminology referred in this specification has the implication identical implication being generally understood that with those skilled in the art, It is defined if any definition of the conflict in this specification.
In general, the term used in this specification has following implication.
Beta is distributed:Beta distributions are one continuously distributed, are the distributions for describing Probability p, and span is 0 to 1.Beta Two parameters of α and β are distributed with, wherein α adds 1 for number of success, and β adds 1 for the frequency of failure.
Subclone:For the cell cultivated, from original clone, then filter out the cell with certain characteristic and carry out Culture, is exactly subcloned.
Target sequence capture sequencing:It is that genome area interested is customized into specific probe to exist with genomic DNA Sequence capturing chip (or solution) is hybridized, and the second generation is recycled after the DNA fragmentation of target genome area is enriched with The research strategy that sequencing technologies are sequenced.
Somatic mutation (SNV):It refer to the mutation occurred except the extracellular body cell of property.The heredity for not resulting in offspring changes Become, can but cause the genetic structure of contemporary some cells to change.
Germline mutation (SNP):Inherited genetic defects are transmitted by ovum or sperm, and all of embryonic cell all contains There is same genetic defect, this defect is present in reproduction cell, is handed down from age to age.
Normal chain:That DNA is single-stranded with RNA sequence identical;In duplication, normal chain is exactly former with new chain-ordering identical single Chain, non-template chain.
Embodiment
Embodiment given below, more specific description is carried out to the present invention, but the invention is not restricted to these embodiments.
The device for detecting blood disease correlation somatic mutation of the invention of embodiment 1
Embodiment 1 for detect blood disease correlation somatic mutation device possess:
Data acquisition module, for obtaining the sequencing data of blood disease correlated samples DNA and the sequencing number of healthy population DNA Include the frequency of mutation in each sites of blood disease correlated samples DNA and related to the blood disease according to, the sequencing data The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each site of sample DNA;Generally, the blood disease The sequencing data of correlated samples DNA is from the data that blood disease correlated samples DNA to be measured is sequenced and is obtained, the health The sequencing data of crowd DNA is from the healthy population DNA databases having built up;
Frequency of mutation statistical module, it is connected with the data acquisition module, for counting the healthy population colony Each sites of the DNA in each site frequency of mutation distribution situation, obtain healthy population frequency of mutation statistical model;
Contrast module, it is connected with the data acquisition module and the frequency of mutation statistical module, for by described in The frequency of mutation in each sites of blood disease correlated samples DNA is contrasted with the healthy population frequency of mutation statistical model, is obtained Comparing result;
Determination module, it is connected with the contrast module, for judging each sites of blood disease correlated samples DNA Whether mutation is real somatic mutation, obtains result of determination;Wherein, when the comparing result is for there were significant differences and prominent When Frequency is less than setting value, result of determination is real somatic mutation;And
Testing result output module, it is connected with the determination module, sentences described in the determination module for exporting Determine result.
The data acquisition module includes the frequency of mutation acquisition module in each sites of blood disease correlated samples DNA, the module Further include following submodules:
Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal to sequencing data Low-quality sequencing data (being less than Q30), obtains clean fastq data;
Submodule is compared, it is connected with the filter submodule, for by the sequencing data and reference sequences after filtering Compare, obtain sequencing fragment (reads) corresponding position in genome;Specifically, with BWA softwares to clean Fastq data compare and obtain sam formatted files, and sam formatted files are switched into bam forms with samtools (wherein includes The information of reads corresponding positions in genome), save memory headroom;
Pretreatment submodule, it is connected with the submodule that compares, for removing the sequencing fragment for repeating;It is specific and Speech, the pretreatment module treatment bam files, removes the reads for repeating, and obtains unique bam files;
Statistic submodule, it is connected with the pretreatment submodule, the mutation for counting each sites of blood sample DNA Frequency;
Specifically, the statistic submodule is treated for each site i, i ∈ { human genome } of each sample The computing formula of the detection LOD for the site of test sample sheet is as follows:
Various pieces in formula are obtained by following equation:
Data are described with following both of which:
model M0Expression does not make a variation in the site, and the base in any non-reference site is considered as sequencing and makes an uproar Sound;
modelRepresenting in the site has real m to be mutated, and gene frequency is f.
M0It is equivalent to when being f=0
Reference point is r ∈ { A, T, C, G }, and for every read i (i=1 ... d)
The base for covering this site is bi, the error probability of this base is ei(this error probability by each base matter Value eiObtain,).Finally, LOD is screened>100 site, obtains the frequency of mutation.
The data acquisition module is also included in healthy population corresponding with each sites of blood disease correlated samples DNA The frequency of mutation acquisition module in each each site of individual DNA, the frequency of mutation in the module and each sites of blood sample DNA The difference of acquisition module is:Its statistic submodule do not screen LOD value more than setting value site, but obtain it is all with it is described The frequency of mutation in each sites of each individual DNA in the corresponding healthy population in each sites of blood disease correlated samples DNA.
The frequency of mutation statistical module is used for each in each sites of the DNA for count the healthy population colony The distribution situation of the frequency of mutation in site, obtains healthy population frequency of mutation statistical model.The frequency of mutation statistical module includes Model correction module, the model correction module is used for using the healthy population frequency of mutation statistical model that obtains, pair with Each individual DNA everybody points in the corresponding healthy population in each sites of blood disease correlated samples DNA are estimated and give up Go to deviate considerably from the site of (normal distribution, P > 0.05), and count the frequency of mutation in each site in remaining each site Situation, until the point not deviated considerably from, obtains new healthy population frequency of mutation statistical model.
The determination module includes following submodules:
Mutation conspicuousness decision sub-module, it is connected with the contrast module, for judging the blood disease correlation sample The conspicuousness of the mutation in each sites of this DNA;And
Mutation type decision sub-module, it is connected with the mutation conspicuousness decision sub-module, for judging the blood The type of the mutation with conspicuousness in liquid disease each sites of correlated samples DNA is somatic mutation or germline mutation.
Whether the frequency of mutation in the mutation conspicuousness decision sub-module judgement each sites of blood disease correlated samples DNA There is significant difference with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model, such as criterion is normal state point Cloth, P<0.05, it is then true mutation that there were significant differences, without significant difference then for false positive is mutated.It is true for what there were significant differences Real mutation, when the frequency of mutation is less than 35%, is judged to real somatic mutation;When the frequency of mutation is more than or equal to 35% When, it is judged to germline mutation.
The information of testing result output module output includes:(such as 1444444 is exhausted on No. 12 chromosomes for true mutated site To position, reference gene group is HG19), mutation type (such as somatic mutation) and mutating alkali yl (such as A->T, R172K), The frequency of mutation (such as 12.34%), mutator (such as EGFR), details (such as including gene, transcript, extron, base mutation Situation, amino acid mutation situation etc.).
Embodiment 2
Somatic mutation detection is carried out to a blood sample for inpatient with haematological diseases.
1.1 blood sample DNA are extracted
Embrane method was used to extract blood sample genomic DNA, specific steps are with reference to Tiangeng company blood/cell/tissue base Because of a group DNA extraction kit operation manual
Repair (End Repair) in 1.2 ends
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 1.
Table 1
(2) reaction is repaired in end:1.5mL centrifuge tubes are placed in 20 DEG C of warm bath 30 in Thermomixer after adding DNA sample Minute.Reaction uses the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system after terminating, be dissolved in 32 μ LEB.
1.3 ends add " A " (A-Tailing)
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 2:
Table 2
(2) end adds " A " to react:32 μ L previous steps are added to be placed in 1.5mL centrifuge tubes after purifying the DNA for reclaiming 37 DEG C of warm bath 30 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in In 18 μ L EB.
The connection (Adapter Ligation) of 1.4 joints
(1) reagent needed for being taken out from -20 DEG C of kits of preservation in advance, single sample amount of preparation is referring to table 3:
Table 3
(2) coupled reaction of joint:18 μ L previous steps are added to be placed in sample tube after purifying the DNA for reclaiming 20 DEG C of warm bath 15 minutes in Thermomixer.Using the DNA in 1.8 × nucleic acid purification magnetic bead recovery purifying reaction system, it is dissolved in In the EB of 30 μ L.
1.5PCR reacts
(1) reagent needed for being taken out from -20 DEG C of kits of preservation, prepares PCR reaction systems in the PCR pipe of 2mL:
Table 4
(2) PCR programs are set, the program setting of PCR reactions is as follows:
Reaction terminates timely take out sample and is put into 4 DEG C of Refrigerator stores and exits on request or close instrument.
(3) with the DNA in 0.9 × nucleic acid purification magnetic bead recovery purifying reaction system, library after purification is dissolved in 20 μ L's ddH2In O.Qubit detections are carried out to library, by library censorship Agilent 2100.
1.6 blood disease target areas capture chip libraries hybridization
(1) in this experiment, for provide hybrid capture reaction ionic environment buffer solution and for elute physics inhale Attached or non-specific hybridization cleaning fluid, rinsing liquid are commercially obtained.
(2) Hybrid Library is prepared:By DNA library to be hybridized in thawed on ice, the μ g of gross mass 1 are taken (in subsequent operation step This DNA library is referred to as sample library in rapid).
(3) Ann primers Pool is prepared:By the corresponding Tag primer In1 of sample library Index (100 μM) and consensus primer (1000 μM) respectively take 1000pmol mixing, (this mixture is referred to as into Ann primer pool in subsequent process steps).
(4) preparation of sample is hybridized:To adding 5 μ L COT DNA (Human Cot-1DNA, Life in 1.5mL EP pipes Technologies, 1mg/mL), 1 μ g samples library, Ann primers pool.The hybridization sample EP for preparing is sealed with sealed membrane Pipe, the EP pipes that will fill sample library pool/COT DNA/Ann primers pool are placed in vacuum plant until being completely dried.
(5) solution of sample is hybridized:To being added in the dry powder of sample library pool/COT DNA/Ann primers pool:
7.5 μ 2 × hybridization buffers of L
3 μ L hybridization components A
(6) said mixture is placed on preprepared 95 DEG C of heating modules after fully mixing is denatured 10 minutes.
(7) said mixture is transferred in the 0.2mL flat cover PCR pipes containing 4.5 μ L capture chips.Fully be vortexed concussion 3 seconds, Hybridization samples mixture is placed in 47 DEG C of heating module upper 16 hours.The hot lid temperature of heating module need to be set as 57 DEG C, Product need to subsequently be eluted reclaimer operation after hybridization.
(8) by 10 × cleaning fluid (I, II and III), 10 × rinsing liquid and 2.5 × magnetic bead cleaning fluid be configured to 1 × working solution.
Table 5
(9) following reagent is preheated in 47 DEG C of heating modules:
400 μ 1 × rinsing liquids of L
100 μ 1 × cleaning fluids of L I
1.7 prepare affine absorption magnetic bead
(1) by Streptavidin MagneSphere (Dynabeads M-280Streptavidin, hereinafter referred to as magnetic bead) at room temperature After 30 minutes, magnetic bead is fully vortexed balance mixing 15 seconds.
(2) to 100 μ L magnetic beads are dispensed in 1.5mL centrifuge tubes, the centrifuge tube that will fill 100 μ L magnetic beads is placed on magnetic frame, Careful suction abandons supernatant after about 5 minutes, plus twice magnetic bead initial volume 1 × magnetic bead cleaning fluid, be vortexed and mix 10 seconds.To fill The centrifuge tube of magnetic bead puts back to magnetic frame, adsorbs magnetic bead.Treat that solution is clarified, supernatant is abandoned in suction.Time step is repeated, is washed twice altogether.
(3) inhaled after washing is finished and abandon magnetic bead cleaning fluid, with 1 × magnetic bead cleaning fluid resuspended magnetic bead of vortex of magnetic bead initial volume It is transferred in the PCR pipe of 0.2mL.PCR pipe is placed on magnetic frame suction after adsorbing magnetic bead clarification and abandons supernatant.
The combination and rinsing of 1.8DNA and affine absorption magnetic bead
(1) the sample library of hybridization is transferred in the 0.2mL PCR pipes for filling affine absorption magnetic bead, vortex oscillation is mixed.
(2) 0.2mL PCR pipes are placed in 47 DEG C of heating modules 45 minutes, were vortexed every 15 minutes and mixed once, make DNA with Magnetic bead is combined.
After (3) 45 minutes are incubated, to 47 DEG C of μ L of 1 × cleaning fluid I 100 of preheating of addition in the DNA sample that 15 μ L are captured. It is vortexed and mixes 10 seconds.Whole components in 0.2mL PCR pipes are transferred in 1.5mL centrifuge tubes.1.5mL centrifuge tubes are placed in magnetic force Magnetic bead is adsorbed on frame, supernatant is abandoned.
(4) 1.5mL centrifuge tubes are removed from magnetic frame, the 1 × rinsing liquid for adding 200 μ L to preheat 47 DEG C.Mixing is played in suction 10 times (need to operate rapidly, prevent reagent, sample temperature to be less than 47 DEG C).Sample is placed in 47 DEG C of heating module upper 5 minutes after mixing. This step is repeated, is washed twice altogether with 47 DEG C of 1 × rinsing liquid.The centrifuge tube of 1.5mL is placed on magnetic frame, magnetic bead is adsorbed, Abandon supernatant.
(5) to 1 × cleaning fluid I that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, it is vortexed and mixes 2 minutes.Will centrifugation Pipe is placed on magnetic frame, adsorbs magnetic bead, abandons supernatant.To 1 × cleaning fluid II that 200 μ L room temperatures are added in above-mentioned 1.5mL centrifuge tubes, It is vortexed and mixes 1 minute.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.To adding 200 in above-mentioned 1.5mL centrifuge tubes 1 × the cleaning fluid III of μ L room temperatures, is vortexed and mixes 30 seconds.Centrifuge tube is placed on magnetic frame, magnetic bead is adsorbed, supernatant is abandoned.
(6) 1.5mL centrifuge tubes are removed from magnetic frame, add 45 μ L PCR water, dissolving wash-out magnetic capture sample.
The PCR amplifications of 1.9 capture dnas
(1) according to the form below prepares PCR mix after capture, and the concussion that is vortexed after preparing is mixed.Enriching primer F and enriching primer R It is purchased from Invitrogen Corp..
(2) the amplification program setting of magnetic bead adsorption of DNA PCR is as follows:
(3) recovery purifying of hybrid capture DNA PCR primers:With in nucleic acid purification magnetic bead recovery purifying reaction system DNA, magnetic bead usage amount is 0.9 ×, library after purification is dissolved in the ddH of 30 μ L2In O.
1.10 libraries quantify
2100Bio Analyzer (Agilent)/LabChip GX (Caliper) and QPCR detections, note are carried out to library Record library concentration.
Machine sequencing on 1.11 libraries
The library for building is sequenced (PE75) using NextSeq 550AR.
1.12 data processing and inversions
The sequencing data that will be obtained is input into the device of embodiment 1, detects somatic mutation.Testing result is as shown in the table.
Mutator Details The frequency of mutation
KIT N822K, c.2466T>A 20.9%
1.13 result verifications
The somatic mutation whether same Bone Marrow of Patients sample occurs above-mentioned site is tested using generation sequence measurement Card, testing result shows that KIT genes occur N822K, c.2466T>The mutation of A, deletion frequency about 20%, the result with 1.12 testing results are consistent.The body cell that detection means of the invention can successfully detect blood disease correlation in blood sample is dashed forward Become.
Embodiment 3
Sample of bone marrow to chronic lymphocytic leukemia (CLL) patient carries out somatic mutation detection.Used Embrane method extracts sample of bone marrow genomic DNA, and specific steps are with reference to Tiangeng company blood/cell/tissue extracting genome DNA reagent Box operation manual.
Testing result is as shown in the table.
Mutator Details The frequency of mutation
TP53 S46fs, c.137_144del 54%
The somatic mutation whether same Patient's surplus sample of bone marrow occurs above-mentioned site is entered using generation sequence measurement Row checking, testing result shows that TP53 genes occur S46fs, missing c.137_144del, deletion frequency about 50%, checking knot Fruit is consistent with the testing result of upper table.The body that detection means of the invention can successfully detect blood disease correlation in sample of bone marrow is thin Cytoplasmic process becomes.
Industrial applicibility
According to the present invention, there is provided it is wrong with true SNV, so as to more accurately profit that one kind can more accurately distinguish between sequencing The device and method of SNV is detected with blood sample.

Claims (8)

1. a kind of device for detecting the related somatic mutation of blood disease, it includes:
Data acquisition module, sequencing data and the sequencing number of healthy population DNA for obtaining the related sample DNA of blood disease According to, the sequencing data include the frequency of mutation in each sites of blood sample DNA and with the blood sample DNA everybody The frequency of mutation in each sites of each individual DNA in the corresponding healthy population of point;
Frequency of mutation statistical module, it is connected with the data acquisition module, the institute for counting the healthy population colony The frequency of mutation distribution situation in each site in each sites of DNA is stated, healthy population frequency of mutation statistical model is obtained;
Contrast module, it is connected with the data acquisition module and the frequency of mutation statistical module, for by the blood The frequency of mutation in each site of sample DNA is contrasted with the healthy population frequency of mutation statistical model, obtains comparing result;
Determination module, it is connected with the contrast module, for judge each sites of blood sample DNA mutation whether as Real somatic mutation, obtains result of determination;Wherein, when the comparing result is for there were significant differences and the frequency of mutation is less than During setting value, result of determination is real somatic mutation;And
Testing result output module, it is connected with the determination module, the judgement knot for exporting the determination module Really.
2. device according to claim 1, wherein, the data acquisition module includes the prominent of each sites of blood sample DNA Frequency acquisition module, the frequency of mutation acquisition module in each sites of blood sample DNA further includes following submodules:
Filter submodule, it is connected with the data acquisition module, for carrying out quality inspection, filtering removal low-quality to sequencing data The sequencing data of amount;
Submodule is compared, it is connected with the filter submodule, for the sequencing data after filtering to be carried out with reference sequences Compare, obtain sequencing fragment corresponding position in genome;
Pretreatment submodule, it is connected with the submodule that compares, for removing the sequencing fragment for repeating;And
Statistic submodule, its with it is described pretreatment submodule be connected, for count each sites of blood sample DNA mutation frequently Rate.
3. device according to claim 2, wherein, the statistic submodule is filtered out in each sites of blood sample DNA Confidence value is more than the site of setting value and carries out frequency of mutation statistics.
4. the device according to any one of claims 1 to 3, wherein, the frequency of mutation statistical module includes model school Syndrome generation module, the model correction module is used for using the healthy population frequency of mutation statistical model that obtains, pair with the blood The every points of each individual DNA in the related corresponding healthy population in each site of sample DNA of liquid disease are estimated and cast out bright Show the site of deviation, and count the distribution situation of the frequency of mutation in each site in remaining each site, obtain new being good for Kang Renqun frequency of mutation statistical models.
5. the device according to any one of Claims 1 to 4, wherein, the determination module includes following submodules:
Mutation conspicuousness decision sub-module, it is connected with the contrast module, for judging each sites of blood sample DNA Mutation conspicuousness;And
Mutation type decision sub-module, it is connected with the mutation conspicuousness decision sub-module, for judging the blood sample Whether the type of the mutation with conspicuousness in each sites of this DNA is somatic mutation.
6. device according to claim 5, wherein, the mutation conspicuousness decision sub-module judges that the blood disease is related The frequency of mutation in each site of sample DNA whether deposited with the frequency of mutation in corresponding site in healthy population frequency of mutation statistical model It is normal distribution P in the criterion of significant difference<0.05, it is then true mutation that there were significant differences.
7. the device according to any one of claim 1~6, wherein, testing result output module output blood disease is related Each site of sample DNA the mutation with conspicuousness position and mutation type.
8. the device according to any one of claim 1~7, wherein, the related sample of the blood disease be peripheral blood or Person's marrow.
CN201710067161.6A 2016-12-29 2017-02-07 A kind of device for detecting blood disease correlation somatic mutation Pending CN106778075A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2016112473753 2016-12-29
CN201611247375 2016-12-29

Publications (1)

Publication Number Publication Date
CN106778075A true CN106778075A (en) 2017-05-31

Family

ID=58956177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710067161.6A Pending CN106778075A (en) 2016-12-29 2017-02-07 A kind of device for detecting blood disease correlation somatic mutation

Country Status (1)

Country Link
CN (1) CN106778075A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491666A (en) * 2017-09-01 2017-12-19 深圳裕策生物科技有限公司 Single sample somatic mutation loci detection method, device and storage medium in abnormal structure
CN114155910A (en) * 2021-11-12 2022-03-08 哈尔滨工业大学 Method for predicting cancer somatic mutation function influence

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462869A (en) * 2014-11-28 2015-03-25 天津诺禾致源生物信息科技有限公司 Method and device for detecting somatic cell SNP
CN105154447A (en) * 2015-09-16 2015-12-16 复旦大学 Prostatic cancer molecular target AC016745.3 and application thereof to diagnostic kit
CN105420351A (en) * 2015-10-16 2016-03-23 深圳华大基因研究院 Method and system for determining individual gene mutation
US20160130664A1 (en) * 2014-11-12 2016-05-12 Neogenomics Laboratories, Inc. Determining tumor load and biallelic mutation in patients with calr mutation using peripheral blood plasma
CN105734070A (en) * 2016-04-11 2016-07-06 苏州大学 Corin gene variant and application thereof
CN105969656A (en) * 2016-05-13 2016-09-28 万康源(天津)基因科技有限公司 Detection and analysis platform for sequencing tumor somatic mutation by single-cell exons
CN105969856A (en) * 2016-05-13 2016-09-28 万康源(天津)基因科技有限公司 Detection method for sequencing tumor somatic mutation by single-cell exons
CN105986032A (en) * 2016-03-30 2016-10-05 广州精科生物技术有限公司 Kit, library establishment method, and method and system for detecting target region variation
CN106048009A (en) * 2016-06-03 2016-10-26 人和未来生物科技(长沙)有限公司 Label joint for detection of ultra-low-frequency gene mutation and application of label joint
CN106065414A (en) * 2016-06-15 2016-11-02 浙江大学 Noninvasive cancer of pancreas polygenes detection method and kit based on blood plasma cfDNA detection technique

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160130664A1 (en) * 2014-11-12 2016-05-12 Neogenomics Laboratories, Inc. Determining tumor load and biallelic mutation in patients with calr mutation using peripheral blood plasma
CN104462869A (en) * 2014-11-28 2015-03-25 天津诺禾致源生物信息科技有限公司 Method and device for detecting somatic cell SNP
CN105154447A (en) * 2015-09-16 2015-12-16 复旦大学 Prostatic cancer molecular target AC016745.3 and application thereof to diagnostic kit
CN105420351A (en) * 2015-10-16 2016-03-23 深圳华大基因研究院 Method and system for determining individual gene mutation
CN105986032A (en) * 2016-03-30 2016-10-05 广州精科生物技术有限公司 Kit, library establishment method, and method and system for detecting target region variation
CN105734070A (en) * 2016-04-11 2016-07-06 苏州大学 Corin gene variant and application thereof
CN105969656A (en) * 2016-05-13 2016-09-28 万康源(天津)基因科技有限公司 Detection and analysis platform for sequencing tumor somatic mutation by single-cell exons
CN105969856A (en) * 2016-05-13 2016-09-28 万康源(天津)基因科技有限公司 Detection method for sequencing tumor somatic mutation by single-cell exons
CN106048009A (en) * 2016-06-03 2016-10-26 人和未来生物科技(长沙)有限公司 Label joint for detection of ultra-low-frequency gene mutation and application of label joint
CN106065414A (en) * 2016-06-15 2016-11-02 浙江大学 Noninvasive cancer of pancreas polygenes detection method and kit based on blood plasma cfDNA detection technique

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NOORI, P 等: ""A comparison of somatic mutational spectra in healthy study populations from Russia, Sweden and USA"", 《CARCINOGENESIS》 *
李林海: ""线粒体DNA突变与乳腺癌风险相关性研究"", 《中国博士学位论文全文数据库 医药卫生科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491666A (en) * 2017-09-01 2017-12-19 深圳裕策生物科技有限公司 Single sample somatic mutation loci detection method, device and storage medium in abnormal structure
CN107491666B (en) * 2017-09-01 2020-11-10 深圳裕策生物科技有限公司 Method, device and storage medium for detecting mutant site of single sample somatic cell in abnormal tissue
CN114155910A (en) * 2021-11-12 2022-03-08 哈尔滨工业大学 Method for predicting cancer somatic mutation function influence
CN114155910B (en) * 2021-11-12 2022-07-29 哈尔滨工业大学 Method for predicting cancer somatic mutation function influence

Similar Documents

Publication Publication Date Title
CN106845153A (en) A kind of device for using Circulating tumor DNA pattern detection somatic mutation
CN107475375B (en) A kind of DNA probe library, detection method and kit hybridized for microsatellite locus related to microsatellite instability
CN106650312B (en) Device for detecting copy number variation of circulating tumor DNA
US20220230707A1 (en) Systems and methods for deconvolution of expression data
CN106544407B (en) Method for determining the proportion of donor-derived cfDNA in a recipient cfDNA sample
CN112397151B (en) Methylation marker screening and evaluating method and device based on target capture sequencing
JP2021505977A (en) Methods and systems for determining somatic mutation clonality
CN108319813A (en) Circulating tumor DNA copies the detection method and device of number variation
CN114317762B (en) Three-marker composition for detecting early liver cancer and kit thereof
CN113096728A (en) Method, device, storage medium and equipment for detecting tiny residual focus
CN106845154B (en) A device for FFPE sample copy number variation detects
CN109182517A (en) One group of gene and its application for medulloblastoma molecule parting
CN112992273A (en) Early colorectal cancer risk prediction evaluation model and system
CN106845155A (en) A kind of device for detecting internal series-connection repetition
CN106874710A (en) A kind of device for using tumour FFPE pattern detection somatic mutations
CN106778075A (en) A kind of device for detecting blood disease correlation somatic mutation
CN109652525A (en) Pulmonary thromboembolism gene panel kit and its application
CN109234394A (en) A kind of diagnosing cancer of liver marker and its screening technique
CN108977529A (en) It is a kind of to utilize newborn&#39;s TRECs and KRECs gene copy number detection kit of digital pcr technology and its application
CN110358820A (en) Detect method, primer and the kit of LDLR gene mutation
US20220148690A1 (en) Immunorepertoire wellness assessment systems and methods
CN117587099B (en) Amplicon library construction method based on capture probe and application thereof
CN106676637B (en) A DNA library for detecting pathogenic genes of multiple osteochondroma and its application
CN116640846A (en) Micro residual focus ctDNA quality control product and preparation method and application thereof
CN107201402A (en) A kind of detection method of autosomal dominant polycystic kidney disease

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20180124

Address after: 100176 Beijing branch of Beijing economic and Technological Development Zone Street 88 Hospital No. 8 Building 2 unit 701 room

Applicant after: Annoroad Genetic Technology (Beijing) Co., Ltd.

Applicant after: Zhejiang Annuo uni-data Biotechnology Co. Ltd.

Applicant after: Annuo uni-data (Yiwu) Medical Inspection Co. Ltd.

Address before: 100176 Beijing branch of Daxing District economic and Technological Development Zone Street 88 Hospital No. 8 Building 2 unit 701 room

Applicant before: Annoroad Genetic Technology (Beijing) Co., Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination