[go: up one dir, main page]

WO2019061199A1 - A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof - Google Patents

A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof Download PDF

Info

Publication number
WO2019061199A1
WO2019061199A1 PCT/CN2017/104110 CN2017104110W WO2019061199A1 WO 2019061199 A1 WO2019061199 A1 WO 2019061199A1 CN 2017104110 W CN2017104110 W CN 2017104110W WO 2019061199 A1 WO2019061199 A1 WO 2019061199A1
Authority
WO
WIPO (PCT)
Prior art keywords
primer
sequencing
target region
specific
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/104110
Other languages
French (fr)
Chinese (zh)
Inventor
杨林
张韶红
高雅
黄国栋
张艳艳
王雨倩
张薇婷
陈芳
赵佳
蒋慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MGI Tech Co Ltd
Original Assignee
MGI Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MGI Tech Co Ltd filed Critical MGI Tech Co Ltd
Priority to PCT/CN2017/104110 priority Critical patent/WO2019061199A1/en
Publication of WO2019061199A1 publication Critical patent/WO2019061199A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Definitions

  • the invention belongs to the field of biotechnology, and relates to a primer composition and an amplification method and application for simultaneously performing target region and whole gene amplification.
  • prenatal diagnosis is an important diagnostic project for clinical obstetrics. It diagnoses the genetic status of the fetus through physical, biochemical and genetic techniques and methods, with the aim of providing adequate families with genetic risks. Reliable information allows them to make appropriate choices during pregnancy.
  • Prenatal diagnosis methods usually include two categories, traumatic (invasive) methods and non-invasive (non-invasive) methods. The former is mainly obtained by amniocentesis, villus sampling, umbilical vein blood sampling, etc.
  • the genetic material of the fetus is detected, but there is a risk of miscarriage (0.2-3%), chorioamnion, amniotic fluid leakage, bleeding and infection at the puncture site, fetal bradycardia, premature rupture of membranes, etc.
  • Prenatal diagnosis technology is the direction of clinical medical workers and scientific researchers.
  • Lo et al. extracted plasma free DNA (cfDNA) from pregnant women's plasma and serum, and used the specificity of Y chromosome to judge the sex of the fetus. It was confirmed for the first time that there is fetal free DNA in maternal plasma. This important finding is non-invasive. The field of prenatal diagnosis has brought dawn. With the development of next-generation sequencing technology, cfDNA has been successfully applied to the screening of non-invasive prenatal fetal chromosome aneuploidy, and the detection of fetal chromosomes by high-throughput sequencing of free DNA in pregnant women's plasma.
  • defective genes such as substitutions, deletions, insertions, and frameshift mutations and splicing mutations of genes can also cause genetic diseases. These defective genes are usually derived from the germ cells of parents and are inherited. For the next generation, it is also known as a single genetic disease. There are many types of single-gene genetic diseases. According to incomplete statistics, more than 6,600 single-gene diseases have been discovered, and with the deepening of research, dozens of new single-gene diseases are added every year.
  • Single-gene diseases can be divided into five types: autosomal dominant genetic diseases (AD, such as short-term symptoms, achondroplasia, etc.), autosomal recessive genetic diseases (AR, such as albinism, etc.), X-linked dominant inheritance Disease (XD, such as anti-vitamin D deficiency disease, etc.), X-linked recessive genetic disease (XR, such as color blindness, etc.), Y-linked genetic disease (YL, such as auricle long hair, etc.).
  • AD autosomal dominant genetic diseases
  • AR autosomal recessive genetic diseases
  • XD X-linked dominant inheritance Disease
  • XR X-linked recessive genetic disease
  • YL such as auricle long hair, etc.
  • the staining aneuploidy screening methods mainly include: large-scale parallel shotgun sequencing (MPSS), targeted large-scale parallel sequencing (t-MPS), and SNP-based multiplex PCR targeted sequencing.
  • MPSS large-scale parallel shotgun sequencing
  • t-MPS targeted large-scale parallel sequencing
  • SNP-based multiplex PCR targeted sequencing The most widely used is the large-scale parallel shotgun sequencing (MPSS) method, Huada Gene Dyeing.
  • the NIFTY screening product for color aneuploidy is based on this method.
  • the method obtains the number of nucleic acid fragments distributed on each chromosome by amplifying and sequencing the plasma free DNA of the mother and the fetus isolated from the mother plasma, and analyzing the aneuploidy of the fetal chromosome after counting and comparing, In addition, the method can also be used to detect deletions of large fragments.
  • Targeted enrichment is the sequencing of specific nucleic acid fragments, rather than the entire genome, which enables sequencing depth and sensitivity much higher than whole-genome sequencing, greatly improving the ability to detect mutations.
  • the commonly used targeted enrichment methods are mainly based on probe capture target sequences, such as Roche's Seqcap, Agilent's Sureselect, etc., but these methods are cumbersome, time consuming, and costly, which limits the clinical promotion of single gene disease detection.
  • screening products for hereditary diseases mainly include NIFTY, a detection product for fetal chromosomal abnormality screening based on large-scale parallel sequencing, Hamony based on target region deep sequencing for detection of fetal chromosomal abnormalities, and fetal chromosomes based on target region SNP information. Arise for the detection of abnormal screening products.
  • NIFTY The advantage of NIFTY is that it can amplify and sequence all the cfDNA of mother and fetus. It does not need to find the target sequence information, and has significant advantages in fetal chromosome aneuploidy analysis, but its disadvantage is that the amount of data required is large. Mutation detection in some target regions does not achieve good detection; Hamony amplifies the targeted region to detect target chromosomes and monogenic diseases, but the disadvantage is that large repeats may occur for the entire genome.
  • 201510794535.5 discloses a method for simultaneously performing gene locus, chromosome and linkage analysis, which utilizes whole genome amplification technology combined with high-throughput sequencing to complete multiple comprehensive detections in one step. It avoids the use of multiple methods and multiple steps to detect single-gene genetic disease mutation sites, chromosomal diseases and linkage analysis.
  • the method is to separately perform whole genome amplification and amplification of the target gene mutation site, and then mix the two amplification products in a certain ratio, and then build and sequence the same, and cannot use the same reaction system in the same reaction system.
  • the sample completes the amplification of the whole genome and the target gene at the same time, and needs to consume more sample resources, and the product is highly susceptible to contamination during the mixing process, and the operation requirements are high.
  • the present invention provides a primer composition and an amplification method and application for simultaneous target region and whole gene amplification, which can perform chromosomal abnormalities at the whole genome level. Screening can also detect mutations in the target targeting area.
  • the present invention adopts the following technical solutions:
  • the invention provides a primer composition for simultaneous target region and whole gene amplification, comprising:
  • a first set of primer sets comprising a primer pool of specific primer 1 and a sequencing universal primer 1, wherein the 3' end of the specific primer 1 comprises a target region-specific forward primer; the sequencing The 3' end of universal primer 1 comprises the complementary sequence of sequencing linker sequence 1 at one end of the target region;
  • a second set of primer sets comprising a primer pool of specific primer 2, a sequencing universal primer 1 and a sequencing universal primer 2, wherein the 3' end of the specific primer 2 comprises a target region-specific positive Primer, the 5' end comprises the sequence of the sequencing linker sequence 2 at the other end of the target region; the sequencing universal primer 2 comprises the sequence of the sequencing linker sequence 2 at the other end of the target region.
  • the 5' end of the specific primer 1 in the first set of primer sets contains 2-10 CG bases, and the 2-10 CG bases refer to the total number of C and G 2-10.
  • the 5' end of the specific primer 1 in the first set of primer sets comprises 2 CG bases.
  • the CG base is used to balance the homogeneity between different primers, such as the TM value, so that the similarity between the different primers is better, and the uniformity of the amplification is better, thereby ensuring the data of each target region.
  • Uniformity and stability the 5' end of the specific primer 2 in the second set of primers comprising the sequencing linker sequence 2 was used for subsequent sequencing of the universal primer for anchoring.
  • the primer pool of the specific primer 1 and the primer pool of the specific primer 2 refer to all primers designed for the target region capable of amplifying the target region.
  • a target region product having a sequencing linker sequence 1 at one end and a specific primer 1 at one end is obtained, and the original two remaining in the system are present.
  • the whole genome product of the adaptor was added to the end; the second round of amplification was carried out with the second set of primers as the template, and the whole genome library was obtained; the specific primer 2 and the target region-specific sequence were combined.
  • Sequencing universal primer 1 and adaptor sequence 1 are combined, and nest-specific amplification is performed to obtain a target region product having a linker sequence at both ends, and then amplified by sequencing universal primer 1 and sequencing universal primer 2 to obtain a target region. library.
  • the distance between the specific primer 1 and the specific primer 2 is overlapped by 10 bp to 15 bp apart, that is, the distance between the specific primer 1 and the specific primer 2 is -10 bp to 15 bp, for example, -10bp, -9bp, -8bp, -7bp, -6bp, -5bp, -4bp, -3bp, -2bp, -1bp, 0bp, 1bp, 2bp, 3bp, 4bp, 5bp, 6bp, 7 bp, 8 bp, 9 bp, 10 bp, 11 bp, 12 bp, 13 bp, 14 bp or 15 bp, preferably -5 bp to 10 bp.
  • the linker sequence 1 and linker sequence 2 are independently selected from the sequencing linker of any of the second generation sequencing platforms of BGI SEQ-100, BGI SEQ-500, Proton or Illumina.
  • nucleotide sequence of the linker sequence 1 is shown in SEQ ID NO. 1, and the nucleotide sequence shown in SEQ ID NO. 1 is as follows: AGTCGGAGGCCAAGCGGTCTTAGGAAGACAA CAACTCCTTGGCTCACA.
  • the gray background portion is a 10 bp sequencing tag sequence for distinguishing sequencing samples.
  • nucleotide sequence of the linker sequence 2 is shown in SEQ ID NO. 2, and the nucleotide sequence shown in SEQ ID NO. 2 is as follows: AGCCAAGGTCAGTAACGACATGGCTACGATCCGACTT.
  • the sequencing universal primer 1 and the sequencing universal primer 2 are independently selected from sequencing universal primers of any of the second generation sequencing platforms of BGI SEQ-500, Proton or Illumina.
  • nucleotide sequence of the universal primer 1 is shown as SEQ ID NO. 3, and the nucleotide sequence shown in SEQ ID NO. 3 is as follows: TTGGAGCCAGGAGGTTG.
  • nucleotide sequence of the sequencing universal primer 2 is shown in SEQ ID NO. 4, and the nucleotide sequence shown in SEQ ID NO. 4 is as follows: GAACGACATGGCTACGACCGT.
  • the invention provides a kit for simultaneous target region and whole gene amplification comprising the primer composition of the first aspect.
  • the present invention provides a method for simultaneously performing a target region and whole gene amplification, using the primer composition of the first aspect, comprising the steps of:
  • the method of the invention can effectively increase the specificity of the primer by using two rounds of PCR.
  • the linker sequence is tagged.
  • the conditions of the first round of PCR in the step (2) are: pre-denaturation at 95-99 ° C for 1-5 min; pre-denaturation at 95-99 ° C for 5-15 s, extension at 55-65 ° C for 1-5 min, a total of 15 -25 cycles; 70-75 ° C extension for 5 min.
  • the conditions of the first round of PCR in the step (2) are: pre-denaturation at 98 ° C for 2 min, pre-denaturation at 98 ° C for 10 s, extension at 62 ° C for 2 min, a total of 20 cycles; and 72 ° C extension for 5 min.
  • the conditions of the second round of PCR in the step (2) are: pre-denaturation at 95-99 ° C for 1-5 min; 95-99 ° C pre- Denaturation 5-15 s, extension 55-65 ° C 1-5 min, a total of 10-20 cycles; 70-75 ° C extension 5 min.
  • the conditions of the second round of PCR in the step (2) are: 98 ° C 2 min; 98 ° C 10 s, 62 ° C 2 min, 15 cycles; 72 ° C 5 min, 1 cycle.
  • the second round of PCR is used to amplify the whole genome library, and those skilled in the art can adjust the target library by adjusting the number of cycles of the first round and the second round of PCR.
  • the ratio of the whole genome library to the first round or the second round of the cycle can increase the proportion of the target library. Decreasing the first round and increasing the number of cycles in the second round can reduce the proportion of the target library to meet the difference. The need to detect data.
  • step (1) the steps of extracting the sample, repairing the end, and adding A, the extraction sample, the end repair and the addition of A are conventional techniques in the art, and those skilled in the art can select according to the needs. This is not a special limitation.
  • step (1) further comprises a step of purifying, preferably using magnetic beads.
  • step (2) further comprises the step of data analysis.
  • the data analysis specifically includes: original offline data filtering, comparing the reference genome with BWA, and performing target region analysis and genome-wide analysis, respectively.
  • the whole genome analysis comprises: removing non-specific amplification products outside the target region caused by specific primers by primer sequences, performing homogeneity analysis on the removed data, and removing non-specific products as needed
  • the genome-wide data is analyzed for specific analysis, such as abnormal chromosome number analysis and chromosome structural variation analysis.
  • the target region analysis specifically comprises a mutation analysis.
  • the mutation comprises any one of a point mutation, an insertion deletion or a gene fusion or a combination of at least two types.
  • the analysis of the point mutation specifically includes the following steps: counting the reads of the target region, removing the specific primer sequences on the reads, analyzing and counting the reads after removing the specific primers in the target region, using samtools for the mutation Site base type and ratio were analyzed.
  • the whole genome analysis specifically comprises: removing non-specific amplification outside the target region caused by the specific primer by the primer sequence, performing homogeneity analysis on the removed data, and removing the non-specific product as needed.
  • the genome-wide data is analyzed for specific analysis, such as abnormal chromosome number analysis and chromosome structural variation analysis.
  • the present invention provides a system for simultaneously performing target region and whole gene amplification, comprising:
  • Sample processing module for extracting a sample, and adding a labeled link sequence to both ends of the sample;
  • (2) PCR amplification module connected to the sample processing module for performing two rounds of PCR amplification, the first round of PCR amplification uses the first set of primer sets, and the second round of PCR amplification uses the second set of primer sets to construct Sequencing library.
  • the system further comprises a data analysis module: connected to the PCR amplification module for sequencing the constructed library and analyzing the data.
  • the present invention provides the use of the primer composition of the first aspect for detecting an abnormality of a sample to be tested and an abnormality in the number of chromosomes.
  • the primer composition of the invention can simultaneously achieve whole genome amplification and multiple target region amplification, requires a small sample size, reduces cost, is simple to operate, and obtains stable target data, and the target region data can be used for multiple Detection of mutation types, and target region amplification and whole genome amplification do not interfere with each other;
  • the method of the present invention can complete the construction of the whole genome and the target region with only one sample.
  • the whole genome library and the target library are sequenced to obtain different depth sequencing data, and the whole genome data is low depth full coverage. It can be used for the detection of chromosomal data and structural abnormalities within the genome.
  • the data of the target region is high-depth and can be used for the detection of various types of mutations such as point mutations and small insertions and deletions;
  • the method of the invention has great application value in the field of non-invasive prenatal detection.
  • Figure 1 is a schematic diagram showing the design of specific primer 1 and specific primer 2 of the present application
  • FIG. 2 is a flow chart of a method for simultaneously performing target region amplification and whole gene amplification in the present application
  • Figure 3 is a flow chart showing the construction of the target region and the whole genome library of the present application.
  • FIG. 4 is a diagram showing the results of detection of the sequencing library 2100 in the embodiment of the present application.
  • FIG. 5 is a flowchart of information analysis of an embodiment of the present application.
  • Figure 6 is a diagram showing the results of in-depth analysis of the target region of the FGFR3 gene of the present application.
  • Figure 7 is a diagram showing the results of amplification stability analysis of different samples of the FGFR3 gene of the present application.
  • FIG. 8 is a diagram showing the results of data distribution on the whole genome of the method of the present application, wherein the white histogram is the reference chromosome size, and the black histogram is the only number of reads on the modified chromosome after deduplication;
  • Figure 9 is a graph showing the stability analysis of the method on the whole genome of the genome of the present application.
  • Primers designed according to the FGFR3 gene as shown in Figure 1, primers covering the FGFR3 gene and ACH-related hotspot mutations, the primer pool of the specific primer 1 designed as shown in SEQ ID NO. 5-22, as shown in Table 1, the specificity of the design Primer 2
  • the primer pools of SEQ ID NOS. 23-40 are shown in Table 2, for a total of 18 pairs, as follows:
  • the crude is a linker sequence, and all the specific primers 2 were mixed at an equimolar concentration (10 ⁇ M) to obtain a primer pool of the specific primer 2, and the final concentration of the primer pool was 10 ⁇ M.
  • SEQ ID NO. 1 The bold in SEQ ID NO. 1 is a sequencing 10 bp tag sequence for distinguishing different sequencing samples for parallel sequencing.
  • FIG. 2 specifically includes the following steps:
  • End repair of cf DNA, and base A at the 3 end of the DNA strand, end repair plus A reaction system Use the joint probe anchor polymerization and sequencing method of Huada Gene Company to build the library kit (Cat. No. BOX3), as shown in Table 5:
  • the system for the ligation reaction of the linker is constructed by the Joint Probe Anchor Polymerization Sequencing Method of the Huada Gene Company (Cat. No. BOX3), as shown in Table 6:
  • Primer pool for specific primer 1 (10 ⁇ M) The sum of all primer concentrations was 10 ⁇ M.
  • reaction conditions were as follows: pre-denaturation at 98 ° C for 2 min; pre-denaturation at 98 ° C for 10 s, extension at 62 ° C for 2 min, a total of 20 cycles; 72 ° C extension for 5 min;
  • the primer pool of the specific primer 2 is as shown in Table 2 of the first embodiment, and the second round of the PCR system uses the joint probe anchor polymerization sequencing method of the Huada Gene Company to build the library kit (Cat. No. BOX3), as shown in Table 8 shows:
  • Primer pool for specific primer 2 (10 ⁇ M) The sum of all primer concentrations was 10 ⁇ M.
  • reaction conditions are as follows: 98 ° C 2 min; 98 ° C 10 s, 62 ° C 2 min, 15 cycles; 72 ° C 5 min, 1 cycle;
  • the quality of the library is shown in Figure 4.
  • the size of the library and the plasma free DNA size are close to the target region and the whole genome region.
  • the main peak is 245 bp, and there are two small peaks of 400 bp and 577 and plasma free size anastomosis (plasma free DNA).
  • sample 2 was a point mutation positive sample on exon 16 of FGFR3 gene (c.1138G>C)
  • the experiment can correctly detect the site, and the mutation detection G accounts for the proportion of the entire base at the position. At 5.2%, the remaining samples did not detect mutations on the FGFR3 gene.
  • the data stability analysis on the chromosome is shown in Figure 9.
  • the data on the chromosomes between the two samples is statistically proportional.
  • the abscissa is the proportion of each chromosome in one sample, and the ordinate is the chromosomal data in the other sample.
  • the ratio, where R 2 0.9999, is good.
  • Non-invasive prenatal test results are the results obtained using the NGFTY test of Huada Gene.
  • the method of the present application can simultaneously amplify the whole genome and multiple target regions, and obtain low-depth whole genome data and high-depth target region data.
  • multiple target areas there is good uniformity and stability.
  • High-depth target area data can detect multiple mutation types in the target area.
  • whole genome data it also has good uniformity and stability, low depth.
  • Whole genome data can be used to detect chromosome number and structural abnormalities, and amplification of the target region does not affect the results of genome-wide chromosome detection.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供了一种同时进行目标区域和全基因扩增的引物组合物及扩增方法和应用,所述引物组合物包括:第一组引物组,所述第一引物组包括特异性引物1的引物池和测序通用引物1;第二组引物组,所述第二引物组包括特异性引物2的引物池、测序通用引物1和测序通用引物2。The present invention provides a primer composition and amplification method and application for simultaneous target region and whole gene amplification, the primer composition comprising: a first set of primer sets, the first primer set comprising a specific primer 1 Primer pool and sequencing universal primer 1; a second set of primer sets including primer pool of specific primer 2, sequencing universal primer 1 and sequencing universal primer 2.

Description

一种同时进行目标区域和全基因扩增的引物组合物及扩增方法和应用Primer composition and amplification method and application for simultaneous target region and whole gene amplification 技术领域Technical field

本发明属于生物技术领域,涉及一种同时进行目标区域和全基因扩增的引物组合物及扩增方法和应用。The invention belongs to the field of biotechnology, and relates to a primer composition and an amplification method and application for simultaneously performing target region and whole gene amplification.

背景技术Background technique

作为遗传诊断的重要组成之一,产前诊断是临床产科的重要诊断项目,其通过物理、生化和遗传等技术和方法,对胎儿的遗传状况进行诊断,目的在于为有遗传风险的家庭提供充足可靠的信息,使他们在妊娠期做出适当的选择。产前诊断方法通常包括两大类,即创伤性(侵入性)方法和非创伤性(非侵入性)方法,前者主要通过羊膜腔穿刺术、绒毛取样术、脐静脉血取样术等方式获得充足的胎儿的遗传物质进行检测,但存在流产(0.2-3%)、绒毛膜炎、羊水渗漏、穿刺部位出血和感染、胎儿心动过缓、胎膜早破等风险,因此大力发展非创伤性产前诊断技术是临床医务工作者和科学研究者共同努力的方向。As one of the important components of genetic diagnosis, prenatal diagnosis is an important diagnostic project for clinical obstetrics. It diagnoses the genetic status of the fetus through physical, biochemical and genetic techniques and methods, with the aim of providing adequate families with genetic risks. Reliable information allows them to make appropriate choices during pregnancy. Prenatal diagnosis methods usually include two categories, traumatic (invasive) methods and non-invasive (non-invasive) methods. The former is mainly obtained by amniocentesis, villus sampling, umbilical vein blood sampling, etc. The genetic material of the fetus is detected, but there is a risk of miscarriage (0.2-3%), chorioamnion, amniotic fluid leakage, bleeding and infection at the puncture site, fetal bradycardia, premature rupture of membranes, etc. Prenatal diagnosis technology is the direction of clinical medical workers and scientific researchers.

1997年,Lo等人从孕妇血浆和血清中抽提血浆游离DNA(cfDNA),应用Y染色体的特异性对胎儿性别进行判断,首次证实母体血浆中存在胎儿游离的DNA,这一重要发现给无创性产前诊断领域带来了曙光。随着新一代测序技术的发展,cfDNA成功应用于无创性产前胎儿染色体非整倍性的筛查,通过对孕妇血浆中游离的DNA进行高通量测序来检测胎儿染色体的情况。In 1997, Lo et al. extracted plasma free DNA (cfDNA) from pregnant women's plasma and serum, and used the specificity of Y chromosome to judge the sex of the fetus. It was confirmed for the first time that there is fetal free DNA in maternal plasma. This important finding is non-invasive. The field of prenatal diagnosis has brought dawn. With the development of next-generation sequencing technology, cfDNA has been successfully applied to the screening of non-invasive prenatal fetal chromosome aneuploidy, and the detection of fetal chromosomes by high-throughput sequencing of free DNA in pregnant women's plasma.

除染色体异常导致的遗传疾病外,单个核苷酸的替换、缺失、插入以及基因的移码突变和剪接突变等缺陷基因也会引起遗传疾病,这些缺陷基因通常来自于父母的生殖细胞,并且遗传给下一代,所以又称为单基因遗传病。单基因遗传病种类繁多,据不完全统计,目前已经发现6600多种单基因疾病,并且随着研究的深入,平均每年增加数十种新的单基因疾病。在这些已经发现的单基因疾病中,有1000多种疾病的发病机制比较清楚,能应用于临床检测,如血友病、苯丙酮尿症、进行性肌营养不良、地中海贫血等。单基因疾病可以分为五种类型:常染色体显性遗传病(AD,如短指症、软骨发育不全等)、常染色体隐性遗传病(AR,如白化病等)、X伴性显性遗传病(XD,如抗维生素D缺乏病等)、X伴性隐性遗传病(XR,如色盲等)、Y伴性遗传病(YL,如耳廓长毛症等)。目前可以通过产前诊断结合干预措施来避免具有这些缺陷的患儿出生。In addition to genetic diseases caused by chromosomal abnormalities, defective genes such as substitutions, deletions, insertions, and frameshift mutations and splicing mutations of genes can also cause genetic diseases. These defective genes are usually derived from the germ cells of parents and are inherited. For the next generation, it is also known as a single genetic disease. There are many types of single-gene genetic diseases. According to incomplete statistics, more than 6,600 single-gene diseases have been discovered, and with the deepening of research, dozens of new single-gene diseases are added every year. Among the single-gene diseases that have been discovered, more than 1,000 diseases have clear pathogenesis and can be applied to clinical tests such as hemophilia, phenylketonuria, progressive muscular dystrophy, and thalassemia. Single-gene diseases can be divided into five types: autosomal dominant genetic diseases (AD, such as short-term symptoms, achondroplasia, etc.), autosomal recessive genetic diseases (AR, such as albinism, etc.), X-linked dominant inheritance Disease (XD, such as anti-vitamin D deficiency disease, etc.), X-linked recessive genetic disease (XR, such as color blindness, etc.), Y-linked genetic disease (YL, such as auricle long hair, etc.). Prenatal diagnosis combined with interventions can now be used to avoid the birth of children with these defects.

染色体非整倍性筛查和单基因疾病筛查主要采用不同的策略。染色非整倍性筛查方法主要包括:大规模平行鸟枪法测序(MPSS)、靶向大规模平行测序(t-MPS)和基于SNP的多重PCR靶向测序。其中运用最广泛的是大规模平行鸟枪法测序(MPSS)方法,华大基因染 色体非整倍性的筛查产品NIFTY便是基于该方法的。该方法通过对从母亲血浆中分离出的母亲和胎儿的血浆游离DNA进行扩增和测序,获得分布在每条染色体上的核酸片段数量,经过计数比较,分析胎儿染色体的非整倍体情况,此外该方法还可以用来检测大片段的缺失重复。Chromosomal aneuploidy screening and single-gene disease screening use different strategies. The staining aneuploidy screening methods mainly include: large-scale parallel shotgun sequencing (MPSS), targeted large-scale parallel sequencing (t-MPS), and SNP-based multiplex PCR targeted sequencing. The most widely used is the large-scale parallel shotgun sequencing (MPSS) method, Huada Gene Dyeing. The NIFTY screening product for color aneuploidy is based on this method. The method obtains the number of nucleic acid fragments distributed on each chromosome by amplifying and sequencing the plasma free DNA of the mother and the fetus isolated from the mother plasma, and analyzing the aneuploidy of the fetal chromosome after counting and comparing, In addition, the method can also be used to detect deletions of large fragments.

而对于单基因遗传病的检测,一般采用靶向富集的方法。该方法可以有效地利用珍贵的样本资源,对与研究最相关的特定核酸进行测序。靶向富集方法是对特定的核酸片段、而非整个基因组进行测序,可以实现远高于全基因组测序的测序深度和灵敏度,大幅提升突变的发现能力。常用的靶向富集方法主要基于探针捕获目的序列,如罗氏的Seqcap,安捷伦的Sureselect等,但这些方法操作繁琐,耗时长,成本高,限制了单基因病检测在临床上的推广。For the detection of single-gene genetic diseases, the method of targeted enrichment is generally adopted. This method can effectively use precious sample resources to sequence specific nucleic acids most relevant to the study. Targeted enrichment is the sequencing of specific nucleic acid fragments, rather than the entire genome, which enables sequencing depth and sensitivity much higher than whole-genome sequencing, greatly improving the ability to detect mutations. The commonly used targeted enrichment methods are mainly based on probe capture target sequences, such as Roche's Seqcap, Agilent's Sureselect, etc., but these methods are cumbersome, time consuming, and costly, which limits the clinical promotion of single gene disease detection.

目前遗传性疾病的筛查产品主要有基于大规模平行测序对胎儿染色体异常筛查的检测产品NIFTY,基于目标区域深度测序对胎儿染色体异常筛查的检测产品Hamony和基于目标区域SNP信息对胎儿染色体异常筛查的检测产品Arise。Currently, screening products for hereditary diseases mainly include NIFTY, a detection product for fetal chromosomal abnormality screening based on large-scale parallel sequencing, Hamony based on target region deep sequencing for detection of fetal chromosomal abnormalities, and fetal chromosomes based on target region SNP information. Arise for the detection of abnormal screening products.

NIFTY的优点是对母亲和胎儿全部的cfDNA进行扩增测序,不需要寻找靶向序列信息,在胎儿染色体非整倍性分析中具有显著的优势,但其缺点是需要的数据量较大,对于某些目标区域的突变检测无法达到很好的检测效果;Hamony对靶向区域进行扩增,实现对目标染色体和单基因疾病的检测,但其缺点是对于整个基因组上可能发生的大的重复缺失没有很好的筛查效果,并且其复杂的建库流程和昂贵的试剂成本限制了其在临床上的应用;Natera对染色体异常(缺失重复)的检测受限于区域上的SNP位点,同样只能对目标染色体和单基因疾病进行检测,对其他非目标染色体不能进行检测。The advantage of NIFTY is that it can amplify and sequence all the cfDNA of mother and fetus. It does not need to find the target sequence information, and has significant advantages in fetal chromosome aneuploidy analysis, but its disadvantage is that the amount of data required is large. Mutation detection in some target regions does not achieve good detection; Hamony amplifies the targeted region to detect target chromosomes and monogenic diseases, but the disadvantage is that large repeats may occur for the entire genome. There is no good screening effect, and its complex database construction process and expensive reagent cost limit its clinical application; Natera's detection of chromosomal abnormalities (deletion duplication) is limited by the SNP site on the region, Only target chromosomes and single-gene diseases can be detected, and other non-target chromosomes cannot be detected.

为了同时检测染色体异常疾病和单基因疾病,201510794535.5公开了一种同时完成基因位点、染色体及连锁分析的方法,该方法利用全基因组扩增技术结合高通量测序,一步完成多项综合性检测,避免了使用多方法、多步骤分别进行单基因遗传病突变位点、染色体疾病和连锁分析的检测。然而,该方法是将样本分别进行全基因组扩增和目的基因突变位点扩增,然后将两种扩增产物按照一定比例混合后,进行建库和测序,不能在同一个反应体系中使用同一份样本同时完成全基因组和目的基因的扩增,需要消耗较多的样本资源,并且产物在混合过程中极易发生污染,对操作要求较高。In order to simultaneously detect chromosomal abnormal diseases and single-gene diseases, 201510794535.5 discloses a method for simultaneously performing gene locus, chromosome and linkage analysis, which utilizes whole genome amplification technology combined with high-throughput sequencing to complete multiple comprehensive detections in one step. It avoids the use of multiple methods and multiple steps to detect single-gene genetic disease mutation sites, chromosomal diseases and linkage analysis. However, the method is to separately perform whole genome amplification and amplification of the target gene mutation site, and then mix the two amplification products in a certain ratio, and then build and sequence the same, and cannot use the same reaction system in the same reaction system. The sample completes the amplification of the whole genome and the target gene at the same time, and needs to consume more sample resources, and the product is highly susceptible to contamination during the mixing process, and the operation requirements are high.

因此,建立一种能够同时检测染色体异常和单基因疾病的方法,克服现有技术样本需求量大、操作繁琐、检测成本高的缺点,从而解决大部分遗传缺陷问题,对我国的优生优育工作具有极大的意义。Therefore, a method for simultaneously detecting chromosomal abnormalities and single-gene diseases is established, which overcomes the shortcomings of the prior art samples, such as large demand, cumbersome operation, and high detection cost, thereby solving most of the genetic defects, and has the advantages of prenatal and postnatal care in China. Great significance.

发明内容 Summary of the invention

针对现有技术的不足及实际需求,本发明提供一种同时进行目标区域和全基因扩增的引物组合物及扩增方法和应用,所述扩增方法既可以对全基因组水平染色体异常情况进行筛查,又可以对目标靶向区域发生的突变进行检测。In view of the deficiencies and practical needs of the prior art, the present invention provides a primer composition and an amplification method and application for simultaneous target region and whole gene amplification, which can perform chromosomal abnormalities at the whole genome level. Screening can also detect mutations in the target targeting area.

为达到此发明的目的,本发明采用以下技术方案:To achieve the object of the present invention, the present invention adopts the following technical solutions:

一方面,本发明提供了一种同时进行目标区域和全基因扩增的引物组合物,包括:In one aspect, the invention provides a primer composition for simultaneous target region and whole gene amplification, comprising:

第一组引物组,所述第一引物组包括特异性引物1的引物池和测序通用引物1,其中,所述特异性引物1的3’端包含目标区域特异性正向引物;所述测序通用引物1的3’端包含目标区域其中一端的测序接头序列1的互补序列;a first set of primer sets comprising a primer pool of specific primer 1 and a sequencing universal primer 1, wherein the 3' end of the specific primer 1 comprises a target region-specific forward primer; the sequencing The 3' end of universal primer 1 comprises the complementary sequence of sequencing linker sequence 1 at one end of the target region;

第二组引物组,所述第二引物组包括特异性引物2的引物池、测序通用引物1和测序通用引物2,其中,所述特异性引物2的3’端包含目标区域特异性正向引物,5’端包含目标区域另一端的测序接头序列2的序列;所述测序通用引物2包含目标区域另一端的测序接头序列2的序列。a second set of primer sets comprising a primer pool of specific primer 2, a sequencing universal primer 1 and a sequencing universal primer 2, wherein the 3' end of the specific primer 2 comprises a target region-specific positive Primer, the 5' end comprises the sequence of the sequencing linker sequence 2 at the other end of the target region; the sequencing universal primer 2 comprises the sequence of the sequencing linker sequence 2 at the other end of the target region.

本发明中,所述第一组引物组中的特异性引物1的5’端包含2-10个CG碱基,所述的2-10个CG碱基指的是C和G的数量一共为2-10个。In the present invention, the 5' end of the specific primer 1 in the first set of primer sets contains 2-10 CG bases, and the 2-10 CG bases refer to the total number of C and G 2-10.

优选地,第一组引物组中的特异性引物1的5’端包含2个CG碱基。Preferably, the 5' end of the specific primer 1 in the first set of primer sets comprises 2 CG bases.

本发明中,所述CG碱基是用于平衡不同引物之间的均一性,如TM值等,让不同引物之间相似性更好,扩增的均一性更好,从而保证各目标区域数据的均一性和稳定性,所述第二组引物中的特异性引物2的5’端包含测序接头序列2是为了用于后续测序通用引物进行锚定。In the present invention, the CG base is used to balance the homogeneity between different primers, such as the TM value, so that the similarity between the different primers is better, and the uniformity of the amplification is better, thereby ensuring the data of each target region. Uniformity and stability, the 5' end of the specific primer 2 in the second set of primers comprising the sequencing linker sequence 2 was used for subsequent sequencing of the universal primer for anchoring.

本发明中,所述特异性引物1的引物池和特异性引物2的引物池指的是针对目标区域设计的能够扩增出目标区域的所有引物。In the present invention, the primer pool of the specific primer 1 and the primer pool of the specific primer 2 refer to all primers designed for the target region capable of amplifying the target region.

本发明中,通过采用第一组引物组进行第一轮扩增,得到一端带有测序接头序列1,一端带有特异性引物1的目标区域产物,此时体系中还存在留下的原始两端都加上接头的全基因组产物;以上面两种产物为模板,以第二组引物组为引物进行第二轮扩增,得到全基因组文库;特异性引物2和目标区域特异性序列结合,测序通用引物1和接头序列1结合,进行巢式特异性扩增,得到两端都带有接头序列的目标区域产物,后续再通过测序通用引物1和测序通用引物2进行扩增,得到目标区域文库。In the present invention, by using the first set of primer sets for the first round of amplification, a target region product having a sequencing linker sequence 1 at one end and a specific primer 1 at one end is obtained, and the original two remaining in the system are present. The whole genome product of the adaptor was added to the end; the second round of amplification was carried out with the second set of primers as the template, and the whole genome library was obtained; the specific primer 2 and the target region-specific sequence were combined. Sequencing universal primer 1 and adaptor sequence 1 are combined, and nest-specific amplification is performed to obtain a target region product having a linker sequence at both ends, and then amplified by sequencing universal primer 1 and sequencing universal primer 2 to obtain a target region. library.

根据本发明,所述特异性引物1和特异性引物2的距离为重叠10bp到相离15bp,即所述特异性引物1和所述特异性引物2的距离为-10bp~15bp,例如可以是-10bp、-9bp、-8bp、-7bp、-6bp、-5bp、-4bp、-3bp、-2bp、-1bp、0bp、1bp、2bp、3bp、4bp、5bp、6bp、 7bp、8bp、9bp、10bp、11bp、12bp、13bp、14bp或15bp,优选为-5bp~10bp。According to the present invention, the distance between the specific primer 1 and the specific primer 2 is overlapped by 10 bp to 15 bp apart, that is, the distance between the specific primer 1 and the specific primer 2 is -10 bp to 15 bp, for example, -10bp, -9bp, -8bp, -7bp, -6bp, -5bp, -4bp, -3bp, -2bp, -1bp, 0bp, 1bp, 2bp, 3bp, 4bp, 5bp, 6bp, 7 bp, 8 bp, 9 bp, 10 bp, 11 bp, 12 bp, 13 bp, 14 bp or 15 bp, preferably -5 bp to 10 bp.

根据本发明,所述接头序列1和接头序列2独立地选自BGISEQ-100、BGISEQ-500、Proton或Illumina中的任意一种二代测序平台的测序接头。According to the invention, the linker sequence 1 and linker sequence 2 are independently selected from the sequencing linker of any of the second generation sequencing platforms of BGI SEQ-100, BGI SEQ-500, Proton or Illumina.

根据本发明,所述接头序列1的核苷酸序列如SEQ ID NO.1所示,所述SEQ ID NO.1所示的核苷酸序列如下:AGTCGGAGGCCAAGCGGTCTTAGGAAGACAA

Figure PCTCN2017104110-appb-000001
CAACTCCTTGGCTCACA.According to the present invention, the nucleotide sequence of the linker sequence 1 is shown in SEQ ID NO. 1, and the nucleotide sequence shown in SEQ ID NO. 1 is as follows: AGTCGGAGGCCAAGCGGTCTTAGGAAGACAA
Figure PCTCN2017104110-appb-000001
CAACTCCTTGGCTCACA.

其中灰色背景部分为10bp测序标签序列,用于区分测序样本。The gray background portion is a 10 bp sequencing tag sequence for distinguishing sequencing samples.

根据本发明,所述接头序列2的核苷酸序列如SEQ ID NO.2所示,所述SEQ ID NO.2所示的核苷酸序列如下:AGCCAAGGTCAGTAACGACATGGCTACGATCCGACTT.According to the present invention, the nucleotide sequence of the linker sequence 2 is shown in SEQ ID NO. 2, and the nucleotide sequence shown in SEQ ID NO. 2 is as follows: AGCCAAGGTCAGTAACGACATGGCTACGATCCGACTT.

根据本发明,所述测序通用引物1和所述测序通用引物2独立地选自BGISEQ-500、Proton或Illumina中的任意一种二代测序平台的测序通用引物。According to the invention, the sequencing universal primer 1 and the sequencing universal primer 2 are independently selected from sequencing universal primers of any of the second generation sequencing platforms of BGI SEQ-500, Proton or Illumina.

根据本发明,所述测序通用引物1的核苷酸序列如SEQ ID NO.3所示,所述SEQ ID NO.3所示的核苷酸序列如下:TGTGAGCCAAGGAGTTG.According to the present invention, the nucleotide sequence of the universal primer 1 is shown as SEQ ID NO. 3, and the nucleotide sequence shown in SEQ ID NO. 3 is as follows: TTGGAGCCAGGAGGTTG.

根据本发明,所述测序通用引物2的核苷酸序列如SEQ ID NO.4所示,所述SEQ ID NO.4所示的核苷酸序列如下:GAACGACATGGCTACGACCGT.According to the present invention, the nucleotide sequence of the sequencing universal primer 2 is shown in SEQ ID NO. 4, and the nucleotide sequence shown in SEQ ID NO. 4 is as follows: GAACGACATGGCTACGACCGT.

第二方面,本发明提供一种同时进行目标区域和全基因扩增的试剂盒,其包括如第一方面所述的引物组合物。In a second aspect, the invention provides a kit for simultaneous target region and whole gene amplification comprising the primer composition of the first aspect.

第三方面,本发明提供一种同时进行目标区域和全基因扩增的方法,采用如第一方面所述的引物组合物,包括如下步骤:In a third aspect, the present invention provides a method for simultaneously performing a target region and whole gene amplification, using the primer composition of the first aspect, comprising the steps of:

(1)将待测样本两端加上带有标签的接头序列;(1) adding a tagged linker sequence to both ends of the sample to be tested;

(2)进行两轮PCR扩增,第一轮PCR扩增采用第一组引物组,第二轮PCR扩增采用第二组引物组,构建测序文库。(2) Two rounds of PCR amplification were performed. The first round of PCR amplification was performed using the first set of primer sets, and the second round of PCR amplification was performed using the second set of primer sets to construct a sequencing library.

本发明方法采用两轮PCR能够有效增加引物的特异性。The method of the invention can effectively increase the specificity of the primer by using two rounds of PCR.

根据本发明,所述接头序列带有标签。According to the invention, the linker sequence is tagged.

根据本发明,步骤(2)所述第一轮PCR的条件为:95-99℃预变性1-5min;95-99℃预变性5-15s,55-65℃延伸1-5min,共进行15-25个循环;70-75℃延伸5min。According to the present invention, the conditions of the first round of PCR in the step (2) are: pre-denaturation at 95-99 ° C for 1-5 min; pre-denaturation at 95-99 ° C for 5-15 s, extension at 55-65 ° C for 1-5 min, a total of 15 -25 cycles; 70-75 ° C extension for 5 min.

根据本发明,步骤(2)所述第一轮PCR的条件为:98℃预变性2min 98℃预变性10s,62℃延伸2min,共进行20个循环;72℃延伸5min。According to the present invention, the conditions of the first round of PCR in the step (2) are: pre-denaturation at 98 ° C for 2 min, pre-denaturation at 98 ° C for 10 s, extension at 62 ° C for 2 min, a total of 20 cycles; and 72 ° C extension for 5 min.

根据本发明,步骤(2)所述第二轮PCR的条件为:95-99℃预变性1-5min;95-99℃预 变性5-15s,55-65℃延伸1-5min,共进行10-20个循环;70-75℃延伸5min。According to the present invention, the conditions of the second round of PCR in the step (2) are: pre-denaturation at 95-99 ° C for 1-5 min; 95-99 ° C pre- Denaturation 5-15 s, extension 55-65 ° C 1-5 min, a total of 10-20 cycles; 70-75 ° C extension 5 min.

根据本发明,步骤(2)所述第二轮PCR的条件为:98℃2min;98℃10s,62℃2min,15个循环;72℃5min,1个循环。According to the present invention, the conditions of the second round of PCR in the step (2) are: 98 ° C 2 min; 98 ° C 10 s, 62 ° C 2 min, 15 cycles; 72 ° C 5 min, 1 cycle.

本发明中,由于第一轮PCR用于扩增目标文库,第二轮PCR用于扩增全基因组文库,本领域技术人员可以通过调节第一轮和第二轮PCR的循环数来调整目标文库和全基因组文库的比例,增加第一轮或减少第二轮循环数可以增加靶向文库的占比,减少第一轮和增加第二轮循环数可以减少靶向文库的占比,以满足不同检测数据的需求。In the present invention, since the first round of PCR is used to amplify the target library, the second round of PCR is used to amplify the whole genome library, and those skilled in the art can adjust the target library by adjusting the number of cycles of the first round and the second round of PCR. The ratio of the whole genome library to the first round or the second round of the cycle can increase the proportion of the target library. Decreasing the first round and increasing the number of cycles in the second round can reduce the proportion of the target library to meet the difference. The need to detect data.

根据本发明,步骤(1)之前还包括提取样本、末端修复和加A的步骤,所述提取样本、末端修复和加A为本领域的常规技术,本领域技术人员可以根据需要进行选择,在此不做特殊限定。According to the present invention, before step (1), the steps of extracting the sample, repairing the end, and adding A, the extraction sample, the end repair and the addition of A are conventional techniques in the art, and those skilled in the art can select according to the needs. This is not a special limitation.

根据本发明,步骤(1)之后还包括纯化的步骤,优选采用磁珠进行纯化。According to the invention, step (1) further comprises a step of purifying, preferably using magnetic beads.

根据本发明,步骤(2)之后还包括数据分析的步骤。According to the invention, step (2) further comprises the step of data analysis.

根据本发明,所述数据分析具体包括:原始下机数据过滤,用BWA比对参考基因组,分别进行目标区域分析和全基因组分析。优选地,所述全基因组分析具体包括:通过引物序列去除由特异性引物导致的在目标区域外的非特异性扩增产物,将去除后的数据进行均一性分析,并根据需要对去掉非特异性产物的全基因组数据进行特定分析,如染色体数目异常分析,染色体结构变异分析等。According to the present invention, the data analysis specifically includes: original offline data filtering, comparing the reference genome with BWA, and performing target region analysis and genome-wide analysis, respectively. Preferably, the whole genome analysis comprises: removing non-specific amplification products outside the target region caused by specific primers by primer sequences, performing homogeneity analysis on the removed data, and removing non-specific products as needed The genome-wide data is analyzed for specific analysis, such as abnormal chromosome number analysis and chromosome structural variation analysis.

根据本发明,所述目标区域分析具体包括突变分析。According to the invention, the target region analysis specifically comprises a mutation analysis.

根据本发明,所述突变包括点突变、插入缺失或基因融合中的任意一种或至少两种类型的组合。According to the invention, the mutation comprises any one of a point mutation, an insertion deletion or a gene fusion or a combination of at least two types.

根据本发明,所述点突变的分析具体包括如下步骤:对目标区域的reads进行统计,去除reads上的特异性引物序列,对目标区域去除特异性引物后的reads进行分析统计,采用samtools对突变位点碱基型别和比例进行分析。According to the present invention, the analysis of the point mutation specifically includes the following steps: counting the reads of the target region, removing the specific primer sequences on the reads, analyzing and counting the reads after removing the specific primers in the target region, using samtools for the mutation Site base type and ratio were analyzed.

根据本发明,所述全基因组分析具体包括:通过引物序列去除由特异性引物导致的在目标区域外的非特异性扩增,将去除后的数据进行均一性分析,并根据需要对去掉非特异性产物的全基因组数据进行特定分析,如染色体数目异常分析,染色体结构变异分析等。According to the present invention, the whole genome analysis specifically comprises: removing non-specific amplification outside the target region caused by the specific primer by the primer sequence, performing homogeneity analysis on the removed data, and removing the non-specific product as needed. The genome-wide data is analyzed for specific analysis, such as abnormal chromosome number analysis and chromosome structural variation analysis.

第四方面,本发明提供一种同时进行目标区域和全基因扩增的系统,包括:In a fourth aspect, the present invention provides a system for simultaneously performing target region and whole gene amplification, comprising:

(1)样本处理模块:用于提取样本,将样本两端加上带有标签的接头序列;(1) Sample processing module: for extracting a sample, and adding a labeled link sequence to both ends of the sample;

(2)PCR扩增模块:与样本处理模块相连,用于进行两轮PCR扩增,第一轮PCR扩增采用第一组引物组,第二轮PCR扩增采用第二组引物组,构建测序文库。 (2) PCR amplification module: connected to the sample processing module for performing two rounds of PCR amplification, the first round of PCR amplification uses the first set of primer sets, and the second round of PCR amplification uses the second set of primer sets to construct Sequencing library.

优选地,所述系统还包括数据分析模块:与PCR扩增模块相连,用于将构建的文库进行测序,数据分析。Preferably, the system further comprises a data analysis module: connected to the PCR amplification module for sequencing the constructed library and analyzing the data.

第五方面,本发明提供一种如第一方面所述的引物组合物用于检测待测样本的突变和染色体数目的异常的用途。In a fifth aspect, the present invention provides the use of the primer composition of the first aspect for detecting an abnormality of a sample to be tested and an abnormality in the number of chromosomes.

与现有技术相比,本申请具有如下有益效果:Compared with the prior art, the present application has the following beneficial effects:

(1)本发明引物组合物可以同时实现全基因组扩增和多个目标区域扩增,需要的样本量小,降低了成本,操作简单,得到的目标区域数据均一稳定,目标区域数据可用于多种突变类型的检测,且目标区域扩增和全基因组扩增互不干扰;(1) The primer composition of the invention can simultaneously achieve whole genome amplification and multiple target region amplification, requires a small sample size, reduces cost, is simple to operate, and obtains stable target data, and the target region data can be used for multiple Detection of mutation types, and target region amplification and whole genome amplification do not interfere with each other;

(2)本发明方法只用一份样本的情况下能够完成全基因组和目标区域的构建,全基因组文库和目标文库经过测序后得到不同深度的测序数据,全基因组的数据是低深度全覆盖的,可以用于基因组范围内的染色体数据和结构异常进行检测,目标区域的数据是高深度的,可以用于点突变、小的插入缺失等各种类型突变的检测;(2) The method of the present invention can complete the construction of the whole genome and the target region with only one sample. The whole genome library and the target library are sequenced to obtain different depth sequencing data, and the whole genome data is low depth full coverage. It can be used for the detection of chromosomal data and structural abnormalities within the genome. The data of the target region is high-depth and can be used for the detection of various types of mutations such as point mutations and small insertions and deletions;

(3)本发明方法在无创产前检测领域有非常大的应用价值。(3) The method of the invention has great application value in the field of non-invasive prenatal detection.

附图说明DRAWINGS

图1为本申请特异性引物1和特异性引物2的设计示意图;Figure 1 is a schematic diagram showing the design of specific primer 1 and specific primer 2 of the present application;

图2为本申请同时进行目标区域扩增和全基因扩增方法的流程图;2 is a flow chart of a method for simultaneously performing target region amplification and whole gene amplification in the present application;

图3为本申请目标区域和全基因组文库构建的流程图;Figure 3 is a flow chart showing the construction of the target region and the whole genome library of the present application;

图4为本申请实施例中测序文库2100检测的结果图;4 is a diagram showing the results of detection of the sequencing library 2100 in the embodiment of the present application;

图5为本申请实施例信息分析流程图;FIG. 5 is a flowchart of information analysis of an embodiment of the present application;

图6为本申请FGFR3基因目标区域的深度分析结果图;Figure 6 is a diagram showing the results of in-depth analysis of the target region of the FGFR3 gene of the present application;

图7为本申请FGFR3基因不同样本扩增稳定性分析结果图;Figure 7 is a diagram showing the results of amplification stability analysis of different samples of the FGFR3 gene of the present application;

图8为本申请该方法在全基因组上的数据分布结果图,其中,白色柱形图为参考染色体大小,黑色柱形图为去重后唯一比对到改染色体上的reads数;8 is a diagram showing the results of data distribution on the whole genome of the method of the present application, wherein the white histogram is the reference chromosome size, and the black histogram is the only number of reads on the modified chromosome after deduplication;

图9为本申请该方法在全基因组的染色体上数据稳定性分析。Figure 9 is a graph showing the stability analysis of the method on the whole genome of the genome of the present application.

具体实施方式Detailed ways

为进一步阐述本发明所采取的技术手段及其效果,以下结合实施例对本发明作进一步地说明。可以理解的是,此处所描述的具体实施方式仅仅用于解释本发明,而非对本发明的限定。In order to further explain the technical means and effects of the present invention, the present invention will be further described below in conjunction with the embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

实施例中未注明具体技术或条件者,按照本领域内的文献所描述的技术或条件,或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可通过正规渠道商购获得的常 规产品。The specific techniques or conditions are not indicated in the examples, according to the techniques or conditions described in the literature in the art, or in accordance with the product specifications. If the reagents or instruments used do not indicate the manufacturer, they are all commercially available through formal channels. Regulatory products.

实施例1 FGFR3基因的引物组合物Example 1 Primer Composition of FGFR3 Gene

根据FGFR3基因设计引物,如图1所示,引物覆盖FGFR3基因和ACH相关的热点突变,设计的特异性引物1的引物池如SEQ ID NO.5-22如表1所示,设计的特异性引物2如SEQ ID NO.23-40的引物池如表2所示,共计18对,具体如下:Primers designed according to the FGFR3 gene, as shown in Figure 1, primers covering the FGFR3 gene and ACH-related hotspot mutations, the primer pool of the specific primer 1 designed as shown in SEQ ID NO. 5-22, as shown in Table 1, the specificity of the design Primer 2 The primer pools of SEQ ID NOS. 23-40 are shown in Table 2, for a total of 18 pairs, as follows:

表1 特异性引物1Table 1 Specific primer 1

引物编号Primer number 引物序列(5’-3’)Primer sequence (5'-3') SEQ ID NO.5SEQ ID NO. 5 CGTGGCCCCTGAGCGTCATCCGTGGCCCCTGAGCGTCATC SEQ ID NO.6SEQ ID NO.6 CGAGCCGAGGAGGAGCTGGTCGAGCCGAGGAGGAGCTGGT SEQ ID NO.7SEQ ID NO.7 CGGGCTTCTTCCTGTTCATCCCGGGCTTCTTCCTGTTCATCC SEQ ID NO.8SEQ ID NO.8 CGGAGATGGAGATGATGAAGATGATCCGGAGATGGAGATGATGAAGATGATC SEQ ID NO.9SEQ ID NO.9 CGCCTTCCCCAGTGCATCCACGCCTTCCCCAGTGCATCCA SEQ ID NO.10SEQ ID NO.10 CGCCCAGCAGTGGGGGCTCGCGCCCAGCAGTGGGGGCTCG SEQ ID NO.11SEQ ID NO.11 CGGGCCCCTGAGCGTCATCTCGGGCCCCTGAGCGTCATCT SEQ ID NO.12SEQ ID NO.12 CGTGCTGGGCAGCGACGTGGCGTGCTGGGCAGCGACGTGG SEQ ID NO.13SEQ ID NO.13 CGCCGGGACGTGCACAACCTCGCCGGGACGTGCACAACCT SEQ ID NO.14SEQ ID NO.14 CGACGAGGCGGGCAGTGTGTCGACGAGGCGGGCAGTGTGT SEQ ID NO.15SEQ ID NO.15 CGCCGGGACGTGCACAACCTCGCCGGGACGTGCACAACCT SEQ ID NO.16SEQ ID NO.16 CGCGCCAGGCCTCCTGGAGCCGCGCCAGGCCTCCTGGAGC SEQ ID NO.17SEQ ID NO.17 CGACGCGTCCATGAGCTCCAACACCGACGCGTCCATGAGCTCCAACAC SEQ ID NO.18SEQ ID NO.18 CGTGCTCCTGGGACGGGTGTATCGTGCTCCTGGGACGGGTGTAT SEQ ID NO.19SEQ ID NO.19 CGTTTGCACACTCATGGTCCCTCCGTTTGCACACTCATGGTCCCTC SEQ ID NO.20SEQ ID NO.20 CGCTGAGGAGCCCGTGTCCCCAGCGCTGAGGAGCCCGTGTCCCCAG SEQ ID NO.21SEQ ID NO. 21 CGGCATCCTCAGCTACGGGGTGCGGCATCCTCAGCTACGGGGTG SEQ ID NO.22SEQ ID NO.22 CGTGTCTGAGATGGAGATGATGAAGCGTGTCTGAGATGGAGATGATGAAG

其中,所有特异性引物1以等摩尔浓度混合(10μM)得到特异性引物1的引物池,引物池终浓度为10μM。Among them, all the specific primers 1 were mixed at an equimolar concentration (10 μM) to obtain a primer pool of the specific primer 1, and the final concentration of the primer pool was 10 μM.

表2 特异性引物2Table 2 Specific primers 2

引物编号Primer number 引物序列(5’-3’)Primer sequence (5'-3')

SEQ ID NO.23SEQ ID NO.23 GAACGACATGGCTACGATCCGACTTAGCGTCATCTGCCCCCACGAACGACATGGCTACGATCCGACTTAGCGTCATCTGCCCCCAC SEQ ID NO.24SEQ ID NO.24 GAACGACATGGCTACGATCCGACTTGTGGAGGCTGACGAGGCGGAACGACATGGCTACGATCCGACTTGTGGAGGCTGACGAGGCG SEQ ID NO.25SEQ ID NO.25 GAACGACATGGCTACGATCCGACTTTGTTCATCCTGGTGGTGGGAACGACATGGCTACGATCCGACTTTGTTCATCCTGGTGGTGG SEQ ID NO.26SEQ ID NO.26 GAACGACATGGCTACGATCCGACTTGAAGATGATCGGGAAACACAAAAACGAACGACATGGCTACGATCCGACTTGAAGATGATCGGGAAACACAAAAAC SEQ ID NO.27SEQ ID NO.27 GAACGACATGGCTACGATCCGACTTTGCATCCACAGGGACCTGGGAACGACATGGCTACGATCCGACTTTGCATCCACAGGGACCTGG SEQ ID NO.28SEQ ID NO.28 GAACGACATGGCTACGATCCGACTTGTGGGGGCTCGCGGACGTGAACGACATGGCTACGATCCGACTTGTGGGGGCTCGCGGACGT SEQ ID NO.29SEQ ID NO.29 GAACGACATGGCTACGATCCGACTTTGCCCCCACAGAGCGCTCGAACGACATGGCTACGATCCGACTTTGCCCCCACAGAGCGCTC SEQ ID NO.30SEQ ID NO.30 GAACGACATGGCTACGATCCGACTTGGAGTTCCACTGCAAGGTGTGAACGACATGGCTACGATCCGACTTGGAGTTCCACTGCAAGGTGT SEQ ID NO.31SEQ ID NO.31 GAACGACATGGCTACGATCCGACTTGTGCACAACCTCGACTACTACAAGGAACGACATGGCTACGATCCGACTTGTGCACAACCTCGACTACTACAAG SEQ ID NO.32SEQ ID NO.32 GAACGACATGGCTACGATCCGACTTTATGCAGGCATCCTCAGCTACGAACGACATGGCTACGATCCGACTTTATGCAGGCATCCTCAGCTAC SEQ ID NO.33SEQ ID NO.33 GAACGACATGGCTACGATCCGACTTGTGCACAACCTCGACTACTACAAGGAACGACATGGCTACGATCCGACTTGTGCACAACCTCGACTACTACAAG SEQ ID NO.34SEQ ID NO.34 GAACGACATGGCTACGATCCGACTTCCACCTCGGCCCACGCTGGGAACGACATGGCTACGATCCGACTTCCACCTCGGCCCACGCTGG SEQ ID NO.35SEQ ID NO.35 GAACGACATGGCTACGATCCGACTTACTGGTGCGCATCGCAAGGCTGAACGACATGGCTACGATCCGACTTACTGGTGCGCATCGCAAGGCT SEQ ID NO.36SEQ ID NO.36 GAACGACATGGCTACGATCCGACTTGCCCCTCTCAAGGTGCCCTGGAACGACATGGCTACGATCCGACTTGCCCCTCTCAAGGTGCCCTG SEQ ID NO.37SEQ ID NO.37 GAACGACATGGCTACGATCCGACTTCTCCACTGCCAGGCTGACCCTGAACGACATGGCTACGATCCGACTTCTCCACTGCCAGGCTGACCCT SEQ ID NO.38SEQ ID NO.38 GAACGACATGGCTACGATCCGACTTCCTGTACGTGCTGGTGGAGTAGAACGACATGGCTACGATCCGACTTCCTGTACGTGCTGGTGGAGTA SEQ ID NO.39SEQ ID NO.39 GAACGACATGGCTACGATCCGACTTGGCTTCTTCCTGTTCATCCTGAACGACATGGCTACGATCCGACTTGGCTTCTTCCTGTTCATCCT SEQ ID NO.40SEQ ID NO.40 GAACGACATGGCTACGATCCGACTTTGATCGGGAAACACAAAAACATCATCGAACGACATGGCTACGATCCGACTTTGATCGGGAAACACAAAAACATCATC

其中,粗体为接头序列,所有特异性引物2以等摩尔浓度混合(10μM)得到特异性引物2的引物池,引物池终浓度为10μM。Among them, the crude is a linker sequence, and all the specific primers 2 were mixed at an equimolar concentration (10 μM) to obtain a primer pool of the specific primer 2, and the final concentration of the primer pool was 10 μM.

表3 测序通用引物和接头序列Table 3 Sequencing universal primers and linker sequences

Figure PCTCN2017104110-appb-000002
Figure PCTCN2017104110-appb-000002

Figure PCTCN2017104110-appb-000003
Figure PCTCN2017104110-appb-000003

其中SEQ ID NO.1中粗体为测序10bp标签序列,用于区分不同测序样本进行平行测序。The bold in SEQ ID NO. 1 is a sequencing 10 bp tag sequence for distinguishing different sequencing samples for parallel sequencing.

实施例2 FGFR3基因突变和染色体数目异常同时进行检测Example 2 FGFR3 gene mutation and chromosome number abnormality simultaneous detection

对FGFR3基因突变和染色体数目异常同时进行检测,检测流程图如图2所示,具体包括如下步骤:The FGFR3 gene mutation and the abnormal chromosome number are simultaneously detected, and the detection flow chart is shown in FIG. 2, which specifically includes the following steps:

(1)样本提取:(1) Sample extraction:

10例孕妇血浆(来源于北京301医院,已和患者签署知情同意,样本可用于科研应用),男女各五例,孕周12-14周不等,其中1例FGFR3基因发生突变(c.1138G>C,一例18三体,一例21三体,其余样本正常,样本信息如表4)。10 cases of maternal plasma (from Beijing 301 Hospital, informed consent has been signed with patients, the sample can be used for scientific research), five cases of men and women, 12-14 weeks of gestational age, including 1 case of FGFR3 gene mutation (c.1138G >C, one case of 18 trisomy, one case of 21 trisomy, the rest of the samples are normal, and the sample information is shown in Table 4).

表4:样本信息Table 4: Sample Information

Figure PCTCN2017104110-appb-000004
Figure PCTCN2017104110-appb-000004

用QIAGEN游离DNA提取试剂盒(德国qiagen公司,货号CatNO./ID:180023),对10例血浆样本进行提取,得到的cfDNA溶于40μl的AE溶液中,得到的cfDNA按照以下操作进行文库制备(文库制备采用华大基因公司的联合探针锚定聚合测序法建库试剂盒,货号BOX3);10 samples of plasma were extracted using QIAGEN free DNA extraction kit (Ziagen, Germany, Cat. No. 180023), and the obtained cfDNA was dissolved in 40 μl of AE solution, and the obtained cfDNA was subjected to library preparation according to the following procedure ( The library was prepared by the joint probe anchor polymerization and sequencing method of Huada Gene Co., Ltd., the product number BOX3);

(2)末端修复和加A(2) End repair and add A

对cf DNA进行末端修复,并在DNA链的3端加上碱基A,末端修复加A反应体系采 用华大基因公司的联合探针锚定聚合测序法建库试剂盒(货号BOX3),具体如表5所示:End repair of cf DNA, and base A at the 3 end of the DNA strand, end repair plus A reaction system Use the joint probe anchor polymerization and sequencing method of Huada Gene Company to build the library kit (Cat. No. BOX3), as shown in Table 5:

表5:末端修复和加A反应体系Table 5: End repair and A reaction system

Figure PCTCN2017104110-appb-000005
Figure PCTCN2017104110-appb-000005

反应条件:37℃,15min,65℃,15min;Reaction conditions: 37 ° C, 15 min, 65 ° C, 15 min;

(3)加接头,所述接头连接反应的体系采用华大基因公司的联合探针锚定聚合测序法建库试剂盒(货号BOX3),具体如表6所示:(3) Adding a linker, the system for the ligation reaction of the linker is constructed by the Joint Probe Anchor Polymerization Sequencing Method of the Huada Gene Company (Cat. No. BOX3), as shown in Table 6:

表6:接头连接反应体系Table 6: Joint connection reaction system

Figure PCTCN2017104110-appb-000006
Figure PCTCN2017104110-appb-000006

反应条件:23℃,30min;Reaction conditions: 23 ° C, 30 min;

反应完成后,加入60μl AXYGEN纯化磁珠(MAG-FRAG-I-50,美国康宁公司)进行纯化,得到的DNA溶解于20μl蒸馏水中。After completion of the reaction, 60 μl of AXYGEN purified magnetic beads (MAG-FRAG-I-50, Corning, USA) was added for purification, and the obtained DNA was dissolved in 20 μl of distilled water.

(4)进行两轮PCR扩增,如图3所示,第一轮PCR,特异性引物1的引物池如实施例1中的表1,第一轮PCR体系采用华大基因公司的联合探针锚定聚合测序法建库试剂盒(货号BOX3),具体如表7所示:(4) performing two rounds of PCR amplification, as shown in FIG. 3, the first round of PCR, the primer pool of the specific primer 1 is as shown in Table 1 of the first embodiment, and the first round of the PCR system is jointly explored by the Huada Gene Company. Needle-anchored polymerization sequencing method to build a library kit (item number BOX3), as shown in Table 7:

表7:PCR体系Table 7: PCR system

Figure PCTCN2017104110-appb-000007
Figure PCTCN2017104110-appb-000007

Figure PCTCN2017104110-appb-000008
Figure PCTCN2017104110-appb-000008

注:特异性引物1的引物池(10μM):所有引物浓度之和为10μM。Note: Primer pool for specific primer 1 (10 μM): The sum of all primer concentrations was 10 μM.

反应条件如下:98℃预变性2min;98℃预变性10s,62℃延伸2min,共进行20个循环;72℃延伸5min;The reaction conditions were as follows: pre-denaturation at 98 ° C for 2 min; pre-denaturation at 98 ° C for 10 s, extension at 62 ° C for 2 min, a total of 20 cycles; 72 ° C extension for 5 min;

反应完成后,加入60μl AXYGEN纯化磁珠进行纯化,得到的DNA溶解于20μl蒸馏水中;After the reaction was completed, 60 μl of AXYGEN purified magnetic beads were added for purification, and the obtained DNA was dissolved in 20 μl of distilled water;

第二轮PCR,特异性引物2的引物池如实施例1的表2,第二轮PCR体系采用华大基因公司的联合探针锚定聚合测序法建库试剂盒(货号BOX3),具体如表8所示:For the second round of PCR, the primer pool of the specific primer 2 is as shown in Table 2 of the first embodiment, and the second round of the PCR system uses the joint probe anchor polymerization sequencing method of the Huada Gene Company to build the library kit (Cat. No. BOX3), as shown in Table 8 shows:

表8:PCR体系Table 8: PCR system

Figure PCTCN2017104110-appb-000009
Figure PCTCN2017104110-appb-000009

注:特异性引物2的引物池(10μM):所有引物浓度之和为10μM。Note: Primer pool for specific primer 2 (10 μM): The sum of all primer concentrations was 10 μM.

反应条件如下:98℃2min;98℃10s,62℃2min,15个循环;72℃5min,1个循环;The reaction conditions are as follows: 98 ° C 2 min; 98 ° C 10 s, 62 ° C 2 min, 15 cycles; 72 ° C 5 min, 1 cycle;

反应完成后,加入60μl AXYGEN纯化磁珠进行纯化,得到的DNA溶解于20μl蒸馏水中After the reaction was completed, 60 μl of AXYGEN purified magnetic beads were added for purification, and the obtained DNA was dissolved in 20 μl of distilled water.

文库质检如图4所示,靶向区域和全基因组区域富集后的文库大小和血浆游离DNA大小接近,主峰245bp,还有400bp和577两个小峰和血浆游离大小吻合(血浆游离DNA的大小加上BGISEQ-500的84bp接头序列),检测合格后在BGISEQ500上进行测序,测序类型双端50bp。The quality of the library is shown in Figure 4. The size of the library and the plasma free DNA size are close to the target region and the whole genome region. The main peak is 245 bp, and there are two small peaks of 400 bp and 577 and plasma free size anastomosis (plasma free DNA). Size plus the 84 bp linker sequence of BGISEQ-500), sequenced on BGISEQ500 after sequencing, and the sequencing type was double-ended 50 bp.

(5)数据分析,流程如图5所示,数据通过过滤后用bwa进行比对,对目标区域的reads进行分析统计,然后通过特定的算法对突变位点进行分析;通过引物序列去除由特异性引物 来带的在目标区域外的非特异性扩增,得到的数据进行全基因组分析,包括染色体异常、胎儿性别、胎儿游离DNA浓度等;(5) Data analysis, the flow is shown in Figure 5. The data is filtered and compared with bwa. The readings of the target area are analyzed and statistically analyzed, and then the mutation sites are analyzed by a specific algorithm; Sex primer Non-specific amplification outside the target region, and the obtained data is subjected to genome-wide analysis, including chromosomal abnormalities, fetal sex, fetal free DNA concentration, and the like;

下机数据如表9所示:The data of the machine is shown in Table 9:

表9:下机数据Table 9: Lower machine data

Figure PCTCN2017104110-appb-000010
Figure PCTCN2017104110-appb-000010

测序类型BGISEQ-500PE50+10,每个样本原始下机数据在10M以上,比对率在91.8-94.6%之间,目标区域数据在1.1-1.6%之间,目标区域平均深度5002-7576×之间,目标区域外(全基因组范围)平均深度在0.23-0.27×之间。Sequencing type BGISEQ-500PE50+10, the original off-machine data of each sample is above 10M, the comparison rate is between 91.8-94.6%, the target area data is between 1.1-1.6%, and the target area average depth is 5002-7576× The average depth outside the target area (wide genome range) is between 0.23-0.27×.

(1’)对目标区域数据分析(1') Analysis of target area data

(a)目标区域性能评估(a) Performance assessment of the target area

对目标区域性能评估,结果如图6-7所示,从图6可以看出,每个扩增区域的深度分布,其中0.1×平均深度值达到了94.3%,扩增均一性好,从图7可以看出,不同样本扩增稳定性:同一次样本两次实验测试得到的每个扩增区域深度对比,其中R2=0.9934,稳定性好;The performance of the target area is evaluated. The results are shown in Figure 6-7. As can be seen from Figure 6, the depth distribution of each amplified region, where the 0.1× average depth value reaches 94.3%, the uniformity of amplification is good, from the figure. 7 can be seen that the stability of different sample amplification: the depth comparison of each amplified region obtained by two experiments in the same sample, where R2=0.9934, the stability is good;

(b)目标区域突变分析(b) Target region mutation analysis

FGFR3阳性样本检测结果如表10所示:The results of FGFR3 positive samples are shown in Table 10:

表10:FGFR3阳性样本检测结果Table 10: FGFR3 positive sample test results

Figure PCTCN2017104110-appb-000011
Figure PCTCN2017104110-appb-000011

Figure PCTCN2017104110-appb-000012
Figure PCTCN2017104110-appb-000012

共10例样本,其中2号样本为FGFR3基因16号外显子上发生点突变阳性样本(c.1138G>C),实验能正确检测该位点,其中突变检测G占该位置整个碱基的比例为5.2%,其余样本在FGFR3基因上均未检测到突变。A total of 10 samples, of which sample 2 was a point mutation positive sample on exon 16 of FGFR3 gene (c.1138G>C), the experiment can correctly detect the site, and the mutation detection G accounts for the proportion of the entire base at the position. At 5.2%, the remaining samples did not detect mutations on the FGFR3 gene.

(2’)全基因组数据分析(2') Whole genome data analysis

(a)每个染色体上的数据分布(a) Distribution of data on each chromosome

每个染色体上数据分布结果如图8所示,可以看出,选取其中一个样本(样本1),分析其全基因组上(每条染色体上)的数据分布,实验去重后唯一比对到该染色体上的reads数与参考染色体大小基本吻合。The results of data distribution on each chromosome are shown in Figure 8. It can be seen that one of the samples (sample 1) is selected and analyzed for the distribution of data on the whole genome (on each chromosome). The number of reads on the chromosome is basically consistent with the size of the reference chromosome.

(b)染色体上数据稳定性分析(b) Analysis of data stability on chromosomes

染色体上数据稳定性分析如图9所示,2个样本之间各条染色体上数据占比统计,横坐标是一个样本的各条染色体数据占比,纵坐标是另一个样本的各条染色体数据占比,其中R2=0.9999,稳定性好。The data stability analysis on the chromosome is shown in Figure 9. The data on the chromosomes between the two samples is statistically proportional. The abscissa is the proportion of each chromosome in one sample, and the ordinate is the chromosomal data in the other sample. The ratio, where R 2 =0.9999, is good.

(c)染色体异常检测(c) Chromosome abnormality detection

三体检测,根据根据特定算法(Z-Score,Z-Score算法参考Fan Jiang,Jinghui Ren,Fang Chen:Noninvasive Fetal Trisomy(NIFTY)test:an advanced noninvasive prenatal diagnosis methodology for fetal autosomal and sex chromosomal aneuploidies.BMC Medical Genomics20125:57)计算染色体是否发生异常,检测结果如表8所示:Three-body detection, according to the specific algorithm (Z-Score, Z-Score algorithm reference Fan Jiang, Jinghui Ren, Fang Chen: Noninvasive Fetal Trisomy (NIFTY) test: an advanced noninvasive prenatal diagnosis method for fetal autosomal and sex chromosomal aneuploidies. BMC Medical Genomics 2012 5:57) Calculate whether the chromosome is abnormal. The test results are shown in Table 8:

表11:三体检测结果Table 11: Three-body test results

Figure PCTCN2017104110-appb-000013
Figure PCTCN2017104110-appb-000013

Figure PCTCN2017104110-appb-000014
Figure PCTCN2017104110-appb-000014

*无创产前检测结果为采用华大基因NIFTY检测得到的结果。* Non-invasive prenatal test results are the results obtained using the NGFTY test of Huada Gene.

该实验中共用10例样本,其中7/8号样本为分别为T18三体和T21三体阳性样本,我们的方法能正确检测该染色体异常,其余样本未检测到染色体异常;胎儿游离DNA浓度和无创产前结果相比误差在±0.4%之间;胎儿性别判断和无创产前检测结果一致,该方法和无创产前检测结果一致。In this experiment, 10 samples were shared, of which 7/8 samples were T18 trisomy and T21 trisomy positive samples, our method can detect the chromosomal abnormality correctly, and the other samples did not detect chromosomal abnormalities; fetal free DNA concentration and The non-invasive prenatal results were compared with the error of ±0.4%; the fetal gender judgment was consistent with the non-invasive prenatal test results, and the method was consistent with the non-invasive prenatal test results.

综上所述,本申请该方法可以对全基因组和多个目标区域同时进行扩增,得到低深度的全基因组数据和高深度的目标区域数据。对于多个目标区域,有很好的均一性和稳定性,高深度的目标区域数据可以检测目标区域的多种突变类型;对于全基因组数据,也有很好的均一性和稳定性,低深度的全基因组数据可以进行染色体数目和结构异常的检测,并且目标区域的扩增不影响全基因组范围内染色体检测的结果。In summary, the method of the present application can simultaneously amplify the whole genome and multiple target regions, and obtain low-depth whole genome data and high-depth target region data. For multiple target areas, there is good uniformity and stability. High-depth target area data can detect multiple mutation types in the target area. For whole genome data, it also has good uniformity and stability, low depth. Whole genome data can be used to detect chromosome number and structural abnormalities, and amplification of the target region does not affect the results of genome-wide chromosome detection.

申请人声明,本发明通过上述实施例来说明本发明的详细方法,但本发明并不局限于上述详细方法,即不意味着本发明必须依赖上述详细方法才能实施。所属技术领域的技术人员应该明了,对本发明的任何改进,对本发明产品各原料的等效替换及辅助成分的添加、具体方式的选择等,均落在本发明的保护范围和公开范围之内。 The Applicant declares that the present invention is described by the above-described embodiments, but the present invention is not limited to the above detailed methods, that is, it does not mean that the present invention must be implemented by the above detailed methods. It should be apparent to those skilled in the art that any modifications of the present invention, equivalent substitution of the various materials of the products of the present invention, addition of auxiliary components, selection of specific means, and the like, are all within the scope of the present invention.

Claims (15)

一种同时进行目标区域和全基因扩增的引物组合物,其特征在于,包括:A primer composition for simultaneously performing target region and whole gene amplification, comprising: 第一组引物组,所述第一引物组包括特异性引物1的引物池和测序通用引物1,其中,所述特异性引物1的3’端包含目标区域特异性正向引物;所述测序通用引物1的3’端包含目标区域其中一端的测序接头序列1的互补序列;a first set of primer sets comprising a primer pool of specific primer 1 and a sequencing universal primer 1, wherein the 3' end of the specific primer 1 comprises a target region-specific forward primer; the sequencing The 3' end of universal primer 1 comprises the complementary sequence of sequencing linker sequence 1 at one end of the target region; 第二组引物组,所述第二引物组包括特异性引物2的引物池、测序通用引物1和测序通用引物2,其中,所述特异性引物2的3’端包含目标区域特异性正向引物,5’端包含目标区域另一端的测序接头序列2的序列;所述测序通用引物2包含目标区域另一端的测序接头序列2的序列。a second set of primer sets comprising a primer pool of specific primer 2, a sequencing universal primer 1 and a sequencing universal primer 2, wherein the 3' end of the specific primer 2 comprises a target region-specific positive Primer, the 5' end comprises the sequence of the sequencing linker sequence 2 at the other end of the target region; the sequencing universal primer 2 comprises the sequence of the sequencing linker sequence 2 at the other end of the target region. 根据权利要求1所述的引物组合物,其特征在于,所述第一组引物组中的特异性引物1的5’端包含2-10个CG碱基。The primer composition according to claim 1, wherein the 5' end of the specific primer 1 in the first set of primer sets contains 2 to 10 CG bases. 根据权利要求1或2所述的引物组合物,其特征在于,所述特异性引物1和所述特异性引物2的距离为-10bp~15bp。The primer composition according to claim 1 or 2, wherein the specific primer 1 and the specific primer 2 have a distance of -10 bp to 15 bp. 根据权利要求1-3任一项所述的引物组合物,其特征在于,所述第一组引物组中的特异性引物1的5’端包含2个CG碱基;The primer composition according to any one of claims 1 to 3, wherein the 5' end of the specific primer 1 in the first set of primer sets comprises 2 CG bases; 优选地,所述特异性引物1和所述特异性引物2的距离为-5bp~10bp;Preferably, the distance between the specific primer 1 and the specific primer 2 is -5 bp to 10 bp; 优选地,所述接头序列1的核苷酸序列如SEQ ID NO.1所示;Preferably, the nucleotide sequence of the linker sequence 1 is as shown in SEQ ID NO. 优选地,所述接头序列2的核苷酸序列如SEQ ID NO.2所示。Preferably, the nucleotide sequence of said linker sequence 2 is set forth in SEQ ID NO. 根据权利要求1或2所述的引物组合物,其特征在于,所述测序通用引物1和所述测序通用引物2独立地选自任意一种二代测序平台的测序通用引物;The primer composition according to claim 1 or 2, wherein the sequencing universal primer 1 and the sequencing universal primer 2 are independently selected from a sequencing universal primer of any one of the second generation sequencing platforms; 优选地,所述测序通用引物1的核苷酸序列如SEQ ID NO.3所示;Preferably, the nucleotide sequence of the sequencing universal primer 1 is as shown in SEQ ID NO. 3; 优选地,所述测序通用引物2的核苷酸序列如SEQ ID NO.4所示。Preferably, the nucleotide sequence of the sequencing universal primer 2 is as shown in SEQ ID NO. 一种同时进行目标区域和全基因扩增的试剂盒,其特征在于,其包括如权利要求1-5中任一项所述的引物组合物。A kit for simultaneously performing a target region and whole gene amplification, characterized in that it comprises the primer composition according to any one of claims 1 to 5. 一种同时进行目标区域和全基因扩增的方法,其特征在于,采用如权利要求1-5中任一项所述的引物组合物,包括如下步骤:A method of simultaneously performing a target region and whole-gene amplification, characterized in that the primer composition according to any one of claims 1 to 5 comprises the following steps: (1)将待测样本两端加上接头序列;(1) adding a linker sequence to both ends of the sample to be tested; (2)进行两轮PCR扩增,第一轮PCR扩增采用第一组引物组,第二轮PCR扩增采用第二组引物组,构建测序文库。(2) Two rounds of PCR amplification were performed. The first round of PCR amplification was performed using the first set of primer sets, and the second round of PCR amplification was performed using the second set of primer sets to construct a sequencing library. 根据权利要求7所述的方法,其特征在于,所述接头序列带有标签。The method of claim 7 wherein said linker sequence is tagged. 根据权利要求7或8所述的方法,其特征在于,步骤(2)所述第一轮PCR的条件为: 95-99℃预变性1-5min;95-99℃预变性5-15s,55-65℃延伸1-5min,共进行15-25个循环;70-75℃延伸5min;The method according to claim 7 or 8, wherein the condition of the first round of PCR in step (2) is: 95-99 ° C pre-denaturation 1-5 min; 95-99 ° C pre-denaturation 5-15 s, 55-65 ° C extension 1-5 min, a total of 15-25 cycles; 70-75 ° C extension 5 min; 优选地,步骤(2)所述第一轮PCR的条件为:98℃预变性2min;98℃预变性10s,62℃延伸2min,共进行20个循环;72℃延伸5min。Preferably, the conditions of the first round of PCR in step (2) are: pre-denaturation at 98 ° C for 2 min; pre-denaturation at 98 ° C for 10 s, extension at 62 ° C for 2 min, a total of 20 cycles; and 72 ° C extension for 5 min. 根据权利要求7-9任一项所述的方法,其特征在于,步骤(2)所述第二轮PCR的条件为:95-99℃预变性1-5min;95-99℃预变性5-15s,55-65℃延伸1-5min,共进行10-20个循环;70-75℃延伸5min;The method according to any one of claims 7-9, wherein the condition of the second round of PCR in step (2) is: pre-denaturation at 95-99 ° C for 1-5 min; pre-denaturation at 95-99 ° C 5- 15s, 55-65 ° C extension 1-5min, a total of 10-20 cycles; 70-75 ° C extension 5min; 优选地,步骤(2)所述第二轮PCR的条件为:98℃ 2min;98℃ 10s,62℃ 2min,15个循环;72℃ 5min,1个循环。Preferably, the conditions of the second round of PCR in step (2) are: 98 ° C 2 min; 98 ° C 10 s, 62 ° C 2 min, 15 cycles; 72 ° C 5 min, 1 cycle. 根据权利要求7-10任一项所述的方法,其特征在于,步骤(1)之前还包括提取样本、末端修复和加A的步骤;The method according to any one of claims 7 to 10, further comprising the steps of extracting a sample, repairing the end, and adding A before step (1); 优选地,步骤(1)之后还包括纯化的步骤,优选采用磁珠进行纯化。Preferably, step (1) further comprises a step of purifying, preferably using magnetic beads for purification. 根据权利要求7-11中任一项所述的方法,其特征在于,步骤(2)之后还包括数据分析的步骤;The method according to any one of claims 7-11, characterized in that after step (2), the step of data analysis is further included; 优选地,所述数据分析具体包括:原始下机数据过滤,用BWA比对参考基因组,分别进行目标区域分析和全基因组分析;Preferably, the data analysis specifically comprises: filtering the original offline data, comparing the reference genome with the BWA, performing target region analysis and genome-wide analysis respectively; 优选地,所述目标区域分析包括突变分析;Preferably, the target region analysis comprises a mutation analysis; 优选地,所述突变包括点突变、插入缺失或基因融合中的任意一种或至少两种类型的组合;Preferably, the mutation comprises any one of a point mutation, an insertion deletion or a gene fusion or a combination of at least two types; 优选地,所述点突变的分析具体包括如下步骤:对目标区域的reads进行统计,去reads上除特异性引物序列,对目标区域去除后特异性引物的reads进行分析统计,采用samtools对突变位点碱基型别和比例进行分析;Preferably, the analysis of the point mutation specifically comprises the following steps: counting the reads of the target region, removing the specific primer sequences from the reads, analyzing and counting the specific primers after the target region is removed, and using the samtools pair of mutations. Point base type and ratio for analysis; 优选地,所述全基因组分析具体包括:通过引物序列去除由特异性引物导致的在目标区域外的非特异性扩增,将去除后的数据进行均一性分析,并根据需要对去掉非特异性产物的全基因组数据进行特定分析。Preferably, the whole genome analysis comprises: removing non-specific amplification outside the target region caused by the specific primer by the primer sequence, performing homogeneity analysis on the removed data, and removing the non-specific product as needed Whole genome data for specific analysis. 一种同时进行目标区域和全基因扩增的系统,其特征在于,包括:A system for simultaneous target region and whole gene amplification, comprising: (1)样本处理模块:用于提取样本,将样本两端加上带有标签的接头序列;(1) Sample processing module: for extracting a sample, and adding a labeled link sequence to both ends of the sample; (2)PCR扩增模块:与样本处理模块相连,用于进行两轮PCR扩增,第一轮PCR扩增采用第一组引物组,第二轮PCR扩增采用第二组引物组,构建测序文库。(2) PCR amplification module: connected to the sample processing module for performing two rounds of PCR amplification, the first round of PCR amplification uses the first set of primer sets, and the second round of PCR amplification uses the second set of primer sets to construct Sequencing library. 根据权利要求13所述的系统,其特征在于,所述系统还包括数据分析模块:与PCR 扩增模块相连,用于将构建的文库进行测序,数据分析。The system of claim 13 wherein said system further comprises a data analysis module: and PCR Amplification modules are connected for sequencing and data analysis of the constructed library. 一种如权利要求1-5中任一项所述的引物组合物用于检测待测样本的突变和染色体数目或者结构异常的用途。 Use of a primer composition according to any one of claims 1 to 5 for detecting mutations and chromosome numbers or structural abnormalities of a sample to be tested.
PCT/CN2017/104110 2017-09-28 2017-09-28 A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof Ceased WO2019061199A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/104110 WO2019061199A1 (en) 2017-09-28 2017-09-28 A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/104110 WO2019061199A1 (en) 2017-09-28 2017-09-28 A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof

Publications (1)

Publication Number Publication Date
WO2019061199A1 true WO2019061199A1 (en) 2019-04-04

Family

ID=65900279

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/104110 Ceased WO2019061199A1 (en) 2017-09-28 2017-09-28 A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof

Country Status (1)

Country Link
WO (1) WO2019061199A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3798319A1 (en) * 2019-09-30 2021-03-31 Diagenode S.A. An improved diagnostic and/or sequencing method and kit
EP3828283A1 (en) * 2019-11-28 2021-06-02 Diagenode S.A. An improved sequencing method and kit

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120196292A1 (en) * 2008-04-23 2012-08-02 Life Technologies Corporation Sequence amplification with linear primers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120196292A1 (en) * 2008-04-23 2012-08-02 Life Technologies Corporation Sequence amplification with linear primers

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3798319A1 (en) * 2019-09-30 2021-03-31 Diagenode S.A. An improved diagnostic and/or sequencing method and kit
US11788137B2 (en) 2019-09-30 2023-10-17 Diagenode S.A. Diagnostic and/or sequencing method and kit
EP3828283A1 (en) * 2019-11-28 2021-06-02 Diagenode S.A. An improved sequencing method and kit

Similar Documents

Publication Publication Date Title
JP6997813B2 (en) Highly multiplex PCR method and composition
JP6585117B2 (en) Diagnosis of fetal chromosomal aneuploidy
CN103608818B (en) Non-invasive prenatal ploidy identification device
CN107849607B (en) Single molecule sequencing of plasma DNA
CN105648045B (en) Methods and devices for determining haplotypes of fetal target regions
CN105543339A (en) Method for simultaneously completing gene locus, chromosome and linkage analysis
CN104293916B (en) A kind of G6PD deficiency disease gene detecting kit
CN107541561B (en) Kit, device and method for improving concentration of fetal free DNA in maternal peripheral blood
TW201805429A (en) Using cell-free DNA fragment size to determine copy number variations
CN105074004A (en) Noninvasive method for detecting fetal chromosomal aneuploidy
CN110628891B (en) Method for screening embryo genetic abnormality
CN105555970B (en) Method and system for simultaneous haplotyping and chromosomal aneuploidy detection
WO2022077885A1 (en) Nucleic acid library construction method and application thereof in analysis of abnormal chromosome structure in preimplantation embryo
CN111378732B (en) Mitochondrial genome sequencing primer, kit and method
WO2020119626A1 (en) Method for non-invasive prenatal testing of fetus for genetic disease
CN118127143A (en) Device for noninvasively detecting thalassemia before delivery and application thereof
CN110527724A (en) Set of probes and application thereof
WO2019061199A1 (en) A primer combination for performing simultaneous amplification of target region and whole genome, gene amplification method and application thereof
CN105648044A (en) Method and apparatus for determining fetus target area haplotype
CN117625778B (en) Method, primer and kit for detecting multiple mutations of IKBKG genes of pigment incontinence disease
CN113046435B (en) Specific primer for preparing PCR reaction system for detecting prenatal fetal 21-trisomy syndrome
CN101962680A (en) Double PCR molecular diagnosis kit for detecting inactivation of X chromosome
CN117737227A (en) Gene detection kit and system for screening fetal ACH based on cfDNA
HK40083172A (en) Single-molecule sequencing of plasma dna
CN116790740A (en) A new diagnostic chip construction method for common deafness gene copy number detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17927481

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17927481

Country of ref document: EP

Kind code of ref document: A1