[go: up one dir, main page]

WO2021227950A1 - Cancer prognostic method - Google Patents

Cancer prognostic method Download PDF

Info

Publication number
WO2021227950A1
WO2021227950A1 PCT/CN2021/092132 CN2021092132W WO2021227950A1 WO 2021227950 A1 WO2021227950 A1 WO 2021227950A1 CN 2021092132 W CN2021092132 W CN 2021092132W WO 2021227950 A1 WO2021227950 A1 WO 2021227950A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
patient
methylation
tissue
tissues
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/092132
Other languages
French (fr)
Chinese (zh)
Inventor
王晨阳
林静
李冰思
揣少坤
张之宏
汉雨生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Burning Rock Dx Co Ltd
Original Assignee
Guangzhou Burning Rock Dx Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Burning Rock Dx Co Ltd filed Critical Guangzhou Burning Rock Dx Co Ltd
Publication of WO2021227950A1 publication Critical patent/WO2021227950A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • the present disclosure generally relates to the field of biological detection and diagnosis. Specifically, the present disclosure relates to a method for prognosing a patient based on the difference in methylation in the genome of the cancer tissue sample and the adjacent tissue sample of the patient. The present disclosure also relates to systems and devices for prognosing cancer patients.
  • cancer is one of the main causes of death and disease burden.
  • Cancer prognosis is the prediction of the possible outcomes of an individual's current medical condition, and it is an important tool for improving patient diagnosis and treatment management.
  • An accurate prognosis is essential for choosing the right cancer treatment and predicting survival rates.
  • the clinical stage and pathological analysis of tumors are mainly used to assist some related molecular characteristics (such as immunohistochemistry, DNA mutations, mRNA or microRNA expression, etc.) to evaluate and predict the prognosis of patients.
  • molecular assessment methods still have great limitations.
  • molecular markers of 21 genes in breast cancer can accurately predict the risk of postoperative recurrence of patients and bring significant clinical benefits to breast cancer patients.
  • DNA methylation mutations are closely related to the occurrence of cancer, and compared with gene mutations, DNA methylation mutations have the characteristics of wider coverage, higher stability, and earlier occurrence. , So it is more suitable for early detection of cancer.
  • methods and strategies for predicting the prognosis of cancer patients using DNA methylation mutations are still very lacking.
  • regional carcinogenesis believes that under the action of a certain mechanism, normal tissues will gradually begin the process of carcinogenesis at the molecular level, and this change will first appear in the DNA. Based on the theory of "regional carcinogenesis", the inventors designed a new system that uses DNA methylation detection technology to predict the prognostic recurrence risk of cancer patients-malignancy density ratio (MD ratio) Evaluation system.
  • MD ratio tumor-malignancy density ratio
  • the MD ratio evaluation system is based on the following theory, that is, normal tissues in patients are in the process of transforming from normal cells to cancer cells. Through the detection of methylation mutations, the process of canceration can be evaluated, so as to predict the risk of tumor recurrence in patients. Compared with traditional testing, the MD ratio evaluation system predicts the patient’s recurrence risk only by testing tissue samples from the patient, which can better manage the patient’s prognosis, avoid frequent follow-up after surgery, and is simple and efficient. , Personalized features, so it has a better application prospect in prognosis management. Moreover, in the analysis of real samples, the system has higher accuracy than mutation detection in predicting recurrence.
  • the present disclosure relates to a method for prognosing cancer patients, the method comprising:
  • the method includes:
  • DMB differential methylation block
  • MB methylated block
  • ⁇ i (T) M i (T) / N i (T)
  • beta] i (A) M i (A) /N i (A)
  • DMB DMB
  • combining adjacent CpG sites in step b2 can be, for example, combining a distance of less than 50 bp, a distance of less than 100 bp, a distance of less than 150 bp, a distance of less than 200 bp, a distance of less than 250 bp, a distance of less than 300 bp, and a distance of less than 350 bp.
  • CpG sites with a distance of less than 50 bp or a distance of less than 100 bp are combined in step b2.
  • determining the MB with a significant difference between ⁇ i (T) and ⁇ i (A) as DMB can be, for example, determining
  • step c1 the patient's alpha value is calculated by the following algorithm:
  • l( ⁇ ) is the log-likelihood function of the observation data
  • p i (0) is the Beta-binomial distribution subject to the baseline methylation level of the i-th MB (p i (0 ) , The shape parameter in q i (0) ).
  • one or more regions of the genome are regions of the genome where there are methylation variants in the population of cancer patients.
  • the region of the genome that is known to have methylation variants in a patient population of a specific cancer type can be obtained through a public database and used in the detection method of the present disclosure.
  • the aforementioned public databases are, for example, the TCGA (The Cancer Genome Atlas) database and the GEO (Gene Expression Omnibus) database.
  • one or more regions of the genome cover a region of at least 0.3M (megabases), such as at least 0.3M, 0.4M, at least 0.5M, at least 0.6M, at least 0.7M, at least 0.8M , At least 0.9M or at least 1.0M area.
  • one or more regions of the genome cover about 0.3M-10.0M regions of the genome, such as 0.3M-5.0M, 0.3M-4.0M, 0.3M-3.0M regions, 0.3M-3.0M regions, 2.0M area, 0.3M-1.5M area, 0.3M-1.0M area, 0.4M-5.0M area, 0.4M-4.0M area, 0.4M-3.0M area, 0.4M-2.0M Area, 0.4M-1.5M area, 0.4M-1.0M area, 0.5M-5.0M area, 0.5M-4.0M area, 0.5M-3.0M area, 0.5M-2.0M area , 0.5M-1.5M area, 0.5-1.0M area, or 1.0M-5.0M area, 1.0M-4.0M area, 1.0M-3.0M area, 1.0M-2.0M area or 1.0 M-1.5M area.
  • the above range also includes endpoint values and any subset ranges in between.
  • the cancer is a solid tumor.
  • solid tumors include, but are not limited to, lung cancer (including small cell lung cancer, non-small cell lung cancer, lung adenocarcinoma, and lung squamous cell carcinoma), colorectal cancer, liver cancer, ovarian cancer, pancreatic cancer, gallbladder cancer, gastric cancer, esophageal cancer , Kidney cancer, melanoma, breast cancer, cervical cancer, endometrial cancer, prostate cancer, bladder cancer, testicular cancer, thyroid cancer, salivary gland cancer, skin cancer, squamous cell carcinoma, neuroblastoma, glioblastoma Tumor, retinoblastoma, lymphoma (including Hodgkin’s lymphoma and non-Hodgkin’s lymphoma), bone cancer, myeloma, basal cell carcinoma, peritoneal cancer, choriocarcinoma, eye cancer, head and neck cancer, laryngeal cancer , Oral
  • the cancer may be selected from lung cancer, colorectal cancer, liver cancer, ovarian cancer, pancreatic cancer, gallbladder cancer, gastric cancer, and esophageal cancer.
  • the cancer is a primary cancer. In other embodiments, the cancer is a secondary or metastatic cancer.
  • the cancer may be in any stage of cancer development, such as early, middle or late stages of cancer development, or the cancer may be in clinical stages I, II, III, or IV.
  • the cancer is lung cancer, such as non-small cell lung cancer (NSCLC).
  • NSCLC non-small cell lung cancer
  • one or more regions of the detected genome may include one or more regions selected from those listed in Table 1.
  • the cancer is lung cancer
  • one or more regions of the detected genome include at least 100 regions, at least 200 regions, at least 300 regions, and at least 400 regions selected from those listed in Table 1. , At least 500 areas, at least 600 areas, at least 700 areas, at least 800 areas, at least 900 areas, or at least 1000 areas.
  • the one or more regions of the detected genome include all regions selected from the list in Table 1.
  • the cancer patient has undergone previous cancer treatment methods, such as surgical treatment, radiation therapy, chemotherapy, targeted drug therapy, immunotherapy, or a combination thereof.
  • the cancer tissue and para-cancerous tissue used may be tissues surgically removed from the patient.
  • using methylation data from normal tissues includes:
  • methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ):
  • the p-value is calculated by Wald test in step c3. In other embodiments, the p-value is calculated by, for example, a likelihood ratio test.
  • the method is used to predict the postoperative recurrence risk and/or survival of the cancer patient.
  • patients with p ⁇ 0.05 are identified as having a high risk of recurrence and/or low postoperative survival.
  • the present disclosure relates to a system for prognosing cancer patients, the system including:
  • the methylation sequencing module is configured to detect the methylation level in one or more regions of the genome of cancer tissue and paracancerous tissue from the patient through high-throughput sequencing
  • the prognostic analysis module is configured For the prognosis of the patient through the following methods:
  • the prognosis of the patient is performed by mathematical modeling
  • system is configured to implement the prognostic method according to the first aspect of the present disclosure.
  • the prognostic analysis module is configured to prognose the patient through the following methods:
  • DMB differential methylation block
  • ⁇ i (T) M i (T) / N i (T)
  • beta] i (A) M i (A) /N i (A)
  • DMB DMB
  • the prognostic analysis module is configured to set a distance of less than 50 bp, a distance of less than 100 bp, a distance of less than 150 bp, a distance of less than 200 bp, a distance of less than 250 bp, a distance of less than 300 bp, a distance of less than 350 bp, and a distance of less than 400 bp in step a2.
  • the prognostic analysis module is configured to combine CpG sites with a distance of less than 50 bp or a distance of less than 100 bp in step b2.
  • the prognostic analysis module is configured to determine
  • the prognostic analysis module is configured to calculate the alpha value of the patient through the following algorithm in step b1:
  • l( ⁇ ) is the log-likelihood function of the observation data
  • p i (0) is the Beta-binomial distribution subject to the baseline methylation level of the i-th MB (p i (0 ) , The shape parameter in q i (0) ).
  • the prognostic analysis module is further configured to use methylation data from normal tissues to establish the baseline methylation level, including:
  • methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ):
  • system is further configured to calculate the p-value by Wald test in step b3.
  • the p-value is calculated by, for example, a likelihood ratio test.
  • the system is further configured to predict the postoperative recurrence risk and/or survival of the cancer patient.
  • patients with p ⁇ 0.05 are identified as having a high risk of recurrence and/or low postoperative survival.
  • the present disclosure relates to a device for prognosing cancer patients, which includes:
  • Memory for storing computer program instructions
  • a processor for executing computer program instructions
  • the device executes the method according to the first aspect of the present disclosure.
  • the present disclosure relates to a computer-readable medium storing computer program instructions, wherein when the computer program instructions are executed by a processor, the computer program instructions according to the first aspect of the present disclosure are implemented. The method described.
  • Figure 1 shows the line chart of the disease-free survival (DFS) results of patients with high risk of recurrence and low risk of recurrence predicted by detecting the cancer tissue and the genomic regions listed in Table 1 in the patient's cancer tissue and paracancerous group using the MD ratio method .
  • DFS disease-free survival
  • Figure 2 shows the line graph of the disease-free survival (DFS) results of patients with high risk of recurrence and low risk of recurrence predicted by detecting the cancer tissue and the genomic regions listed in Table 3 in the patient's cancer tissue and paracancerous group using the MD ratio method .
  • DFS disease-free survival
  • Figure 3 shows a line chart of the disease-free survival (DFS) results of patients with high risk of recurrence and low risk of recurrence predicted by detecting the cancer tissue and the genomic regions listed in Table 4 in the patient's cancer tissue and paracancerous group using the MD ratio method .
  • DFS disease-free survival
  • the main strategy of the MD ratio assessment system is to sequence the methylation levels of tissue samples to find areas with differences in methylation status in the genome of cancer tissues and normal tissues (such as paracancerous tissues) in patients. Based on the paracancerous tissues in the process of transforming from normal cells to cancer cells, calculate the degree of similarity between the methylation level of the paracancerous tissues and the methylation level of cancer tissues in the different regions, and then infer the cancerous transformation of normal cells in the patient Risk level.
  • the MD ratio assessment system for cancer recurrence risk detects the patient’s cancer tissue and para-cancerous tissue samples. Take lung cancer as an example.
  • the adjacent tissue is defined as the tissue outside the resection margin 5cm; in the wedge resection, the adjacent tissue is defined as the tissue outside the resection margin 3cm, and pass
  • the pathological evaluation of the tissue cells verifies that they do not contain tumor cells; at the same time, the cell types of the cancer tissue and the adjacent tissues are the same.
  • the sample library is prepared using the brELSATM method (Burning Rock Biotech, Guangzhou, China), which includes the following steps: 1) DNA extraction and purification; 2) sodium bisulfite treatment; 3) single-stranded DNA amplification by DNA polymerase; 4) Use a customized cancer methylation profile RNA bait to enrich the target region as shown in Table 1 (covering a region of about 0.9M of the human genome); 5) Quantify the target library by real-time PCR. Finally, the sequencer NovaSeq 6000 released by Illumina was used for sequencing, and the average sequencing depth was 1,000 layers.
  • the original output files of sequencing were analyzed using sequence comparison software BWA-meth and methylation data statistics software MethylDackel to obtain the methylation detection output files of each sample. It contains the location information of each CpG site in the specific capture area, and the methylation information in the reads covering this site. The number of methylated reads covering this site is recorded as M, and the number of unmethylated reads is recorded as U. Combining adjacent CpG sites (with a distance of less than 50 bp), this set of multiple CpG sites is called a methylation block (MB).
  • MB methylation block
  • M i M i + U i
  • N i M i + U i
  • the research target is targeted at the regions of the cancer and para-cancerous tissues where there is a difference in methylation in the genome, which is defined as a differentially methylated block ( differential methylated block, referred to as DMB).
  • DMB differential methylated block
  • the methylation level ⁇ (T) of the cancer tissue sample of the patient and the methylation level ⁇ (A) of the adjacent tissue sample are used for testing.
  • ⁇ i (T) M i (T) / N i (T)
  • ⁇ i (A ) M i (A) /N i (A) , which will conform to
  • the MB is determined to be the patient’s personalized DMB.
  • methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ):
  • ⁇ i (0) represents the baseline of the actual degree of methylation using the maximum likelihood algorithm (maximum likelihood estimation, referred MLE) solving the shape parameter ⁇ p i, q i ⁇ , the maximum likelihood estimates of the parameters in mind It is ⁇ p i (0) , q i (0) ⁇ .
  • methylation sequencing data (M i (A), N i (A), ⁇ i (A)) which is adjacent to obey a cancerous tissue sample hybrid Beta-Binomial distribution
  • M i (A) N i (A), ⁇ i (A)
  • Beta-Binomial distribution may be specifically expressed as:
  • ⁇ i (T) represents the methylation level of the cancer tissue of the patient, which can be estimated by the moment of the sequencing data of the cancer tissue sample Place;
  • ⁇ i (0) represents the baseline level of methylation, as described above, which is subject to the shape parameter (p i (0), q i (0)) of the Beta distribution.
  • is a ratio parameter on [0,1], which indicates the similarity between adjacent tissues and cancerous tissues. The closer ⁇ is to 1, the closer the degree of methylation of the adjacent tissues is to the cancer tissue, and it can be inferred that the patient's risk of recurrence is higher.
  • the log likelihood function of the observation data can be written:
  • the values of ⁇ are 0, 0.001, 0.003, 0.01, 0.003, 0.1, and each group of parameters is repeated 50 times.
  • the numerical simulation results are shown in Table 2 below.
  • the genomic regions listed in Table 4 and Table 5 in cancer tissues and adjacent tissues (which include 522 and 532 randomly selected from Table 1 respectively) These regions cover the methylation levels of 0.47M and 0.48M in the genome, respectively.
  • the patient's p-value is calculated by the same algorithm, and the test results are divided into high-risk recurrence and low-risk recurrence according to the test p-value ⁇ 0.05.
  • the disease-free survival (DFS) results are shown in Figure 2 and Figure 3, respectively.
  • MD ratio-1 represents the result obtained by detecting the area shown in Table 1 (corresponding to Figure 1); MD ratio-2 and MD-ratio-3 represent the result obtained by detecting the area shown in Table 3 and Table 4, respectively (Corresponding to Figure 2 and Figure 3 respectively).
  • the above results indicate that the MD ratio assessment system can more effectively assess the patient's cancer recurrence risk and subsequent survival than the somatic test, and play a better role in the patient's prognosis management and clinical treatment.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Pathology (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • Epidemiology (AREA)
  • Genetics & Genomics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Primary Health Care (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Disclosed is a prognostic method for a cancer patient. The relapse risk and/or survival of a patient are/is predicted on the basis of a methylation difference in genomes of a cancer tissue sample and a paracancerous tissue sample of the patient. Further disclosed are a prognostic system and device for a cancer patient.

Description

癌症预后方法Cancer prognosis methods

本申请要求于2020年05月09日递交的申请号为202010385996.8,发明名称为“癌症预后方法”的中国专利申请的优先权,在此全文引用上述专利申请的内容以作为本申请的一部分。This application claims the priority of the Chinese patent application filed on May 9, 2020 with the application number 202010385996.8 and the invention title "Methods for Cancer Prognosis", and the content of the above patent application is quoted here in full as a part of this application.

技术领域Technical field

本公开一般涉及生物检测和诊断领域。具体而言,本公开涉及基于患者的癌症组织样本和癌旁组织样本的基因组中甲基化差异对患者进行预后的方法。本公开还涉及用于对癌症患者进行预后的系统和设备。The present disclosure generally relates to the field of biological detection and diagnosis. Specifically, the present disclosure relates to a method for prognosing a patient based on the difference in methylation in the genome of the cancer tissue sample and the adjacent tissue sample of the patient. The present disclosure also relates to systems and devices for prognosing cancer patients.

技术背景technical background

在全球范围内,癌症是导致死亡和疾病负担的主要原因之一。癌症预后是对个体当前医疗状况可能的结果进行预测,其是改善患者诊断和治疗管理的重要工具。准确的预后对于选择正确的癌症治疗方法和预测生存率至关重要。目前临床上主要根据肿瘤的临床分期及病理分析,并辅助一些相关分子学特征(如免疫组化,DNA突变,mRNA或microRNA表达量等)对患者的预后进行评估和预测。但上述的分子评估方法尚存在很大局限性,一方面,如乳腺癌中21基因类的分子标志物可对患者术后复发风险做出精准预测,为乳腺癌患者带来显著的临床获益,并由此进入乳腺癌的诊疗指南;另一方面,具有高准确性和临床应用效能的分子标志物还非常匮乏,绝大多数的癌种在临床上仍然需要对患者进行频繁的影像学随访来进行病程监控,对患者会造成一定的负担,并且对肿瘤进展的发现可能存在滞后。因此,能够独立于病理分期和其他临床因素,准确预测患者预后的分子标志物具有巨大的临床需求。Globally, cancer is one of the main causes of death and disease burden. Cancer prognosis is the prediction of the possible outcomes of an individual's current medical condition, and it is an important tool for improving patient diagnosis and treatment management. An accurate prognosis is essential for choosing the right cancer treatment and predicting survival rates. At present, clinically, the clinical stage and pathological analysis of tumors are mainly used to assist some related molecular characteristics (such as immunohistochemistry, DNA mutations, mRNA or microRNA expression, etc.) to evaluate and predict the prognosis of patients. However, the above molecular assessment methods still have great limitations. On the one hand, molecular markers of 21 genes in breast cancer can accurately predict the risk of postoperative recurrence of patients and bring significant clinical benefits to breast cancer patients. , And thus entered the guidelines for the diagnosis and treatment of breast cancer; on the other hand, molecular markers with high accuracy and clinical application efficiency are still very scarce, and most cancer types still require frequent imaging follow-up of patients in clinical practice. To monitor the course of the disease, it will cause a certain burden to the patient, and there may be a delay in the discovery of tumor progression. Therefore, independent of pathological stage and other clinical factors, molecular markers that can accurately predict the prognosis of patients have huge clinical needs.

在许多研究中,已经证明了DNA甲基化变异与癌症的发生密切相关,而且相比于基因突变,DNA甲基化的变异具有覆盖区域更广,稳定性更高,发生时间更早等特点,因此更适合用于癌变的早期检测。然而,使用DNA甲基化变异对癌症患者进行预后预测的方法和策略还非常缺乏。In many studies, it has been proved that DNA methylation mutations are closely related to the occurrence of cancer, and compared with gene mutations, DNA methylation mutations have the characteristics of wider coverage, higher stability, and earlier occurrence. , So it is more suitable for early detection of cancer. However, methods and strategies for predicting the prognosis of cancer patients using DNA methylation mutations are still very lacking.

发明内容Summary of the invention

“区域癌变”的理论认为,正常组织在某种机制的作用下,会逐渐从分子层面上开始开始癌变的进程,这一变化首先会出现在DNA上。基于“区域癌变”的理论,本发明人设计了一种全新的利用DNA甲基化检测技术来预测癌症患者预后复发风险的系统——癌旁组织恶性占比(malignancy density ratio,简称MD ratio)评估系统。The theory of "regional carcinogenesis" believes that under the action of a certain mechanism, normal tissues will gradually begin the process of carcinogenesis at the molecular level, and this change will first appear in the DNA. Based on the theory of "regional carcinogenesis", the inventors designed a new system that uses DNA methylation detection technology to predict the prognostic recurrence risk of cancer patients-malignancy density ratio (MD ratio) Evaluation system.

MD ratio评估系统基于以下理论,即患者体内的正常组织正处于从正常细胞向癌细胞转化的过程中,通过甲基化变异的检测,评估癌变的进程,从而对患者肿瘤复发的风险作出预测。相比于传统检测,MD ratio评估系统仅仅通过对来自患者的组织样本进行检测来预测患者的复发风险,从而能够更好地对患者进行预后管理,避免了术后频繁的随访,具有简便、高效、个性化的特征,因此在预后管理中具备更好的应用前景。并且,在真实样本的分析中,该系统比突变检测预测复发具有更高的准确性。The MD ratio evaluation system is based on the following theory, that is, normal tissues in patients are in the process of transforming from normal cells to cancer cells. Through the detection of methylation mutations, the process of canceration can be evaluated, so as to predict the risk of tumor recurrence in patients. Compared with traditional testing, the MD ratio evaluation system predicts the patient’s recurrence risk only by testing tissue samples from the patient, which can better manage the patient’s prognosis, avoid frequent follow-up after surgery, and is simple and efficient. , Personalized features, so it has a better application prospect in prognosis management. Moreover, in the analysis of real samples, the system has higher accuracy than mutation detection in predicting recurrence.

相应地,在一方面,本公开涉及一种对癌症患者进行预后的方法,所述方法包括:Accordingly, in one aspect, the present disclosure relates to a method for prognosing cancer patients, the method comprising:

a.通过高通量测序检测来自所述患者的癌症组织和癌旁组织的基因组的一个或多个区域中的甲基化水平;a. Detecting the methylation level in one or more regions of the genome of cancer tissues and paracancerous tissues from the patient by high-throughput sequencing;

b.确定所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异;和b. Determine the difference in the methylation level between the cancer tissue and the adjacent tissue; and

c.使用所述甲基化水平的差异信息,通过数学建模的方法对所述患者进行预后,c. Use the difference information of the methylation level to predict the patient's prognosis through mathematical modeling,

其中所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异越小,指示所述患者的预后越差。Wherein, the smaller the difference of the methylation level between the cancer tissue and the adjacent tissues is, the worse the prognosis of the patient is.

在一些实施方案中,所述方法包括:In some embodiments, the method includes:

a.通过高通量测序检测来自所述患者的癌症组织和癌旁组织的基因组的一个或多个区域中的甲基化水平;a. Detecting the methylation level in one or more regions of the genome of cancer tissues and paracancerous tissues from the patient by high-throughput sequencing;

b.确定所述癌症组织和癌旁组织中的差异甲基化区块(DMB),包括:b. Determine the differential methylation block (DMB) in the cancer tissue and adjacent tissues, including:

b1.对于检测的区域内的每个CpG位点,将覆盖该位点的读段中发生甲基化的个数记为M,未发生甲基化的个数记为U;b1. For each CpG site in the detected area, record the number of methylated reads covering this site as M, and the number of unmethylated reads as U;

b2.将相邻的CpG位点进行组合并定义为甲基化区块(MB);将患者的第i个MB中所有CpG位点上的M相加,记为M i;所有U相加,记为U i; 总覆盖为N i=M i+U i,甲基化水平为β i=M i/N i. b2 adjacent CpG sites are combined and defined as methylated block (MB); M on the i-th MB patients all CpG sites are added, referred to as M i; the sum of all U , Denoted as U i ; the total coverage is N i =M i +U i , and the methylation level is β i =M i /N i ;

b3.对于第i个MB,将癌症组织中的甲基化水平记为β i (T)=M i (T)/N i (T),将癌旁组织中甲基化水平记为β i (A)=M i (A)/N i (A),其中将β i (T)值和β i (A)值存在显著差异的MB确定为DMB; b3. For the i th MB, the level of methylation in cancer tissues referred to as β i (T) = M i (T) / N i (T), the level of methylation adjacent tissue cancer referred to as beta] i (A) =M i (A) /N i (A) , where the MB with a significant difference between β i (T) and β i (A) is determined as DMB;

c.对所述患者进行预后,包括:c. Prognosis of the patient, including:

c1.引入指示所述患者的癌旁组织和癌症组织的甲基化水平的相似程度的参数α并通过以下算法计算患者的α值:c1. Introduce a parameter α indicating the degree of similarity between the methylation levels of the paracancerous tissue and the cancer tissue of the patient, and calculate the patient's α value by the following algorithm:

Figure PCTCN2021092132-appb-000001
Figure PCTCN2021092132-appb-000001

Figure PCTCN2021092132-appb-000002
Figure PCTCN2021092132-appb-000002

其中0<α<1且α取值越大指示所述相似程度越高,其中f(·),g(·),h(·)分别表示癌旁组织的甲基化水平的条件分布,癌症组织的甲基化水平的先验分布和基线甲基化水平的先验分布,

Figure PCTCN2021092132-appb-000003
表示先验分布的参数族; Where 0<α<1 and a larger value of α indicates a higher degree of similarity, where f(·), g(·), h(·) respectively represent the conditional distribution of the methylation level of adjacent tissues, cancer The prior distribution of the methylation level of the tissue and the prior distribution of the baseline methylation level,
Figure PCTCN2021092132-appb-000003
Represents the parameter family of the prior distribution;

c2.在0<α<1的范围内使用极大似然估计算法计算参数估计

Figure PCTCN2021092132-appb-000004
并通过fisher信息矩阵计算参数估计的方差
Figure PCTCN2021092132-appb-000005
c2. Use maximum likelihood estimation algorithm to calculate parameter estimates in the range of 0<α<1
Figure PCTCN2021092132-appb-000004
And calculate the variance of parameter estimates through the fisher information matrix
Figure PCTCN2021092132-appb-000005

Figure PCTCN2021092132-appb-000006
Figure PCTCN2021092132-appb-000006

Figure PCTCN2021092132-appb-000007
Figure PCTCN2021092132-appb-000007

c3.计算每一位患者癌旁组织零假设α=0的p值,c3. Calculate the p value of the null hypothesis α=0 for each patient’s adjacent tissues,

其中,p值越小指示所患者的预后越差。Among them, the smaller the p value indicates the worse the prognosis of the patient.

在一些实施方案中,在步骤b2中将相邻的CpG位点进行组合可以是例如将距离小于50bp、距离小于100bp、距离小于150bp、距离小于200bp、距离小于250bp、距离小于300bp、距离小于350bp、距离小于400bp、距离小于450bp或距离小于500bp的CpG位点进行组合。在一些实施方案中,在步骤b2中将距离小于50bp或距离小于100bp的CpG位点进行组合。In some embodiments, combining adjacent CpG sites in step b2 can be, for example, combining a distance of less than 50 bp, a distance of less than 100 bp, a distance of less than 150 bp, a distance of less than 200 bp, a distance of less than 250 bp, a distance of less than 300 bp, and a distance of less than 350 bp. , Combine CpG sites with a distance of less than 400 bp, a distance of less than 450 bp, or a distance of less than 500 bp. In some embodiments, CpG sites with a distance of less than 50 bp or a distance of less than 100 bp are combined in step b2.

在一些实施方案中,在步骤b3中将β i (T)值和β i (A)值存在显著差异的MB 确定为DMB可以是例如将|β i (T)i (A)|>σ,且

Figure PCTCN2021092132-appb-000008
的MB确定为DMB,其中σ为0.05至1之间的值且τ为大于0.1的值。在一些实施方案中,σ=0.1且τ=0.4。 In some embodiments, in step b3, determining the MB with a significant difference between β i (T) and β i (A) as DMB can be, for example, determining |β i (T)i (A) |> σ, and
Figure PCTCN2021092132-appb-000008
The MB of is determined as DMB, where σ is a value between 0.05 and 1 and τ is a value greater than 0.1. In some embodiments, σ=0.1 and τ=0.4.

在上述方法的一些实施方案中,在步骤c1中通过以下算法计算所述患者的α值:In some embodiments of the above method, in step c1, the patient's alpha value is calculated by the following algorithm:

l(α)=∑ ilog L(α;M i (A),N i (A),p i (0),q i (0),β i (T)) l (α) = Σ i log L (α; M i (A), N i (A), p i (0), q i (0), β i (T))

Figure PCTCN2021092132-appb-000009
Figure PCTCN2021092132-appb-000009

其中,l(α)为观测数据的对数似然函数,p i (0),q i (0)为第i个MB的基线甲基化水平所服从的Beta-binomial分布(p i (0),q i (0))中的形状参数。 Among them, l(α) is the log-likelihood function of the observation data, p i (0) , q i (0) is the Beta-binomial distribution subject to the baseline methylation level of the i-th MB (p i (0 ) , The shape parameter in q i (0) ).

在一些实施方案中,所述基因组的一个或多个区域是在所述癌症患者的群体中存在甲基化变异的基因组的区域。例如,可以通过公共数据库获得在具体癌症种类的患者群体中,已知存在甲基化变异的基因组的区域,并用于本公开的检测方法。例如,可以从这些公共数据库下载各种癌症患者群体中的癌症组织及癌旁样本的甲基化芯片数据,并利用这些数据确定与每种癌症的发生密切相关的甲基化特征区域。可以利用例如参数检验法(如t检验,线性回归等)、非参数检验法(如wilcoxon秩和检验等)进行确定。上述公共数据库例如TCGA(The Cancer Genome Atlas)数据库和GEO(Gene Expression Omnibus)数据库。In some embodiments, one or more regions of the genome are regions of the genome where there are methylation variants in the population of cancer patients. For example, the region of the genome that is known to have methylation variants in a patient population of a specific cancer type can be obtained through a public database and used in the detection method of the present disclosure. For example, it is possible to download the methylation chip data of cancer tissues and para-cancerous samples in various cancer patient groups from these public databases, and use these data to determine the methylation characteristic regions closely related to the occurrence of each cancer. It can be determined by using, for example, parameter testing methods (such as t-test, linear regression, etc.), non-parametric testing methods (such as Wilcoxon rank sum test, etc.). The aforementioned public databases are, for example, the TCGA (The Cancer Genome Atlas) database and the GEO (Gene Expression Omnibus) database.

在一些实施方案中,所述基因组的一个或多个区域覆盖至少0.3M(兆碱基)的区域,例如至少0.3M、0.4M、至少0.5M、至少0.6M、至少0.7M、至少0.8M、至少0.9M或至少1.0M的区域。In some embodiments, one or more regions of the genome cover a region of at least 0.3M (megabases), such as at least 0.3M, 0.4M, at least 0.5M, at least 0.6M, at least 0.7M, at least 0.8M , At least 0.9M or at least 1.0M area.

在一些实施方案中,所述基因组的一个或多个区域覆盖基因组约0.3M-10.0M的区域,例如0.3M-5.0M、0.3M-4.0M、0.3M-3.0M的区域、0.3M-2.0M的区域、0.3M-1.5M的区域、0.3M-1.0M的区域、0.4M-5.0M的区域、0.4M-4.0M的区域、0.4M-3.0M的区域、0.4M-2.0M的区域、0.4M-1.5M 的区域、0.4M-1.0M的区域、0.5M-5.0M的区域、0.5M-4.0M的区域、0.5M-3.0M的区域、0.5M-2.0M的区域、0.5M-1.5M的区域、0.5-1.0M的区域,或1.0M-5.0M的区域、1.0M-4.0M的区域、1.0M-3.0M的区域、1.0M-2.0M的区域或1.0M-1.5M的区域。上述范围还包括端点值和其间的任何子集范围。In some embodiments, one or more regions of the genome cover about 0.3M-10.0M regions of the genome, such as 0.3M-5.0M, 0.3M-4.0M, 0.3M-3.0M regions, 0.3M-3.0M regions, 2.0M area, 0.3M-1.5M area, 0.3M-1.0M area, 0.4M-5.0M area, 0.4M-4.0M area, 0.4M-3.0M area, 0.4M-2.0M Area, 0.4M-1.5M area, 0.4M-1.0M area, 0.5M-5.0M area, 0.5M-4.0M area, 0.5M-3.0M area, 0.5M-2.0M area , 0.5M-1.5M area, 0.5-1.0M area, or 1.0M-5.0M area, 1.0M-4.0M area, 1.0M-3.0M area, 1.0M-2.0M area or 1.0 M-1.5M area. The above range also includes endpoint values and any subset ranges in between.

在一些实施方案中,所述癌症是实体瘤。实体瘤的实例包括但不限于肺癌(包括小细胞肺癌、非小细胞肺癌、肺腺癌和肺鳞状细胞癌)、结直肠癌、肝癌、卵巢癌、胰腺癌、胆囊癌、胃癌、食管癌、肾癌、黑色素瘤、乳腺癌、宫颈癌、子宫内膜癌、前列腺癌、膀胱癌、睾丸癌、甲状腺癌、唾液腺癌、皮肤癌、鳞状细胞癌、神经母细胞瘤、胶质母细胞瘤、视网膜母细胞瘤、淋巴瘤(包括霍奇金淋巴瘤和非霍奇金淋巴瘤)、骨癌、骨髓瘤、基底细胞癌、腹膜癌、绒毛膜癌、眼癌、头颈癌、喉癌、口腔癌和横纹肌肉瘤等。In some embodiments, the cancer is a solid tumor. Examples of solid tumors include, but are not limited to, lung cancer (including small cell lung cancer, non-small cell lung cancer, lung adenocarcinoma, and lung squamous cell carcinoma), colorectal cancer, liver cancer, ovarian cancer, pancreatic cancer, gallbladder cancer, gastric cancer, esophageal cancer , Kidney cancer, melanoma, breast cancer, cervical cancer, endometrial cancer, prostate cancer, bladder cancer, testicular cancer, thyroid cancer, salivary gland cancer, skin cancer, squamous cell carcinoma, neuroblastoma, glioblastoma Tumor, retinoblastoma, lymphoma (including Hodgkin’s lymphoma and non-Hodgkin’s lymphoma), bone cancer, myeloma, basal cell carcinoma, peritoneal cancer, choriocarcinoma, eye cancer, head and neck cancer, laryngeal cancer , Oral cancer and rhabdomyosarcoma.

在一些实施方案中,所述癌症可以选自肺癌、结直肠癌、肝癌、卵巢癌、胰腺癌、胆囊癌、胃癌和食管癌。In some embodiments, the cancer may be selected from lung cancer, colorectal cancer, liver cancer, ovarian cancer, pancreatic cancer, gallbladder cancer, gastric cancer, and esophageal cancer.

在一些实施方案中,所述癌症是原发性癌症。在另一些实施方案中,所述癌症是继发性或转移性癌症。所述癌症可以处于癌症发展的任何阶段,例如癌症发展的早期、中期或晚期,或所述癌症可以处于临床分期I期、II期、III期或IV期。In some embodiments, the cancer is a primary cancer. In other embodiments, the cancer is a secondary or metastatic cancer. The cancer may be in any stage of cancer development, such as early, middle or late stages of cancer development, or the cancer may be in clinical stages I, II, III, or IV.

在一些实施方案中,所述癌症是肺癌,例如非小细胞肺癌(NSCLC)。在所述癌症是肺癌的情况下,检测的基因组的一个或多个区域可以包括选自表1中所列的一个或多个区域。In some embodiments, the cancer is lung cancer, such as non-small cell lung cancer (NSCLC). In the case where the cancer is lung cancer, one or more regions of the detected genome may include one or more regions selected from those listed in Table 1.

表1.Table 1.

Figure PCTCN2021092132-appb-000010
Figure PCTCN2021092132-appb-000010

Figure PCTCN2021092132-appb-000011
Figure PCTCN2021092132-appb-000011

Figure PCTCN2021092132-appb-000012
Figure PCTCN2021092132-appb-000012

Figure PCTCN2021092132-appb-000013
Figure PCTCN2021092132-appb-000013

Figure PCTCN2021092132-appb-000014
Figure PCTCN2021092132-appb-000014

Figure PCTCN2021092132-appb-000015
Figure PCTCN2021092132-appb-000015

Figure PCTCN2021092132-appb-000016
Figure PCTCN2021092132-appb-000016

Figure PCTCN2021092132-appb-000017
Figure PCTCN2021092132-appb-000017

Figure PCTCN2021092132-appb-000018
Figure PCTCN2021092132-appb-000018

Figure PCTCN2021092132-appb-000019
Figure PCTCN2021092132-appb-000019

Figure PCTCN2021092132-appb-000020
Figure PCTCN2021092132-appb-000020

Figure PCTCN2021092132-appb-000021
Figure PCTCN2021092132-appb-000021

Figure PCTCN2021092132-appb-000022
Figure PCTCN2021092132-appb-000022

Figure PCTCN2021092132-appb-000023
Figure PCTCN2021092132-appb-000023

在一些实施方案中,所述癌症是肺癌,且检测的基因组的一个或多个区域包括选自表1中所列的至少100个区域、至少200个区域、至少300个区域、至少400个区域、至少500个区域、至少600个区域、至少700个区域、至少800个区域、至少900个区域,或至少1000个区域。在一些实施方案中,所述检测的基因组的一个或多个区域包括选自表1中所列的所有区域。In some embodiments, the cancer is lung cancer, and one or more regions of the detected genome include at least 100 regions, at least 200 regions, at least 300 regions, and at least 400 regions selected from those listed in Table 1. , At least 500 areas, at least 600 areas, at least 700 areas, at least 800 areas, at least 900 areas, or at least 1000 areas. In some embodiments, the one or more regions of the detected genome include all regions selected from the list in Table 1.

在一些实施方案中,所述癌症患者已经经历先前的癌症治疗方法,例如手术治疗、放射治疗、化学治疗、靶向药物治疗、免疫治疗或其组合。In some embodiments, the cancer patient has undergone previous cancer treatment methods, such as surgical treatment, radiation therapy, chemotherapy, targeted drug therapy, immunotherapy, or a combination thereof.

在上述方法的一些实施方案中,使用的癌症组织和癌旁组织可以是从所述患者手术切除的组织。In some embodiments of the above methods, the cancer tissue and para-cancerous tissue used may be tissues surgically removed from the patient.

在上述方法的一些实施方案中,使用来自正常组织(例如,来自健康供体的正常组织)的甲基化数据建立所述基线甲基化水平,包括:In some embodiments of the methods described above, using methylation data from normal tissues (e.g., normal tissues from healthy donors) to establish the baseline methylation level includes:

对于第i个MB,将正常组织的甲基化数据记为M i (0)和N i (0),认为在给定N i (0)的条件下M i (0)服从形状参数为(p i (0),q i (0))的Beta-binomial分布: For the i-th MB, methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ):

M i (0)|N i (0),β i (0)~Binomial(N i (0),β i (0)) M i (0) |N i (0) , β i (0) ~ Binomial(N i (0) , β i (0) )

β i (0)~Beta(p i (0),q i (0)), β i (0) ~Beta(p i (0) , q i (0) ),

并利用极大似然算法计算参数p i (0)和q i (0)And use the maximum likelihood algorithm to calculate the parameters p i (0) and q i (0) .

在上述方法的一些实施方案中,在步骤c3中通过Wald检验来计算所述p值。在另一些实施方案中,通过例如似然比检验来计算所述p值。In some embodiments of the above method, the p-value is calculated by Wald test in step c3. In other embodiments, the p-value is calculated by, for example, a likelihood ratio test.

在上述方法的一些实施方案中,所述方法用于预测所述癌症患者的术后复发风险和/或存活。在一些实施方案中,将p<0.05的患者鉴定为具有高复发风险和/或低术后存活。In some embodiments of the above methods, the method is used to predict the postoperative recurrence risk and/or survival of the cancer patient. In some embodiments, patients with p<0.05 are identified as having a high risk of recurrence and/or low postoperative survival.

在第二个方面,本公开涉及一种用于对癌症患者进行预后的系统,所述系统包括:In a second aspect, the present disclosure relates to a system for prognosing cancer patients, the system including:

甲基化检测模块;和Methylation detection module; and

预后分析模块,Prognostic analysis module,

其中,所述甲基化测序模块配置为通过高通量测序检测来自所述患者的癌症组织和癌旁组织的基因组的一个或多个区域中的甲基化水平,且所述预后分析模块配置为通过以下方法对所述患者进行预后:Wherein, the methylation sequencing module is configured to detect the methylation level in one or more regions of the genome of cancer tissue and paracancerous tissue from the patient through high-throughput sequencing, and the prognostic analysis module is configured For the prognosis of the patient through the following methods:

a.确定所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异;和a. Determine the difference in the methylation level between the cancer tissue and the paracancerous tissue; and

b.使用所述甲基化水平的差异信息,通过数学建模的方法对所述患者进行预后,b. Using the difference information of the methylation level, the prognosis of the patient is performed by mathematical modeling,

其中所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异越小,指示所述患者的预后越差。Wherein, the smaller the difference of the methylation level between the cancer tissue and the adjacent tissues is, the worse the prognosis of the patient is.

在一些实施方案中,所述系统配置为用于实施根据本公开的第一个方面的预后方法。In some embodiments, the system is configured to implement the prognostic method according to the first aspect of the present disclosure.

在一些实施方案中,所述预后分析模块配置为通过以下方法对所述患者进行预后:In some embodiments, the prognostic analysis module is configured to prognose the patient through the following methods:

a.确定所述癌症组织和癌旁组织中的差异甲基化区块(DMB),包括:a. Determine the differential methylation block (DMB) in the cancer tissue and the adjacent tissue, including:

a1.对于检测的区域内的每个CpG位点,将覆盖该位点的读段中发生甲基化的个数记为M,未发生甲基化的个数记为U;a1. For each CpG site in the detected area, record the number of methylated reads covering this site as M, and the number of unmethylated reads as U;

a2.将相邻的CpG位点进行组合并定义为甲基化区块(MB);将患者的第i个MB中所有CpG位点上的M相加,记为M i;所有U相加,记为U i;总覆盖为N i=M i+U i,甲基化水平为β i=M i/N i. a2 adjacent CpG sites are combined and defined as methylated block (MB); M on the i-th MB patients all CpG sites are added, referred to as M i; the sum of all U , Denoted as U i ; the total coverage is N i =M i +U i , and the methylation level is β i =M i /N i ;

a3.对于第i个MB,将癌症组织中的甲基化水平记为β i (T)=M i (T)/N i (T),将癌旁组织中甲基化水平记为β i (A)=M i (A)/N i (A),其中将β i (T)值和β i (A)值存在显著差异的MB确定为DMB; a3. For the i th MB, the level of methylation in cancer tissues referred to as β i (T) = M i (T) / N i (T), the level of methylation adjacent tissue cancer referred to as beta] i (A) =M i (A) /N i (A) , where the MB with a significant difference between β i (T) and β i (A) is determined as DMB;

b.对所述患者进行预后,包括:b. Prognosis of the patient, including:

b1.引入指示所述患者的癌旁组织和癌症组织的甲基化水平的相似程度的参数α并通过以下算法计算患者的α值:b1. Introduce a parameter α indicating the degree of similarity between the methylation levels of the paracancerous tissue and the cancer tissue of the patient, and calculate the patient's α value through the following algorithm:

Figure PCTCN2021092132-appb-000024
Figure PCTCN2021092132-appb-000024

Figure PCTCN2021092132-appb-000025
Figure PCTCN2021092132-appb-000025

其中0<α<1且α取值越大指示所述相似程度越高,其中f(·),g(·),h(·)分别表示癌旁组织的甲基化水平条件分布,癌症组织的甲基化水平先验分布和基线甲基化水平的先验分布,

Figure PCTCN2021092132-appb-000026
表示先验分布的参数族; Where 0<α<1 and the larger the value of α indicates the higher the degree of similarity, where f(·), g(·), h(·) respectively represent the conditional distribution of methylation levels of adjacent tissues, cancer tissues The prior distribution of methylation levels and the prior distribution of baseline methylation levels,
Figure PCTCN2021092132-appb-000026
Represents the parameter family of the prior distribution;

b2.在0<α<1的范围内使用极大似然估计算法计算参数估计

Figure PCTCN2021092132-appb-000027
并通过fisher信息矩阵计算参数估计的方差
Figure PCTCN2021092132-appb-000028
b2. Use maximum likelihood estimation algorithm to calculate parameter estimates in the range of 0<α<1
Figure PCTCN2021092132-appb-000027
And calculate the variance of parameter estimates through the fisher information matrix
Figure PCTCN2021092132-appb-000028

Figure PCTCN2021092132-appb-000029
Figure PCTCN2021092132-appb-000029

Figure PCTCN2021092132-appb-000030
Figure PCTCN2021092132-appb-000030

b3.计算每一位患者癌旁组织零假设α=0的p值,b3. Calculate the p value of the null hypothesis α=0 for each patient’s adjacent tissues,

其中,p值越小指示所患者的预后越差。Among them, the smaller the p value indicates the worse the prognosis of the patient.

在一些实施方案中,所述预后分析模块配置为在步骤a2中将距离小于50bp、距离小于100bp、距离小于150bp、距离小于200bp、距离小于250bp、距离小于300bp、距离小于350bp、距离小于400bp、距离小于450bp或距离小于500bp的CpG位点进行组合。在一些实施方案中,所述预后分析模块配置为在步骤b2中将距离小于50bp或距离小于100bp的CpG位点进行组合。In some embodiments, the prognostic analysis module is configured to set a distance of less than 50 bp, a distance of less than 100 bp, a distance of less than 150 bp, a distance of less than 200 bp, a distance of less than 250 bp, a distance of less than 300 bp, a distance of less than 350 bp, and a distance of less than 400 bp in step a2. Combine CpG sites with a distance of less than 450 bp or a distance of less than 500 bp. In some embodiments, the prognostic analysis module is configured to combine CpG sites with a distance of less than 50 bp or a distance of less than 100 bp in step b2.

在一些实施方案中,所述预后分析模块配置为在步骤b3中将|β i (T)i (A)|>σ,且

Figure PCTCN2021092132-appb-000031
的MB确定为DMB,其中σ为0.05至1之间的值且τ为大于0.1的值。在一些实施方案中,σ=0.1且τ=0.4。 In some embodiments, the prognostic analysis module is configured to determine | β i (T)i (A) |>σ in step b3, and
Figure PCTCN2021092132-appb-000031
The MB of is determined as DMB, where σ is a value between 0.05 and 1 and τ is a value greater than 0.1. In some embodiments, σ=0.1 and τ=0.4.

在一些实施方案中,所述预后分析模块配置为在步骤b1中通过以下算法计算所述患者的α值:In some embodiments, the prognostic analysis module is configured to calculate the alpha value of the patient through the following algorithm in step b1:

l(α)=∑ ilog L(α;M i (A),N i (A),p i (0),q i (0),β i (T)) l (α) = Σ i log L (α; M i (A), N i (A), p i (0), q i (0), β i (T))

Figure PCTCN2021092132-appb-000032
Figure PCTCN2021092132-appb-000032

其中,l(α)为观测数据的对数似然函数,p i (0),q i (0)为第i个MB的基线甲基化水平所服从的Beta-binomial分布(p i (0),q i (0))中的形状参数。 Among them, l(α) is the log-likelihood function of the observation data, p i (0) , q i (0) is the Beta-binomial distribution subject to the baseline methylation level of the i-th MB (p i (0 ) , The shape parameter in q i (0) ).

在一些实施方案中,所述预后分析模块进一步配置为使用来自正常组织的甲基化数据建立所述基线甲基化水平,包括:In some embodiments, the prognostic analysis module is further configured to use methylation data from normal tissues to establish the baseline methylation level, including:

对于第i个MB,将正常组织的甲基化数据记为M i (0)和N i (0),认为在给定 N i (0)的条件下M i (0)服从形状参数为(p i (0),q i (0))的Beta-binomial分布: For the i-th MB, methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ):

M i (0)|N i (0),β i (0)~Binomial(N i (0),β i (0)) M i (0) |N i (0) , β i (0) ~ Binomial(N i (0) , β i (0) )

β i (0)~Beta(p i (0),q i (0)), β i (0) ~Beta(p i (0) , q i (0) ),

并利用极大似然算法计算参数p i (0)和q i (0)And use the maximum likelihood algorithm to calculate the parameters p i (0) and q i (0) .

在上述系统的一些实施方案中,所述系统进一步配置为在步骤b3中通过Wald检验来计算所述p值。在另一些实施方案中,通过例如似然比检验来计算所述p值。In some embodiments of the above system, the system is further configured to calculate the p-value by Wald test in step b3. In other embodiments, the p-value is calculated by, for example, a likelihood ratio test.

在上述系统的一些实施方案中,所述系统进一步配置为预测所述癌症患者的术后复发风险和/或存活。在一些实施方案中,将p<0.05的患者鉴定为具有高复发风险和/或低术后存活。In some embodiments of the aforementioned system, the system is further configured to predict the postoperative recurrence risk and/or survival of the cancer patient. In some embodiments, patients with p<0.05 are identified as having a high risk of recurrence and/or low postoperative survival.

在第三个方面,本公开涉及一种用于对癌症患者进行预后的设备,其包括:In a third aspect, the present disclosure relates to a device for prognosing cancer patients, which includes:

用于存储计算机程序指令的存储器;和Memory for storing computer program instructions; and

用于执行计算机程序指令的处理器,A processor for executing computer program instructions,

其中当所述计算机程序指令由所述处理器执行时,所述设备执行根据本公开的第一个方面所述的方法。Wherein when the computer program instructions are executed by the processor, the device executes the method according to the first aspect of the present disclosure.

在第四个方面,本公开涉及一种计算机可读介质,所述计算机可读介质存储有计算机程序指令,其中当所述计算机程序指令被处理器执行时实现根据本公开的第一个方面所述的方法。In a fourth aspect, the present disclosure relates to a computer-readable medium storing computer program instructions, wherein when the computer program instructions are executed by a processor, the computer program instructions according to the first aspect of the present disclosure are implemented. The method described.

附图说明Description of the drawings

图1显示了使用MD ratio方法,通过检测患者的癌症组织和癌旁组中如表1中所列的基因组区域预测的复发高危和复发低危患者的无病生存期(DFS)结果的折线图。Figure 1 shows the line chart of the disease-free survival (DFS) results of patients with high risk of recurrence and low risk of recurrence predicted by detecting the cancer tissue and the genomic regions listed in Table 1 in the patient's cancer tissue and paracancerous group using the MD ratio method .

图2显示了使用MD ratio方法,通过检测患者的癌症组织和癌旁组中如表3中所列的基因组区域预测的复发高危和复发低危患者的无病生存期(DFS)结果的折线图。Figure 2 shows the line graph of the disease-free survival (DFS) results of patients with high risk of recurrence and low risk of recurrence predicted by detecting the cancer tissue and the genomic regions listed in Table 3 in the patient's cancer tissue and paracancerous group using the MD ratio method .

图3显示了使用MD ratio方法,通过检测患者的癌症组织和癌旁组中如表4中所列的基因组区域预测的复发高危和复发低危患者的无病生存期(DFS)结果的折线图。Figure 3 shows a line chart of the disease-free survival (DFS) results of patients with high risk of recurrence and low risk of recurrence predicted by detecting the cancer tissue and the genomic regions listed in Table 4 in the patient's cancer tissue and paracancerous group using the MD ratio method .

具体实施方式Detailed ways

下面结合具体实施例来进一步描述本发明,本发明的优点和特点将会随着描述而更为清楚。但这些实施例仅是范例性的,并不对本发明的范围构成任何限制。本领域技术人员应该理解的是,在不偏离本发明的精神和范围下可以对本发明技术方案的细节和形式进行修改或替换,但这些修改和替换均落入本发明的保护范围内。The present invention will be further described below in conjunction with specific embodiments, and the advantages and features of the present invention will become clearer with the description. However, these embodiments are only exemplary, and do not constitute any limitation to the scope of the present invention. Those skilled in the art should understand that the details and forms of the technical solution of the present invention can be modified or replaced without departing from the spirit and scope of the present invention, but these modifications and replacements fall within the protection scope of the present invention.

数据准备data preparation

MD ratio评估系统的主要策略是通过对组织样本的甲基化水平测序,寻找患者体内癌组织和正常组织(例如癌旁组织)的基因组中甲基化状态有差异的区域。基于癌旁组织正处于从正常细胞向癌细胞转化的过程中,计算癌旁组织的甲基化水平在差异区域上与癌组织甲基化水平的近似程度,进而推断患者体内正常细胞发生癌变的风险水平。The main strategy of the MD ratio assessment system is to sequence the methylation levels of tissue samples to find areas with differences in methylation status in the genome of cancer tissues and normal tissues (such as paracancerous tissues) in patients. Based on the paracancerous tissues in the process of transforming from normal cells to cancer cells, calculate the degree of similarity between the methylation level of the paracancerous tissues and the methylation level of cancer tissues in the different regions, and then infer the cancerous transformation of normal cells in the patient Risk level.

癌症复发风险MD ratio评估系统检测患者的癌症组织和癌旁组织样本。以肺癌为例,在肺叶切除手术和肺段切除术中,将癌旁组织定义为切缘5cm外的组织;在楔形切除手术中,将癌旁组织定义为切缘3cm外的组织,并通过组织细胞的病理学评估验证不含有肿瘤细胞;同时,癌症组织和癌旁组织的细胞类型一致。此外,使用68例来自健康供体(年龄分布为30-86岁;中位数年龄为57.5岁;35名男性和33名女性)的正常组织样本进行基线构建过程,上述样本同样经过病理学评估验证不含有肿瘤细胞,且与癌症组织的细胞类型一致。The MD ratio assessment system for cancer recurrence risk detects the patient’s cancer tissue and para-cancerous tissue samples. Take lung cancer as an example. In lobectomy and segmentectomy, the adjacent tissue is defined as the tissue outside the resection margin 5cm; in the wedge resection, the adjacent tissue is defined as the tissue outside the resection margin 3cm, and pass The pathological evaluation of the tissue cells verifies that they do not contain tumor cells; at the same time, the cell types of the cancer tissue and the adjacent tissues are the same. In addition, 68 normal tissue samples from healthy donors (age distribution 30-86 years; median age 57.5 years; 35 men and 33 women) were used for the baseline construction process, and the above samples were also pathologically evaluated Verify that it does not contain tumor cells and is consistent with the cell type of cancer tissue.

样本的文库制备采用brELSATM method(Burning Rock Biotech,Guangzhou,China),包括以下步骤:1)DNA提取和纯化;2)亚硫酸氢钠处理;3)通过DNA聚合酶对单链DNA进行扩增;4)使用定制的癌症甲基化谱RNA诱饵富集如表1所示的目标区域(覆盖了人类基因组约0.9M的区域);5)通过实时PCR对目标文库进行定量。最后使用Illumina公司发布的测序仪NovaSeq 6000进行测序,平均测序深度为1000层。The sample library is prepared using the brELSATM method (Burning Rock Biotech, Guangzhou, China), which includes the following steps: 1) DNA extraction and purification; 2) sodium bisulfite treatment; 3) single-stranded DNA amplification by DNA polymerase; 4) Use a customized cancer methylation profile RNA bait to enrich the target region as shown in Table 1 (covering a region of about 0.9M of the human genome); 5) Quantify the target library by real-time PCR. Finally, the sequencer NovaSeq 6000 released by Illumina was used for sequencing, and the average sequencing depth was 1,000 layers.

对测序的原始输出文件,使用序列比对软件BWA-meth和甲基化数据统计软件MethylDackel进行分析,得到每个样本的甲基化检测输出文件。其中 包含在特异性捕获区域内每个CpG位点的位置信息,以及覆盖此位点的读段中的甲基化信息。将覆盖此位点的读段中发生甲基化的个数记为M,未发生甲基化的个数记为U。将相邻CpG位点(距离小于50bp)进行组合,这种由多个CpG位点组成的集合称为甲基化区块(methylation block,MB)。将患者的第i个MB中所有位点上的M相加,记为M i,所有U相加记为U i,总的覆盖数记为N i(N i=M i+U i),甲基化水平记为β ii的矩估计

Figure PCTCN2021092132-appb-000033
)。 The original output files of sequencing were analyzed using sequence comparison software BWA-meth and methylation data statistics software MethylDackel to obtain the methylation detection output files of each sample. It contains the location information of each CpG site in the specific capture area, and the methylation information in the reads covering this site. The number of methylated reads covering this site is recorded as M, and the number of unmethylated reads is recorded as U. Combining adjacent CpG sites (with a distance of less than 50 bp), this set of multiple CpG sites is called a methylation block (MB). M on the i-th MB patients all sites are added, referred to as M i, referred to as the sum of all U i U, the total number of coverage referred to as N i (N i = M i + U i), The methylation level is denoted as β i (the moment estimation of β i
Figure PCTCN2021092132-appb-000033
).

差异甲基化区域筛选、观测数据建模和参数求解Differential methylation region screening, observation data modeling and parameter solving

为了检测患者的癌旁组织与癌症组织近似而不同于正常组织的程度,将研究目标锁定在癌症组织和癌旁组织的基因组中有甲基化差异的区域,定义为差异甲基化区块(differential methylated block,简称DMB)。In order to detect the degree to which the patient’s para-cancerous tissue is similar to cancer tissue but different from normal tissues, the research target is targeted at the regions of the cancer and para-cancerous tissues where there is a difference in methylation in the genome, which is defined as a differentially methylated block ( differential methylated block, referred to as DMB).

为了筛选癌症患者的个体的DMB,使用患者的癌症组织样本的甲基化水平β (T)与癌旁组织样本的甲基化水平β (A)进行检验。对于第i个MB,将癌症组织中的甲基化水平记为β i (T)=M i (T)/N i (T),将癌旁组织中甲基化水平记为β i (A)=M i (A)/N i (A),将符合|β i (T)i (A)|>0.1,且

Figure PCTCN2021092132-appb-000034
的MB确定为该患者的个性化DMB。 In order to screen the individual DMB of cancer patients, the methylation level β (T) of the cancer tissue sample of the patient and the methylation level β (A) of the adjacent tissue sample are used for testing. For the i-th MB, the level of methylation in cancer tissues referred to as β i (T) = M i (T) / N i (T), the level of methylation adjacent tissue cancer referred to as β i (A ) = M i (A) /N i (A) , which will conform to |β i (T)i (A) |>0.1, and
Figure PCTCN2021092132-appb-000034
The MB is determined to be the patient’s personalized DMB.

为了建立相应的统计学模型,首先利用正常的组织样本(如实施例1中所述)来建立甲基化水平的基线。对于第i个MB,将正常组织的甲基化数据记为M i (0)和N i (0),认为在给定N i (0)的条件下M i (0)服从形状参数为(p i (0),q i (0))的Beta-binomial分布: In order to establish a corresponding statistical model, first use normal tissue samples (as described in Example 1) to establish a baseline of methylation level. For the i-th MB, methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ):

M i (0)|N i (0),β i (0)~Binomial(N i (0),β i (0)) M i (0) |N i (0) , β i (0) ~ Binomial(N i (0) , β i (0) )

β i (0)~Beta(p i (0),q i (0)), β i (0) ~Beta(p i (0) , q i (0) ),

其中β i (0)表示基线真实的甲基化程度,利用极大似然算法(maximum likelihood estimation,简称MLE)求解形状参数{p i,q i},将参数的极大似然估计值记为{p i (0),q i (0)}。 Wherein β i (0) represents the baseline of the actual degree of methylation using the maximum likelihood algorithm (maximum likelihood estimation, referred MLE) solving the shape parameter {p i, q i}, the maximum likelihood estimates of the parameters in mind It is {p i (0) , q i (0) }.

对于癌症患者,认为其癌旁组织样本的甲基化测序数据(M i (A),N i (A),β i (A))服从一个混合Beta-Binomial分布,具体可以表示为: For cancer patients, that methylation sequencing data (M i (A), N i (A), β i (A)) which is adjacent to obey a cancerous tissue sample hybrid Beta-Binomial distribution may be specifically expressed as:

M i (A)|N i (A),β i (A)~Binomial(N i (A),β i (A)) M i (A) |N i (A) , β i (A) ~ Binomial(N i (A) , β i (A) )

β i (A) i=αβ i (T)+(1-α)β i (0) β i (A) i = αβ i (T) + (1-α)β i (0)

β i (0)~Beta(p i (0),q i (0)) β i (0) ~Beta(p i (0) , q i (0) )

其中β i (T)表示该患者癌症组织的甲基化水平,可以用癌症组织样本测序数据的矩估算

Figure PCTCN2021092132-appb-000035
代替;β i (0)表示基线的甲基化水平,如上所述其服从形状参数为(p i (0),q i (0))的Beta分布。α为[0,1]上的比例参数,表示癌旁组织和癌症组织的相似性。α越接近于1则说明癌旁组织的甲基化程度越接近癌症组织,可以推断患者的复发风险越高。 Where β i (T) represents the methylation level of the cancer tissue of the patient, which can be estimated by the moment of the sequencing data of the cancer tissue sample
Figure PCTCN2021092132-appb-000035
Place; β i (0) represents the baseline level of methylation, as described above, which is subject to the shape parameter (p i (0), q i (0)) of the Beta distribution. α is a ratio parameter on [0,1], which indicates the similarity between adjacent tissues and cancerous tissues. The closer α is to 1, the closer the degree of methylation of the adjacent tissues is to the cancer tissue, and it can be inferred that the patient's risk of recurrence is higher.

根据以上的分布,可以写出观测数据的对数似然函数:According to the above distribution, the log likelihood function of the observation data can be written:

l(α)=∑ ilog L(α;M i (A),N i (A),p i (0),q i (0),β i (T)) l (α) = Σ i log L (α; M i (A), N i (A), p i (0), q i (0), β i (T))

Figure PCTCN2021092132-appb-000036
Figure PCTCN2021092132-appb-000036

Figure PCTCN2021092132-appb-000037
Figure PCTCN2021092132-appb-000037

参数α满足:The parameter α satisfies:

0<α<10<α<1

通过极大似然算法计算参数α,在0<α<1的范围内使用拟牛顿算法计算参数估计

Figure PCTCN2021092132-appb-000038
并通过fisher信息矩阵计算参数估计的方差
Figure PCTCN2021092132-appb-000039
Calculate the parameter α through the maximum likelihood algorithm, and use the quasi-Newton algorithm to calculate the parameter estimation in the range of 0<α<1
Figure PCTCN2021092132-appb-000038
And calculate the variance of parameter estimates through the fisher information matrix
Figure PCTCN2021092132-appb-000039

Figure PCTCN2021092132-appb-000040
Figure PCTCN2021092132-appb-000040

Figure PCTCN2021092132-appb-000041
Figure PCTCN2021092132-appb-000041

通过癌旁组织与癌症组织的相似性来推测癌症的复发风险:对参数α的零假设:α=0进行统计推断。利用Wald检验来推断零假设:Infer the risk of cancer recurrence through the similarity between adjacent tissues and cancer tissues: make statistical inferences on the null hypothesis of parameter α: α=0. Use Wald test to infer the null hypothesis:

Figure PCTCN2021092132-appb-000042
Figure PCTCN2021092132-appb-000042

在自由度为1的卡方分布下可以计算每一位患者癌旁组织α=0的p值,p值越小则表示患者的癌旁组织与癌症组织越相似,未来癌症的复发风险也越大。Under the chi-square distribution with 1 degree of freedom, the p value of α=0 for each patient’s paracancerous tissue can be calculated. The smaller the p value, the more similar the patient’s paracancerous tissue and the cancer tissue, and the greater the risk of future cancer recurrence. Big.

癌症复发风险MD ratio的数值模拟检测结果Numerical simulation test results of the MD ratio of cancer recurrence risk

应用beta-binomial混合模型来生成模拟数据,参数设置为N i=1000,m i=1000,p i (0)=11,q i (0)=383,其中p i (0)和q i (0)的取值来源于正常组织样本的极大似然值。α分别取值0,0.001,0.003,0.01,0.003,0.1,每组参数重复50次。数值模拟结果如下表2所示。 Application of beta-binomial mixture model to generate analog data, the parameter is set to N i = 1000, m i = 1000, p i (0) = 11, q i (0) = 383, where p i (0) and q i ( The value of 0) is derived from the maximum likelihood value of normal tissue samples. The values of α are 0, 0.001, 0.003, 0.01, 0.003, 0.1, and each group of parameters is repeated 50 times. The numerical simulation results are shown in Table 2 below.

表2.Table 2.

Figure PCTCN2021092132-appb-000043
Figure PCTCN2021092132-appb-000043

其中,Bias表示

Figure PCTCN2021092132-appb-000044
的均值,MSE表示
Figure PCTCN2021092132-appb-000045
的均值,Std表示
Figure PCTCN2021092132-appb-000046
的标准差,Wald表示Wald检验的中位数,Power表示p-value<0.05的概率。从表中可以看出极大似然的算法误差非常小;假阳率=0.06,检验功效=1,符合预期。 Among them, Bias said
Figure PCTCN2021092132-appb-000044
Mean of, MSE means
Figure PCTCN2021092132-appb-000045
Mean of, Std means
Figure PCTCN2021092132-appb-000046
The standard deviation of, Wald represents the median of Wald test, and Power represents the probability of p-value<0.05. It can be seen from the table that the error of the maximum likelihood algorithm is very small; the false positive rate=0.06, the test power=1, which is in line with expectations.

实施例2.癌症复发风险MD ratio的真实样本检测结果Example 2. Real sample detection results of the MD ratio of cancer recurrence risk

使用如上所述的MD ratio评估系统对39名IA期非小细胞肺癌患者(年龄分布为40-82岁;中位数年龄为61岁;24名男性和15名女性)的癌症组织样本和癌旁组织样本进行检测。检测癌症组织和癌旁组织中如表1中列出的基因组区域的甲基化水平。通过上述算法计算患者的p值。依据检验p-value<0.05,将检测结果分为两组:复发高危和复发低危,其无病生存期(DFS)结果如图1所示。Using the MD ratio evaluation system as described above, 39 patients with stage IA non-small cell lung cancer (age distribution 40-82 years old; median age 61 years old; 24 men and 15 women) cancer tissue samples and cancer Next to tissue samples for testing. Detect the methylation level of the genomic regions listed in Table 1 in cancer tissues and adjacent tissues. The p-value of the patient is calculated by the above algorithm. According to the test p-value<0.05, the test results are divided into two groups: high risk of recurrence and low risk of recurrence. The results of disease-free survival (DFS) are shown in Figure 1.

从图1的结果可以看出,相比于低危患者,高危患者的复发更快,两组患者的DFS差异具有显著的统计学意义(p-value=0.039)。上述结果证明MD  ratio评估系统可以用于准确预测癌症患者的复发风险以及后续存活。It can be seen from the results in Figure 1 that compared with low-risk patients, high-risk patients have a faster relapse, and the difference in DFS between the two groups of patients is statistically significant (p-value=0.039). The above results prove that the MD ratio evaluation system can be used to accurately predict the recurrence risk and subsequent survival of cancer patients.

为了进一步验证MD ratio评估系统对癌症患者进行预后的效果,检测癌症组织和癌旁组织中如表4和表5中列出的基因组区域(其分别包括从表1中随机选取的522个和532个区域,分别覆盖基因组中约0.47M和约0.48M的区域)的甲基化水平。通过相同的算法计算患者的p值,并依据检验p-value<0.05,将检测结果分为复发高危和复发低危,其无病生存期(DFS)结果分别如图2和图3所示。In order to further verify the prognostic effect of the MD ratio evaluation system on cancer patients, the genomic regions listed in Table 4 and Table 5 in cancer tissues and adjacent tissues (which include 522 and 532 randomly selected from Table 1 respectively) These regions cover the methylation levels of 0.47M and 0.48M in the genome, respectively. The patient's p-value is calculated by the same algorithm, and the test results are divided into high-risk recurrence and low-risk recurrence according to the test p-value<0.05. The disease-free survival (DFS) results are shown in Figure 2 and Figure 3, respectively.

表3.table 3.

Figure PCTCN2021092132-appb-000047
Figure PCTCN2021092132-appb-000047

Figure PCTCN2021092132-appb-000048
Figure PCTCN2021092132-appb-000048

Figure PCTCN2021092132-appb-000049
Figure PCTCN2021092132-appb-000049

Figure PCTCN2021092132-appb-000050
Figure PCTCN2021092132-appb-000050

Figure PCTCN2021092132-appb-000051
Figure PCTCN2021092132-appb-000051

Figure PCTCN2021092132-appb-000052
Figure PCTCN2021092132-appb-000052

Figure PCTCN2021092132-appb-000053
Figure PCTCN2021092132-appb-000053

表4.Table 4.

Figure PCTCN2021092132-appb-000054
Figure PCTCN2021092132-appb-000054

Figure PCTCN2021092132-appb-000055
Figure PCTCN2021092132-appb-000055

Figure PCTCN2021092132-appb-000056
Figure PCTCN2021092132-appb-000056

Figure PCTCN2021092132-appb-000057
Figure PCTCN2021092132-appb-000057

Figure PCTCN2021092132-appb-000058
Figure PCTCN2021092132-appb-000058

Figure PCTCN2021092132-appb-000059
Figure PCTCN2021092132-appb-000059

Figure PCTCN2021092132-appb-000060
Figure PCTCN2021092132-appb-000060

为了对比MD ratio评估系统与传统的肿瘤驱动基因检测在对癌症患者进行预后中的有效性,使用体细胞突变检测的方法将主要的突变类型EGFR19del和EGFR L858R和DFS的相关性与MD ratio评估系统检测结果和DFS的相关性进行比较,结果如表5所示。其中,MD ratio-1表示通过检测表1所示的区域获得的结果(对应图1);MD ratio-2和MD-ratio-3分别表示通过检测表3和表4所示的区域获得的结果(分别对应图2和图3)。In order to compare the effectiveness of the MD ratio evaluation system and traditional tumor driver gene detection in prognosticating cancer patients, the method of somatic mutation detection is used to compare the correlation between the main mutation types EGFR19del and EGFR L858R and DFS with the MD ratio evaluation system The correlation between the test results and DFS is compared, and the results are shown in Table 5. Among them, MD ratio-1 represents the result obtained by detecting the area shown in Table 1 (corresponding to Figure 1); MD ratio-2 and MD-ratio-3 represent the result obtained by detecting the area shown in Table 3 and Table 4, respectively (Corresponding to Figure 2 and Figure 3 respectively).

表5.table 5.

 To p-valuep-value harzard ratio(HR)harzard ratio(HR) EGFR L858REGFR L858R 0.2300.230 0.4640.464 EGFR 19delEGFR 19del 0.1300.130 2.4422.442 MD ratio-1MD ratio-1 0.0390.039 4.6924.692 MD ratio-2MD ratio-2 0.0300.030 3.4013.401 MD ratio-3MD ratio-3 0.0180.018 4.4534.453

上述结果表明,MD ratio评估系统比somatic检测能够更有效的评估患者的癌症复发风险以及后续存活,在患者的预后管理和临床治疗上发挥更好的作用。The above results indicate that the MD ratio assessment system can more effectively assess the patient's cancer recurrence risk and subsequent survival than the somatic test, and play a better role in the patient's prognosis management and clinical treatment.

Claims (10)

一种对癌症患者进行预后的方法,所述方法包括:A method for prognosing cancer patients, the method comprising: a.通过高通量测序检测来自所述患者的癌症组织和癌旁组织的基因组的一个或多个区域中的甲基化水平;a. Detecting the methylation level in one or more regions of the genome of cancer tissues and paracancerous tissues from the patient by high-throughput sequencing; b.确定所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异;和b. Determine the difference in the methylation level between the cancer tissue and the adjacent tissue; and c.使用所述甲基化水平的差异信息,通过数学建模的方法对所述患者进行预后,c. Use the difference information of the methylation level to predict the patient's prognosis through mathematical modeling, 其中所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异越小,指示所述患者的预后越差。Wherein, the smaller the difference of the methylation level between the cancer tissue and the adjacent tissues is, the worse the prognosis of the patient is. 如权利要求1所述的方法,包括:The method of claim 1, comprising: a.通过高通量测序检测来自所述患者的癌症组织和癌旁组织的基因组的一个或多个区域中的甲基化水平;a. Detecting the methylation level in one or more regions of the genome of cancer tissues and paracancerous tissues from the patient by high-throughput sequencing; b.确定所述癌症组织和癌旁组织中的差异甲基化区块(DMB),包括:b. Determine the differential methylation block (DMB) in the cancer tissue and adjacent tissues, including: b1.对于检测的区域内的每个CpG位点,将覆盖该位点的读段中发生甲基化的个数记为M,未发生甲基化的个数记为U;b1. For each CpG site in the detected area, the number of methylated reads covering the site is recorded as M, and the number of unmethylated reads is recorded as U; b2.将相邻的CpG位点进行组合并定义为甲基化区块(MB);将患者的第i个MB中所有CpG位点上的M相加,记为M i;所有U相加,记为U i;总覆盖为N i=M i+U i,甲基化水平为β i=M i/N i. b2 adjacent CpG sites are combined and defined as methylated block (MB); M on the i-th MB patients all CpG sites are added, referred to as M i; the sum of all U , Denoted as U i ; the total coverage is N i =M i +U i , and the methylation level is β i =M i /N i ; b3.对于第i个MB,将癌症组织中的甲基化水平记为β i (T)=M i (T)/N i (T),将癌旁组织中甲基化水平记为β i (A)=M i (A)/N i (A),其中将β i (T)值和β i (A)值存在显著差异的MB确定为DMB; b3. For the i th MB, the level of methylation in cancer tissues referred to as β i (T) = M i (T) / N i (T), the level of methylation adjacent tissue cancer referred to as beta] i (A) =M i (A) /N i (A) , where the MB with a significant difference between β i (T) and β i (A) is determined as DMB; c.对所述患者进行预后,包括:c. Prognosis of the patient, including: c1.引入指示所述患者的癌旁组织和癌症组织的甲基化水平的相似程度的参数α并通过以下算法计算患者的α值:c1. Introduce a parameter α indicating the degree of similarity between the methylation levels of the paracancerous tissue and the cancer tissue of the patient, and calculate the patient's α value by the following algorithm:
Figure PCTCN2021092132-appb-100001
Figure PCTCN2021092132-appb-100001
Figure PCTCN2021092132-appb-100002
Figure PCTCN2021092132-appb-100002
其中0<α<1且α取值越大指示所述相似程度越高,其中f(·),g(·),h(·)分别表示癌旁组织的甲基化水平的条件分布,癌症组织的甲基化水平的先验分布和基线甲基化水平的先验分布,
Figure PCTCN2021092132-appb-100003
表示先验分布的参数族;
Where 0<α<1 and a larger value of α indicates a higher degree of similarity, where f(·), g(·), h(·) respectively represent the conditional distribution of the methylation level of adjacent tissues, cancer The prior distribution of the methylation level of the tissue and the prior distribution of the baseline methylation level,
Figure PCTCN2021092132-appb-100003
Represents the parameter family of the prior distribution;
c2.在0<α<1的范围内使用极大似然估计算法计算参数估计
Figure PCTCN2021092132-appb-100004
并通过fisher信息矩阵计算参数估计的方差
Figure PCTCN2021092132-appb-100005
c2. Use maximum likelihood estimation algorithm to calculate parameter estimates in the range of 0<α<1
Figure PCTCN2021092132-appb-100004
And calculate the variance of parameter estimates through the fisher information matrix
Figure PCTCN2021092132-appb-100005
Figure PCTCN2021092132-appb-100006
Figure PCTCN2021092132-appb-100006
Figure PCTCN2021092132-appb-100007
Figure PCTCN2021092132-appb-100007
c3.计算每一位患者癌旁组织零假设α=0的p值,c3. Calculate the p value of the null hypothesis α=0 for each patient’s adjacent tissues, 其中,p值越小指示所患者的预后越差。Among them, the smaller the p value indicates the worse the prognosis of the patient.
如权利要求2所述的方法,其中可选地包含如下任意一个或多个特征:The method according to claim 2, which optionally includes any one or more of the following features: (1)在步骤b2中将距离小于50bp或距离小于100bp的CpG位点进行组合;(1) Combine CpG sites with a distance of less than 50 bp or a distance of less than 100 bp in step b2; (2)在步骤b3中将|β i (T)i (A)|>σ,且
Figure PCTCN2021092132-appb-100008
的MB确定为DMB,其中σ为0.05至1之间的值且τ为大于0.1的值;
(2) In step b3, set |β i (T)i (A) |>σ, and
Figure PCTCN2021092132-appb-100008
The MB of is determined as DMB, where σ is a value between 0.05 and 1 and τ is a value greater than 0.1;
优选的,σ=0.1且τ=0.4;Preferably, σ=0.1 and τ=0.4; (3)在步骤c1中通过以下算法计算所述患者的α值:(3) In step c1, the patient's α value is calculated by the following algorithm:
Figure PCTCN2021092132-appb-100009
Figure PCTCN2021092132-appb-100009
Figure PCTCN2021092132-appb-100010
Figure PCTCN2021092132-appb-100010
其中,l(α)为观测数据的对数似然函数,p i (0),q i (0)为第i个MB的基线甲基化水平所服从的Beta-binomial分布(p i (0),q i (0))中的形状参数; Among them, l(α) is the log-likelihood function of the observation data, p i (0) , q i (0) is the Beta-binomial distribution subject to the baseline methylation level of the i-th MB (p i (0 ) , The shape parameter in q i (0) ); (4)在步骤c3中通过Wald检验来计算所述p值;(4) Calculate the p value by Wald test in step c3; (5)将p<0.05的患者鉴定为具有高复发风险和/或低术后存活。(5) Patients with p<0.05 are identified as having a high risk of recurrence and/or low postoperative survival.
如权利要求1-3中任一项所述的方法,其中可选地包含如下任意一个或多个特征:The method according to any one of claims 1-3, which optionally includes any one or more of the following features: (1)所述基因组的一个或多个区域是在所述癌症患者的群体中存在甲基化变异的基因组的区域;(1) One or more regions of the genome are regions of the genome with methylation variants in the population of cancer patients; (2)所述基因组的一个或多个区域覆盖基因组至少0.3M(兆碱基)的区域,例如0.3M-5M的区域;(2) One or more regions of the genome cover at least a 0.3M (megabase) region of the genome, such as a 0.3M-5M region; (3)所述方法用于预测所述癌症患者的术后复发风险和/或存活。(3) The method is used to predict the postoperative recurrence risk and/or survival of the cancer patient. 如权利要求1-4中任一项所述的方法,其中所述癌症选自肺癌、结直肠癌、肝癌、卵巢癌、胰腺癌、胆囊癌、胃癌和食管癌;优选的,其中所述癌症是肺癌,例如非小细胞肺癌(NSCLC);优选的,其中所述基因组的一个或多个区域包括选自表1中所列的一个或多个区域;优选的,其中所述癌症组织和癌旁组织是从所述患者手术切除的组织。The method of any one of claims 1 to 4, wherein the cancer is selected from lung cancer, colorectal cancer, liver cancer, ovarian cancer, pancreatic cancer, gallbladder cancer, gastric cancer, and esophageal cancer; preferably, wherein the cancer Is lung cancer, such as non-small cell lung cancer (NSCLC); preferably, wherein one or more regions of the genome include one or more regions selected from those listed in Table 1; preferably, wherein the cancer tissue and cancer Parasite tissue is the tissue surgically removed from the patient. 如权利要求2-5中任一项所述的方法,其中使用来自正常组织的甲基化数据建立所述基线甲基化水平,包括:The method of any one of claims 2-5, wherein using methylation data from normal tissues to establish the baseline methylation level comprises: 对于第i个MB,将正常组织的甲基化数据记为M i (0)和N i (0),认为在给定N i (0)的条件下M i (0)服从形状参数为(p i (0),q i (0))的Beta-binomial分布: For the i-th MB, methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ): M i (0)|N i (0),β i (0)~Binomial(N i (0),β i (0)) M i (0) |N i (0) , β i (0) ~ Binomial(N i (0) , β i (0) ) β i (0)~Beta(p i (0),q i (0)), β i (0) ~Beta(p i (0) , q i (0) ), 并利用极大似然算法计算参数p i (0)和q i (0)And use the maximum likelihood algorithm to calculate the parameters p i (0) and q i (0) . 一种用于对癌症患者进行预后的系统,所述系统包括:A system for prognosing cancer patients, the system comprising: 甲基化检测模块;和Methylation detection module; and 预后分析模块,Prognostic analysis module, 其中,所述甲基化测序模块配置为通过高通量测序检测来自所述患者的癌症组织和癌旁组织的基因组的一个或多个区域中的甲基化水平,且所述预 后分析模块配置为通过以下方法对所述患者进行预后:Wherein, the methylation sequencing module is configured to detect the methylation level in one or more regions of the genome of cancer tissues and paracancerous tissues from the patient through high-throughput sequencing, and the prognostic analysis module is configured For the prognosis of the patient through the following methods: a.确定所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异;和a. Determine the difference in the methylation level between the cancer tissue and the paracancerous tissue; and b.使用所述甲基化水平的差异信息,通过数学建模的方法对所述患者进行预后,b. Using the difference information of the methylation level, the prognosis of the patient is performed by mathematical modeling, 其中所述甲基化水平在所述癌症组织和所述癌旁组织之间的差异越小,指示所述患者的预后越差。Wherein, the smaller the difference of the methylation level between the cancer tissue and the adjacent tissues is, the worse the prognosis of the patient is. 如权利要求7所述的系统,其中所述预后分析模块配置为通过以下方法对所述患者进行预后:The system of claim 7, wherein the prognostic analysis module is configured to prognose the patient by the following method: a.确定所述癌症组织和癌旁组织中的差异甲基化区块(DMB),包括:a. Determine the differential methylation block (DMB) in the cancer tissue and the adjacent tissue, including: a1.对于检测的区域内的每个CpG位点,将覆盖该位点的读段中发生甲基化的个数记为M,未发生甲基化的个数记为U;a1. For each CpG site in the detected area, record the number of methylated reads covering this site as M, and the number of unmethylated reads as U; a2.将相邻的CpG位点进行组合并定义为甲基化区块(MB);将患者的第i个MB中所有CpG位点上的M相加,记为M i;所有U相加,记为U i;总覆盖为N i=M i+U i,甲基化水平为β i=M i/N i. a2 adjacent CpG sites are combined and defined as methylated block (MB); M on the i-th MB patients all CpG sites are added, referred to as M i; the sum of all U , Denoted as U i ; the total coverage is N i =M i +U i , and the methylation level is β i =M i /N i ; a3.对于第i个MB,将癌症组织中的甲基化水平记为β i (T)=M i (T)/N i (T),将癌旁组织中甲基化水平记为β i (A)=M i (A)/N i (A),其中将β i (T)值和β i (A)值存在显著差异的MB确定为DMB; a3. For the i th MB, the level of methylation in cancer tissues referred to as β i (T) = M i (T) / N i (T), the level of methylation adjacent tissue cancer referred to as beta] i (A) =M i (A) /N i (A) , where the MB with a significant difference between β i (T) and β i (A) is determined as DMB; b.对所述患者进行预后,包括:b. Prognosis of the patient, including: b1.引入指示所述患者的癌旁组织和癌症组织的甲基化水平的相似程度的参数α并通过以下算法计算患者的α值:b1. Introduce a parameter α indicating the degree of similarity between the methylation levels of the paracancerous tissue and the cancer tissue of the patient, and calculate the patient's α value through the following algorithm:
Figure PCTCN2021092132-appb-100011
Figure PCTCN2021092132-appb-100011
其中0<α<1且α取值越大指示所述相似程度越高,其中f(·),g(·),h(·)分别表示癌旁组织的甲基化水平的条件分布,癌症组织的甲基化水平的先验分布和基线甲基化水平的先验分布,
Figure PCTCN2021092132-appb-100012
表示先验分布的参数族;
Where 0<α<1 and a larger value of α indicates a higher degree of similarity, where f(·), g(·), h(·) respectively represent the conditional distribution of the methylation level of adjacent tissues, cancer The prior distribution of the methylation level of the tissue and the prior distribution of the baseline methylation level,
Figure PCTCN2021092132-appb-100012
Represents the parameter family of the prior distribution;
b2.在0<α<1的范围内使用极大似然估计算法计算参数估计
Figure PCTCN2021092132-appb-100013
并通过fisher信息矩阵计算参数估计的方差
Figure PCTCN2021092132-appb-100014
b2. Use maximum likelihood estimation algorithm to calculate parameter estimates in the range of 0<α<1
Figure PCTCN2021092132-appb-100013
And calculate the variance of parameter estimates through the fisher information matrix
Figure PCTCN2021092132-appb-100014
Figure PCTCN2021092132-appb-100015
Figure PCTCN2021092132-appb-100015
Figure PCTCN2021092132-appb-100016
Figure PCTCN2021092132-appb-100016
b3.计算每一位患者癌旁组织零假设α=0的p值,b3. Calculate the p value of the null hypothesis α=0 for each patient’s adjacent tissues, 其中,p值越小指示所患者的预后越差;Among them, the smaller the p value indicates the worse the prognosis of the patient; 优选的,所述预后分析模块配置为在步骤a2中将距离小于50bp或距离小于100bp的CpG位点进行组合;Preferably, the prognostic analysis module is configured to combine CpG sites with a distance of less than 50 bp or a distance of less than 100 bp in step a2; 优选的,所述预后分析模块配置为在步骤a3中将|β i (T)i (A)|>σ,且
Figure PCTCN2021092132-appb-100017
的MB确定为DMB,其中σ为0.05至1之间的值且τ为大于0.1的值;优选的,其中σ=0.1且τ=0.4;优选的,所述预后分析模块配置为在步骤b1中通过以下算法计算所述患者的α值:
Preferably, the prognostic analysis module is configured to set |β i (T)i (A) |>σ in step a3, and
Figure PCTCN2021092132-appb-100017
The MB of is determined to be DMB, where σ is a value between 0.05 and 1 and τ is a value greater than 0.1; preferably, where σ=0.1 and τ=0.4; preferably, the prognostic analysis module is configured to be in step b1 The α value of the patient is calculated by the following algorithm:
Figure PCTCN2021092132-appb-100018
Figure PCTCN2021092132-appb-100018
Figure PCTCN2021092132-appb-100019
Figure PCTCN2021092132-appb-100019
其中,l(α)为观测数据的对数似然函数,p i (0),q i (0)为第i个MB的基线甲基化水平所服从的Beta-binomial分布(p i (0),q i (0))中的形状参数;优选的,所述预后分析模块进一步配置为使用来自正常组织的甲基化数据建立所述基线甲基化水平,包括: Among them, l(α) is the log-likelihood function of the observation data, p i (0) , q i (0) is the Beta-binomial distribution subject to the baseline methylation level of the i-th MB (p i (0 ) , the shape parameter in q i (0) ); preferably, the prognostic analysis module is further configured to use methylation data from normal tissues to establish the baseline methylation level, including: 对于第i个MB,将正常组织的甲基化数据记为M i (0)和N i (0),认为在给定N i (0)的条件下M i (0)服从形状参数为(p i (0),q i (0))的Beta-binomial分布: For the i-th MB, methylated normal tissue data referred to as M i (0) and N i (0), that at a given N i (0) of M i (0) is subject to a shape parameter ( Beta-binomial distribution of p i (0) , q i (0) ): M i (0)|N i (0),β i (0)~Binomial(N i (0),β i (0)) M i (0) |N i (0) , β i (0) ~ Binomial(N i (0) , β i (0) ) β i (0)~Beta(p i (0),q i (0)), β i (0) ~Beta(p i (0) , q i (0) ), 并利用极大似然算法计算参数p i (0)和q i (0);优选的,所述预后分析模块配置为在步骤b3中通过Wald检验来计算所述p值;优选的,所述系统用于预测所述癌症患者的术后复发风险和/或存活;优选的,将p<0.05的患者鉴定为具有高复发风险和/或低术后存活。 The maximum likelihood algorithm is used to calculate the parameters p i (0) and q i (0) ; preferably, the prognostic analysis module is configured to calculate the p value through Wald test in step b3; preferably, the The system is used to predict the postoperative recurrence risk and/or survival of the cancer patient; preferably, patients with p<0.05 are identified as having a high recurrence risk and/or low postoperative survival.
一种用于对癌症患者进行预后的设备,其包括:A device for prognosing cancer patients, which includes: 用于存储计算机程序指令的存储器;和Memory for storing computer program instructions; and 用于执行计算机程序指令的处理器,A processor for executing computer program instructions, 其中当所述计算机程序指令由所述处理器执行时,所述设备执行权利要求1-6中任一项所述的方法。Wherein when the computer program instructions are executed by the processor, the device executes the method according to any one of claims 1-6. 一种计算机可读介质,所述计算机可读介质存储有计算机程序指令,其中当所述计算机程序指令被处理器执行时实现权利要求1-6中任一项所述的方法。A computer readable medium storing computer program instructions, wherein when the computer program instructions are executed by a processor, the method according to any one of claims 1 to 6 is implemented.
PCT/CN2021/092132 2020-05-09 2021-05-07 Cancer prognostic method Ceased WO2021227950A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010385996.8 2020-05-09
CN202010385996 2020-05-09

Publications (1)

Publication Number Publication Date
WO2021227950A1 true WO2021227950A1 (en) 2021-11-18

Family

ID=77132558

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/092132 Ceased WO2021227950A1 (en) 2020-05-09 2021-05-07 Cancer prognostic method

Country Status (2)

Country Link
CN (2) CN113234825B (en)
WO (1) WO2021227950A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024007205A1 (en) * 2022-07-06 2024-01-11 何肇基 Method and system for establishing indicator for assessing degree of malignancy of tissue microenvironment, and method and system for using indicator for assessing degree of malignancy of tissue microenvironment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247873A (en) * 2017-03-29 2017-10-13 电子科技大学 A kind of recognition methods of differential methylation site
CN107451420A (en) * 2017-07-26 2017-12-08 同济大学 The differential methylation parser of purity effect is considered based on DNA methylation data
CN108064314A (en) * 2015-01-18 2018-05-22 加利福尼亚大学董事会 Method and system for determining cancer status
CN108779487A (en) * 2015-11-16 2018-11-09 普罗格尼迪公司 Nucleic acid for detecting methylation state and method
CN111094590A (en) * 2017-07-12 2020-05-01 大学健康网络 Cancer detection and classification using methylation component analysis

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5760247B2 (en) * 2005-12-28 2015-08-05 国立大学法人名古屋大学 Composition and method for predicting postoperative prognosis or metastatic potential of cancer patients
US20120004855A1 (en) * 2008-12-23 2012-01-05 Koninklijke Philips Electronics N.V. Methylation biomarkers for predicting relapse free survival
CN105849282A (en) * 2014-01-03 2016-08-10 深圳华大基因研究院 A minimally invasive approach to postoperative monitoring of cancer patients
CN107326065B (en) * 2016-04-29 2022-07-29 博尔诚(北京)科技有限公司 Screening method and application of gene marker
CN107034295B (en) * 2017-06-05 2021-04-06 天津医科大学肿瘤医院 DNA methylation index for early diagnosis and risk evaluation of cancer and application thereof
CN110257524A (en) * 2019-08-01 2019-09-20 浙江大学 It is a kind of distinguish colorectal cancer cancerous tissue and Carcinoma side normal tissue colorectal cancer discrimination model and its construction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108064314A (en) * 2015-01-18 2018-05-22 加利福尼亚大学董事会 Method and system for determining cancer status
CN108779487A (en) * 2015-11-16 2018-11-09 普罗格尼迪公司 Nucleic acid for detecting methylation state and method
CN107247873A (en) * 2017-03-29 2017-10-13 电子科技大学 A kind of recognition methods of differential methylation site
CN111094590A (en) * 2017-07-12 2020-05-01 大学健康网络 Cancer detection and classification using methylation component analysis
CN107451420A (en) * 2017-07-26 2017-12-08 同济大学 The differential methylation parser of purity effect is considered based on DNA methylation data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CURTIUS KIT, WRIGHT NICHOLAS A., GRAHAM TREVOR A.: "An evolutionary perspective on field cancerization", NATURE REVIEWS CANCER, NATURE PUB. GROUP, LONDON, vol. 18, no. 1, 1 January 2018 (2018-01-01), London , pages 19 - 32, XP055866546, ISSN: 1474-175X, DOI: 10.1038/nrc.2017.102 *

Also Published As

Publication number Publication date
CN119193839A (en) 2024-12-27
CN113234825B (en) 2024-11-19
CN113234825A (en) 2021-08-10

Similar Documents

Publication Publication Date Title
CN106795562B (en) Analysis of tissue methylation patterns in DNA mixtures
Xu et al. Development and clinical validation of a novel 9-gene prognostic model based on multi-omics in pancreatic adenocarcinoma
CN105518151B (en) Identification and use of circulating nucleic acid tumor markers
CN112236520A (en) Methylation Marker and Targeted Methylation Probe Panels
CN108504555B (en) Device and method for identifying and evaluating tumor progression
CN109689891A (en) The method of segment group spectrum analysis for cell-free nucleic acid
Zhu et al. The genomic and epigenomic evolutionary history of papillary renal cell carcinomas
TW201718871A (en) Analysis of haplotype methylation patterns in tissues of DNA mixtures
CN109971862A (en) C9orf139 and MIR600HG as prognostic markers in pancreatic cancer and methods for their establishment
JP2023540257A (en) Validation of samples to classify cancer
JP2023530463A (en) Detection and classification of human papillomavirus-associated cancers
JP2025084804A (en) Risk stratification for virus-associated cancers
WO2021227950A1 (en) Cancer prognostic method
Kalady et al. Gene signature is associated with early stage rectal cancer recurrence
Deng et al. Identification and validation of a DNA methylation-driven gene-based prognostic model for clear cell renal cell carcinoma
Shu et al. Identification of a DNA-methylome-based signature for prognosis prediction in driver gene-negative lung adenocarcinoma
CN118749032A (en) Molecular analysis using long free DNA molecules for disease classification
CN117413071A (en) Method for preparing a multi-analysis prediction model for cancer diagnosis
Zhu et al. Expression patterns and prognostic value of key regulators associated with m7G RNA modification based on all gene expression in colon adenocarcinoma
Yang et al. Transcriptome mapping of renal clear cell carcinoma revealed by machine learning algorithm based on enhanced computed tomography images
Vastrad et al. A comprehensive transcriptome based meta-analysis to unveil the aggression nexus of oral squamous cell carcinoma
CN112037851A (en) Application of autophagy-related gene in kit and system for colorectal cancer prognosis
US20240420800A1 (en) METHOD FOR HRD DETECTION IN TARGETED cfDNA SAMPLES USING DE NOVO MUTATIONAL SIGNATURES
Guo et al. Evaluating the prognostic potential of telomerase signature in breast cancer through advanced machine learning model
Benvenuto A bioinformatic approach to define transcriptome alterations in platinum resistance ovarian cancers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21803354

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21803354

Country of ref document: EP

Kind code of ref document: A1