[go: up one dir, main page]

WO2024178248A1 - Détection précoce pan-cancer et méthylation de l'adncf dans le cadre des maladies résiduelles minimales - Google Patents

Détection précoce pan-cancer et méthylation de l'adncf dans le cadre des maladies résiduelles minimales Download PDF

Info

Publication number
WO2024178248A1
WO2024178248A1 PCT/US2024/016941 US2024016941W WO2024178248A1 WO 2024178248 A1 WO2024178248 A1 WO 2024178248A1 US 2024016941 W US2024016941 W US 2024016941W WO 2024178248 A1 WO2024178248 A1 WO 2024178248A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
target regions
samples
methylation
regions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2024/016941
Other languages
English (en)
Inventor
Bodour Salhia
David Buckley
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Southern California USC
Original Assignee
University of Southern California USC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Southern California USC filed Critical University of Southern California USC
Priority to AU2024226284A priority Critical patent/AU2024226284A1/en
Publication of WO2024178248A1 publication Critical patent/WO2024178248A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/40ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • Cancer is the second leading cause of death in children aged 1-14 years in the United States, with approximately 11.000 new cases and 1,200 deaths annually.
  • the 5 -year overall survival rate for pediatric cancer has risen dramatically in recent years, from 58% in 1975 to just under 85% in 2020.
  • neuroblastoma the third most common form of pediatric cancer after leukemia and central nervous system (CNS) tumors - has a 5 -year overall survival rate of 75% which plummets to 20% after the first disease recurrence.
  • the improvement in overall survival for metastatic pediatric cancers has been mixed over the last few decades, with neuroblastoma showing improvement in prognosis due to treatment advances, while others like rhabdomyosarcoma and Ewing sarcoma have had minimal improvement.
  • a key factor limiting progress in this is the rarity of many of these cancers, which limits research opportunities and hampers clinical trials.
  • pediatric cancers Compared to adult-onset cancer, pediatric cancers typically have much lower mutational burden. While pediatric cancer genomes are characteristically "quiet with a low mutational burden, the epigenome of many pediatric cancers appears "loud with many driver mutations occurring in chromatin modifiers, such as SMARCB 1 in malignant rhabdoid tumors, alongside widespread DNA methylation and histone marker changes. In light of this, targeting epigenetic modulators has been explored as a potential therapeutic option for multiple pediatric cancers.
  • chromatin modifiers such as SMARCB 1 in malignant rhabdoid tumors
  • DNA methylation changes are widespread and common in cancer and are among the earliest aberrations to occur in tumorigenesis.
  • cancer methylomes display genome-wide hypomethylation and focal hypermethylation, particularly at promoter loci.
  • Tire effectiveness of targeting DNA demethyltransferase inhibitors has been explored in multiple cancers - notably neuroblastoma, Ewing sarcoma, and AML.
  • WGBS whole genome bisulfite sequencing
  • a method for determining whether a subject is likely to have or develop a pediatric cancer, adult cancer, or Minimal Residual Disease comprises the steps of: a) training a machine learning model to detect the pediatric cancer, the adult cancer, or the MRD.
  • the machine learning model is trained using target regions from a plurality of cancer samples and corresponding target regions from non-cancerous samples, wherein the cancer samples comprise at least two different cancer types, wherein the machine learning model is configured to identify the pediatric cancer, the adult cancer, or the MRD based on a comparison of a methylation pattern of target regions of the cancer samples compared to a methylation pattern of corresponding target regions of the non-cancerous samples; b) determining a methylation pattern of target regions of a deoxyribonucleic acid (DNA) sample obtained from the subject; c) applying the trained machine learning model to the methylation pattern of the target regions of the DNA obtained from tire subject; and d) determining that the subject has or does not have the pediatric cancer, the adult cancer, or the MRD based on an output of the machine learning model.
  • DNA deoxyribonucleic acid
  • tire methylation pattern of the plurality of each of target regions is determined using DNA methylation analysis, wherein the DNA methylation analysis comprises one or more of whole genome bisulfite sequencing (WGBS), Reduced Representation Bisulfite sequencing (RRBS), Targeted bisulfite sequencing, Hybridization Probe capture, Methylation bead arrays, and Enzymatic methyl-sequence conversion.
  • the methylation pattern of the plurality of target regions is determined using hybridization probe capture after whole genome bisulfite sequencing.
  • the hybridization probe capture comprises one or more probes that hybridize to the one or more target genomic regions, wherein each of the one or more probes comprises ribonucleic acid or deoxyribonucleic acid, and optionally, wherein each of the one or more probes comprises an affinity tag selected from the group consisting of biotin and streptavidin.
  • the target regions of steps a-c comprise about 30% to about 50% of the target regions of Table 1, about 50% to about 70% of the target regions of Table 1, about 70% to about 90% of the target regions of Table 1, about 90% to about 95% of the target regions of Table 1, or the plurality of target regions comprises greater than about 95% of the target regions of Table 1.
  • FIG 1A-E Analysis of differential DNA methylation patterns in pediatric cancers.
  • A Hierarchical clustering of beta values across 183 highly variable DMRs in 31 tumor samples. Row annotation bar denotes diagnosis; column annotation bar denotes DMR cluster as determined by k-mcans clustering.
  • B Uniform manifold approximation projection (UMAP) per sample based 183 most variable DMRs as in in C.
  • C Heatmap of beta values of 166/183 regions in C (17 regions from C were dropped due to limitations of 450k array) from 526 TARGET samples and 17 POETIC samples using.
  • D UMAP of TARGET & POETIC samples across regions in E by diagnosis and (E) source.
  • FIG. 2A-C WGBS analysis reveals pan pediatric cancer DNA methylation profile.
  • B Box plots of average beta values for hypo and hypermethylated mDMRs. Tumor/normal comparisons are statistically significant (p ⁇ 0.0001; Wilcoxon rank sum test).
  • C Receiver operating characteristic (ROC) curve of random forest classifier model constructed using mDMRs.
  • ROC curve annotated with area under the curve (AUC). Significance annotation: 'ns': p > 0.05, p ⁇ 0.05, p ⁇ 0.01 , ‘***’: p ⁇ 0.001, ‘****’: p ⁇ 0.0001.
  • FIG. 3 mDMR differential methylation in TARGET and ENCODE datasets. Plots show mean beta values across hypermethylated (left) and hypermethylated (right) mDMRs in MRT, NBL OS, and WT tumor samples from the TARGET database. Normal tissues from POETIC and normal tissues from ENCODE were used as controls; all comparisons were statistically significant (Wilcox p-value ⁇ 0.0001) and consistent with the directionality observed in POETIC samples. Significance annotation: ‘ns’ : p > 0.05, ‘*’: p ⁇ 0.05, p ⁇ 0.01, ‘***’: p ⁇ 0.001, ‘****’ : p ⁇ 0.0001.
  • FIG. 4A-C mDMRs detected in multiple adult cancers from TCGA.
  • A ROC curves from TCGA 45 OK DNA methylation data using a random forest model trained 422 of the 905 pediatric cancer mDMRs derived by WGBS (subset of 422 regions used due to the limitations of the 450K array). Plots annotated with TCGA cancer code and AUC.
  • B Graphical representation of AUC in A with 95% CI indicated as error bars. See the GDC website for study abbreviation disambiguation (gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations).
  • C Tumor and normal mean beta values across hypermethylated (top) and hypennethylated (bottom) mDMRs in all 14 adult cancer types. All comparisons are statistically significant (Wilcox p-value ⁇ 0.0001), except PAAD hypomethylated comparison, and consistent with the directionality observed in POETIC samples. Significance annotation: ‘ns’: p > 0.05, ‘*’: p ⁇ 0.05, ‘**’: p ⁇ 0.01, ‘***’: p ⁇ 0.001, ‘****’; p ⁇ 0.0001.
  • FIG. 5A-C Overlapping DMRs between plasma and tissue.
  • A Total DMRs called in cell free (cf)DNA. Stacked histograms displaying number of DMRs per sample or
  • B percent of DMRs called in cfDNA that overlap with at least one DMR called in patient-matched tumor tissue. Eight grey represents the hypermethylated regions and the dark grey represents tire hypomethylated regions.
  • FIG. 6A-B Common features identified in solid tumors serve as biomarkers in plasma.
  • A/B Mean beta from cfDNA across 402 hypomethylated and 905 hypermethylated mDMRs identified in tumor samples. All regions across all healthy and disease plasma samples are represented.
  • FIG. 7A-D Targeted methylation analysis of hypomethylated mDMRs in an independent cohort.
  • A Mean beta across 402 hypomethylated mDMRs in 44 additional tissue samples from 6 tumor types acquired from CHLA compared to normal tissue samples. All comparisons are statistically significant (Wilcox rank-sum test, Significance annotation: ‘ns’: p > 0.05, ‘*’: p ⁇ 0.05, p ⁇ 0.01, ‘***’: p ⁇ 0.001. •****’: p ⁇ 0.0001).
  • B Heatmap of beta values within mDMRs in POETIC and CHLA cohorts. Annotation bars show diagnosis, tumor/normal status, and source for each sample . UMAPs generated using mean beta values across all 402 colored by source (C) and cancer type (D).
  • FIG. 8A-B Number of DMRs called in WGBS by sample and cancer type.
  • A Number of DMRs per sample. Stacked histograms displaying number of DMRs per sample. Light grey represents the hypermethylated regions and the dark grey represents the hypomethylated regions. Chart is subdivided by tumor type: Embryonal rhabdomyosarcoma (ERMS), neuroblastoma (NBL) and osteosarcoma (OS), hepatoblastoma (HB), malignant rhabdoid tumor (MRT), and fibrolamellar hepatocellular carcinoma (FHC); all cancer types were grouped as ‘Other’.
  • B Stacked Histogram displaying average number of DMRs per cancer type, where n denotes number of samples per tumor type.
  • FIG. 9A-B MRT Methylation profiles.
  • A Number of DMRs called in P01-019 (MRT). Shade delineates hypermethylated regions (light grey) from hypomethylated regions (dark grey). Each sample annotated with percent of DMR calls that are hypermethylated.
  • B Genome-wide beta value of TARGET samples shows MRT hypermethylation.
  • FIG. 10 UMAP of DMRs identified in Figure 1c by k-means cluster. Each point represents an individual DMR as detailed in Figure 1c. Clusters on column annotation of Figure 1c are ’K-Means Cluster' .
  • FIG. 11A-B mDMRs detected in multiple stage I adult cancers from TCGA.
  • A ROC curves from TCGA 450K DNA methylation data using a random forest model trained 422 of the 905 pediatric cancer mDMRs derived by WGBS (subset of 422 regions used due to the limitations of the 450K array). Plots annotated with TCGA cancer code and AUC.
  • B Graphical representation of AUC in A with 95% CI indicated as error bars. See the GDC website for study abbreviation disambiguation (gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations).
  • FIG. 12A-B mDMRs detected in multiple stage II adult cancers from TCGA.
  • A ROC curves from TCGA 450K DNA methylation data using a random forest model trained 422 of the 905 pediatric cancer mDMRs derived by WGBS (subset of 422 regions used due to the limitations of the 450K array). Plots annotated with TCGA cancer code and AUC.
  • B Graphical representation of AUC in A with 95% CI indicated as error bars. See the GDC website for study abbreviation disambiguation (gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations).
  • FIG. 13A-B mDMRs detected in multiple stage III adult cancers from TCGA.
  • A ROC curves from TCGA 450K DNA methylation data using a random forest model trained 422 of the 905 pediatric cancer mDMRs derived by WGBS (subset of 422 regions used due to the limitations of the 450K array). Plots annotated with TCGA cancer code and AUC.
  • B Graphical representation of AUC in A with 95% CI indicated as error bars. See the GDC website for study abbreviation disambiguation (gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations).
  • FIG. 14A-B mDMRs detected in multiple stage IV adult cancers from TCGA.
  • A ROC curves from TCGA 450K DNA methylation data using a random forest model trained 422 of the 905 pediatric cancer mDMRs derived by WGBS (subset of 422 regions used due to the limitations of the 450K array). Plots annotated with TCGA cancer code and AUC.
  • B Graphical representation of AUC in A with 95% CI indicated as error bars. See the GDC website for study abbreviation disambiguation (gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations).
  • FIG. 15 curves from mDMRs in CNS tumor (Capper et al.) dataset.
  • ROC curves from Capper et al. 450K DNA methylation data using a random forest model trained 422 of the 905 pediatric cancer mDMRs derived by WGBS (subset of 422 regions used due to the limitations of the 450K array).
  • Each plot represents one methylation class from Capper et al.
  • FIG. 16 Cell free DNA yield from plasma. DNA yield in pg/ul per plasma sample. Vertically subdivided by diagnosis.
  • minimal focal regions that were differentially methylated across samples in multiple cancer types which wc termed minimally differentially methylated regions (mDMRs). These methylation changes were also observed in 518 pediatric and 6426 adult cancer samples accessed from publicly available databases, and in 44 pediatric cancer samples we analyzed using a targeted hybridization probe capture assay. Finally, we found that these methylation changes were detectable in cfDNA and could serve as potential cfDNA methylation biomarkers.
  • references in the specification to "one embodiment”, “an embodiment”, etc., indicate that the embodiment described may include a particular aspect, feature, structure, moiety, or characteristic, but not every embodiment necessarily includes that aspect, feature, structure, moiety, or characteristic. Moreover, such phrases may, but do not necessarily, refer to the same embodiment referred to in other portions of the specification. Further, when a particular aspect, feature, structure, moiety, or characteristic is described in connection with an embodiment, it is within the knowledge of one skilled in the art to affect or connect such aspect, feature, structure, moiety, or characteristic with other embodiments, whether or not explicitly described.
  • Tire terms “about” and “approximately” are used interchangeably. Both terms can refer to a variation of ⁇ 5%, ⁇ 10%, ⁇ 20%, or ⁇ 25% of the value specified. For example, “about 50" percent can in some embodiments carry a variation from 45 to 55 percent, or as otherwise defined by a particular claim.
  • the term “about” can include one or two integers greater than and/or less than a recited integer at each end of the range.
  • the terms “about” and “approximately” are intended to include values, e.g., weight percentages, proximate to the recited range that are equivalent in terms of the functionality of the individual ingredient, composition, or embodiment.
  • Tire terms “about” and “approximately” can also modify the endpoints of a recited range as discussed above in this paragraph.
  • ranges recited herein also encompass any and all possible sub-ranges and combinations of sub-ranges thereof, as well as the individual values making up the range, particularly integer values. It is therefore understood that each unit between two particular units are also disclosed. For example, if 10 to 15 is disclosed, then 11, 12, 13, and 14 arc also disclosed, individually, and as part of a range.
  • a recited range e.g., weight percentages or carbon groups
  • any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, or tenths.
  • each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc.
  • all language such as “up to”, “at least”, “greater than”, “less than”, “more than”, “or more”, and tire like include the number recited and such terms refer to ranges that can be subsequently broken down into sub-ranges as discussed above.
  • all ratios recited herein also include all sub-ratios falling within the broader ratio. Accordingly, specific values recited for radicals, substituents, and ranges, are for illustration only; they do not exclude other defined values or other values within defined ranges for radicals and substituents. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of tire other endpoint.
  • a range such as ‘'number 1” to “number 2”, implies a continuous range of numbers that includes the whole numbers and fractional numbers.
  • 1 to 10 means 1, 2, 3, 4, 5, ... 9, 10. It also means 1.0, 1.1, 1.2. 1.3, ..., 9.8, 9.9, 10.0, and also means 1.01, 1.02, 1.03, and so on.
  • the variable disclosed is a number less than “number 10”, it implies a continuous range that includes whole numbers and fractional numbers less than numberlO, as discussed above.
  • the variable disclosed is a number greater than “numberlO 7 ’. it implies a continuous range that includes whole numbers and fractional numbers greater than numberlO.
  • Tire term “substantially” as used herein, is a broad tenn and is used in its ordinary sense, including, without limitation, being largely but not necessarily wholly that which is specified.
  • the tenn could refer to a numerical value that may not be 100% the full numerical value.
  • the full numerical value may be less by about 1%, about 2%. about 3%, about 4%, about 5%. about 6%, about 7%. about 8%, about 9%, about 10%, about 15%, or about 20%.
  • a portion of or “a portion thereof’ means consecutive nucleotides of the sequence of said particular region.
  • a portion according to tire invention can comprise or consist of at least 15 or 20 consecutive nucleotides, preferably at least 100, 200, 300, 500 or 700 consecutive nucleotides, and more preferably at least 1. 2, 3, 4 or 5 consecutive kb of said particular region.
  • a portion can comprise or consist of 1. 2, 3, 4. 5. 6, 7, 8, 9. 10. 11, 12, 13, 14, 15 consecutive kb of said particular region.
  • contacting refers to tire act of touching, making contact, or of bringing to immediate or close proximity, including at the cellular or molecular level, for example, to bring about a physiological reaction, a chemical reaction, or a physical change, e.g., in a solution, in a reaction mixture, in vitro, or in vivo.
  • an “effective amount” refers to an amount effective to treat a disease, disorder, and/or condition, or to bring about a recited effect.
  • an effective amount can be an amount effective to reduce the progression or severity of the condition or symptoms being treated. Determination of a therapeutically effective amount is well within the capacity of persons skilled in the art.
  • the term "effective amount” is intended to include an amount of a compound described herein, or an amount of a combination of compounds described herein, e.g., that is effective to treat or prevent a disease or disorder, or to treat the symptoms of the disease or disorder, in a host.
  • an “effective amount” generally means an amount that provides the desired effect.
  • an “effective amount” or “therapeutically effective amount,” as used herein, refer to a sufficient amount of an agent or a composition or combination of compositions being administered which will relieve to some extent one or more of the symptoms of the disease or condition being treated. The result can be reduction and/or alleviation of the signs, symptoms, or causes of a disease, or any other desired alteration of a biological system.
  • an “effective amount” for therapeutic uses is the amount of the composition comprising a compound as disclosed herein required to provide a clinically significant decrease in disease symptoms.
  • An appropriate "effective" amount in any individual case may be determined using techniques, such as a dose escalation study. The dose could be administered in one or more administrations.
  • the precise determination of what would be considered an effective dose may be based on factors individual to each patient, including, but not limited to, the patient's age, size, type or extent of disease, stage of the disease, route of administration of the compositions, the type or extent of supplemental therapy used, ongoing disease process and type of treatment desired (e.g., aggressive vs. conventional treatment).
  • treating include (i) preventing a disease, pathologic or medical condition from occurring (e.g., prophylaxis): (ii) inhibiting the disease, pathologic or medical condition or arresting its development; (iii) relieving the disease, pathologic or medical condition; and/or (iv) diminishing sy mptoms associated with the disease, pathologic or medical condition.
  • the terms “treat”, “treatment”, and “treating” can extend to prophylaxis and can include prevent, prevention, preventing, lowering, stopping, or reversing the progression or severity of the condition or symptoms being treated.
  • the term “treatment” can include medical, therapeutic, and/or prophylactic administration, as appropriate.
  • subject or “patient'’ means an individual having symptoms of, or at risk for, a disease or other malignancy.
  • a patient may be human or non-human and may include, for example, animal strains or species used as “model systems” for research purposes, such a mouse model as described herein.
  • patient may include either adults or juveniles (e.g., children).
  • patient may mean any living organism, preferably a mammal (e.g. , human or non-human) that may benefit from the administration of compositions contemplated herein.
  • mammals include, but are not limited to, any member of the Mammalian class: humans, non-human primates such as chimpanzees, and other apes and monkeyspecies; farm animals such as cattle, horses, sheep, goats, swine; domestic animals such as rabbits, dogs, and cats; laboratory animals including rodents, such as rats, mice and guinea pigs, and the like.
  • non-mammals include, but are not limited to, birds, fish, and the like .
  • the mammal is a human.
  • the tenns “providing”, “administering,” “introducing,” are used interchangeably herein and refer to the placement of a compound of the disclosure into a subject by a method or route that results in at least partial localization of the compound to a desired site.
  • the compound can be administered by any appropriate route that results in delivery to a desired location in the subject.
  • inhibitor refers to the slowing, halting, or reversing the growth or progression of a disease, infection, condition, or group of cells.
  • the inhibition can be greater than about 20%, 40%, 60%, 80%, 90%, 95%, or 99%, for example, compared to the growth or progression that occurs in the absence of the treatment or contacting.
  • amplicon refers to nucleic acid products resulting from the amplification of a target nucleic acid sequence. Amplification is often perfomred by PCR. Amplicons can range in size from 20 base pairs to 1 000 base pairs in the case of long-range PCR but are more commonly 100-1000 base pairs for bisulfite-treated DNA used for methylation analysis.
  • Tire term “amplification” refers to an increase in the number of copies of a nucleic acid molecule.
  • the resulting amplification products are called “amplicons.”
  • Amplification of a nucleic acid molecule refers to use of a technique that increases the number of copies of a nucleic acid molecule in a sample.
  • An example of amplification is the polymerase chain reaction (PCR), in which a sample is contacted with a pair of oligonucleotide primers under conditions that allow for the hybridization of the primers to a nucleic acid template in the sample.
  • PCR polymerase chain reaction
  • Tire product of amplification can be characterized by such techniques as electrophoresis, restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic acid sequencing.
  • the methods provided herein can include a step of producing an amplified nucleic acid under isothermal or thermal variable conditions.
  • biological sample refers to a sample obtained from an individual.
  • biological samples include all clinical samples containing genomic DNA (such as cell-free genomic DNA) useful for cancer diagnosis and prognosis, including, but not limited to, cells, tissues, and bodily fluids, such as: blood, derivatives and fractions of blood (such as serum or plasma), buccal epithelium, saliva, urine, stools, bronchial aspirates, sputum, biopsy (such as tumor biopsy), and CVS samples.
  • a “biological sample” obtained or derived from an individual includes any such sample that has been processed in any suitable manner (for example, processed to isolate genomic DNA for bisulfite treatment) after being obtained from the individual.
  • bisulfite treatment refers to the treatment of DNA with bisulfite or a salt thereof, such as sodium bisulfite (NaHSO?).
  • Bisulfite reacts readily with the 5.6-double bond of cytosine, but poorly with methylated cytosine.
  • Cytosine reacts with the bisulfite ion to form a sulfonated cytosine reaction intermediate which is susceptible to deamination, giving rise to a sulfonated uracil.
  • Tire sulfonate group can be removed under alkaline conditions, resulting in the formation of uracil.
  • Uracil is recognized as a thymine by polymerases and amplification will result in an adenine-thymine base pair instead of a cytosine- guanine base pair.
  • cancer refers to a biological condition in which a malignant tumor or other neoplasm has undergone characteristic anaplasia with loss of differentiation, increased rate of growth, invasion of surrounding tissue, and which is capable of metastasis.
  • a neoplasm is a new and abnormal growth, particularly a new growth of tissue or cells in which the growth is uncontrolled and progressive.
  • a tumor is an example of a neoplasm.
  • types of cancer include lung cancer, stomach cancer, colon cancer, breast cancer, uterine cancer, bladder, head and neck, kidney, liver, ovarian, pancreas, prostate, and rectum cancer.
  • polynucleotide and “nucleic acid” are used interchangeably and mean at least two or more ribo- or deoxy-ribo nucleic acid base pairs (nucleotide) linked which are through a phosphoester bond or equivalent.
  • the nucleic acid includes polynucleotide and polynucleoside.
  • the nucleic acid includes a single molecule, a double molecule, a triple molecule, a circular molecule, or a linear molecule. Examples of the nucleic acid include RNA, DNA. cDNA. a genomic nucleic acid, a naturally existing nucleic acid, and a non-natural nucleic acid such as a synthetic nucleic acid but are not limited.
  • Short nucleic acids and polynucleotides are commonly called “oligonucleotides” or “probes” of single-stranded or double -stranded DNA.
  • DNA deoxyribonucleic acid
  • the repeating units in DNA polymers are four different nucleotides, each of which comprises one of the four bases, adenine, guanine, cytosine, and thymine bound to a deoxyribose sugar to which a phosphate group is attached.
  • Triplets of nucleotides referred to as codons
  • codons code for each amino acid in a polypeptide, or for a stop signal.
  • the term codon is also used for the corresponding (and complementary) sequences of three nucleotides in the mRNA into which the DNA sequence is transcribed.
  • cell-free DNA refers to DNA which is no longer fully contained within an intact cell, for example DNA found in plasma or serum.
  • target nucleic acid molecule refers to a nucleic acid molecule whose detection, amplification, quantitation, qualitative detection, or a combination thereof, is intended.
  • the nucleic acid molecule need not be in a purified form.
  • Various other nucleic acid molecules can also be present with the target nucleic acid molecule.
  • the target nucleic acid molecule can be a specific nucleic acid molecule of which the amplification and/or evaluation of methylation status is intended. Purification or isolation of the target nucleic acid molecule, if needed, can be conducted by methods known to those in the art, such as by using a commercially available purification kit or tire like.
  • methylation level refers to the state of methylation (methylated or not methylated) of the cytosine nucleotide of one or more CpG sites within a genomic sequence.
  • hypomethylated refers to a methylation status of a DNA molecule containing multiple CpG sites (e.g., more than 3, 4. 5, 6, 7. 8, 9, 10, etc.) where a high percentage of the CpG sites (e.g., more than 80%, 85%, 90%, or 95%, or any other percentage within the range of 50%- 100%) are unmethylated or methylated, respectively.
  • Tire term “CpG Site” refers to a di-nucleotide DNA sequence comprising a cytosine followed by a guanine in the 5' to 3' direction.
  • Tire cytosine nucleotides of CpG sites in genomic DNA are the target of intracellular methyltransferases and can have a methylation status of methylated or not methylated.
  • Reference to “methylated CpG site” or similar language refers to a CpG site in genomic DNA having a 5- methylcytosine nucleotide.
  • sequence identity or “identity” in the context of two nucleic acid or polypeptide sequences makes reference to a specified percentage of residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window, as measured by sequence comparison algorithms or by visual inspection.
  • percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule.
  • sequences differ in conservative substitutions the percent sequence identity may be adjusted upwards to correct for the conservative nature of tire substitution.
  • Sequences that differ by such conservative substitutions are said to have '‘sequence similarity” or ‘'similarity.” Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, Calif).
  • percentage of sequence identity means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
  • substantially identical in the context of a peptide indicates that a peptide comprises a sequence with at least 70%. 71%. 72%. 73%. 74%. 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%. 90%. 91%. 92%. 93%. or 94%, or even 95%, 96%, 97%, 98% or 99%, sequence identity to the reference sequence over a specified comparison window.
  • optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch (Needleman and Wunsch, JMB, 48, 443 (1970)).
  • a peptide is substantially identical to a second peptide, for example, where tire two peptides differ only by a conservative substitution.
  • embodiment of the invention also provides nucleic acid molecules and peptides that are substantially identical to the nucleic acid molecules and peptides presented herein.
  • sequence comparison typically one sequence acts as a reference sequence to which test sequences are compared.
  • test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated.
  • sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • Tire term “multiplex” refers to the use of more than one pair of primers intended to amplify multiple target gene segments simultaneously within a single tube. In this manner, all the primers may be contained within one tube to which a sample is introduced or positioned. All desired influenza virus and control gene segments are then amplified via the plurality of forward and reverse primers within the tube.
  • Tire term “complement” as used herein means the complementary sequence to a nucleic acid according to standard Watson/Crick base pairing rules.
  • a complement sequence can also be a sequence of RNA complementary to the DNA sequence or its complement sequence and can also be a cDNA.
  • substantially complementary means that two sequences hybridize under stringent hybridization conditions. The skilled artisan will understand that substantially complementary sequences need not hybridize along their entire length. In particular, substantially complementary sequences comprise a contiguous sequence of bases that do not hybridize to a target or marker sequence, positioned 3' or 5' to a contiguous sequence of bases that hybridize under stringent hybridization conditions to a target or marker sequence.
  • Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
  • the hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence -specific manner.
  • the complex may comprise tw o strands forming a duplex structure, three or more strands forming a multistranded complex, a single self-hybridizing strand, or any combination of these.
  • a hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PC reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.
  • Examples of stringent hybridization conditions include incubation temperatures of about 25° C. to about 37° C.; hybridization buffer concentrations of about 6*SSC to about I O*SSC: formamide concentrations of about 0%to about 25%; and wash solutions from about 4*SSC to about 8*SSC.
  • Examples of moderate hybridization conditions include incubation temperatures of about 40° C. to about 50° C.; buffer concentrations of about WSSC to about 2*SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5*SSC to about 2> ⁇ SSC.
  • Examples of high stringency conditions include incubation temperatures of about 55° C.
  • reference genome refers to any particular known, sequenced or characterized genome, whether partial or complete, of any organism or vims that may be used to reference identified sequences from a subject. Exemplar ⁇ ’ reference genomes used for human subjects as well as many other organisms are provided in the on-line genome browser hosted by the National Center for Biotechnology Information (“NCBI”) or the University of California, Santa Cruz (UCSC).
  • NCBI National Center for Biotechnology Information
  • UCSC Santa Cruz
  • a '‘genome” refers to the complete genetic information of an organism or virus, expressed in nucleic acid sequences.
  • a reference sequence or reference genome often is an assembled or partially assembled genomic sequence from an individual or multiple individuals.
  • a reference genome is an assembled or partially assembled genomic sequence from one or more human individuals.
  • Tire reference genome can be viewed as a representative example of a species' set of genes.
  • a reference genome comprises sequences assigned to chromosomes.
  • One exemplary human reference genome is GRCh37 (UCSC equivalent: hgl9).
  • normal reference standard intends a control level, degree, or range of DNA methylation at a particular genomic region or gene in a sample that is not associated with cancer.
  • normal reference cutoff value refers to a control threshold level of DNA methylation at a particular genomic region or gene or a differential methylation value (DMV).
  • DNA methylation levels enriched above the normal reference cutoff value are associated with having or developing cancer.
  • DNA methylation levels at or below the normal reference cutoff value are associated with not having or developing cancer.
  • Detecting refers to determining the presence and/or degree of methylation in a nucleic acid of interest in a sample. Detection does not require the method to provide 100% sensitivity and/or 100% specificity.
  • RT-PCR refers to reverse transcription polymerase chain reaction and is used to detect specific RNA, in this case specific gene segments of the influenza virus genome, such as by reverse transcribing the RNA of interest into its DNA complement through the use of reverse transcriptase.
  • the newly synthesized cDNA can be amplified using traditional PCR.
  • the RT-PCR provided herein is by a one-step approach, wherein the entire reaction from cDNA synthesis to PCR amplification occurs in a single tube.
  • the process described herein is compatible with a two-step reaction requires that the reverse transcriptase reaction and PCR amplification be performed in separate tubes.
  • Real-Time PCR Current Technology and Applications, Logan, Edwards, and Saunders eds., Caister Academic Press, 2009; Bustin A-Z of Quantitative PCR (IUL Biotechnology, No. 5).
  • a “fragment” of DNA refers to a piece of cell-free DNA that is about lObp. about 20bp, about 30bp, about 40bp. about 50bp, about 60bp, about 70bp, about 80bp, about 90bp, about lOObp, about HObp, about 120bp, about 130bp. about 140bp, about 150bp, about I60bp, about 170bp. about
  • DNA fragments are about lOObp to about 200 bp, about 120bp to about 180 bp, or about 140 bp to about 160bp.
  • Tire term “neoadjuvant treatment” refers to treatment (such as chemotherapy or hormone therapy) administered before primary cancer treatment (such as surgery) to enhance the outcome of primary treatment.
  • chemotherapy refers to the treatment of cancer with an antitumor or chemotherapeutic agent as part of a standardized regimen. Chemotherapy may be given with a curative intent or it may aim to prolong life or to palliate symptoms. It may be used in conjunction with other cancer treatments, such as radiation therapy or surgery .
  • Tire term “methylation” refers to the addition of a methy l group to the 5' carbon of the cytosine base in a deoxyribonucleic acid sequence of CpG within a genome.
  • neighboring CpG site refers to the collection of CpG sites within a genomic feature or over a short genetic distance.
  • the genomic feature may be a promoter, an enhancer, an exon, an intron, a 5 '-untranslated region (UTR), a 3'-UTR, a gene body, a stem cell associated region, a CpG island, a CpG shelf, a CpG shore, a LINE, a SINE, or an LTR.
  • the short genetic distance may be 10 bp, 11 bp, 12 bp, 13 bp, 14 bp, 15 bp, 16 bp, 17 bp, 18 bp, 19 bp.
  • MRD Minimal Residual Disease
  • the disclosure provides for assays and various methods for detecting differences in methy lation patterns of a target region of DNA.
  • the differences in methylation patterns of the target regions of the sample e.g. cfDNA. genomic DNA isolated from tissue
  • the methylation pattern of the target region of DNA in a sample may be analyzed using a trained machine learning model that is trained using a defined set target regions of DNA of cancerous and non-cancerous control samples.
  • a test sample from a subject may be processed to determine the presence or absence of a pediatric cancer, an adult cancer, or MRD according to the following steps i) extracting nucleic acids (e.g., DNA) from a test sample from a subject suspected of having a pediatric cancer, adult cancer, or MRD; ii) converting unmethylated cytosines of the extracted nucleic acids to uracil (e.g., via bisulfite conversion); iii) preparing a library of bisulfite converted DNA; iv) enriching target nucleic acids by hybridizing the bisulfite converted DNA with hybridization probes; v) generating sequence reads of the enriched nucleic acids comprising the target regions; vi) aligning the sequence reads of the target regions with corresponding target regions of a reference genome (e.g., using Bismark); vii) perform differentially methylated region analysis (e.g., using Metilene) of the test sample compared to a normal sample or pool of normal samples to
  • a method for determining whether a subject is likely to have or develop a pediatric cancer, an adult cancer, and/or Minimal Residual Disease comprises the steps of: training a machine learning model to detect the pediatric cancer, the adult cancer, or the MRD, wherein the machine learning model is trained using target regions from a plurality of cancerous samples and corresponding target regions from non-cancerous samples, wherein the cancer samples comprise at least two different cancer types, wherein the machine learning model is configured to identify the pediatric cancer, the adult cancer, or the MRD based on a comparison of a methylation pattern of target regions of the cancerous samples compared to a methylation pattern of corresponding target regions of the non-cancerous samples; determining a methylation pattern of target regions of a deoxyribonucleic acid (DNA) sample obtained from tire subject; applying the trained machine learning model to the methylation pattern of the target regions of the DNA obtained from the subject; and determining that the subject has or does not have the pediatric cancer, the adult cancer,
  • DNA deoxyribon
  • Embodiments of the disclosure may comprise the steps of bisulfite conversion of the nucleic acids from a DNA sample of a subject using, for example, Whole Genome Bisulfite Sequencing (WGBS) or hybrid probe capture; next generation sequencing the converted and/or enriched nucleic acids; collecting the methylation data from the targeted regions (e.g., the target regions listed in Table 1); and using a trained machine learning model to determine, for example, the presence or absence of a pediatric cancer, a recurrence of a pediatric cancer, and/or the presence of an adult cancer or recurrence of an adult cancer.
  • WGBS Whole Genome Bisulfitencing
  • the method used to determine the methylation pattern of the one or more target nucleic acids includes methylation sequencing.
  • the methylation pattern of CpG sites within the target regions listed in Table 1 may be detected using DNA methylation sequencing.
  • DNA methylation sequencing can involve, for example, treating DNA from a sample with bisulfite to convert unmethylated cytosine to uracil followed by amplification (such as PCR amplification) of a target nucleic acid within the treated genomic DNA, and sequencing of the resulting amplicon. Sequencing produces nucleotide reads that may be aligned to a genomic reference sequence that may be used to quantitate methylation levels of all the CpGs within an amplicon.
  • Cytosines in non-CpG context may be used to track bisulfite conversion efficiency for each individual sample.
  • the procedure is both time and cost-effective, as multiple samples may be sequenced in parallel using a 96 well plate and generates reproducible measurements of methylation when assayed in independent experiments.
  • Nucleic acid molecules may be subjected to conditions sufficient to convert unmethylated cytosines in the nucleic acid molecules to uracils (e.g., subsequent to extraction from a sample). For example, to detect DNA methylation, certain embodiments provide for first converting tire DNA to be analyzed so that the unmethylated cytosine is converted to uracil.
  • a chemical reagent that selectively modifies either the methylated or non-methylated form of CpG dinucleotide motifs may be used. Suitable chemical reagents include hydrazine and bisulphite ions and the like.
  • isolated DNA is treated with sodium bisulfite (NaHSCf) which converts unmethylated cytosine to uracil, while methylated cytosines are maintained.
  • NaHSCf sodium bisulfite
  • Cytosine reacts with the bisulfite ion to form a sulfonated cytosine reaction intermediate that is susceptible to deamination, giving rise to a sulfonated uracil.
  • the sulfonated group can be removed under alkaline conditions, resulting in the formation of uracil.
  • the nucleotide conversion results in a change in the sequence of the original DNA. It is general knowledge that the resulting uracil has the base pairing behavior of thymine, which differs from cytosine base pairing behavior. To that end, uracil is recognized as a thymine by DNA polymerase. Therefore, after PCR or sequencing, the resultant product contains cytosine only at the position where 5-methylcytosine occurs in the starting template DNA. This makes the discrimination between unmethylated and methylated cytosine possible.
  • Nucleic acid molecules may also be subjected to further processing including other derivatization processes (e.g., to incorporate, modify, and/or delete one or more sequences, tags, or labels).
  • functional sequences e.g., sequencing adapters, flow cell adapters, sequencing primers, etc.
  • derivatives of nucleic acid molecules from a sample may comprise processed nucleic acid molecules including bisulfite-modified nucleic acid molecules, reverse- transcribed nucleic acid molecules, tagged nucleic acid molecules, barcoded nucleic acid molecules, and other modified nucleic acid molecules.
  • methylation pattern of a target region may be determined using one or more of hybrid probe capture, targeted bisulfite amplicon sequencing, bisulfite DNA treatment, WGBS, bisulfite conversion combined with bisulfite restriction analysis (COBRA), bisulfite PCR, bisulfite modification, bisulfite pyrosequencing, methylated CpG island amplification, CpG binding column based isolation of CpG islands, CpG island arrays with differential methylation hybridization, high performance liquid chromatography, DNA methyltransferase assay, methylation sensitive PCR, cloning differentially methylated sequences, methylation detection following restriction, restriction landmark genomic scanning, methylation sensitive restriction fingerprinting, or Southern blot analysis.
  • hybrid probe capture targeted bisulfite amplicon sequencing
  • bisulfite DNA treatment bisulfite DNA treatment
  • WGBS bisulfite conversion combined with bisulfite restriction analysis
  • COBRA bisulfite restriction analysis
  • bisulfite PCR bisulfite
  • the method used to determine the methylation level of the one or more target regions in the DNA is WGBS (Cokus, et al. 2008. Nature 452(7184): 215-219; Lister, et al. 2009. Nature 462(7271): 315-322; Harns, et al. 2010. Nat Biotechnol 28(10): 1097-1105).
  • DNA methylation detection methods include hybrid probe capture (REF), methylation-specific enzyme digestion (Singer-Sam et al., Nucleic Acids Res. 18(3): 687. 1990; Taylor et al.. Leukemia 15(4): 583-9, 2001), methylation-specific PCR (MSP or MSPCR) (Herman et al., Proc Natl Acad Sci USA 93(18): 9821-6, 1996), methylation-sensitive single nucleotide primer extension (MS-SnuPE) (Gonzalgo et al., Nucleic Acids Res.
  • MSP or MSPCR methylation-specific PCR
  • MS-SnuPE methylation-sensitive single nucleotide primer extension
  • the methylation levels may be determined using one or more DNA methylation sequencing assays with or without bisulfite treatment of DNA.
  • RRBS Reduced Representation Bisulfite Sequencing
  • nucleic acid with bisulfite to convert all unmethylated cytosines into uracil, followed by restriction enzyme digestion (for example, by an enzyme that recognizes a site that includes a CG sequence such as MspI) and complete fragment sequencing after coupling with an adapter ligand.
  • restriction enzyme digestion for example, by an enzyme that recognizes a site that includes a CG sequence such as MspI
  • complete fragment sequencing after coupling with an adapter ligand.
  • the selection of the restriction enzyme enriches the fragments of the dense regions in CpG, reducing the number of redundant sequences that can map multiple positions of the gene during the analysis.
  • RRBS reduces the sample complexity of the nucleic acid sample by selecting a subset (e.g., by size selection using preparative gel electrophoresis) of restriction fragments for sequencing.
  • each fragment produced by restriction enzyme digestion contains information on DNA methylation for at least one CpG dinucleotide. Therefore, RRBS enriches the sample in promoters. CpG islands, and other genomic characteristics with a high frequency of restriction enzyme cleavage sites in these regions and, thus, provides an assay to assess tire methylation status of one or more genomic loci.
  • a typical protocol for RRBS comprises the steps of digesting a sample of nucleic acid with a restriction enzyme such as Mspl, filling with projections and A-tails, ligating adapters, conversion with bisulfite, and PCR. See, for example, Gu etal. (2010), Nat Methods 7: 133-6; Meissner etal (2005), Nucleic Acids Res. 33: 5868-77.
  • identifying the presence of a pediatric cancer, an adult cancer, or MRD in a subject may comprise using hybrid capture probes configured to selectively enrich nucleic acid molecules (e.g., DNA or RNA molecules) or sequences thereof.
  • Such probes may be pull-down probes (e.g., bait sets).
  • Selectively enriched nucleic acid molecules or sequences thereof may correspond to one or more target regions in the methylation profile of the data set.
  • the presence of particular sequences, modifications (e.g., methylation states), deletions, additions, single nucleotide polymorphisms, copy number variations, or other features in the selectively enriched nucleic acid molecules or sequences thereof may be indicative of a presence and/or recurrence of a pediatric cancer.
  • the probes may be selective for a subset of certain target regions of Table 1 in the DNA sample and/or for differentially methylated regions (e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites).
  • the probes may be configured to selectively enrich nucleic acid molecules (e.g.. DNA or RNA molecules) or sequences thereof corresponding to a plurality of target nucleic acid of target genomic sequences, such as the subset of tire one or more genomic regions in the cell-free biological sample and/or differentially methylated regions (e.g.. CpG sites, CpA, sites, CpT sites, and/or CpC sites).
  • the probes may be nucleic acid molecules (e.g., DNA or RNA molecules) having sequence complementarity with target nucleic acid sequences. These nucleic acid molecules may be primers or enrichment sequences.
  • the assaying of the nucleic acid molecules of the sample (e.g., cell-free biological sample) using probes that are selected for target nucleic acid sequences may comprise use of array hybridization, polymerase chain reaction (PCR), or nucleic acid sequencing (e.g., DNA sequencing or RNA sequencing).
  • the number of target nucleic acid sequences selectively enriched using such a scheme may comprise at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14.
  • target nucleic acid sequences include those associated with the target regions included in Table 1.
  • the target region includes all the sequences of Table 1.
  • the hybridization probes comprise an assay panel and are complementary to one or more target regions of Table 1. and may comprise at least 50, 60, 70. 80. 90.
  • the hybridization probes comprise an assay panel and may comprise at least 1,000, 2,000, 2,500, 5,000, 6,000, 7,500, 10,000, 15,000, 20,000, 25,000 or 50,000 different pairs of probes.
  • an assay panel may include at least 100, 120, 140, 160, 180, 200, 240, 300, or 400 different probes.
  • an assay panel may include at least 1,000, 2,000, 5,000, 10,000, 12,000, 15,000, 20,000, 30,000, 40,000, 50,000, or 100,000 different probes.
  • the number of probes is sufficient to overlap substantially all of the target regions of interest.
  • the one or more probes comprises deoxyribonucleic acid and/or ribonucleic acid. In some embodiments, each of the one or more probes comprises an affinity tag selected from the group consisting of biotin and streptavidin.
  • the methylation sequencing of the plurality of target regions uses one or more of whole genome sequencing, wherein the whole genome sequencing comprises one or more of whole genome bisulfite sequencing (WGBS), Reduced Representation Bisulfite sequencing (RRBS), Targeted bisulfite sequencing, Hybridization Probe capture, Methylation bead arrays, and Enzymatic methyl-sequence conversion.
  • whole genome sequencing comprises one or more of whole genome bisulfite sequencing (WGBS), Reduced Representation Bisulfite sequencing (RRBS), Targeted bisulfite sequencing, Hybridization Probe capture, Methylation bead arrays, and Enzymatic methyl-sequence conversion.
  • WGBS whole genome bisulfite sequencing
  • RRBS Reduced Representation Bisulfite sequencing
  • Targeted bisulfite sequencing Hybridization Probe capture
  • Methylation bead arrays Methylation bead arrays
  • Enzymatic methyl-sequence conversion Enzymatic methyl-se
  • Nucleic acid molecules e.g., extracted cfDNA
  • Sequencing reads may be aligned with and/or analyzed with regard to a reference genome. Based at least in part on sequencing reads, an absolute amount or relative amount of nucleic acid molecules (including an absolute or relative level of methylation within said molecules) corresponding to one or more genomic regions may be measured. Alternatively, sequencing reads may not be used to determine an amount or relative amount of nucleic acid molecules.
  • a data set comprising a genomic profile (e.g., methylation profile) of one or more genomic regions of a sample may be generated based at least in part on sequencing reads. Sequencing reads may be processed to identify methylation patterns of the target regions of the DNA in a sample.
  • Sequence identification may be performed by sequencing, array hybridization e.g., Affymetrix), or nucleic acid amplification (e.g., PCR), for example.
  • Sequencing may be performed by any suitable sequencing methods, such as massively parallel sequencing (MPS), paired-end sequencing, high- throughput sequencing, next-generation sequencing (NGS), shotgun sequencing, single-molecule sequencing, nanopore sequencing, nanopore sequencing with direct detection or inference of methylation status, semiconductor sequencing, pyrosequencing, sequencing-by-synthesis (SBS), sequencing-by- ligation, sequencing -by hybridization, and RNA-Seq (Illumina).
  • MPS massively parallel sequencing
  • NGS next-generation sequencing
  • SBS sequencing-by-synthesis
  • SBS sequencing-by- ligation
  • sequencing -by hybridization RNA-Seq
  • Sequencing and/or preparing a nucleic acid sample for sequencing may comprise performing one or more nucleic acid reactions such as one or more nucleic acid amplification processes (e.g., of DNA or RNA molecules).
  • Nucleic acid amplification may comprise, for example, reverse transcription, primer extension, asymmetric amplification, rolling circle amplification, ligase chain reaction, polymerase chain reaction (PCR), and multiple displacement amplification.
  • PCR methods include digital PCR (dPCR), emulsion PCR (ePCR), quantitative PCR (qPCR), real-time PCR (RT-PCR), hot start PCR, multiplex PCR, asymmetric PCR, nested PCR, and assembly PCR.
  • a suitable number of rounds of nucleic acid amplification may be performed to sufficiently amplify an initial amount of nucleic acid molecule (e.g., DNA molecule) or derivative thereof to a desired input quantity for subsequent sequencing.
  • the PCR may be used for global amplification of nucleic acid molecules. This may comprise using adapter sequences that may be first ligated to different molecules followed by PCR amplification using universal primers.
  • PCR may be performed using any of a number of commercial kits, e.g., provided by Life Technologies, Affymetrix, Promega, Qiagen, etc.
  • nucleic acid amplification may comprise targeted amplification of one or more genetic loci, genomic regions, cfDNA target regions, or differentially methylated regions (e.g., CpG sites, CpA, sites, CpT sites, and/or CpC sites), and in particular, the target regions listed in Table 1.
  • nucleic acid amplification is performed after bisulfite conversion.
  • Nucleic acid amplification may comprise the use of one or more primers, probes, enzymes (e.g., polymerases), buffers, and deoxyribonucleotides.
  • Nucleic acid amplification may be isothermal or may comprise thermal cycling. Thermal cycling may involve changing a temperature associated with various processes of nucleic acid amplification including, for example, initialization, denaturation, annealing, and extension. Sequencing may comprise use of simultaneous reverse transcription (RT) and PCR, such as a OneStep RT-PCR kit protocol by Qiagen, NEB, Thermo Fisher Scientific, or BioRad.
  • RT simultaneous reverse transcription
  • PCR such as a OneStep RT-PCR kit protocol by Qiagen, NEB, Thermo Fisher Scientific, or BioRad.
  • Nucleic acid molecules e.g., DNA or RNA molecules
  • Nucleic acid molecules or derivatives thereof may be labeled or tagged, e.g., with identifiable tags, to allow for multiplexing of a plurality of samples. For example, every nucleic acid molecule or derivative thereof associated with a given sample or subject may be tagged or labeled (e.g., with a barcode such as a nucleic acid barcode sequence or a fluorescent label). Nucleic acid molecules or derivatives thereof associated with other samples or subjects may be tagged or labels with different tags or labels such that nucleic acid molecules or derivatives thereof may be associated with the sample or subject from which they derive.
  • Such tagging or labeling also facilitates multiplexing such that nucleic acid molecules or derivatives thereof from multiple samples and/or subjects may be analyzed (e.g. , sequenced) at the same time.
  • Any number of samples may be multiplexed.
  • a multiplexed reaction may contain nucleic acid molecules or derivatives thereof from at least about 2, 3, 4, 5, 6, 7, 8. 9, 10, 11, 12, 13. 14. 15. 16. 17. 18, 19, 20, 25, 30, 35, 40, 45. 50. 55. 60. 65. 70, 75, 80, 85, 90, 95, 100. or more than 100 initial samples.
  • Such samples may be derived from the same or different subjects.
  • a plurality of samples may be tagged with sample barcodes (e.g., nucleic acid barcode sequences) such that each nucleic acid molecule (e.g., DNA molecule) or derivative thereof may be traced back to the sample (and/or the subject) from which tire nucleic acid molecule originated.
  • Sample barcodes may pemiit samples from multiple subjects to be differentiated from one another, which may permit sequences in such samples to be identified simultaneously, such as in a pool.
  • Tags, labels, and/or barcodes may be attached to nucleic acid molecules or derivatives thereof by ligation, primer extension, nucleic acid amplification, or another process.
  • nucleic acid molecules or derivatives thereof of a particular sample may be tagged, labeled, or barcoded with different tags, labels, or barcodes (e.g., unique molecular identifiers) such that different nucleic acid molecules or derivatives thereof deriving from the same sample may be differentially tagged, labeled, or barcoded.
  • nucleic acid molecules or derivatives thereof from a given sample may be labeled with both different labels and identical labels, such that each nucleic acid molecule or derivative thereof associated with the sample includes both a unique label and a shared label.
  • tire sequencing reads of the target regions may be aligned to corresponding target regions of a reference genome (e.g., GRCh37) (e.g., using the Bismark software).
  • the methylation level of the target region may be compared to a methylation level of a target region of anormal sample or pool of normal samples to identify hypom ethylated target regions and hypermethylated target regions (i.e., differentially methylated regions). This process may be facilitated through the use of a methylation calling program such as Metilene.
  • a methylation calling program such as Metilene.
  • Each of the hypomethylated regions and the hypermethylated regions may be examined separately for the presence or absence of a cancer using the machine learning model.
  • Target regions analyzed for methylation patterns.
  • Target regions correspond to chromosomes, start, and stop positions corresponding to the human reference genome
  • sequence reads may be aligned to one or more reference genomes (e.g., a human genome).
  • the aligned sequence reads may be quantified at one or more genomic loci or target regions to generate the data set comprising the methylation patern profile of one or more target regions of the cell-free biological sample. Quantification of sequences may be expressed as un-normalized or nonnalized values.
  • Alignment of bisulfite converted DNA is performed using a software program such as Bismark (Krueger et al. (2011) Bioinformatics, 27(11): 157171). Bismark performs both read mapping and methylation calling in a single step and its output discriminates between cytosines in CpG, CHG and CHH contexts. Bismark is released under the GNU GPLv3+ license.
  • the source code is freely available at bioinformatics.bbsrc.ac.uk/projects/bismark/.
  • differential methylation is calculated for specific loci/regions using, for example, one or more publicly available programs to analyze and/or determine methylation levels or a target polynucleotide region.
  • the method used to analyze and/or determine methylation levels of a target polynucleotide region include Metilene (Juhling et al., Genome Res., 2016; 26(2): 256-262) or GenomeStudio Software available online from Illumina, Inc. Other methods of detennining differentially methylated target polynucleotide regions are described in Hovestadt et al., 2014; Nature, 510(7506), 537-541.
  • the target regions that are examined to determine the presence or absence of a pediatric cancer in a subject comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, a least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% of tire target regions listed in Table 1.
  • the target regions that are examined to determine the severity of a pediatric cancer (z.e., stage I, stage II, stage III, or stage IV cancer) subject comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%. at least 60%. at least 65%, at least 70%, at least 75%, at least 80%. a least 85%, at least 90%, at least 95%. at least 96%, at least 97%, at least 98%, at least 99%, or 100% of the target regions listed in Table 1.
  • the target genomic regions that are examined to determine the presence or absence of an adult cancer in a subject comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, a least 85%, at least 90%, at least 95%, at least 96%, at least 97%. at least 98%. at least 99%, or 100% of the target regions listed in Table 1.
  • the target regions that are examined to determine the severity of a an adult cancer (z.e., stage I, stage II, stage III, or stage IV cancer) subject comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, a least 85%, at least 90%, at least 95%, at least 96%. at least 97%, at least 98%, at least 99%, or 100% of the target regions listed in Table 1.
  • Target genomic regions that are examined to determine the presence of MRD in a subject comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, a least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% of the target regions listed in Table 1.
  • tire biological sample may be collected through a standard biopsy or a liquid biopsy.
  • the biopsy is a liquid biopsy and the cfDNA may be collected from whole blood, plasma, serum, or urine.
  • an amount of sample, such as whole blood may include an amount of about 50 pL to about 5 mL , about 100 pL to about 5 mL, about 150 pLto about 5 mL, about 200 pL to about 5 mL, about 250 pL to about 5 mL, about 300 pL to about 5 mL, about 350 pL to about 5 mL, about 400 pL to about 5 mL, about 450 pL to about 5 mL, about 500 pL to about 5 mL, about 550 pL to about 5 mL, about 600 pL to about 5 mL, about 700 pL to about 5 mL, about 750 pL to about 5 mL, about 800 pL to about 5 mL
  • an amount of sample such as whole blood, may include an amount of about 5 mL to about 10 mL.
  • Isolation and extraction of DNA, and in particular, cfDNA may be performed through collection of bodily fluids using a variety of techniques. In some cases, collection may comprise aspiration of a bodily fluid from a subject using a syringe. In other cases, collection may comprise pipetting or direct collection of fluid into a collecting vessel. Methods for isolating DNA or other nucleic acids from tissue samples are well-known in the art.
  • cfDNA may be isolated and extracted using a variety of techniques known to a person of ordinary skill in the art.
  • cell-free nucleic acid may be isolated, extracted and prepared using commercially available kits such as the Qiagen Qiamp® Circulating Nucleic Acid Kit protocol.
  • Qiagen QubitTM dsDNA HS Assay kit protocol AgilentTM DNA 1000 kit, or TruSeqTM Sequencing Library Preparation; Low-Throughput (LT) protocol.
  • cfDNA may be extracted and isolated by from bodily fluids through a partitioning step in which cfDNAs, as found in solution, are separated from cells and other non-soluble components of the bodily fluid. Partitioning may include, but is not limited to, techniques such as centrifugation or filtration. In other cases, cells may not be partitioned from cfDNA first, but rather lysed. For instance, the genomic DNA of intact cells may be partitioned through selective precipitation.
  • the pediatric cancer or adult cancer is osteosarcoma, medulloblastomas, ependymomas, optical nerve gliomas, brain stem glioma, oligodendrogliomas, gangliogliomas. Pineal Region Tumors, hepatoblastoma.
  • the cancers are recurrent cancers (e.g., recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer, recurrent ovarian cancer
  • the pediatric or adult cancer is a central nervous system cancer such as anaplastic pilocytic astrocytoma; atypical teratoid/rhabdoid tumor, subclass MYC; atypical teratoid/rhabdoid tumor, subclass SHH; atypical teratoid/rhabdoid tumor, subclass TYR; cerebellar liponeurocytoma; CNS Ewing sarcoma family tumor with CIC alteration; CNS high grade neuroepithelial tumor with BCOR alteration; CNS high grade neuroepithelial tumor with MN1 alteration; CNS neuroblastoma with FOXR2 activation; diffuse leptomeningeal glioneuronal tumor; diffuse midline glioma H3 K27M mutant; embry onal tumor with multilayered rosettes; ependymoma, posterior fossa group A; ependymoma, RELA fusion; csthcsioncur
  • the pediatric or adult cancer is Bladder Urothelial Carcinoma, Breast invasive carcinoma, Colon adenocarcinoma, Esophageal carcinoma. Head and Neck squamous cell carcinoma. Kidney renal clear cell carcinoma, Kidney renal papillary cell carcinoma, Liver hepatocellular carcinoma. Lung adenocarcinoma, Lung squamous cell carcinoma, Pancreatic adenocarcinoma, Prostate adenocarcinoma, Thyroid carcinoma, Uterine Corpus Endometrial Carcinoma.
  • the cancer is breast cancer.
  • the cancers are recurrent cancers (e.g., recurrent colon adenocarcinoma, recurrent esophageal carcinoma, recurrent lung adenocarcinoma, etc.).
  • a clinical procedure or cancer therapy can be administered to the subject.
  • exemplary' therapies or procedures include but are not limited to surgery', radiation therapy, chemotherapy, hormone therapy, targeted therapy, and/or administration of an effective mount of one or more therapeutic agents: angiogenesis inhibitors, such as angiostatin Kl-3, DL-a-Difluorometliyl-omithine, endostatin, fumagillin, genistein, minocycline, staurosporine, and ( ⁇ )-thalidomide; DNA intercalator/cross-linkers, such as Bleomycin, Carboplatin, Carmustine.
  • angiogenesis inhibitors such as angiostatin Kl-3, DL-a-Difluorometliyl-omithine, endostatin, fumagillin, genistein, minocycline, staurosporine, and ( ⁇ )-thalidomide
  • DNA intercalator/cross-linkers such as Bleomycin, Carboplatin, Carmus
  • DNA synthesis inhibitors such as ( ⁇ )-Amethopterin (Methotrexate), 3-Amino-l,2,4-benzotriazine 1,4-dioxide, Aminopterin, Cytosine P-D-arabinofuranoside, 5-Fluoro-5'-deoxyuridine, 5 -Fluorouracil, Ganciclovir, Hydroxyurea, and Mitomycin C; DNA-RNA transcription regulators, such as Actinomycin D, Daunorubicin, Doxorubicin, Homoharringtonine, and Idarubicin; enzyme inhibitors, such as S(+)-Camptothecin, Curcumin, (-)-Deguelin, 5,6- Dichlorobenzimidazole 1-P-D-ribof
  • Fostriecin Hispidin, 2 -Imino- 1 - imidazoli-dineacetic acid (Cyclocreatine), Mevinolin, Trichostatin A, Tyrphostin AG 34, and Tyrphostin AG 879; gene regulators, such as 5-Aza-2'-deoxycytidine, 5 -Azacytidine, Cholecalciferol (Vitamin D3), 4- Hydroxytamoxifcn, Melatonin, Mifepristone, Raloxifene, all trans-Rctinal (Vitamin A aldehyde), Retinoic acid, all trans (Vitamin A acid), 9-cis-Retinoic Acid, 13-cis-Retinoic acid, Retinol (Vitamin A), Tamoxifen, and Troglitazone; microtubule inhibitors, such as Colchicine, Dolastatin 15, Nocodazole, Paclitaxel, Podophyllotoxin
  • the antitumor agent may be a neoantigen.
  • Neoantigens are tumor-associated peptides that serve as active pharmaceutical ingredients of vaccine compositions which stimulate antitumor responses and are described in US Pub. No. 2011/0293637, which is incorporated by reference herein in its entirety.
  • the antitumor agent may be a monoclonal antibody such as rituximab, alemtuzumab, Ipilimumab, Bevacizumab, Cetuximab, panitumumab, and trastuzumab, Vemurafenib imatinib mesylate, erlotinib, gefitinib, Vismodegib, 90Y- ibritumomab tiuxetan, 131 I-tositumomab, ado-trastuzumab emtansine, lapatinib. pertuzumab.
  • rituximab alemtuzumab, Ipilimumab, Bevacizumab, Cetuximab, panitumumab, and trastuzumab
  • Vemurafenib imatinib mesylate erlotinib
  • gefitinib gefitin
  • ado- trastuzumab emtansine regorafenib, sunitinib, Denosumab, sorafenib, pazopanib, axitinib, dasatinib, nilotinib, bosutinib, ofatumumab, obinutuzumab, ibrutinib, idelalisib, crizotinib, erlotinib (Tarceva®), afatinib dimaleate, ceritinib, Tositumomab and 131 I-tositumomab, ibritumomab tiuxetan, brentuximab vedotin, bortezomib, siltuximab, trametinib, dabrafenib, pembrolizumab, carfilzomib, Ramucirumab, Caboz
  • the antitumor agent may be INF-a. IL-2, Aldesleukin, IL-2, Erythropoietin, Granulocyte-macrophage colony-stimulating factor (GM-CSF) or granulocyte colonystimulating factor.
  • Tire antitumor agent may be a targeted therapy such as toremifene, fulvestrant, anastrozole, exemestane, letrozole, ziv-aflibercept, Alitretinoin, temsirolimus, Tretinoin, denileukin diftitox, vorinostat. romidepsin.
  • the antitumor agent may be a checkpoint inhibitor such as an inhibitor of the programmed death- 1 (PD-1) pathway, for example an anti-PDl antibody (Nivolumab).
  • the inhibitor may be an anti-cytotoxic T-lymphocyte-associated antigen (CTLA-4) antibody.
  • CTLA-4 anti-cytotoxic T-lymphocyte-associated antigen
  • the inhibitor may target another member of the CD28 CTLA4 Ig superfamily such as BTLA, LAG3, ICOS, PDL1 or KIR.
  • a checkpoint inhibitor may target a member of the TNFR superfamily such as CD40, 0X40, CD 137, GITR, CD27 or TIM-3.
  • tire antitumor agent may be an epigenetic targeted drug such as HDAC inhibitors, kinase inhibitors, DNA methyltransferase inhibitors, histone demethylase inhibitors, or histone methylation inhibitors.
  • the epigenetic drugs may be Azacitidine, Decitabine, Vorinostat, Romidepsin, or Ruxolitinib.
  • method of treatment of a pediatric cancer, an adult cancer, or MRD may include administration of an effective amount of a suitable substance able to target intracellular proteins, small molecules, or nucleic acid molecules alone or in combination with an appropriate carrier or vehicle, including, but not limited to, an antibody or functional fragment thereof, (e.g..
  • Fab', F(ab')2, Fab, Fv, rlgG, and scFv fragments and genetically engineered or otherwise modified forms of immunoglobulins such as intrabodies and chimeric antibodies
  • small molecule inhibitors of the protein chimeric proteins or peptides, gene therapy for inhibition of transcription, or an RNA interference (RNAi)-related molecule or morpholino molecule able to inhibit gene expression and/or translation.
  • RNAi RNA interference
  • the inhibitor is an RNAi- related molecule such as an siRNA or an shRNA for inhibition of translation.
  • RNA interference (RNAi) molecule is a small nucleic acid molecule, such as a short interfering RNA (siRNA), a double -stranded RNA (dsRNA), a micro-RNA (miRNA), or a short hairpin RNA (shRNA) molecule, that complementarily binds to a portion of a target gene or mRNA so as to provide for decreased levels of expression of the target.
  • siRNA short interfering RNA
  • dsRNA double -stranded RNA
  • miRNA micro-RNA
  • shRNA short hairpin RNA
  • Suitable phannaceutical composition comprising one or more of the agents described herein is administered and dosed in accordance with good medical practice, taking into account the clinical condition of the individual patient, the site and method of administration, scheduling of administration, patient age, sex, body weight, and other factors known to medical practitioners.
  • the therapeutically effective amount for purposes herein is thus determined by such considerations as are known in the art.
  • an effective amount of the pharmaceutical composition is that amount necessary to provide a therapeutically effective decrease in the expression of the targeted gene.
  • Tire amount of the phannaceutical composition should be effective to achieve improvement including but not limited to total prevention and to improved survival rate or more rapid recovery, or improvement or elimination of symptoms associated with the chronic inflammatory conditions being treated and other indicators as are selected as appropriate measures by those skilled in the art.
  • a suitable single dose size is a dose that is capable of preventing or alleviating (reducing or eliminating) a symptom in a patient when administered one or more times over a suitable time period.
  • One of skill in the art can readily determine appropriate single dose sizes for systemic administration based on the size of the patient and the route of administration.
  • the pharmaceutical compositions can be formulated according to known methods for preparing pharmaceutically useful compositions.
  • pharmaceutically acceptable carrier means any of the standard pharmaceutically acceptable carriers.
  • the pharmaceutically acceptable carrier can include diluents, adjuvants, and vehicles, as w ell as implant carriers, and inert, nontoxic solid or liquid fillers, diluents, or encapsulating material that does not react with the active ingredients of the technology. Examples include, but are not limited to, phosphate buffered saline, physiological saline, water, and emulsions, such as oil/water emulsions.
  • the carrier can be a solvent or dispersing medium containing, for example, ethanol, polyol (for example, glycerol, propylene glycol, liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils.
  • ethanol for example, ethanol, polyol (for example, glycerol, propylene glycol, liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils.
  • polyol for example, glycerol, propylene glycol, liquid polyethylene glycol, and the like
  • suitable mixtures thereof for example, glycerol, propylene glycol, liquid polyethylene glycol, and the like
  • Formulations suitable for parenteral administration include, for example, aqueous sterile injection solutions, which may contain antioxidants, buffers, bacteriostats, and solutes which render the formulation isotonic with the blood of the intended recipient: and aqueous and nonaqueous sterile suspensions which may include suspending agents and thickening agents.
  • the formulations may be presented in unit-dose or multi-dose containers, for example sealed ampoules and vials, and may be stored in a freeze dried (lyophilized) condition requiring only the condition of the sterile liquid carrier, for example, water for injections, prior to use.
  • Extemporaneous injection solutions and suspensions may be prepared from sterile powder, granules, tablets, etc.
  • the formulations of the subject technology can include other agents conventional in the art having regard to the type of formulation in question.
  • Tire disclosure also provides for assay panels (nucleic acid hybridization probes or sets of probes) comprises a plurality of polynucleotide probes, wherein each of the polynucleotide probes is configured to hybridize to a bisulfate-converted fragment obtained from processing of DNA, or more preferably, cfDNA molecules, from a subject, wherein each of tire cfDNA molecules corresponds to or is derived from, or includes the one or more target regions selected from Table 1.
  • the methods described herein also may be implemented by use of computer systems.
  • any of the steps described above for evaluating sequence reads to determine methylation status of a CpG site may be performed by means of software components loaded into a computer or other information appliance or digital device.
  • the computer, appliance or device may then perform all or some of the above-described steps to assist the analysis of values associated with the methylation of a one or more CpG sites, or for comparing such associated values.
  • the above features embodied in one or more computer programs may be performed by one or more computers running such programs.
  • various aspects of the methods disclosed herein can be implemented using computer-based calculations, machine learning (e.g., support vector machine (SVM), Lasso, Generalized Linear Model (GLM), Gradient Boosted Model (GBM), Extreme Gradient Boosting (XGB), Elastic-Net Regularized Generalized Linear Models (Glmnet), Random Forest, Gradient boosting (on random forest), C5.0 decision trees), and other software tools, or combinations thereof.
  • SVM support vector machine
  • Lasso Generalized Linear Model
  • GBM Gradient Boosted Model
  • XGB Extreme Gradient Boosting
  • Elastic-Net Regularized Generalized Linear Models e.g., Random Forest, Gradient boosting (on random forest), C5.0 decision trees
  • a methylation status for a CpG site can be assigned by a computer based on an underlying sequence read of an amplicon from a sequencing assay.
  • a methylation value for a DNA region or portion thereof can be compared by a computer to a threshold value, as described herein.
  • Tire tools are advantageously provided in the form of computer programs that are executable by a general-purpose computer system of conventional design.
  • the method used to analyze and/or determine methylation levels of a target polynucleotide region includes Metilene (Juhling et cd., Genome Res., 2016; 26(2): 256-262) or GenomeStudio Software available online from Illumina, Inc., or as described in Hovestadt et al., 2014; Nature, 510(7506), 537-541.
  • methylation data may be further processed by algorithms and/or software to determine the differential values (i.e. differential methylation value) and identify differentially methylated regions (DMRs). Differential methylation value may be calculated by methods known in the art (see, e.g. Hovestadt, et al. (2014).
  • Metilene a software program for calling differentially methylated regions may be used to identify differentially methylated regions within whole genome and targeted sequencing data.
  • the methylation data may be divided into hypomethylated target regions, hypermethylated target regions, a combination of both hypo and hypermethylated regions.
  • the machine learning model may be trained using hypomethylated target regions, hypermethylated target regions, or both.
  • methods of identifying a pediatric cancer, an adult cancer, or MRD in a subject may comprise the use of a machine learning model.
  • Hie machine learning model may be a trained algorithm.
  • the machine learning model may be trained on one or more features and trained be used to process a data set generated via assaying nucleic acid molecules in a sample e.g., cell- free biological sample), which data set comprises a methylation profile of one or more target genomic regions of the biological sample. Examples of machine learning models use and training of said machine learning model are described, for example in PCT Pub. No. WO2022/178108 to Salhia et al:, WO2019/178277 to Gross et al:, U.S. Pat. Pub No.
  • the machine learning model is trained using samples of DNA comprising the target regions of Table 1 taken from subjects having a known cancer. In some embodiments, the machine learning model is trained using samples of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 ,13, 14 ,15 ,16, 17, 18, 19, or 20 or more cancers.
  • the target regions of the DNA isolated from a subject may be divided into hypomethylated target regions, hypermethylated target regions, or a combination thereof, and, for example, individually analyzed using the machine learning model.
  • the target regions of DNA of the subject may be analyzed using a machine model trained solely on hypermethylated regions or hypomethylated regions.
  • a computer comprising at least one processor may be configured to receive a plurality of sequencing results from the DNA methylation sequencing reactions (e.g., after WGBS) that may comprise the methylation pattern of one or more target regions disclosed herein (e.g. , Table 1) from a patient having a mass or other tumor (e.g., DNA isolated form the mass or tumor, or DNA isolated from cfDNA from the person having the mass or tumor) or otherwise suspected of having a cancer.
  • the methylation pattern of the received sequences reads may be determined through sequence alignment with a reference genome and. for example, using Metilene or other commercially available product, identifying differentially methylated regions betw een the sample and a normal sample or pool of normal samples.
  • processor may be any type of processor, such as, for example, any type of general-purpose microprocessor or microcontroller (e.g., an IntelTM x86, PowerPCTM, ARMTM processor, or the like), a digital signal processing (DSP) processor, an integrated circuit, a field programmable gate array (FPGA), or any combination thereof.
  • processor may be any type of processor, such as, for example, any type of general-purpose microprocessor or microcontroller (e.g., an IntelTM x86, PowerPCTM, ARMTM processor, or the like), a digital signal processing (DSP) processor, an integrated circuit, a field programmable gate array (FPGA), or any combination thereof.
  • DSP digital signal processing
  • FPGA field programmable gate array
  • the machine learning model used to detect the pediatric cancer, adult cancer, or MRD comprises analyzes methylation patterns of a plurality of target regions of cancerous samples as compared to methylation patterns of a plurality of target regions of non-cancerous samples.
  • the pediatric cancer, adult cancer, and/or the MRD signature is detected by detennining and analyzing a methylation pattern of a plurality of target regions of both cancerous and non-cancerous samples wherein the plurality of target regions comprise at least 5%, at least 10%, at least 15%, at least 20%, at least 25%. at least 30%. at least 40%, at least 45%, at least 50%.
  • the signature is developed by analyzing and comparing all the target regions of Table 1.
  • the machine learning model is trained using target regions from a plurality of cancerous samples and corresponding target regions from non-cancerous samples, wherein the cancer samples comprise at least two different cancer types, wherein the detection of the pediatric cancer, the adult cancer, or the MRD is based on a comparison of a methylation pattern of target regions of the cancerous samples compared to a methylation pattern of corresponding target regions of the non-cancerous samples.
  • the machine learning model is trained by aligning sequence reads of target regions of a plurality of cancer sample to the corresponding regions of a reference genome, then conducting methylation analysis of the target regions of the plurality of cancer samples to a normal sample pool or normal samples to identify the target regions that are differentially methylated: identifying the differentially methylated regions that are common between the cancer samples: using the common or shared differentially methylated regions to train the machine learning model to distinguish cancerous versus non-cancerous samples.
  • a target region of a sample having a methylation level that differs from the methylation level of the target sequences a normal sample is a differentially methylated region (DMR).
  • sequence reads are aligned after whole genome sequencing (e.g., WGBS, RRBS).
  • training of the machine leaning model includes identifying the minimum number of differentially methylated regions that are shared across the plurality of cancer samples.
  • the minimally differentiated regions (mDMRs) are common to about 40% of the cancer samples, about 45% of the cancer samples, about 50% of the cancer samples, about 55% of the cancer samples, about 60% of the cancer samples, about 65% of the cancer samples, about 70% of the cancer samples, about 75% of the cancer samples, about 80% of the cancer samples, about 85% of the cancer samples, about 90% of the cancer samples, about 95% of the cancer samples, or greater than 95% of the cancer samples.
  • the mDMRs are common or shared between about 70% of the cancers from which the target samples are derived, about 75% of the cancers from which the target samples are derived, about 80% of the cancers from which the target samples are derived, about 85% of the cancers from which the target samples are derived, about 90% of the cancers from which the target samples are derived, about 95% of the cancers from which the target samples are derived, or about 100% of the cancers from which the target samples are derived.
  • training a machine learning model comprises tire steps of i) receiving methylation sequencing reads of a plurality of test samples including cancerous samples and non-cancerous samples to obtain a methylation pattern of target regions of the plurality of test samples, wherein the cancerous samples comprise at least two different cancer types; ii) aligning the target regions of the plurality of test samples with a reference genome, wherein each of the target regions of the plurality of test samples is aligned with a corresponding target region of the reference genome; iii) perfonn differentially methylated region analysis (e.g., using Metilene) of the test sample compared to a normal sample or pool of normal samples to identify differentially methylated regions between the test (cancer) samples and the normal samples; vi) define the minimally differentially methylated regions (mDMRs) across test samples; and vii) building a classifier model with mDMRs using machine learning tools to distinguish between cancerous and non-cancerous samples using the MDMRs wherein an
  • a target region may be assigned a methylation value based on, for example, a positive number if the target region has a methylated CpG compared to the corresponding position of the reference genome (i.e., the CpG of the reference genome is unmethylated), and a negative number if the target region has a an unmcthylatcd CpG compared to the corresponding position of the reference genome (i.e., the CpG of the reference genome is methylated).
  • a hypermethylated target region may have a positive overall score and a hypomethylated may have a negative overall score.
  • the output of the machine learning model comprises an Area Under the Curve (AUC) value for a test sample.
  • AUC Area Under the Curve
  • an AUC of 0.8 or greater, 0.85 or greater, 0.9 or greater, 0.95 or greater. 0.96 or greater, 0.97 or greater, 0.98 or greater, of 0.99 or greater indicates the presence of a pediatric cancer, adult cancer, or MRD.
  • the first set of target regions, the second set of target regions, and the corresponding target regions comprise the same target regions of Table 1.
  • the cancerous samples used to train the machine learning model comprise stage I-IV cancer samples, such as, for example, metastatic breast cancer.
  • the cancerous samples used to train the machine learning model are from stage I or stage II cancer samples, The methylation pattern of the cancerous samples may then be compared to the methylation pattern of the non -cancerous samples.
  • Any of the computer-readable media herein can be non-transitory (e.g., volatile memory such as DRAM or SRAM, nonvolatile memory such as magnetic storage, optical storage, or the like) and/or tangible. Any of tire storing actions described herein can be implemented by storing in one or more computer-readable media (e.g., computer-readable storage media or other tangible media). Any of the things (e.g., data created and used during implementation) described as stored can be stored in one or more computer-readable media (e.g., computer-readable storage media or other tangible media). Computer- readable media can be limited to implementations not consisting of a signal.
  • kits for detecting a cancer comprising reagents for carrying out the aforementioned methods, and instructions for detecting the cancer signals.
  • Reagents may include, for example, primer sets, PCR reaction components, a plurality of probe sets complementary to target regions of Table 1, sequencing reagents, and optionally, a solid support for said probes (e.g., a glass slide or chip, surface of a bead, surface of a matrix, etc ).
  • MRT malignant rhabdoid tumors
  • Table 3 DMR summaries from tissue samples. The average, median, minimum, and maximum number of DMRs by cancer type. Hie number of cases for each type is also indicated. The average number of CpGs, median width for the DMRs, and locations with respect to CpG islands, shores, shelves and open sea are also indicated.
  • IPA identified genes with DMRs in this cluster to be associated with PI3K signaling, DNA double-strand break repair, and G2/M checkpoint regulation (Table 7). Table 4. IPA results from cluster 1 genes. Top canonical pathways associated with genes in cluster
  • Table 7 IPA results from cluster 4 genes. Top canonical pathways associated with genes in cluster .
  • Methylation beta values were also extracted from 518 TARGET samples (Table 8) across 166 of the 183 highly variable DMRs which overlapped at least one probe on the HM450 methylation array. Analysis of TARGET data also demonstrated strong separation by tumor type ( Figure 1C-D). POETIC samples clustered with TARGET samples according to tumor type as seen by hierarchical clustering and UMAP analyses ( Figure IE). It is also important to note that TARGET samples were predominantly collected from the primary tumors (Table 8), while POETIC cases were all collected from patients with recurrent metastatic disease, indicating that methylation profiles of recurrent tumors are more similar to primary tumors then they are different. The co-clustering of POETIC recurrent samples with TARGET samples was also observed when each tumor type was analyzed separately (data not shown).
  • Table 8 TARGET sample summary. Table shows the number of samples for each cancer type accessed from the TARGET database.
  • mDMRs minimally differentially methylated regions
  • each CpG site was scored based on the number of samples with a DMR call which included that CpG (separately for hypomethylated and hypermethylated DMR calls).
  • Mean beta values across hypomethylated mDMRs were 0.406 in tumor tissue compared with 0.732 in normal tissue.
  • Beta values in hypermethylated mDMRs were 0.643 in tumor tissue compared with 0.308 in normal tissues ( Figure 2B).
  • a random forest classifier was built, based on methylation patterns of the mDMR set, to determine the utility of using the data for tumor detection.
  • the cross-validated receiver operating characteristic (ROC) of methylation in mDMRs had an area under the curve (AUC) of 0.95, indicating that the mDMRs w ere capable of differentiating tumor from nonnal ( Figure 2C).
  • Table 9 ENCODE sample summary'. Table shows the number of samples for each tissue type accessed from the ENCODE database.
  • IPA found that genes associated with the selected mDMR were significantly associated with the regulation of epithelial-mesenchymal transition (EMT), NANOG signaling, TGF-signaling, and TREM1 signaling (Table 11). Taken together, these results show that the selected mDMRs may represent a pan-pediatric cancer signature, which is associated with broad-spanning cancer-specific pathways.
  • Table 1 1. IPA results from hypermethylated mDMRs. Top canonical pathways associated with genes in hypermethylated mDMRs.
  • Table 12 TCGA sample summary. Table shows the number of samples for each cancer type (tumor and adjacent normal) accessed from the TCGA database. mDMRs in pediatric cancer are also detected in adult cancers. To determine the generalizability of the 905 mDMRs across a broad set of adult tumor types, we used HM450 data from TCGA to assess the tumor/normal classifier. Four hundred and twenty two of 905 mDMRs overlapped with at least one probe on the HM450 array, and we built a separate random forest classifier based on this reduced set of regions. ROC curves (and AUCs) were calculated for 14 different adult solid tumors (6426 samples, Table 14) ( Figure 4A-B).
  • CMOS methylation classes - esthesioneuroblastoma
  • ENB esthesioneuroblastoma
  • CN NBL - methylation classes of NBL that arise in olfactory nerves and other neural crest cells within the CNS, respectively.
  • Cell-free DNA methylation reflects tumor tissue methylation patterns.
  • Cell-free DNA methylation is gaining widespread acceptance as an emerging biomarker for liquid biopsies.
  • CfDNA extracted from these samples had an average yield of 20.4 ng/ml (3-2 - 87.5 ng) (Fig. 16).
  • DMRs were called between cfDNA from cancer patients and the three healthy cfDNA samples and fdtered in the same way described above for gDNA from tissue. Hie median number of DMRs was 42,935 per sample (Table 13).
  • Table 13 DMR summaries from plasma samples. The average, median, minimum, and maximum number of DMRs by cancer type. The number of cases for each type is also indicated. The average number of CpGs, median width for the DMRs, and locations with respect to CpG islands, shores, shelves and open sea are also indicated.
  • Pan-pediatric mDMRs detectable in cfDNA.
  • 905 mDMRs (402 hypomethylated and 503 hypermethylated) across tumor types (Figure 2) could differentiate between tumor and normal in cfDNA
  • Mean methylation for hypomethylated mDMRs identified in tissue was significantly lower in cfDNA cancer samples compared to normal with an average methylation difference of 0.10 (p ⁇ 0.0001) ( Figure 6A).
  • Methylation deserts were characterized by DNA methylation beta value averages less than 0.2, which is significantly lower than that seen in traditional PMDs which typically have a beta value of 0.7 (Lister etal., Nature 462, 315-322 (2009)). PMDs and methylation deserts were observed in cfDNA in P01-010, P01-029, and P01-036.
  • WGBS Copy Number Estimation by WGBS in gDNA from tumor tissue.
  • WGBS is primarily used to measure DNA methylation across the entire genome but can also be used reliably in determining copy number variants (CNVs).
  • CNVs copy number variants
  • AE anaplastic ependymoma
  • N-MYC amplifications in NBL occur in 20-25% of NBLs and are associated with poor prognosis.
  • SMARCB 1 homozygous loss of SMARCB 1 in the 2 MRT samples, which is a known driver mutation for that tumor type.
  • MMP11 CNVs have been reported in NBL and WT in the catalogue of somatic mutations in cancer (COSMIC); however, MMP11 CNVs in MRT have not been reported.
  • Detection of CNVs in cfDNA is less sensitive than cfDNA methylation.
  • WGBS data to analyze CNVs in cfDNA. Unlike many known CNVs found in tissue gDNA, CNVs were largely not observable in cfDNA. Five samples from three patients represented notable exceptions, where CNV calls were broadly consistent with CNV detection from tissue gDNA. In one such example (P01-020), the Iq and 8q gains observed in 2 tissue samples were observed in cfDNA. Blood from this patient was taken at the time the first tumor sample (P01-020-T1) was collected and both samples had a similar CNV profile.
  • the second tumor sample (P01-020-T2) was collected at a later date and showed a 13q loss not seen in the first sample (Tl) or its cfDNA.
  • P01-036 NBL
  • the cfDNA CNV calls also closely matched tumor sample CNV data, with gains on Iq, 2p, 7, 9q, 12q, 13q, 17q and losses on Ip, 3p, 4q/p, 1 Iq, 17q, and 19q.
  • tire cfDNA was able to resolve some CNVs not seen in the tissue, namely losses in lOp and 15q. This was also true for P01-029 where there were CNVs in plasma not seen in gDNA.
  • SMARCB 1 Unlike adult cancers, most cases of pediatric cancer can be traced to a single genetic driver, and these genetic alterations tend to be highly tumor-specific. For example, the loss of SMARCB 1 specifically leads to the development of MRT. Specific gene fusions cause alveolar rhabdomyosarcoma (PAX3/7- FOXO1), Ewing’s sarcoma (EWS-FLI1) , and CML (BCR-ABL1). Mutations to certain genes such as RB are specific to retinoblastoma and OS, while mutations in TP53 are extremely common in multiple pediatric cancers.
  • PAX3/7- FOXO1 alveolar rhabdomyosarcoma
  • EWS-FLI1 Ewing’s sarcoma
  • CML BCR-ABL1
  • WGBS is the gold standard assay for methylation evaluation as it evaluates every CpG in the genome, but we also used it for copy number estimation - maximizing data generation from each sample and enabling multi-omic analysis from the same sample and aliquot, which also minimizes sampling bias and reduces cost. This is particularly useful when dealing with rare tumor types or sample types with very' little available DNA (such as cfDNA).
  • CNV analysis from WGBS identified both large-scale alterations and focal gains/losses such as MYCN gain in NBL, SMARCB1 loss in MRT, and PTEN loss in ERMS.
  • this disclosure provides a comprehensive analysis of multiple pediatric cancers using WGBS.
  • WGBS pan-cancer methylation signature
  • cfDNA common to multiple pediatric cancer types, including extremely rare neoplasms such as DSRCT and MRT.
  • WGBS to detect CNVs, in order to directly compare CNV and methylation detection from the same sample and aliquot.
  • DNA methylation was superior to CNV at detecting tumor-specific signal in cfDNA.
  • the pan-cancer cfDNA methylation signature in this study has potential utility in minimal residual disease monitoring and early detection and warrants further investigation in both pediatric and adult cancer.
  • Sample collection Samples were obtained under written informed parental consent from the Pediatric Oncology Experimental Therapeutics Investigators' Consortium (POETIC) at Memorial Sloan Kettering Cancer Center (New York, USA) from patients with a wide range of recurrent pediatric cancers (Table 2). Tissue samples were flash-frozen after resection. Peripheral blood samples were drawn pre- operatively in EDTA purple-top tubes and the plasma was harvested.
  • POETIC Pediatric Oncology Experimental Therapeutics Investigators' Consortium
  • Table 2 recurrent pediatric cancers
  • Table 2A Patient summary . All unique samples from the POETIC cohort used for WGBS. The diagnosis for each patient is indicated, along with the sample type and resection location where applicable.
  • NBL, n 9
  • OS osteosarcoma
  • HB hepatoblastoma
  • HB hepatoblastoma
  • Table 2B Patient summary . All unique cases from the POETIC cohort with demographic infomiation, diagnosis, and sample source. Number of tumor tissue, normal tissue and plasma samples analyzed per patient is indicated.
  • Genomic (g)DNA and total RNA were extracted from flash-frozen normal or tumor tissue using an AllPrep DNA/RNA Mini kit (Qiagen) according to manufacturer’s recommendations. Briefly, tissues were homogenized using a Bullet Blender homogenizer (Next Advance) for 5 minutes at full speed with a mixture of 0.9-2.0mm RNase-free stainless-steel beads. Homogenates were passed through tire QIAshredder (Qiagen) to remove any remaining particulate matter. Plasma was isolated from whole blood by spinning it at 300 g for 20 minutes. Cell-free (cf)DNA was extracted from the plasma using the QIAamp DNA Blood Maxi kit (Qiagen) according to the manufacturer’s recommendations.
  • Quantity and purity of the isolated gDNA was determined by Qubit dsDNA High Sensitivity fluorometric assay (Invitrogen). cfDNA was quantitated using the TapeStation High Sensitivity D1000 assay according to the manufacturer’s protocol. Extracted gDNA and cfDNA were used for whole genome bisulfite sequencing analysis (WGBS) as described, for example, in Legendre et al.. Clin Epigenetics 7, 100 (2015). Directional, bisulfite-converted libraries for paired-end sequencing were prepared using the Ovation Ultralow Methyl-Seq Library System (NuGen), using the manufacturer’s suggested protocol. Bisulfite conversion was performed using the EpiTect Fast DNA Bisulfite Kit (Qiagen).
  • Post-library QC was performed on the 4200 Tapestation using Fligh Sensitivity DI 000 ScreenTapes (Agilent). Paired-end sequencing was performed on the Illumina NovaSeq 6000 platform using the S2 or S4 flow 7 cell for a total read length of 2x150 bp. Paired-end sequencing on bisulfite treated gDNA and cfDNA was performed. Tissue samples were sequenced by Macrogen on HiSeq X (Illumina). Read pairs were processed through our alignment and methylation calling pipeline which uses Brabham Bioinformatics’ Bismark alignment software (Krueger et al., Bioinformatics 27, 1571-1572 (2011)) for read mapping and methylation evaluation. Sequencing of cfDNA w 7 as done at USC on Illumina’s NovaSeq 6000 using S2 chips. All reads were mapped to hg!9.
  • DMR calling Differentially methylated regions (DMRs) were evaluated using the DMR caller Metilene (Juhling et al., Genome Res 26, 256-262 (2016)). Tumor samples were compared with their patient-matched adjacent normal sample where possible. For tumor samples without a matched normal, a pool of normal samples from other patients with the same diagnosis w 7 as used. For unique cancer types with no matched normal sample, a pool of all normal samples was used. DMRs were filtered to include those with a Mann-Whitney p-value ⁇ 0.05 and an
  • Partially methylated domain/hypomethylated domain analysis Partially methylated domains (PMDs) are hypomethylated domains with an intermediate level of methylation spanning several kilobases to a few megabases. PMDs were called using MethPipe (Song et al. PLoS One 8, e81148 (2013)), based on methylation and coverage information from Bismark. We modified methPipe’s code to generate additional metrics, namely mean methylation across each PMD and the standard deviation of beta values within each PMD call. Analysis and plotting of PMD regions were performed in R.
  • CpG minimally differentially methylated regions We identified a consensus DMR set with substantial enrichment of hypomethylated and hypermethylated DMRs across all samples. Each CpG location was scored by the number of samples with a DMR overlapping this locus (separately for hyperand hyper-methylated DMRs). These CpGs were filtered to include those with a count of 22 or more samples (70% of samples) and filtered CpGs within 500 bases of each other were combined to fonn separate regions, termed minimally differentially methylated regions (mDMRs).
  • mDMRs minimally differentially methylated regions
  • Copy number analysis Copy number variants (CNVs) were called on the WGBS data using the R package QDNAseq (Scheinin et al., Genome Res 24, 2022-2032 (2014)), which uses read counts within fixed sized bins; we selected a 30kb bin size for evaluation of genome wide copy number variants, and a Ikb, 5kb, or lOkb bin size for focal amplifications such as MYCN gain in neuroblastoma.
  • Ingenuity pathway analysis In order to investigate biological pathways associated with DMRs we used Ingenuity Pathway Analysis (IPA, Qiagen). DMRs were annotated with the nearest gene and distance to the nearest gene. The resulting gene list was used as IPA input. We ran pathway enrichment analysis using default IPA settings.
  • Hybridization probe capture Hybridization probe capture, assay design, sequencing, and bioinformatic analysis were performed as previously described (Buckley et al., NAR Genom Bioinform 4, lqac099 (2022); Buckley et al. Clin Cancer Res, 2023 Dec 15:29(24):5196-5206).

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Artificial Intelligence (AREA)
  • Hospice & Palliative Care (AREA)
  • Evolutionary Computation (AREA)
  • Oncology (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medicinal Chemistry (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)

Abstract

Procédés pour déterminer si un sujet est susceptible d'avoir ou de développer un cancer pédiatrique, un cancer de l'adulte ou une maladie résiduelle minimale (MRD) par l'utilisation d'un modèle d'apprentissage automatique conçu pour détecter lesdits cancers.
PCT/US2024/016941 2023-02-22 2024-02-22 Détection précoce pan-cancer et méthylation de l'adncf dans le cadre des maladies résiduelles minimales Ceased WO2024178248A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2024226284A AU2024226284A1 (en) 2023-02-22 2024-02-22 Pan-cancer early detection and mrd cfdna methylation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363486379P 2023-02-22 2023-02-22
US63/486,379 2023-02-22

Publications (1)

Publication Number Publication Date
WO2024178248A1 true WO2024178248A1 (fr) 2024-08-29

Family

ID=92501711

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/016941 Ceased WO2024178248A1 (fr) 2023-02-22 2024-02-22 Détection précoce pan-cancer et méthylation de l'adncf dans le cadre des maladies résiduelles minimales

Country Status (2)

Country Link
AU (1) AU2024226284A1 (fr)
WO (1) WO2024178248A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110201520A1 (en) * 2005-05-02 2011-08-18 Toray Industries, Inc. Composition and method for diagnosing esophageal cancer and metastasis of esophageal cancer
US20210017609A1 (en) * 2018-04-02 2021-01-21 GRAIL, Inc Methylation markers and targeted methylation probe panel
US20210388445A1 (en) * 2018-04-12 2021-12-16 Singlera Genomics, Inc. Compositions and methods for cancer and neoplasia assessment
WO2022133315A1 (fr) * 2020-12-17 2022-06-23 President And Fellows Of Harvard College Procédés de détection du cancer à l'aide d'îlots cpg méthylés de manière extra-embryonnaire

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110201520A1 (en) * 2005-05-02 2011-08-18 Toray Industries, Inc. Composition and method for diagnosing esophageal cancer and metastasis of esophageal cancer
US20210017609A1 (en) * 2018-04-02 2021-01-21 GRAIL, Inc Methylation markers and targeted methylation probe panel
US20210388445A1 (en) * 2018-04-12 2021-12-16 Singlera Genomics, Inc. Compositions and methods for cancer and neoplasia assessment
WO2022133315A1 (fr) * 2020-12-17 2022-06-23 President And Fellows Of Harvard College Procédés de détection du cancer à l'aide d'îlots cpg méthylés de manière extra-embryonnaire

Also Published As

Publication number Publication date
AU2024226284A1 (en) 2025-08-14

Similar Documents

Publication Publication Date Title
Jiang et al. Multi-omics analysis identifies osteosarcoma subtypes with distinct prognosis indicating stratified treatment
US11965215B2 (en) Methods and systems for analyzing nucleic acid molecules
US20140364439A1 (en) Markers associated with chronic lymphocytic leukemia prognosis and progression
BR112019018272A2 (pt) marcadores metilação para diagnosticar hepatocelular carcinoma e câncer
EP4110957B1 (fr) Procédés d'analyse d'acides nucléiques acellulaires et applications associées
JP2023029945A (ja) 癌のエピジェネティックプロファイリング
WO2011112880A2 (fr) Biomarqueurs d'hyperméthylation pour la détection du carcinome malpighien de la tête et du cou
JP2023528533A (ja) 循環腫瘍核酸分子のマルチモーダル分析
BR112019013391A2 (pt) Adaptador de ácido nucleico, e, método para detecção de uma mutação em uma molécula de dna circulante tumoral (ctdna) de fita dupla.
US20240229157A1 (en) Compositions comprising nullomers and methods of using the same for cancer detection and diagnosis
US20240182983A1 (en) Cell-free dna methylation test
US20250297320A1 (en) Methylation signatures in cell-free dna for tumor classification and early detection
WO2017119510A1 (fr) Procédé de test, marqueur de gène et agent de test pour diagnostiquer un cancer du sein
AU2023384165A1 (en) Cell-free dna methylation test for breast cancer
CA3216428A1 (fr) Procedes et systemes pour analyser des molecules d'acide nucleique
US11793825B2 (en) Biomarkers for predicting responsiveness to decitabine therapy
WO2024178248A1 (fr) Détection précoce pan-cancer et méthylation de l'adncf dans le cadre des maladies résiduelles minimales
TWI824488B (zh) 預測胃癌患者預後的方法及其套組
CN114023442B (zh) 基于多组学数据骨肉瘤分子分型的生信分析方法及模型
US20250378908A1 (en) Identifying somatic pseudogenes as a proxy for restrotransposition activity detection
WO2024047250A1 (fr) Détermination sensible et spécifique de profils de méthylation d'adn
Singh et al. Integrative study of whole exome variation and expression in gingivo-buccal oral squamous cell carcinoma revealed alteration in TLR-activated MAP-Kinase signalling
WO2025235602A1 (fr) Signatures pronostiques prédictives pour immuno-oncologie faisant appel à une biopsie liquide

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24761012

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: AU2024226284

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2024226284

Country of ref document: AU

Date of ref document: 20240222

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2024761012

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2024761012

Country of ref document: EP

Effective date: 20250922

ENP Entry into the national phase

Ref document number: 2024761012

Country of ref document: EP

Effective date: 20250922