WO2025160265A1 - Systems and methods of fragmentomics analysis in cancer - Google Patents
Systems and methods of fragmentomics analysis in cancerInfo
- Publication number
- WO2025160265A1 WO2025160265A1 PCT/US2025/012735 US2025012735W WO2025160265A1 WO 2025160265 A1 WO2025160265 A1 WO 2025160265A1 US 2025012735 W US2025012735 W US 2025012735W WO 2025160265 A1 WO2025160265 A1 WO 2025160265A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cancer
- nucleosome
- subject
- sample
- sequencing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/30—Unsupervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/40—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/60—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
- G16H40/67—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/154—Methylation markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- Cancer is a leading cause of death worldwide. Detection of cancer, determination of cancer stage, determination of cancer type or subtype, or both, determination of prognosis cancer, and prediction of treatment response, for example, may be critical for providing treatment and improving patient outcomes. Cancer may have genetic aberrations, in some cases genetic aberrations that may differ depending on a variety of factors or characteristics of the cancer. Detection and identification of these genetic aberrations may be important for detection of cancer, determination of cancer stage, determination of cancer type or subtype, or both, determination of prognosis of cancer, and prediction of treatment response, for example.
- Nucleosome profiling can be used to detect and identify genetic aberrations in samples of subjects having a disease, such as cancer.
- Current approaches utilizing nucleosome profiling may not be accurate nor specific enough to identify a subtype of a disease such as cancer.
- Current approaches utilizing nucleosome profiling may also be unable to utilize nucleosome profiling to predict prognosis or forecast progression or treatment response of a subject reliably and accurately.
- Applicant has recognized that nucleosome profiling can be analyzed and processed using methods and systems disclosed herein to achieve disease forecasting and disease subtyping utilizing nucleosome profiling with reliability and accuracy.
- a method for determining a disease forecast of a subject having cancer comprising: generating a nucleosome profile relating to a biological sample obtained or derived from the subject, wherein the nucleosome profile is generated relative to a selected biomarker; determining a nucleosome profiling abnormality score, comprising relating the nucleosome profile of the subject sample to one or more reference nucleosome profiles; generating the disease forecast based at least in part on the nucleosome profiling abnormality score, wherein the disease forecast comprises one or more disease forecast characteristics.
- the one or more disease forecast characteristics comprise a cancer prognosis relating to the subject.
- the disease forecast characteristics comprise one or more of: an estimated survival time of the subject without a treatment intervention, an estimated survival time of the subject with a treatment intervention, determination of a type of the cancer, determination of a subtype of the cancer, determination of one or more clinical outcomes, or predicted treatment response of the subject to one or more treatments, or any combination thereof.
- the biological sample comprises cell-free deoxyribonucleic acid (cfDNA) molecules.
- the biological sample comprises one or more of: a plasma sample, a serum sample, a red blood cell sample, a urine sample, a saliva sample, pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any derivative thereof, and any combination thereof.
- the biological sample comprises the plasma sample.
- the biological sample comprises the urine sample.
- the cfDNA molecules are obtained or derived from a single biological sample of the subject. In some embodiments, the cfDNA molecules are obtained or derived from different biological samples of the subject. In some embodiments, the biological sample is obtained or derived from the subject using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free deoxyribonucleic acid (DNA) collection tube, other blood collection tube, and CTC collection tubes.
- EDTA ethylenediaminetetraacetic acid
- DNA cell-free deoxyribonucleic acid
- the method further comprises assaying the biological sample to generate the nucleosome profde.
- the method further comprises assaying the biological sample comprises subjecting said biological sample to conditions that are sufficient to isolate, enrich, or extract the cfDNA molecules.
- the method further comprises fractionating a whole blood sample of the subject to obtain the cfDNA molecules.
- the method further comprises assaying the biological sample further comprises assaying the cfDNA molecules using nucleic acid sequencing to produce nucleic acid sequencing reads.
- the nucleic acid sequencing further comprises DNA sequencing.
- the DNA sequencing comprises one or more of: next-generation sequencing, whole genome sequencing, low-pass sequencing, targeted sequencing, whole exome sequencing, methylation- aware sequencing, or bisulfite sequencing, or a combination thereof.
- the DNA sequencing comprises low-pass whole genome sequencing.
- the DNA sequencing comprises whole exome sequencing.
- the DNA sequencing further comprises nucleic acid amplification.
- the nucleic acid amplification comprises polymerase chain reaction (PCR) or isothermal amplification.
- at least one of the cfDNA molecules are assayed using a polymerase chain reaction (PCR) assay, microarray, or a isothermal amplification.
- the type of the cancer of the subject comprises one or more of: lung cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
- the type of the cancer of the subject comprises prostate cancer.
- the subtype of the prostate cancer comprises one or more of: hormone sensitive prostate cancer (HSPC), castration-resistant prostate cancer (CRPC), androgen receptor-dependent prostate cancer (ARPC), metastatic prostate cancer, metastatic castration-resistant prostate cancer (mCRPC), neuroendocrine prostate cancer (NEPC), or any combination thereof.
- the type of cancer comprises breast cancer.
- the subtype of the breast cancer comprises estrogen receptor-positive (ER+) breast cancer or estrogen receptor-negative (ER-) breast cancer.
- the subject is asymptomatic for the cancer.
- the selected biomarker comprises a selected binding site. In some embodiments, the selected biomarker comprises a transcription factor binding site. In some embodiments, the selected biomarker comprises one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
- ARBS androgen receptor binding sites
- ASCL- Achaete-Scute Family BHLH Transcription Factor
- ER estrogen receptor
- ErbB receptor binding sites or any combination thereof.
- the nucleosome profile is generated based at least in part on distance of one or more selected subsets of the cfDNA molecules to the selected biomarker. In some embodiments, the distance comprises the number of base pairs between each of the one or more selected subsets of the cfDNA molecules and the selected biomarker. In some embodiments, the nucleosome profile is further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules. In some embodiments, the nucleosome profile is further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules associated with the distance of the one or more selected subsets of the cfDNA molecules from the selected biomarker.
- determining the nucleosome profiling abnormality score further comprises comparing the coverage of the one or more selected subsets of the cfDNA molecules to one or more coverage values of the one or more reference nucleosome profiles. In some embodiments, determining the nucleosome profiling abnormality score further comprises determining a Z-score of the coverage of the one or more selected subsets of the cfDNA molecules. In some embodiments, determining the nucleosome profiling abnormality score further comprises mapping the coverage Z-score of the one or more selected subsets of the cfDNA molecules to a coverage Z-score of each of the one or more reference nucleosome profiles.
- the one or more selected subsets of the cfDNA molecules comprise fragments of the cfDNA molecules.
- the method further comprises generating the disease forecast further comprises mapping the nucleosome profiling abnormality score to the one or more disease forecast characteristics.
- the one or more reference nucleosome profiles are associated with one or more disease forecast characteristics. In some embodiments, the one or more reference nucleosome profiles are associated with a likelihood of occurrence of one or more disease forecast characteristics. In some embodiments, one or more of the reference nucleosome profiles comprise data from a sample not having the disease (normal). In some embodiments, the method further comprises mapping DNA methylation patterns of the biological sample. [0017] In some embodiments, the method further comprises associating the mapped DNA methylation patterns with the nucleosome profile data of the biological sample. In some embodiments, the one or more selected subsets of the cfDNA molecules are selected based at least in part on DNA methylation data mapped to the cfDNA molecules.
- the method further comprises generating the disease forecast based at least in part on copy number variation data, or sequencing mutation data, or both, associated with the sample.
- a method for determining a subtype of a cancer of a subject comprising: generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject, wherein the nucleosome profile is generated relative to one or more selected biomarkers; mapping one or more characteristics of the nucleosome profile of the biological sample to one or more corresponding characteristics of one or more reference nucleosome profiles to generate a selected subset of reference nucleosome profiles; obtaining data relating to the cancer subtypes associated with each reference nucleosome profile of the selected subset of reference nucleosome profiles; and determining the subtype of the cancer of the subject based at least in part on the cancer subtypes associated with each of the reference nucleosome profiles.
- the method further comprises determining a tumor fraction of the cfDNA molecules.
- the method further comprises associating the tumor fraction with a nucleosome profiling abnormality score.
- the nucleosome profiling abnormality score is associated with the one or more selected biomarkers.
- the one or more selected biomarkers comprise a binding site associated with a type of the cancer.
- the selected biomarker comprises a transcription factor binding site associated with a type of the cancer.
- the selected biomarker comprises one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
- ARBS androgen receptor binding sites
- ASCL- Achaete-Scute Family BHLH Transcription Factor
- ER estrogen receptor
- ErbB receptor binding sites or any combination thereof.
- At least a portion of the reference nucleosome profiles comprise nondisease profiles (normal). In some embodiments, at least a portion of the reference nucleosome profiles comprise a cancer condition having a cancer type matching the type of the cancer of the subject.
- a method for estimating a response of a subject having cancer to one or more treatments comprising: generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject; determining one or more characteristics of the cancer of the subject based at least in part on the nucleosome profile of the subject; and generating a treatment response determination for the subject based at least in part on the one or more characteristics of the cancer of the subject.
- the treatment response determination comprises one or more of: a treatment plan, a value representing likelihood of one or more treatment responses, a binary treatment response indicator, a probability value for each treatment response, or any combination thereof.
- the one or more characteristics of the cancer of the subject comprises a cancer type, a cancer subtype, an estimate of cancer progression, a prognosis of the subject without treatment intervention, a prognosis of the subject with treatment intervention, or any combination thereof.
- a system for determining a disease forecast of a subject having cancer comprising: a memory; and one or more processors configured to execute machine-readable instructions which, when executed, cause the one or more processors to perform a method comprising: generating a nucleosome profile of a biological sample obtained or derived from the subject, mapping the nucleosome profile to one or more reference nucleosome profiles, determining one or more disease forecast characteristics of the biological sample based at least in part on the mapping, and generating the disease forecast based at least in part on the one or more disease forecast characteristics.
- the one or more disease forecast characteristics comprise one or more of: a type of the cancer, a subtype of the cancer, a prognosis of the subject, an estimated survival time of the subject without treatment intervention, an estimated survival time of the subject with treatment intervention, an estimation of the subject’s response to one or more treatments, or any combination thereof.
- FIG. 1 illustrates an exemplary nucleosome profding method.
- FIGs. 2A-2D show exemplary data relating to nucleosome profiling comparisons between samples having various disease conditions and normal samples.
- FIGs. 3A-3D illustrate exemplary data relating to androgen receptor binding site (ARBS) nucleosome profiling between samples having various disease conditions and normal samples.
- ARBS androgen receptor binding site
- FIGs. 4A-4D show exemplary data relating to ASCL-1 biomarker nucleosome profiling between samples having various disease conditions and normal samples.
- FIGs. 5A-5C illustrate exemplary data relating to DNA methylation and ARBS nucleosome profiling in samples having various disease conditions and normal samples.
- FIGs. 6A-6E show exemplary data relating to clinical outcomes associated with ARBS nucleosome profiling abnormality score of samples having various disease conditions and various treatment conditions.
- FIGs. 7A-7D illustrate exemplary data relating to ASCL-1 binding sites and ARBS nucleosome profiling in samples having various disease conditions and normal samples, as well as tumor fraction analysis of these data.
- FIGs. 8A-8D show exemplary data relating to ARBS and ASCL-1 nucleosome profiling and scoring for samples having various disease conditions and normal samples.
- FIGs. 9A-9D illustrate exemplary data comparing ARBS and ASCL-1 nucleosome profiling values for samples of different subjects with various disease conditions and at various time points pretreatment and post-treatment.
- FIGs. 10A-10C show exemplary data comparing ARBS, ASCL-1, and endoplasmic reticulum binding sites (ERBS) methylation data.
- FIGs. 11A-11D illustrate exemplary data relating to comparing nucleosome profiling for various samples having an estrogen-receptor positive (ER+_ disease condition and various samples having an estrogen-receptor negative (ER-) disease condition.
- FIGs. 12A-12C show exemplary data relating to DNA methylation associated with ARBS nucleosome profiling data in samples having various disease conditions and normal samples.
- FIGs. 13A-13D illustrate exemplary data relating to DNA methylation associated with fragment-level ARBS nucleosome profiling data in fragments of samples having various disease conditions and fragments of normal samples.
- FIG. 14 illustrates a non-limiting example of a computing device; in this case, a device with one or more processors, memory, storage, and a network interface, per one or more embodiments herein.
- the phrases “at least one,” “one or more,” and “and/or” are open-ended expressions that are both conjunctive and disjunctive in operation.
- each of the expressions “at least one of A, B and C,” “at least one of A, B, or C,” “one or more of A, B, and C”, “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.
- the phrase “at most three” can mean less than one, one, two, or three.
- subject may be used interchangeably and refer to humans, as well as non-human mammals (e.g., non-human primates, canines, equines, felines, porcines, bovines, ungulates, lagomorphs, rodents, and the like).
- non-human mammals e.g., non-human primates, canines, equines, felines, porcines, bovines, ungulates, lagomorphs, rodents, and the like.
- the subject can be a human (e.g., adult male, adult female, adolescent male, adolescent female, male child, female child) under the care of a physician or other health worker in a hospital, as an outpatient, or other clinical context.
- treatment refers to an approach for obtaining beneficial or desired results with respect to a disease, disorder, or medical condition including, but not limited to, a therapeutic benefit and/or a prophylactic benefit.
- treatment or treating involves administering a therapeutic to a subject.
- a therapeutic benefit may include the eradication or amelioration of the underlying disorder being treated.
- a therapeutic benefit may be achieved with the eradication or amelioration of one or more of the physiological symptoms associated with the underlying disorder, such as observing an improvement in the subject, notwithstanding that the subject may still be afflicted with the underlying disorder.
- the method can comprise generating a nucleosome profile relating to a biological sample obtained or derived from the subject.
- the nucleosome profile can be generated relative to a selected biomarker.
- the method can further comprise determining a nucleosome profiling abnormality score.
- determining a nucleosome profiling abnormality score can comprise relating the nucleosome profile of the subject sample to one or more reference nucleosome profiles.
- the method can further comprise generating the disease forecast based at least in part on the nucleosome profiling abnormality score.
- the disease forecast can comprise one or more disease forecast characteristics.
- the one or more disease forecast characteristics can comprise a cancer prognosis relating to the subject.
- the prognosis may comprise expected progression-free survival (PFS), overall survival (OS), or other metrics relating the severity or survivability of a cancer.
- the disease forecast characteristics can comprise one or more of: an estimated survival time of the subject without a treatment intervention, an estimated survival time of the subject with a treatment intervention, determination of a type of the cancer, determination of a subtype of the cancer, determination of one or more clinical outcomes, or predicted treatment response of the subject to one or more treatments, or any combination thereof.
- the biological sample can comprise cell-free deoxyribonucleic acid (cfDNA) molecules.
- the biological sample may comprise nucleic acids.
- the nucleic acid may be a DNA (e.g. double-stranded DNA, single-stranded DNA, single -stranded DNA hairpins, cDNA, circulating tumor DNA (ctDNA), cell-free DNA (cfDNA)), or DNA otherwise derived from RNA.
- the biological sample may contain or be derived from a biological fluid.
- the biological sample can comprise one or more of: a plasma sample, a serum sample, a buffy coat sample, a red blood cell sample, a urine sample, a saliva sample, tissue biopsy, pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any derivative thereof, and any combination thereof.
- the biological sample can comprise the plasma sample.
- the biological sample can comprise the urine sample.
- the cfDNA molecules can be obtained or derived from a single biological sample of the subject. In some embodiments, the cfDNA molecules can be obtained or derived from different biological samples of the subject. In some embodiments, the biological sample can be obtained or derived from the subject using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free deoxyribonucleic acid (DNA) collection tube, other blood collection tube, and CTC collection tubes.
- EDTA ethylenediaminetetraacetic acid
- DNA cell-free deoxyribonucleic acid
- the collection tube may comprise additional reagents for stabilizing the nucleic acid molecules or blood cells.
- the collection tube may allow the nucleic acid or blood cells to be stable such to minimize degradation of the biological sample prior to assaying.
- the additional reagents may comprise buffer salts or chelators.
- the biological sample may be obtained or derived from a subject at various times. The biological sample may be obtained or derived from a subject prior to the subject receiving a therapy for cancer. The biological sample may be obtained or derived from a subject during receiving a therapy for cancer. The biological sample may be obtained or derived from a subject after receiving a therapy for cancer. The biological sample may be collected over 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or time points.
- the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more hour period.
- the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more day period.
- the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
- the time points may occur over a 1, 2, 3, 4, 5, 6,
- the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more month period.
- the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more year period.
- the method can further comprise assaying the biological sample to generate the nucleosome profile.
- assaying the biological sample can comprise subjecting the biological sample to conditions sufficient to isolate, enrich, or extract the cfDNA molecules.
- the methods disclosed herein may comprise conducting one or more enrichment reactions on one or more nucleic acid molecules in a sample.
- the enrichment reactions may comprise contacting a sample with one or more beads or bead sets.
- the enrichment reactions may comprise one or more hybridization reactions.
- the enrichment reactions may comprise contacting a sample with one or more capture probes or bait molecules that hybridize to a nucleic acid molecule of the biological sample.
- the enrichment reaction may comprise differential amplification of a set of nucleic acid molecules.
- the enrichment reaction may enrich for a plurality of genetic loci or sequences corresponding to genetic loci.
- the enrichment reactions may comprise the use of primers or probes that may complementarity to sequences (or sequences upstream or downstream) of a sequence that is to be enriched.
- a capture probe may comprise sequence complementarity to a set of genomic loci and allow the enrichment of the genomic loci.
- the enrichments reactions may comprise a plurality of probes or primers.
- a plurality of probes may comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, or 180 different probes.
- the methods disclosed herein may comprise conducting one or more isolation or purification reactions on one or more nucleic acid molecules in a sample.
- the isolation or purification reactions may comprise contacting a sample with one or more beads or bead sets.
- the isolation or purification reaction may comprise one or more hybridization reactions, enrichment reactions, amplification reactions, sequencing reactions, or a combination thereof.
- the isolation or purification reaction may comprise the use of one or more separators.
- the one or more separators may comprise a magnetic separator.
- the isolation or purification reaction may comprise separating bead bound nucleic acid molecules from bead free nucleic acid molecules.
- the isolation or purification reaction may comprise separating capture probe hybridized nucleic acid molecules from capture probe free nucleic acid molecules.
- the isolation reactions may comprises removing or separating a group of nucleic acid molecules from another group of nucleic acids.
- the methods disclosed herein may comprise conduction extraction reactions on one or more nucleic acids in a biological sample.
- the extraction reactions may lyse cells or disrupt nucleic acid interactions with the cell such that the nucleic acids may be isolated, purified, enriched or subjected to other reactions.
- the methods disclosed herein may comprise amplification or extension reactions.
- the amplification reactions may comprise polymerase chain reaction.
- the amplification reaction may comprise PCR-based amplifications, non-PCR based amplifications, or a combination thereof.
- the one or more PCR-based amplifications may comprise PCR, qPCR, nested PCR, linear amplification, or a combination thereof.
- the one or more non-PCR based amplifications may comprise multiple displacement amplification (MDA), transcription-mediated amplification (TMA), nucleic acid sequencebased amplification (NASBA), strand displacement amplification (SDA), real-time SDA, rolling circle amplification, circle-to-circle amplification or a combination thereof.
- the amplification reactions may comprise an isothermal amplification.
- the method disclosed herein may comprise a barcoding reaction.
- a barcoding reaction may comprise the additional of a barcode or tag to the nucleic acid.
- the barcode may be a molecular barcode or a sample barcode.
- a barcode nucleic acid may comprise a barcode sequence which may be a degenerate n-mer. The sequence may be randomly generated or generated such to synthesize a specific barcode sequence.
- the barcode nucleic acid may be added to a sample such to label the nucleic acid molecules in the sample.
- the barcodes may be specific to a sample. For example, a plurality of barcode nucleic acids may be added to a sample in which the barcode sequence is the same.
- those originating from a same sample may have a same barcode sequence, and may allow a nucleic acid to be identified as belonging to a particular or given sample.
- a molecular barcode may also be used such that each molecule (or a plurality of molecules) in a same volume have a different molecular barcode.
- This barcode may be subjected to amplification such that all amplicons derived from a molecule have the same barcode. In this way, molecules originating from a same molecule may be identified.
- the sequences reads may be processed based on the barcode sequences. For example, the processing may reduce errors or allow a molecule to be tracked.
- Barcode sequences may be appended or otherwise added or incorporated into a sequence by various reactions, for example an amplification, extension, or ligation reaction, and may be performed enzymatically using a nucleic acid polymerase or ligase.
- the ligation may be an overhang or blunt end ligation and the barcodes may comprise complementarity to nucleic acids to be barcoded. This complementarity may be a sequence derived from the sample from the subject or may be constant sequence generated via a reaction performed on the nucleic acids in the sample.
- the biological sample may comprise multiple components.
- the biological sample may be a whole blood sample.
- the biological sample may be subjected to reactions such to separate or fractionate a biological sample.
- a whole blood sample may be a fractionated and cell free nucleic acids may be obtained.
- the whole blood sample may be fractionated using centrifugation such that blood cells may be separated from the plasma (which may contain cell free nucleic acid).
- a sample may be subjected to multiple rounds of separation or fractionation.
- the method can further comprise fractionating a whole blood sample of the subject to obtain the cfDNA molecules.
- assaying the biological sample can further comprise assaying the cfDNA molecules using nucleic acid sequencing to produce nucleic acid sequencing reads.
- a sequencing reaction that may be used include capillary sequencing, next generation sequencing, Sanger sequencing, sequencing by synthesis, single molecule nanopore sequencing, sequencing by ligation, sequencing by hybridization, sequencing by nanopore current restriction, or a combination thereof.
- Sequencing by synthesis may comprise reversible terminator sequencing, processive single molecule sequencing, sequential nucleotide flow sequencing, or a combination thereof.
- Sequential nucleotide flow sequencing may comprise pyrosequencing, pH-mediated sequencing, semiconductor sequencing or a combination thereof.
- the sequencing reactions may comprise whole genome sequencing, whole exome sequencing, low-pass whole genome sequencing, targeted sequencing, methylation-aware sequencing, enzymatic methylation sequencing, bisulfite methylation sequencing.
- the sequencing reaction may be a transcriptome sequencing, mRNA-seq, totalRNA-seq, smallRNA-seq, exosome sequencing, or combinations thereof. Combinations of sequencing reactions may be used in the methods described elsewhere herein.
- a sample may be subjected to whole genome sequencing and whole transcriptome sequencing.
- the samples may comprise multiple types of nucleic acids (e.g. RNA and DNA), sequencing reactions specific to DNA or RNA may be used such to obtain sequence reads relating to the nucleic acid type.
- the sequencing of nucleic acids may generate sequencing read data.
- the sequencing reads may be processed such to generate data of improved quality.
- the sequencing reads may be generated with a quality score.
- the quality score may indicate an accuracy of a sequence read or a level or signal above a nose threshold for a given base call.
- the quality scores may be used for filtering sequencing reads. For example, sequencing reads may be removed that do not meet a particular quality score threshold.
- the sequencing reads may be processed such to generate a consensus sequence or consensus base call.
- a given nucleic acid (or nucleic acid fragment) may be sequenced and errors in the sequence may be generated due to reactions prior or during sequencing. For example, amplification or PCR may generate error in amplicons such that the sequences are not identical to a parent sequence.
- error correction may include identifying sequence reads that do not corroborate with other sequences from a same sample or same original parent molecules.
- the use of barcodes may allow the identification or a same parent or sample.
- the sequence reads may be processed by performing single strand consensus calling or double stranded consensus call, thereby reducing or suppressing error.
- the nucleic acid sequencing can further comprise DNA sequencing.
- the DNA sequencing can comprise one or more of: next-generation sequencing, whole genome sequencing, low-pass sequencing, targeted sequencing, whole exome sequencing, methylation-aware sequencing, or bisulfite sequencing, or a combination thereof.
- the DNA sequencing can comprise low-pass whole genome sequencing.
- the DNA sequencing can comprise whole exome sequencing.
- the DNA sequencing can further comprise nucleic acid amplification.
- the nucleic acid amplification can comprise polymerase chain reaction (PCR) or isothermal amplification.
- at least one of the cfDNA molecules can be assayed using a polymerase chain reaction (PCR) assay, microarray, or a isothermal amplification.
- the subject may be a suspected of a suffering from a cancer.
- the cancer may be specific or originating from an organ or other area of the subject.
- the type of the cancer of the subject can comprise one or more of: lung cancer, colorectal cancer, melanoma, bladder cancer, nonHodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
- the type of the cancer of the subject can comprise prostate cancer.
- the subtype of the prostate cancer can comprise one or more of: hormone sensitive prostate cancer (HSPC), castration-resistant prostate cancer (CRPC), androgen receptor-dependent prostate cancer (ARPC), metastatic prostate cancer, metastatic castrationresistant prostate cancer (mCRPC), neuroendocrine prostate cancer (NEPC), or any combination thereof.
- the type of cancer can comprise breast cancer.
- the subtype of the breast cancer can comprise estrogen receptor-positive (ER+) breast cancer or estrogen receptornegative (ER-) breast cancer.
- the subject can be asymptomatic for the cancer.
- the cancer may comprise one or more biomarkers that are specific to a particular cancer type or subtype, or both.
- the specific biomarkers may indicate a presence of a particular cancer type or subtype.
- biomarker may indicate that a castrate-resistant prostate cancer is present.
- the identification of the presence of a type or subtype of cancer may allow the determination of a treatment option or recommendation .
- the subject may be asymptomatic for cancer.
- the cancer may not exhibit any symptoms and the subject may be unaware of the presence of cancer.
- the subject may be suspected of having cancer.
- the cancer type or subtype may be unknown.
- the methods and systems described herein may allow a cancer to be identified at an earlier stage than otherwise.
- the methods and systems described herein may allow a type or subtype of cancer to be identified at an earlier stage than otherwise.
- the identification of the presence of a type or subtype of the cancer at an earlier stage may allow a treatment option or recommendation to be determined at an earlier stage, and may allow the treatment option or recommendation to be more targeted to the cancer type or subtype, and may allow the subject to have an improved prognosis.
- the selected biomarker can comprise a selected binding site.
- the selected biomarker can comprise a transcription factor binding site.
- the selected biomarker can comprise one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
- ARBS androgen receptor binding sites
- ASCL- Achaete-Scute Family BHLH Transcription Factor
- ER estrogen receptor
- ErbB receptor binding sites or any combination thereof.
- the nucleosome profile can be generated based at least in part on distance of one or more selected subsets of the cfDNA molecules to the selected biomarker.
- the distance can comprise the number of base pairs between each of the one or more selected subsets of the cfDNA molecules and the selected biomarker.
- the nucleosome profile can be further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules.
- the nucleosome profile can be further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules associated with the distance of the one or more selected subsets of the cfDNA molecules from the selected biomarker. In some embodiments, determining the nucleosome profiling abnormality score can further comprise comparing the coverage of the one or more selected subsets of the cfDNA molecules to one or more coverage values of the one or more reference nucleosome profiles. In some embodiments, determining the nucleosome profiling abnormality score can further comprise determining a Z-score of the coverage of the one or more selected subsets of the cfDNA molecules.
- determining the nucleosome profiling abnormality score can further comprise mapping the coverage Z-score of the one or more selected subsets of the cfDNA molecules to a coverage Z-score of each of the one or more reference nucleosome profiles.
- the one or more selected subsets of the cfDNA molecules can comprise fragments of the cfDNA molecules.
- generating the disease forecast can further comprise mapping the nucleosome profiling abnormality score to the one or more disease forecast characteristics.
- the one or more reference nucleosome profiles can be associated with one or more disease forecast characteristics. In some embodiments, the one or more reference nucleosome profiles can be associated with a likelihood of occurrence of one or more disease forecast characteristics. In some embodiments, one or more of the reference nucleosome profiles can comprise data from a sample not having the disease (normal). In some embodiments, the method can further comprise mapping DNA methylation patterns of the biological sample.
- the method can further comprise associating the mapped DNA methylation patterns with the nucleosome profile data of the biological sample.
- the one or more selected subsets of the cfDNA molecules can be selected based at least in part on DNA methylation data mapped to the cfDNA molecules.
- the method can further comprise generating the disease forecast based at least in part on copy number variation data, or sequencing mutation data, or both, associated with the sample.
- a clinical intervention or a therapy may be identified at least in part based on the identification of the types or subtypes cancer, or one or more disease forecast characteristics of the subject.
- the clinical intervention may be a plurality of clinical interventions.
- the therapy may be a plurality of therapies.
- the therapy may be selected from a plurality of clinical interventions.
- the clinical intervention or therapy may be a surgical resection, chemotherapy, radiotherapy, immunotherapy, adjuvant therapy, neoadjuvant therapy, androgen deprivation therapy, or any combination thereof.
- the clinical intervention or therapy may be administered to the subject.
- the method can comprise generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject.
- the nucleosome profile can be generated relative to one or more selected biomarkers.
- the method can further comprise mapping one or more characteristics of the nucleosome profile of the biological sample to one or more corresponding characteristics of one or more reference nucleosome profiles to generate a selected subset of reference nucleosome profiles.
- the method can further comprise obtaining data relating to the cancer subtypes associated with each reference nucleosome profile of the selected subset of reference nucleosome profiles. In some embodiments, the method can further comprise determining the subtype of the cancer of the subject based at least in part on the cancer subtypes associated with each of the reference nucleosome profiles.
- the method can further comprise determining a tumor fraction of the cfDNA molecules.
- the method can further comprise associating the tumor fraction with a nucleosome profiling abnormality score.
- the nucleosome profiling abnormality score can be associated with the one or more selected biomarkers.
- the one or more selected biomarkers can comprise a binding site associated with a type of the cancer.
- the selected biomarker can comprise a transcription factor binding site associated with a type of the cancer.
- the selected biomarker can comprise one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
- ARBS androgen receptor binding sites
- ASCL- Achaete-Scute Family BHLH Transcription Factor
- ER estrogen receptor
- ErbB receptor binding sites or any combination thereof.
- At least a portion of the reference nucleosome profiles can comprise non-disease profiles (normal). In some embodiments, at least a portion of the reference nucleosome profiles can comprise a cancer condition having a cancer type matching the type of the cancer of the subject.
- the method can comprise generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject. In some embodiments, the method can further comprise determining one or more characteristics of the cancer of the subject based at least in part on the nucleosome profile of the subject. In some embodiments, the method can further comprise generating a treatment response determination for the subject based at least in part on the one or more characteristics of the cancer of the subject.
- the treatment response determination can comprise one or more of: a treatment plan, a value representing likelihood of one or more treatment responses, a binary treatment response indicator, a probability value for each treatment response, or any combination thereof.
- the one or more characteristics of the cancer of the subject can comprise a cancer type, a cancer subtype, an estimate of cancer progression, a prognosis of the subject without treatment intervention, a prognosis of the subject with treatment intervention, or any combination thereof.
- the system can comprise a memory and one or more processors.
- the one or more processors can be configured to execute machine- readable instructions which, when executed, cause the one or more processors to perform a method.
- the method performed by the one or more processors can comprise generating a nucleosome profile of a biological sample obtained or derived from the subject.
- the method performed by the one or more processors can further comprise mapping the nucleosome profile to one or more reference nucleosome profiles.
- the method performed by the one or more processors can further comprise determining one or more disease forecast characteristics of the biological sample. In some cases, the one or more disease forecast characteristics can be determined based at least in part on the mapping. In some embodiments, the method performed by the one or more processors can further comprise generating the disease forecast based at least in part on the one or more disease forecast characteristics.
- the one or more disease forecast characteristics can comprise one or more of: a type of the cancer, a subtype of the cancer, a prognosis of the subject, an estimated survival time of the subject without treatment intervention, an estimated survival time of the subject with treatment intervention, an estimation of the subject’s response to one or more treatments, or any combination thereof.
- the nucleosome profile of the subject, or one or more reference nucleosome profiles, or both are processed using one or more algorithms.
- the one or more processors can further comprise one or more software modules or models configured to operate utilizing the one or more algorithms.
- the one or more algorithms can comprise machine learning (ML) or artificial intelligence (Al) algorithms.
- the AI/ML algorithms may be trained algorithms. The trained algorithms may utilize the selected biomarkers, or the nucleosome profile of the sample, or both, as an input.
- the AI/ML algorithms can generate an output relating to the one or more disease forecast characteristics of a cancer.
- the output may be specific to a type of cancer or subtype of cancer.
- the output can comprise determining the type of cancer or the subtype of cancer of the sample. For example, the output may indicate the presence of a castrate-resistant prostate cancer. For example, the output may indicate the presence of ER- or ER+ breast cancer.
- the trained algorithm may be trained on multiple samples.
- the trained algorithm may be trained using at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 300, 400, 500 , 600 ,700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or more independent training samples.
- the trained algorithm may be trained using no more 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 300, 400, 500 , 600 ,700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or less, independent training samples.
- the training samples may be associated with a presence or an absence of a type or subtype of the cancer.
- the training samples may be associated with a prognosis of the cancer.
- the training samples may be associated with cancer that is resistant to a particular drug or treatment.
- the training samples may be associated with cancer that is responsive to a particular drug or treatment.
- An individual training sample may be positive for a particular type or subtype of cancer.
- An individual training sample may be negative for a particular type or subtype of cancer.
- the trained algorithm may be able to determine a type or subtype of cancer, determine a probability of recurrence or relapse of a cancer, or determine if a cancer comprises a set of biomarkers may be resistant to a treatment or responsive to treatment.
- the training sample may be associated with additional clinical health data of a subject.
- additional clinical health data may comprise the gender, weight, height, or levels of metabolites or antibodies of the subject.
- Additional clinical health data may comprise indication of other diseases, disorders, or diseases conditions
- the trained algorithms may be trained using multiple sets of training samples.
- the sets may comprise training samples as described elsewhere herein.
- the training may be performed using a first set of independent training samples associated with a presence of the type or subtype of cancer and a second set of independent training samples associated with an absence of the type or subtype of cancer.
- a first set may be associated with a nucleosome profile and a second set may be associated with a different nucleosome profile.
- a first set may be associated with a prognosis and a second set may be associated with a different prognosis.
- a first set may be associated with one or more disease forecast characteristics, and a second set may be associated with different one or more disease forecast characteristics.
- a first set may be associated with a resistance to a treatment, and a second set may be associated with a responsiveness to the same treatment.
- the trained algorithm may also process additional clinical health data of the subject.
- additional clinical health data may comprise the gender, weight, height, or levels of metabolites or antibodies of the subject.
- Additional clinical health data may comprise indication of other diseases, disorders, or diseases conditions that the subject may suffer from.
- the trained algorithm may output one or more disease forecast estimations or one or more cancer type or subtype determinations, that may be different from the output of an algorithm that does not process additional clinical health.
- the trained algorithm may be an unsupervised machine learning algorithm.
- the unsupervised machine learning algorithm may utilize cluster analysis to identify attributes of interest.
- the trained algorithm may be a supervised machine learning algorithm.
- the algorithm may be inputted with training data such to generate an expected or desired output.
- the supervised learning algorithm may comprise a deep learning algorithm, a support vector machine (SVM), a neural network, or Random Forest algorithms.
- SVM support vector machine
- the trained algorithm may be able to identify relationships of nucleosome profdes to particular cancer prognoses or types or subtypes. Without the trained algorithm, it may otherwise difficult to identify relationships of the nucleosome profiles to cancer types or subtypes. Without the trained algorithm, it may otherwise difficult to identify relationships of the nucleosome profiles to prognosis or disease forecast characteristics.
- the systems and methods may comprise a accuracy, sensitivity, or specificity of determination of type or subtype of cancer or prognosis or disease forecast.
- the methods or systems may comprise in the subject determination of type or subtype of cancer or prognosis or disease forecast at an accuracy of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
- the methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a sensitivity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
- the methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a specificity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
- the methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a positive predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
- the methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a negative predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
- performing nucleosome profding at methylation-loss regions can increase the sensitivity and specificity of the assays.
- performing nucleosome profiling at methylation-loss regions can increase the sensitivity and specificity of the assays by isolating and eliminating fragments from analysis that do not show methylation changes between cancer samples and normal samples.
- FIG. 14 shows a computer system 1401 that is programmed or otherwise configured to perform analysis or steps of the methods, for example determine a likelihood of the presence of a cancer based on a set of biomarkers of an individual or run an algorithm.
- the computer system 1401 can regulate various aspects of methods and systems of the present disclosure, such as, for example, perform an algorithm, input training data, analyze sets of biomarker, or output a result for the user as to the presence or absence of cancer.
- the computer system 1401 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device.
- the electronic device can be a mobile electronic device.
- the computer system 1401 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 1405, which can be a single core or multi core processor, or a plurality of processors for parallel processing.
- the computer system 1401 also includes memory or memory location 1410 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 1415 (e.g., hard disk), communication interface 1420 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 1425, such as cache, other memory, data storage and/or electronic display adapters.
- the memory 1410, storage unit 1415, interface 1420 and peripheral devices 1425 are in communication with the CPU 1405 through a communication bus (solid lines), such as a motherboard.
- the storage unit 1415 can be a data storage unit (or data repository) for storing data.
- the computer system 1401 can be operatively coupled to a computer network (“network”) 1430 with the aid of the communication interface 1420.
- the network 1430 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
- the network 1430 in some cases is a telecommunication and/or data network.
- the network 1430 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
- the network 1430 in some cases with the aid of the computer system 1401, can implement a peer-to-peer network, which may enable devices coupled to the computer system 1401 to behave as a client or a server.
- the CPU 1405 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
- the instructions may be stored in a memory location, such as the memory 1410.
- the instructions can be directed to the CPU 1405, which can subsequently program or otherwise configure the CPU 1405 to implement methods of the present disclosure. Examples of operations performed by the CPU 1405 can include fetch, decode, execute, and writeback.
- the CPU 1405 can be part of a circuit, such as an integrated circuit.
- a circuit such as an integrated circuit.
- One or more other components of the system 1401 can be included in the circuit.
- the circuit is an application specific integrated circuit (ASIC).
- ASIC application specific integrated circuit
- the storage unit 1415 can store files, such as drivers, libraries and saved programs.
- the storage unit 1415 can store user data, e.g., user preferences and user programs.
- the computer system 1401 in some cases can include one or more additional data storage units that are external to the computer system 1401, such as located on a remote server that is in communication with the computer system 1401 through an intranet or the Internet.
- the computer system 1401 can communicate with one or more remote computer systems through the network 1430.
- the computer system 1401 can communicate with a remote computer system of a user (e.g., a medical professional or patient).
- remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants.
- the user can access the computer system 1401 via the network 1430.
- Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 1401, such as, for example, on the memory 1410 or electronic storage unit 1415.
- the machine executable or machine readable code can be provided in the form of software.
- the code can be executed by the processor 1405.
- the code can be retrieved from the storage unit 1415 and stored on the memory 1410 for ready access by the processor 1405.
- the electronic storage unit 1415 can be precluded, and machine-executable instructions are stored on memory 1410.
- the code can be pre-compiled and configured for use with a machine having a processer adapted to execute the code, or can be compiled during runtime.
- the code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as- compiled fashion.
- aspects of the systems and methods provided herein can be embodied in programming.
- Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium.
- Machineexecutable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.
- “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server.
- another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links.
- a machine readable medium such as computer-executable code
- a tangible storage medium such as computer-executable code
- Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings.
- Volatile storage media include dynamic memory, such as main memory of such a computer platform.
- Tangible transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise a bus within a computer system.
- Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications.
- RF radio frequency
- IR infrared
- Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data.
- Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
- the computer system 1401 can include or be in communication with an electronic display 1435 that comprises a user interface (UI) 1440 for providing, for example, an input of biomarkers or sequencing data, or an visual output relating to a detection, diagnosis, or prognosis.
- UI user interface
- Examples of UFs include, without limitation, a graphical user interface (GUI) and web-based user interface.
- Methods and systems of the present disclosure can be implemented by way of one or more algorithms.
- An algorithm can be implemented by way of software upon execution by the central processing unit 1405.
- the algorithm can, for example, determine a presence or absence of a cancer or cancer parameter based on a set of input sequencing data from a sample derived from a subject.
- ARBS Sites
- nucleosome profiling was utilized for disease forecasting by assaying samples as illustrated in the general exemplary method shown in FIG. 1.
- a sample such as blood-derived plasma was assayed using plasma isolation and DNA extraction methods.
- Library preparation was then performed.
- Sequencing was performed on the extracted DNA of the sample.
- the sample was further analyzed using a Low-Pass Whole Genome Sequencing (LP-WGS) pipeline to generate genome-wide copy number burden classification and nucleosome profiling.
- L-WGS Low-Pass Whole Genome Sequencing
- nucleosome profiling at androgen receptor (AR) binding sites was compared for samples of patients, with some patients having metastatic castration-resistant prostate cancer (mCRPC), and nucleosome profiles at the AR binding sites of normal patients not having cancer. Distance from AR binding sites, measured as number of base pairs, was correlated with coverage of the cfDNA fragments of the sample.
- nucleosome profiling at estrogen receptor (ER) binding sites was compared for samples of the same patients having metastatic castration-resistant prostate cancer (mCRPC), and nucleosome profiles at the ER binding sites of normal patients not having cancer.
- mCRPC metastatic castration-resistant prostate cancer
- Nucleosome profiling was also performed at ER binding sites, as shown in FIG. 2C, comparing patients with estrogen receptor positive (ER+) breast cancer to nucleosome profiles at the ER binding sites of normal patients not having cancer.
- the normal patients were a normal plasma background control.
- Nucleosome profiling was performed for larger groups of subjects as well. Nucleosome profiling was performed on 103 subjects having mCRPC and 69 normal subjects. For example FIG. 3A illustrates comparisons of overall ARBS nucleosome profiling between 103 samples from subjects having mCRPC and 69 samples from normal subjects not having cancer.
- Nucleosome profiling abnormality scores were generated for normal plasma background samples and mCRPC samples. The nucleosome profiling abnormality scores were plotted as shown in FIG. 3B. The nucleosome profiling abnormality scores were obtained by quantifying sample-level ARBS scores. This quantification was performed by comparing each sample ARBS centric fragment coverage ( ⁇ 60bp) with normal plasma background centric fragment coverage ( ⁇ 60bp). Z-scores were calculated for each centric fragment coverage each sample ARBS and each normal plasma background sample. The Z-scores of ARBS samples were compared with Z scores of normal samples to generate the nucleosome profiling abnormality scores.
- nucleosome profiling abnormality scores were then correlated with LP-WGS copy number burden scores (CNB) as illustrated in FIG. 3C to generate an estimation of tumor fraction.
- a value indicating a threshold for cancer detection can be generated from the nucleosome profiling abnormality scores correlated with LP-WGS CNB scores.
- FIG. 4A shows estimated tumor fraction generated from an LP-WGS CNV profile from a first subject having mCRPC.
- FIG. 4B shows estimated tumor fraction generated from an LP-WGS CNV profile from a second subject having mCRPC.
- nucleosome profiling at ASCL-1 binding sites between samples were mapped as distance from ASCL-1 binding sites (in base pairs) corresponding to coverage of cfDNA fragments. As illustrated in FIG. 4C, nucleosome profiling was compared between a first sample and a normal plasma background at ASCL-1 binding sites. As illustrated in FIG. 4D, nucleosome profiling was compared between a second sample and a normal plasma background at ASCL-1 binding sites.
- DNA methylation was also mapped to the samples, as shown in FIG. 5A, AR binding sites were mapped to DNA methylation beta value distribution in normal plasma samples. As shown in FIG. 5B, AR binding sites were mapped to DNA methylation beta value distribution in samples having mCRPC, in this case 16 mCRPC prostate cancer samples.
- ARBS nucleosome profiling abnormality scores were taken and mapped to clinical outcomes for mCRPC prostate cancer samples.
- FIG. 6A illustrates plotted ARBS nucleosome profiling abnormality scores grouped by subjects with response or no response to a treatment or selected treatments.
- FIG. 6B shows mCRPC grouped data of ARBS nucleosome profiling abnormality scores mapped to overall survival time.
- Example 2 Nucleosome Profiling for Disease Forecasting Using cfDNA Fragmentomics in Neuroendocrine Prostate Cancer (NEPC)
- nucleosome profiling was performed at various selected biomarkers comprising transcription factor binding sites, such as AR binding sites and ASCL-1 binding sites in NEPC samples.
- the Griffin framework was applied to classify tumor subtypes using nucleosome profiling of cancer-specific transcription factor binding sites (TFBS) and tumor subtype-specific chromatin accessibility regions from low-pass whole genome sequencing data of cfDNA.
- AR and ASCL-1 binding sites were utilize for cancer subtyping to distinguish between androgen receptor-dependent prostate cancer (ARPC) and NEPC.
- ASCL1 binding sites together with AR binding sites were used for prostate cancer subtyping to distinguish between androgen receptor dependent prostate cancer (ARPC) and neuroendocrine prostate cancer (NEPC).
- ER and ERBB2 were used for breast cancer subtyping to distinguish between ER positive and ER negative tumor subtypes.
- nucleosome profiling at AR binding sites was compared between NEPC subjects and normal plasma samples.
- nucleosome profiling at ASCL-1 binding sites was compared between NEPC subjects and normal plasma background samples.
- Samplelevel ARBS and ASCL-1 binding site abnormality scores were quantified by comparing centric fragment coverage for each NEPC sample ( ⁇ 60bp) to the centric fragment coverage for each normal sample ( ⁇ 60bp).
- ARBS binding site abnormality scores were simulated in vitro as associated with different tumor fractions by blending NEPC tumor and normal samples having various different titrations.
- ASCL-1 binding site abnormality scores were simulated in vitro as associated with different tumor fractions by blending NEPC tumor and normal samples having various different titrations.
- Nucleosome profiling was performed on over 1000 mCRPC samples and 42 normal plasma background samples as illustrated in the overall ARBS nucleosome profiling data shown in FIG. 8A. As illustrated in FIG. 8B, overall ASCL-1 nucleosome profiling was also compared for the over 1000 mCRPC samples and 42 normal plasma background samples.
- ARBS and ASCL-1 binding site abnormality scores were mapped for mCRPC samples.
- ARBS binding site abnormality scores were mapped with LP- WGS inferred tumor fraction values.
- nucleosome profiling was performed at ASCL-1 binding sites for 19 samples having mCRPC and compared with normal plasma background samples.
- tumor fraction score was mapped with ASCL-1 binding site abnormality scores for the 19 samples having mCRPC and normal plasma background samples.
- ARBS binding site abnormality scores were mapped with ASCL-1 binding site abnormality scores for the 19 samples having mCRPC.
- nucleosome profiling was performed at ASCL-1 binding sites for the same patient before treatment with chemotherapy and after treatment with chemotherapy, and these conditions were mapped.
- DNA methylation was mapped to AR binding sites, and DNA methylation bata value distribution of normal plasma samples was plotted.
- DNA methylation was mapped to ASCL-1 binding sites, and DNA methylation bata value distribution of normal plasma samples was plotted.
- estrogen receptor (ER) binding site DNA methylation was mapped and the DNA methylation beta value distribution plotted for normal plasma samples.
- estrogen receptor positive (ER+) AT AC nucleosome profiling was performed with ER+ breast cancer samples and 42 normal plasma background samples.
- estrogen receptor negative (ER-) AT AC nucleosome profiling was performed with ER- breast cancer samples and 42 normal plasma background samples.
- tumor fraction was plotted and compared for ER+ breast cancer and ER- breast cancer groups.
- copy number burden score was plotted and compared for ER+ breast cancer and ER- breast cancer groups.
- AR binding site nucleosome profiling abnormality score was plotted with inferred ARBS scores derived from DNA methylation profiles analyzed using LP-WGS analysis.
- AR binding sites were partitioned into two groups, where the first group were hypo-methylated in prostate cancer samples, and the second group did not have significant methylation changes compared with normal plasma background.
- ARBS nucleosome profiling signals of the two groups were compared for the same LP-WGS samples.
- the nucleosome profiling of the first sample is illustrated in FIG. 13A.
- the nucleosome profiling of the second sample is illustrated in FIG. 13B.
- the nucleosome profiling of the third sample is illustrated in FIG. 13C.
- the nucleosome profiling of the fourth sample is illustrated in FIG. 13D.
- nucleosome profiling can be applied on genome-wide DNA methylation profiles. This could provide horizontal beta information at a fragment level.
- hypo- methylated fragments can be selected first and isolated to use for nucleosome profiling analysis, and then optionally normalized.
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Epidemiology (AREA)
- Physics & Mathematics (AREA)
- Primary Health Care (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Genetics & Genomics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Theoretical Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Evolutionary Computation (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Medicinal Chemistry (AREA)
- Hospice & Palliative Care (AREA)
- Software Systems (AREA)
- Oncology (AREA)
- Microbiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
Abstract
Disclosed herein are systems and methods directed to determining a disease forecast of a subject having cancer. In some cases, the methods can include generating a nucleosome profile relating to a biological sample obtained or derived from the subject relative to a selected biomarker. In some cases, the methods can include determining a nucleosome profiling abnormality score based on the nucleosome profile of the subject sample and one or more reference nucleosome profiles. In some cases, the method can include generating the disease forecast based at least in part on the nucleosome profiling abnormality score. In some cases, the systems disclosed herein can include a memory and one or more processors configured to perform a method including generating the nucleosome profile.
Description
SYSTEMS AND METHODS OF FRAGMENTOMICS ANALYSIS IN CANCER
CROSS-REFERENCE
[0001] This application claims the benefit of United States Provisional Patent Application No. 63/624,537, filed January 24, 2024, and United States Provisional Patent Application No. 63/631,139, filed April 8, 2024 which are incorporated herein by reference in their entirety.
BACKGROUND
[0002] Cancer is a leading cause of death worldwide. Detection of cancer, determination of cancer stage, determination of cancer type or subtype, or both, determination of prognosis cancer, and prediction of treatment response, for example, may be critical for providing treatment and improving patient outcomes. Cancer may have genetic aberrations, in some cases genetic aberrations that may differ depending on a variety of factors or characteristics of the cancer. Detection and identification of these genetic aberrations may be important for detection of cancer, determination of cancer stage, determination of cancer type or subtype, or both, determination of prognosis of cancer, and prediction of treatment response, for example.
SUMMARY
[0003] Nucleosome profiling can be used to detect and identify genetic aberrations in samples of subjects having a disease, such as cancer. Current approaches utilizing nucleosome profiling may not be accurate nor specific enough to identify a subtype of a disease such as cancer. Current approaches utilizing nucleosome profiling may also be unable to utilize nucleosome profiling to predict prognosis or forecast progression or treatment response of a subject reliably and accurately. Applicant has recognized that nucleosome profiling can be analyzed and processed using methods and systems disclosed herein to achieve disease forecasting and disease subtyping utilizing nucleosome profiling with reliability and accuracy.
[0004] Disclosed herein are methods and systems for determining disease forecasts and disease identification and grouping based at least in part on nucleosome profiling analysis.
[0005] In an aspect, disclosed herein is a method for determining a disease forecast of a subject having cancer, comprising: generating a nucleosome profile relating to a biological sample obtained or derived from the subject, wherein the nucleosome profile is generated relative to a selected biomarker; determining a nucleosome profiling abnormality score, comprising relating the nucleosome profile of the subject sample to one or more reference nucleosome profiles; generating the disease forecast based at least in part on the nucleosome profiling abnormality score, wherein the disease forecast comprises one or more disease forecast characteristics.
[0006] In some embodiments, the one or more disease forecast characteristics comprise a cancer prognosis relating to the subject. In some embodiments, the disease forecast characteristics comprise one or more of: an estimated survival time of the subject without a treatment intervention, an estimated
survival time of the subject with a treatment intervention, determination of a type of the cancer, determination of a subtype of the cancer, determination of one or more clinical outcomes, or predicted treatment response of the subject to one or more treatments, or any combination thereof.
[0007] In some embodiments, the biological sample comprises cell-free deoxyribonucleic acid (cfDNA) molecules. In some embodiments, the biological sample comprises one or more of: a plasma sample, a serum sample, a red blood cell sample, a urine sample, a saliva sample, pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any derivative thereof, and any combination thereof. In some embodiments, the biological sample comprises the plasma sample. In some embodiments, the biological sample comprises the urine sample.
[0008] In some embodiments, the cfDNA molecules are obtained or derived from a single biological sample of the subject. In some embodiments, the cfDNA molecules are obtained or derived from different biological samples of the subject. In some embodiments, the biological sample is obtained or derived from the subject using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free deoxyribonucleic acid (DNA) collection tube, other blood collection tube, and CTC collection tubes.
[0009] In some embodiments, the method further comprises assaying the biological sample to generate the nucleosome profde. In some embodiments, the method further comprises assaying the biological sample comprises subjecting said biological sample to conditions that are sufficient to isolate, enrich, or extract the cfDNA molecules. In some embodiments, the method further comprises fractionating a whole blood sample of the subject to obtain the cfDNA molecules. In some embodiments, the method further comprises assaying the biological sample further comprises assaying the cfDNA molecules using nucleic acid sequencing to produce nucleic acid sequencing reads.
[0010] In some embodiments, the nucleic acid sequencing further comprises DNA sequencing. In some embodiments, the DNA sequencing comprises one or more of: next-generation sequencing, whole genome sequencing, low-pass sequencing, targeted sequencing, whole exome sequencing, methylation- aware sequencing, or bisulfite sequencing, or a combination thereof. In some embodiments, the DNA sequencing comprises low-pass whole genome sequencing. In some embodiments, the DNA sequencing comprises whole exome sequencing. In some embodiments, the DNA sequencing further comprises nucleic acid amplification. In some embodiments, the nucleic acid amplification comprises polymerase chain reaction (PCR) or isothermal amplification. In some embodiments, at least one of the cfDNA molecules are assayed using a polymerase chain reaction (PCR) assay, microarray, or a isothermal amplification.
[0011] In some embodiments, the type of the cancer of the subject comprises one or more of: lung cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof. In some embodiments, the type of the cancer of the subject comprises prostate cancer. In some embodiments, the subtype of the prostate cancer comprises one or more of: hormone sensitive prostate
cancer (HSPC), castration-resistant prostate cancer (CRPC), androgen receptor-dependent prostate cancer (ARPC), metastatic prostate cancer, metastatic castration-resistant prostate cancer (mCRPC), neuroendocrine prostate cancer (NEPC), or any combination thereof. In some embodiments, the type of cancer comprises breast cancer. In some embodiments, the subtype of the breast cancer comprises estrogen receptor-positive (ER+) breast cancer or estrogen receptor-negative (ER-) breast cancer. In some embodiments, the subject is asymptomatic for the cancer.
[0012] In some embodiments, the selected biomarker comprises a selected binding site. In some embodiments, the selected biomarker comprises a transcription factor binding site. In some embodiments, the selected biomarker comprises one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
[0013] In some embodiments, the nucleosome profile is generated based at least in part on distance of one or more selected subsets of the cfDNA molecules to the selected biomarker. In some embodiments, the distance comprises the number of base pairs between each of the one or more selected subsets of the cfDNA molecules and the selected biomarker. In some embodiments, the nucleosome profile is further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules. In some embodiments, the nucleosome profile is further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules associated with the distance of the one or more selected subsets of the cfDNA molecules from the selected biomarker. In some embodiments, determining the nucleosome profiling abnormality score further comprises comparing the coverage of the one or more selected subsets of the cfDNA molecules to one or more coverage values of the one or more reference nucleosome profiles. In some embodiments, determining the nucleosome profiling abnormality score further comprises determining a Z-score of the coverage of the one or more selected subsets of the cfDNA molecules. In some embodiments, determining the nucleosome profiling abnormality score further comprises mapping the coverage Z-score of the one or more selected subsets of the cfDNA molecules to a coverage Z-score of each of the one or more reference nucleosome profiles.
[0014] In some embodiments, the one or more selected subsets of the cfDNA molecules comprise fragments of the cfDNA molecules.
[0015] In some embodiments, the method further comprises generating the disease forecast further comprises mapping the nucleosome profiling abnormality score to the one or more disease forecast characteristics.
[0016] In some embodiments, the one or more reference nucleosome profiles are associated with one or more disease forecast characteristics. In some embodiments, the one or more reference nucleosome profiles are associated with a likelihood of occurrence of one or more disease forecast characteristics. In some embodiments, one or more of the reference nucleosome profiles comprise data from a sample not having the disease (normal). In some embodiments, the method further comprises mapping DNA methylation patterns of the biological sample.
[0017] In some embodiments, the method further comprises associating the mapped DNA methylation patterns with the nucleosome profile data of the biological sample. In some embodiments, the one or more selected subsets of the cfDNA molecules are selected based at least in part on DNA methylation data mapped to the cfDNA molecules.
[0018] In some embodiments, the method further comprises generating the disease forecast based at least in part on copy number variation data, or sequencing mutation data, or both, associated with the sample.
[0019] In yet another aspect, disclosed herein is a method for determining a subtype of a cancer of a subject, comprising: generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject, wherein the nucleosome profile is generated relative to one or more selected biomarkers; mapping one or more characteristics of the nucleosome profile of the biological sample to one or more corresponding characteristics of one or more reference nucleosome profiles to generate a selected subset of reference nucleosome profiles; obtaining data relating to the cancer subtypes associated with each reference nucleosome profile of the selected subset of reference nucleosome profiles; and determining the subtype of the cancer of the subject based at least in part on the cancer subtypes associated with each of the reference nucleosome profiles.
[0020] In some embodiments, the method further comprises determining a tumor fraction of the cfDNA molecules.
[0021] In some embodiments, the method further comprises associating the tumor fraction with a nucleosome profiling abnormality score. In some embodiments, the nucleosome profiling abnormality score is associated with the one or more selected biomarkers. In some embodiments, the one or more selected biomarkers comprise a binding site associated with a type of the cancer.
[0022] In some embodiments, the selected biomarker comprises a transcription factor binding site associated with a type of the cancer. In some embodiments, the selected biomarker comprises one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
[0023] In some embodiments, at least a portion of the reference nucleosome profiles comprise nondisease profiles (normal). In some embodiments, at least a portion of the reference nucleosome profiles comprise a cancer condition having a cancer type matching the type of the cancer of the subject.
[0024] In yet another aspect, disclosed herein is a method for estimating a response of a subject having cancer to one or more treatments, comprising: generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject; determining one or more characteristics of the cancer of the subject based at least in part on the nucleosome profile of the subject; and generating a treatment response determination for the subject based at least in part on the one or more characteristics of the cancer of the subject.
[0025] In some embodiments, the treatment response determination comprises one or more of: a treatment plan, a value representing likelihood of one or more treatment responses, a binary treatment response indicator, a probability value for each treatment response, or any combination thereof.
[0026] In some embodiments, the one or more characteristics of the cancer of the subject comprises a cancer type, a cancer subtype, an estimate of cancer progression, a prognosis of the subject without treatment intervention, a prognosis of the subject with treatment intervention, or any combination thereof.
[0027] In yet another aspect, disclosed herein is a system for determining a disease forecast of a subject having cancer, the system comprising: a memory; and one or more processors configured to execute machine-readable instructions which, when executed, cause the one or more processors to perform a method comprising: generating a nucleosome profile of a biological sample obtained or derived from the subject, mapping the nucleosome profile to one or more reference nucleosome profiles, determining one or more disease forecast characteristics of the biological sample based at least in part on the mapping, and generating the disease forecast based at least in part on the one or more disease forecast characteristics.
[0028] In some embodiments, the one or more disease forecast characteristics comprise one or more of: a type of the cancer, a subtype of the cancer, a prognosis of the subject, an estimated survival time of the subject without treatment intervention, an estimated survival time of the subject with treatment intervention, an estimation of the subject’s response to one or more treatments, or any combination thereof.
[0029] Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
INCORPORATION BY REFERENCE
[0030] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. To the extent publications and patents or patent applications incorporated by reference contradict the disclosure contained in the specification, the specification is intended to supersede and/or take precedence over any such contradictory material.
BRIEF DESCRIPTION OF THE DRAWINGS
[0031] The novel features of the disclosure are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present disclosure will be obtained by
reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings (also “Figure” and “FIG.” herein), of which:
[0032] FIG. 1 illustrates an exemplary nucleosome profding method.
[0033] FIGs. 2A-2D show exemplary data relating to nucleosome profiling comparisons between samples having various disease conditions and normal samples.
[0034] FIGs. 3A-3D illustrate exemplary data relating to androgen receptor binding site (ARBS) nucleosome profiling between samples having various disease conditions and normal samples.
[0035] FIGs. 4A-4D show exemplary data relating to ASCL-1 biomarker nucleosome profiling between samples having various disease conditions and normal samples.
[0036] FIGs. 5A-5C illustrate exemplary data relating to DNA methylation and ARBS nucleosome profiling in samples having various disease conditions and normal samples.
[0037] FIGs. 6A-6E show exemplary data relating to clinical outcomes associated with ARBS nucleosome profiling abnormality score of samples having various disease conditions and various treatment conditions.
[0038] FIGs. 7A-7D illustrate exemplary data relating to ASCL-1 binding sites and ARBS nucleosome profiling in samples having various disease conditions and normal samples, as well as tumor fraction analysis of these data.
[0039] FIGs. 8A-8D show exemplary data relating to ARBS and ASCL-1 nucleosome profiling and scoring for samples having various disease conditions and normal samples.
[0040] FIGs. 9A-9D illustrate exemplary data comparing ARBS and ASCL-1 nucleosome profiling values for samples of different subjects with various disease conditions and at various time points pretreatment and post-treatment.
[0041] FIGs. 10A-10C show exemplary data comparing ARBS, ASCL-1, and endoplasmic reticulum binding sites (ERBS) methylation data.
[0042] FIGs. 11A-11D illustrate exemplary data relating to comparing nucleosome profiling for various samples having an estrogen-receptor positive (ER+_ disease condition and various samples having an estrogen-receptor negative (ER-) disease condition.
[0043] FIGs. 12A-12C show exemplary data relating to DNA methylation associated with ARBS nucleosome profiling data in samples having various disease conditions and normal samples.
[0044] FIGs. 13A-13D illustrate exemplary data relating to DNA methylation associated with fragment-level ARBS nucleosome profiling data in fragments of samples having various disease conditions and fragments of normal samples.
[0045] FIG. 14 illustrates a non-limiting example of a computing device; in this case, a device with one or more processors, memory, storage, and a network interface, per one or more embodiments herein.
DETAILED DESCRIPTION
[0046] While various embodiments of the invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed.
Terms and Definitions
[0047] As used herein, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Any reference to “or” herein is intended to encompass “and/or” unless otherwise stated.
[0048] As used herein, the phrases “at least one,” “one or more,” and “and/or” are open-ended expressions that are both conjunctive and disjunctive in operation. For example, each of the expressions “at least one of A, B and C,” “at least one of A, B, or C,” “one or more of A, B, and C”, “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together. As used herein, the phrase “at most three” can mean less than one, one, two, or three.
[0049] Reference throughout this specification to “some embodiments,” “further embodiments,” or “a particular embodiment,” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrase “in some embodiments,” or “in further embodiments,” or “in a particular embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments
[0050] The terms "subject," "individual," and "patient" may be used interchangeably and refer to humans, as well as non-human mammals (e.g., non-human primates, canines, equines, felines, porcines, bovines, ungulates, lagomorphs, rodents, and the like). In various embodiments, the subject can be a human (e.g., adult male, adult female, adolescent male, adolescent female, male child, female child) under the care of a physician or other health worker in a hospital, as an outpatient, or other clinical context.
[0051] As used herein, “treatment” or “treating” refers to an approach for obtaining beneficial or desired results with respect to a disease, disorder, or medical condition including, but not limited to, a therapeutic benefit and/or a prophylactic benefit. In certain embodiments, treatment or treating involves administering a therapeutic to a subject. A therapeutic benefit may include the eradication or amelioration of the underlying disorder being treated. Also, a therapeutic benefit may be achieved with the eradication or amelioration of one or more of the physiological symptoms associated with the underlying disorder, such as observing an improvement in the subject, notwithstanding that the subject may still be afflicted with the underlying disorder.
Methods of Disease Forecasting
[0052] In an aspect, disclosed herein is a method for determining a disease forecast of a subject having cancer. In some embodiments, the method can comprise generating a nucleosome profile relating to a biological sample obtained or derived from the subject. In some cases, the nucleosome profile can be generated relative to a selected biomarker. In some embodiments, the method can further comprise determining a nucleosome profiling abnormality score. In some cases, determining a nucleosome profiling abnormality score can comprise relating the nucleosome profile of the subject sample to one or more reference nucleosome profiles. In some embodiments, the method can further comprise generating the disease forecast based at least in part on the nucleosome profiling abnormality score. In some cases, the disease forecast can comprise one or more disease forecast characteristics.
[0053] In some embodiments, the one or more disease forecast characteristics can comprise a cancer prognosis relating to the subject. The prognosis may comprise expected progression-free survival (PFS), overall survival (OS), or other metrics relating the severity or survivability of a cancer. In some embodiments, the disease forecast characteristics can comprise one or more of: an estimated survival time of the subject without a treatment intervention, an estimated survival time of the subject with a treatment intervention, determination of a type of the cancer, determination of a subtype of the cancer, determination of one or more clinical outcomes, or predicted treatment response of the subject to one or more treatments, or any combination thereof.
[0054] In some embodiments, the biological sample can comprise cell-free deoxyribonucleic acid (cfDNA) molecules. The biological sample may comprise nucleic acids. The nucleic acid may be a DNA (e.g. double-stranded DNA, single-stranded DNA, single -stranded DNA hairpins, cDNA, circulating tumor DNA (ctDNA), cell-free DNA (cfDNA)), or DNA otherwise derived from RNA. The biological sample may contain or be derived from a biological fluid. In some embodiments, the biological sample can comprise one or more of: a plasma sample, a serum sample, a buffy coat sample, a red blood cell sample, a urine sample, a saliva sample, tissue biopsy, pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any derivative thereof, and any combination thereof. In some embodiments, the biological sample can comprise the plasma sample. In some embodiments, the biological sample can comprise the urine sample.
[0055] In some embodiments, the cfDNA molecules can be obtained or derived from a single biological sample of the subject. In some embodiments, the cfDNA molecules can be obtained or derived from different biological samples of the subject. In some embodiments, the biological sample can be obtained or derived from the subject using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free deoxyribonucleic acid (DNA) collection tube, other blood collection tube, and CTC collection tubes. The collection tube may comprise additional reagents for stabilizing the nucleic acid molecules or blood cells. The collection tube may allow the nucleic acid or blood cells to be stable such to minimize degradation of the biological sample prior to assaying. The additional reagents may comprise buffer salts or chelators.
[0056] The biological sample may be obtained or derived from a subject at various times. The biological sample may be obtained or derived from a subject prior to the subject receiving a therapy for cancer. The biological sample may be obtained or derived from a subject during receiving a therapy for cancer. The biological sample may be obtained or derived from a subject after receiving a therapy for cancer. The biological sample may be collected over 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or time points. The time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more hour period. The time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more day period. The time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 30, 35, 40, 45, 50, 55, 60 or more week period. The time points may occur over a 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more month period. The time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more year period.
[0057] In some embodiments, the method can further comprise assaying the biological sample to generate the nucleosome profile. In some embodiments, assaying the biological sample can comprise subjecting the biological sample to conditions sufficient to isolate, enrich, or extract the cfDNA molecules.
[0058] The methods disclosed herein may comprise conducting one or more enrichment reactions on one or more nucleic acid molecules in a sample. The enrichment reactions may comprise contacting a sample with one or more beads or bead sets. The enrichment reactions may comprise one or more hybridization reactions. For example, the enrichment reactions may comprise contacting a sample with one or more capture probes or bait molecules that hybridize to a nucleic acid molecule of the biological sample. The enrichment reaction may comprise differential amplification of a set of nucleic acid molecules. The enrichment reaction may enrich for a plurality of genetic loci or sequences corresponding to genetic loci. The enrichment reactions may comprise the use of primers or probes that may complementarity to sequences (or sequences upstream or downstream) of a sequence that is to be enriched. For example, a capture probe may comprise sequence complementarity to a set of genomic loci and allow the enrichment of the genomic loci. The enrichments reactions may comprise a plurality of probes or primers. A plurality of probes may comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, or 180 different probes.
[0059] The methods disclosed herein may comprise conducting one or more isolation or purification reactions on one or more nucleic acid molecules in a sample. The isolation or purification reactions may comprise contacting a sample with one or more beads or bead sets. The isolation or purification reaction may comprise one or more hybridization reactions, enrichment reactions, amplification reactions, sequencing reactions, or a combination thereof. The isolation or purification reaction may comprise the use of one or more separators. The one or more separators may comprise a magnetic separator. The
isolation or purification reaction may comprise separating bead bound nucleic acid molecules from bead free nucleic acid molecules. The isolation or purification reaction may comprise separating capture probe hybridized nucleic acid molecules from capture probe free nucleic acid molecules. The isolation reactions may comprises removing or separating a group of nucleic acid molecules from another group of nucleic acids.
[0060] The methods disclosed herein may comprise conduction extraction reactions on one or more nucleic acids in a biological sample. The extraction reactions may lyse cells or disrupt nucleic acid interactions with the cell such that the nucleic acids may be isolated, purified, enriched or subjected to other reactions.
[0061] The methods disclosed herein may comprise amplification or extension reactions. The amplification reactions may comprise polymerase chain reaction. The amplification reaction may comprise PCR-based amplifications, non-PCR based amplifications, or a combination thereof. The one or more PCR-based amplifications may comprise PCR, qPCR, nested PCR, linear amplification, or a combination thereof. The one or more non-PCR based amplifications may comprise multiple displacement amplification (MDA), transcription-mediated amplification (TMA), nucleic acid sequencebased amplification (NASBA), strand displacement amplification (SDA), real-time SDA, rolling circle amplification, circle-to-circle amplification or a combination thereof. The amplification reactions may comprise an isothermal amplification.
[0062] The method disclosed herein may comprise a barcoding reaction. A barcoding reaction may comprise the additional of a barcode or tag to the nucleic acid. The barcode may be a molecular barcode or a sample barcode. For example, a barcode nucleic acid may comprise a barcode sequence which may be a degenerate n-mer. The sequence may be randomly generated or generated such to synthesize a specific barcode sequence. The barcode nucleic acid may be added to a sample such to label the nucleic acid molecules in the sample. The barcodes may be specific to a sample. For example, a plurality of barcode nucleic acids may be added to a sample in which the barcode sequence is the same. Upon barcoding of the nucleic acids, those originating from a same sample may have a same barcode sequence, and may allow a nucleic acid to be identified as belonging to a particular or given sample. A molecular barcode may also be used such that each molecule (or a plurality of molecules) in a same volume have a different molecular barcode. This barcode may be subjected to amplification such that all amplicons derived from a molecule have the same barcode. In this way, molecules originating from a same molecule may be identified. The sequences reads may be processed based on the barcode sequences. For example, the processing may reduce errors or allow a molecule to be tracked. Barcode sequences may be appended or otherwise added or incorporated into a sequence by various reactions, for example an amplification, extension, or ligation reaction, and may be performed enzymatically using a nucleic acid polymerase or ligase. The ligation may be an overhang or blunt end ligation and the barcodes may comprise complementarity to nucleic acids to be barcoded. This complementarity may be a sequence derived from the sample from the subject or may be constant sequence generated via a reaction performed on the nucleic acids in the sample.
[0063] In some cases, the biological sample may comprise multiple components. For example, the biological sample may be a whole blood sample. The biological sample may be subjected to reactions such to separate or fractionate a biological sample. For example, a whole blood sample may be a fractionated and cell free nucleic acids may be obtained. The whole blood sample may be fractionated using centrifugation such that blood cells may be separated from the plasma (which may contain cell free nucleic acid). A sample may be subjected to multiple rounds of separation or fractionation. In some embodiments, the method can further comprise fractionating a whole blood sample of the subject to obtain the cfDNA molecules.
[0064] In some embodiments, assaying the biological sample can further comprise assaying the cfDNA molecules using nucleic acid sequencing to produce nucleic acid sequencing reads. Examples of a sequencing reaction that may be used include capillary sequencing, next generation sequencing, Sanger sequencing, sequencing by synthesis, single molecule nanopore sequencing, sequencing by ligation, sequencing by hybridization, sequencing by nanopore current restriction, or a combination thereof. Sequencing by synthesis may comprise reversible terminator sequencing, processive single molecule sequencing, sequential nucleotide flow sequencing, or a combination thereof. Sequential nucleotide flow sequencing may comprise pyrosequencing, pH-mediated sequencing, semiconductor sequencing or a combination thereof. The sequencing reactions may comprise whole genome sequencing, whole exome sequencing, low-pass whole genome sequencing, targeted sequencing, methylation-aware sequencing, enzymatic methylation sequencing, bisulfite methylation sequencing. The sequencing reaction may be a transcriptome sequencing, mRNA-seq, totalRNA-seq, smallRNA-seq, exosome sequencing, or combinations thereof. Combinations of sequencing reactions may be used in the methods described elsewhere herein. For example, a sample may be subjected to whole genome sequencing and whole transcriptome sequencing. As the samples may comprise multiple types of nucleic acids (e.g. RNA and DNA), sequencing reactions specific to DNA or RNA may be used such to obtain sequence reads relating to the nucleic acid type.
[0065] The sequencing of nucleic acids may generate sequencing read data. The sequencing reads may be processed such to generate data of improved quality. The sequencing reads may be generated with a quality score. The quality score may indicate an accuracy of a sequence read or a level or signal above a nose threshold for a given base call. The quality scores may be used for filtering sequencing reads. For example, sequencing reads may be removed that do not meet a particular quality score threshold. The sequencing reads may be processed such to generate a consensus sequence or consensus base call. A given nucleic acid (or nucleic acid fragment) may be sequenced and errors in the sequence may be generated due to reactions prior or during sequencing. For example, amplification or PCR may generate error in amplicons such that the sequences are not identical to a parent sequence. Using sample barcodes or molecular barcodes, error correction may be performed. Error correction may include identifying sequence reads that do not corroborate with other sequences from a same sample or same original parent molecules. The use of barcodes may allow the identification or a same parent or sample.
Additionally, the sequence reads may be processed by performing single strand consensus calling or double stranded consensus call, thereby reducing or suppressing error.
[0066] In some embodiments, the nucleic acid sequencing can further comprise DNA sequencing. In some embodiments, the DNA sequencing can comprise one or more of: next-generation sequencing, whole genome sequencing, low-pass sequencing, targeted sequencing, whole exome sequencing, methylation-aware sequencing, or bisulfite sequencing, or a combination thereof. In some embodiments, the DNA sequencing can comprise low-pass whole genome sequencing. In some embodiments, the DNA sequencing can comprise whole exome sequencing. In some embodiments, the DNA sequencing can further comprise nucleic acid amplification. In some embodiments, the nucleic acid amplification can comprise polymerase chain reaction (PCR) or isothermal amplification. In some embodiments, at least one of the cfDNA molecules can be assayed using a polymerase chain reaction (PCR) assay, microarray, or a isothermal amplification.
[0067] The subject may be a suspected of a suffering from a cancer. The cancer may be specific or originating from an organ or other area of the subject. In some embodiments, the type of the cancer of the subject can comprise one or more of: lung cancer, colorectal cancer, melanoma, bladder cancer, nonHodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof. In some embodiments, the type of the cancer of the subject can comprise prostate cancer. In some embodiments, the subtype of the prostate cancer can comprise one or more of: hormone sensitive prostate cancer (HSPC), castration-resistant prostate cancer (CRPC), androgen receptor-dependent prostate cancer (ARPC), metastatic prostate cancer, metastatic castrationresistant prostate cancer (mCRPC), neuroendocrine prostate cancer (NEPC), or any combination thereof. In some embodiments, the type of cancer can comprise breast cancer. In some embodiments, the subtype of the breast cancer can comprise estrogen receptor-positive (ER+) breast cancer or estrogen receptornegative (ER-) breast cancer. In some embodiments, the subject can be asymptomatic for the cancer. The cancer may comprise one or more biomarkers that are specific to a particular cancer type or subtype, or both. The specific biomarkers may indicate a presence of a particular cancer type or subtype. For example, biomarker may indicate that a castrate-resistant prostate cancer is present. The identification of the presence of a type or subtype of cancer may allow the determination of a treatment option or recommendation .
[0068] In some cases, the subject may be asymptomatic for cancer. For example, the cancer may not exhibit any symptoms and the subject may be unaware of the presence of cancer. In some cases, the subject may be suspected of having cancer. In some cases, the cancer type or subtype may be unknown. The methods and systems described herein may allow a cancer to be identified at an earlier stage than otherwise. The methods and systems described herein may allow a type or subtype of cancer to be identified at an earlier stage than otherwise. The identification of the presence of a type or subtype of the cancer at an earlier stage may allow a treatment option or recommendation to be determined at an earlier stage, and may allow the treatment option or recommendation to be more targeted to the cancer type or subtype, and may allow the subject to have an improved prognosis.
[0069] In some embodiments, the selected biomarker can comprise a selected binding site. In some embodiments, the selected biomarker can comprise a transcription factor binding site. In some embodiments, the selected biomarker can comprise one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
[0070] In some embodiments, the nucleosome profile can be generated based at least in part on distance of one or more selected subsets of the cfDNA molecules to the selected biomarker. In some embodiments, the distance can comprise the number of base pairs between each of the one or more selected subsets of the cfDNA molecules and the selected biomarker. In some embodiments, the nucleosome profile can be further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules. In some embodiments, the nucleosome profile can be further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules associated with the distance of the one or more selected subsets of the cfDNA molecules from the selected biomarker. In some embodiments, determining the nucleosome profiling abnormality score can further comprise comparing the coverage of the one or more selected subsets of the cfDNA molecules to one or more coverage values of the one or more reference nucleosome profiles. In some embodiments, determining the nucleosome profiling abnormality score can further comprise determining a Z-score of the coverage of the one or more selected subsets of the cfDNA molecules. In some embodiments, determining the nucleosome profiling abnormality score can further comprise mapping the coverage Z-score of the one or more selected subsets of the cfDNA molecules to a coverage Z-score of each of the one or more reference nucleosome profiles.
[0071] In some embodiments, the one or more selected subsets of the cfDNA molecules can comprise fragments of the cfDNA molecules.
[0072] In some embodiments, generating the disease forecast can further comprise mapping the nucleosome profiling abnormality score to the one or more disease forecast characteristics.
[0073] In some embodiments, the one or more reference nucleosome profiles can be associated with one or more disease forecast characteristics. In some embodiments, the one or more reference nucleosome profiles can be associated with a likelihood of occurrence of one or more disease forecast characteristics. In some embodiments, one or more of the reference nucleosome profiles can comprise data from a sample not having the disease (normal). In some embodiments, the method can further comprise mapping DNA methylation patterns of the biological sample.
[0074] In some embodiments, the method can further comprise associating the mapped DNA methylation patterns with the nucleosome profile data of the biological sample. In some embodiments, the one or more selected subsets of the cfDNA molecules can be selected based at least in part on DNA methylation data mapped to the cfDNA molecules.
[0075] In some embodiments, the method can further comprise generating the disease forecast based at least in part on copy number variation data, or sequencing mutation data, or both, associated with the sample.
[0076] In some cases, a clinical intervention or a therapy may be identified at least in part based on the identification of the types or subtypes cancer, or one or more disease forecast characteristics of the subject. The clinical intervention may be a plurality of clinical interventions. The therapy may be a plurality of therapies. The therapy may be selected from a plurality of clinical interventions. The clinical intervention or therapy may be a surgical resection, chemotherapy, radiotherapy, immunotherapy, adjuvant therapy, neoadjuvant therapy, androgen deprivation therapy, or any combination thereof. In some cases, the clinical intervention or therapy may be administered to the subject.
Methods of Determining Disease Subtype
[0077] In yet another aspect, disclosed herein is a method for determining a subtype of a cancer of a subject. In some embodiments, the method can comprise generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject. In some cases, the nucleosome profile can be generated relative to one or more selected biomarkers. In some embodiments, the method can further comprise mapping one or more characteristics of the nucleosome profile of the biological sample to one or more corresponding characteristics of one or more reference nucleosome profiles to generate a selected subset of reference nucleosome profiles.
[0078] In some embodiments, the method can further comprise obtaining data relating to the cancer subtypes associated with each reference nucleosome profile of the selected subset of reference nucleosome profiles. In some embodiments, the method can further comprise determining the subtype of the cancer of the subject based at least in part on the cancer subtypes associated with each of the reference nucleosome profiles.
[0079] In some embodiments, the method can further comprise determining a tumor fraction of the cfDNA molecules.
[0080] In some embodiments, the method can further comprise associating the tumor fraction with a nucleosome profiling abnormality score. In some embodiments, the nucleosome profiling abnormality score can be associated with the one or more selected biomarkers. In some embodiments, the one or more selected biomarkers can comprise a binding site associated with a type of the cancer.
[0081] In some embodiments, the selected biomarker can comprise a transcription factor binding site associated with a type of the cancer. In some embodiments, the selected biomarker can comprise one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor (ASCL- ) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
[0082] In some embodiments, at least a portion of the reference nucleosome profiles can comprise non-disease profiles (normal). In some embodiments, at least a portion of the reference nucleosome profiles can comprise a cancer condition having a cancer type matching the type of the cancer of the subject.
Methods for Estimating Treatment Response
[0083] In yet another aspect, disclosed herein is a method for estimating a response of a subject having cancer to one or more treatments. In some embodiments, the method can comprise generating a
nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject. In some embodiments, the method can further comprise determining one or more characteristics of the cancer of the subject based at least in part on the nucleosome profile of the subject. In some embodiments, the method can further comprise generating a treatment response determination for the subject based at least in part on the one or more characteristics of the cancer of the subject.
[0084] In some embodiments, the treatment response determination can comprise one or more of: a treatment plan, a value representing likelihood of one or more treatment responses, a binary treatment response indicator, a probability value for each treatment response, or any combination thereof.
[0085] In some embodiments, the one or more characteristics of the cancer of the subject can comprise a cancer type, a cancer subtype, an estimate of cancer progression, a prognosis of the subject without treatment intervention, a prognosis of the subject with treatment intervention, or any combination thereof.
Systems for Determining Disease Forecast Characteristics
[0086] In yet another aspect, disclosed herein is a system for determining a disease forecast of a subject having cancer. In some embodiments, the system can comprise a memory and one or more processors. In some embodiments, the one or more processors can be configured to execute machine- readable instructions which, when executed, cause the one or more processors to perform a method. In some embodiments, the method performed by the one or more processors can comprise generating a nucleosome profile of a biological sample obtained or derived from the subject. In some embodiments, the method performed by the one or more processors can further comprise mapping the nucleosome profile to one or more reference nucleosome profiles. In some embodiments, the method performed by the one or more processors can further comprise determining one or more disease forecast characteristics of the biological sample. In some cases, the one or more disease forecast characteristics can be determined based at least in part on the mapping. In some embodiments, the method performed by the one or more processors can further comprise generating the disease forecast based at least in part on the one or more disease forecast characteristics.
[0087] In some embodiments, the one or more disease forecast characteristics can comprise one or more of: a type of the cancer, a subtype of the cancer, a prognosis of the subject, an estimated survival time of the subject without treatment intervention, an estimated survival time of the subject with treatment intervention, an estimation of the subject’s response to one or more treatments, or any combination thereof.
[0088] In various aspects, the nucleosome profile of the subject, or one or more reference nucleosome profiles, or both, are processed using one or more algorithms. In some embodiments, the one or more processors can further comprise one or more software modules or models configured to operate utilizing the one or more algorithms. The one or more algorithms can comprise machine learning (ML) or artificial intelligence (Al) algorithms. The AI/ML algorithms may be trained algorithms. The trained algorithms may utilize the selected biomarkers, or the nucleosome profile of the sample, or both, as an input. The AI/ML algorithms can generate an output relating to the one or more disease forecast
characteristics of a cancer. The output may be specific to a type of cancer or subtype of cancer. The output can comprise determining the type of cancer or the subtype of cancer of the sample. For example, the output may indicate the presence of a castrate-resistant prostate cancer. For example, the output may indicate the presence of ER- or ER+ breast cancer.
[0089] The trained algorithm may be trained on multiple samples. For example, the trained algorithm may be trained using at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 300, 400, 500 , 600 ,700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or more independent training samples. The trained algorithm may be trained using no more 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 300, 400, 500 , 600 ,700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or less, independent training samples. The training samples may be associated with a presence or an absence of a type or subtype of the cancer. The training samples may be associated with a prognosis of the cancer. The training samples may be associated with cancer that is resistant to a particular drug or treatment. The training samples may be associated with cancer that is responsive to a particular drug or treatment. An individual training sample may be positive for a particular type or subtype of cancer. An individual training sample may be negative for a particular type or subtype of cancer. By using training samples, the trained algorithm may be able to determine a type or subtype of cancer, determine a probability of recurrence or relapse of a cancer, or determine if a cancer comprises a set of biomarkers may be resistant to a treatment or responsive to treatment. The training sample may be associated with additional clinical health data of a subject. For example, additional clinical health data may comprise the gender, weight, height, or levels of metabolites or antibodies of the subject. Additional clinical health data may comprise indication of other diseases, disorders, or diseases conditions.
[0090] The trained algorithms may be trained using multiple sets of training samples. The sets may comprise training samples as described elsewhere herein. For example, the training may be performed using a first set of independent training samples associated with a presence of the type or subtype of cancer and a second set of independent training samples associated with an absence of the type or subtype of cancer. Similarly, a first set may be associated with a nucleosome profile and a second set may be associated with a different nucleosome profile. Similarly, a first set may be associated with a prognosis and a second set may be associated with a different prognosis. Similarly, a first set may be associated with one or more disease forecast characteristics, and a second set may be associated with different one or more disease forecast characteristics. Similarly, a first set may be associated with a resistance to a treatment, and a second set may be associated with a responsiveness to the same treatment.
[0091] The trained algorithm may also process additional clinical health data of the subject. For example, additional clinical health data may comprise the gender, weight, height, or levels of metabolites
or antibodies of the subject. Additional clinical health data may comprise indication of other diseases, disorders, or diseases conditions that the subject may suffer from. By using the additional clinical health data, in conjunction with the biomarkers, the trained algorithm may output one or more disease forecast estimations or one or more cancer type or subtype determinations, that may be different from the output of an algorithm that does not process additional clinical health.
[0092] The trained algorithm may be an unsupervised machine learning algorithm. For example, the unsupervised machine learning algorithm may utilize cluster analysis to identify attributes of interest. The trained algorithm may be a supervised machine learning algorithm. For example, the algorithm may be inputted with training data such to generate an expected or desired output. The supervised learning algorithm may comprise a deep learning algorithm, a support vector machine (SVM), a neural network, or Random Forest algorithms. Via the machine learning algorithm, the trained algorithm may be able to identify relationships of nucleosome profdes to particular cancer prognoses or types or subtypes. Without the trained algorithm, it may otherwise difficult to identify relationships of the nucleosome profiles to cancer types or subtypes. Without the trained algorithm, it may otherwise difficult to identify relationships of the nucleosome profiles to prognosis or disease forecast characteristics.
[0093] In various aspects, the systems and methods may comprise a accuracy, sensitivity, or specificity of determination of type or subtype of cancer or prognosis or disease forecast. For example, the methods or systems may comprise in the subject determination of type or subtype of cancer or prognosis or disease forecast at an accuracy of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%. The methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a sensitivity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%. The methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a specificity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.The methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a positive predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%. The methods or systems may comprise determination of type or subtype of cancer or prognosis or disease forecast in the subject at a negative predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
[0094] In some embodiments, performing nucleosome profding at methylation-loss regions, can increase the sensitivity and specificity of the assays. In some embodiments, performing nucleosome profiling at methylation-loss regions, can increase the sensitivity and specificity of the assays by
isolating and eliminating fragments from analysis that do not show methylation changes between cancer samples and normal samples.
Computer Control Systems
[0095] The present disclosure provides computer systems that are programmed to implement methods of the disclosure. FIG. 14 shows a computer system 1401 that is programmed or otherwise configured to perform analysis or steps of the methods, for example determine a likelihood of the presence of a cancer based on a set of biomarkers of an individual or run an algorithm. The computer system 1401 can regulate various aspects of methods and systems of the present disclosure, such as, for example, perform an algorithm, input training data, analyze sets of biomarker, or output a result for the user as to the presence or absence of cancer. The computer system 1401 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device. The electronic device can be a mobile electronic device.
[0096] The computer system 1401 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 1405, which can be a single core or multi core processor, or a plurality of processors for parallel processing. The computer system 1401 also includes memory or memory location 1410 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 1415 (e.g., hard disk), communication interface 1420 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 1425, such as cache, other memory, data storage and/or electronic display adapters. The memory 1410, storage unit 1415, interface 1420 and peripheral devices 1425 are in communication with the CPU 1405 through a communication bus (solid lines), such as a motherboard. The storage unit 1415 can be a data storage unit (or data repository) for storing data. The computer system 1401 can be operatively coupled to a computer network (“network”) 1430 with the aid of the communication interface 1420. The network 1430 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet. The network 1430 in some cases is a telecommunication and/or data network. The network 1430 can include one or more computer servers, which can enable distributed computing, such as cloud computing. The network 1430, in some cases with the aid of the computer system 1401, can implement a peer-to-peer network, which may enable devices coupled to the computer system 1401 to behave as a client or a server.
[0097] The CPU 1405 can execute a sequence of machine-readable instructions, which can be embodied in a program or software. The instructions may be stored in a memory location, such as the memory 1410. The instructions can be directed to the CPU 1405, which can subsequently program or otherwise configure the CPU 1405 to implement methods of the present disclosure. Examples of operations performed by the CPU 1405 can include fetch, decode, execute, and writeback.
[0098] The CPU 1405 can be part of a circuit, such as an integrated circuit. One or more other components of the system 1401 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).
[0099] The storage unit 1415 can store files, such as drivers, libraries and saved programs. The storage unit 1415 can store user data, e.g., user preferences and user programs. The computer system
1401 in some cases can include one or more additional data storage units that are external to the computer system 1401, such as located on a remote server that is in communication with the computer system 1401 through an intranet or the Internet.
[0100] The computer system 1401 can communicate with one or more remote computer systems through the network 1430. For instance, the computer system 1401 can communicate with a remote computer system of a user (e.g., a medical professional or patient). Examples of remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants. The user can access the computer system 1401 via the network 1430.
[0101] Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 1401, such as, for example, on the memory 1410 or electronic storage unit 1415. The machine executable or machine readable code can be provided in the form of software. During use, the code can be executed by the processor 1405. In some cases, the code can be retrieved from the storage unit 1415 and stored on the memory 1410 for ready access by the processor 1405. In some situations, the electronic storage unit 1415 can be precluded, and machine-executable instructions are stored on memory 1410.
[0102] The code can be pre-compiled and configured for use with a machine having a processer adapted to execute the code, or can be compiled during runtime. The code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as- compiled fashion.
[0103] Aspects of the systems and methods provided herein, such as the computer system 1401, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Machineexecutable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk. “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms
such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
[0104] Hence, a machine readable medium, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
[0105] The computer system 1401 can include or be in communication with an electronic display 1435 that comprises a user interface (UI) 1440 for providing, for example, an input of biomarkers or sequencing data, or an visual output relating to a detection, diagnosis, or prognosis. Examples of UFs include, without limitation, a graphical user interface (GUI) and web-based user interface.
[0106] Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processing unit 1405. The algorithm can, for example, determine a presence or absence of a cancer or cancer parameter based on a set of input sequencing data from a sample derived from a subject.
EXAMPLES
[0107] The following examples are provided to further illustrate some embodiments of the present disclosure, but are not intended to limit the scope of the disclosure; it will be understood by their exemplary nature that other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used.
Example 1: Nucleosome Profiling for Disease Forecasting at Androgen Receptor Binding
Sites (ARBS)
[0108] Using methods and systems of the present disclosure, nucleosome profiling was utilized for disease forecasting by assaying samples as illustrated in the general exemplary method shown in FIG. 1.
As shown in FIG. 1, a sample such as blood-derived plasma was assayed using plasma isolation and DNA extraction methods. Library preparation was then performed. Sequencing was performed on the extracted DNA of the sample. The sample was further analyzed using a Low-Pass Whole Genome Sequencing (LP-WGS) pipeline to generate genome-wide copy number burden classification and nucleosome profiling.
[0109] As illustrated in FIG. 2A, nucleosome profiling at androgen receptor (AR) binding sites was compared for samples of patients, with some patients having metastatic castration-resistant prostate cancer (mCRPC), and nucleosome profiles at the AR binding sites of normal patients not having cancer. Distance from AR binding sites, measured as number of base pairs, was correlated with coverage of the cfDNA fragments of the sample.
[OHO] As shown in FIG. 2B, nucleosome profiling at estrogen receptor (ER) binding sites was compared for samples of the same patients having metastatic castration-resistant prostate cancer (mCRPC), and nucleosome profiles at the ER binding sites of normal patients not having cancer.
[OHl] Nucleosome profiling was also performed at ER binding sites, as shown in FIG. 2C, comparing patients with estrogen receptor positive (ER+) breast cancer to nucleosome profiles at the ER binding sites of normal patients not having cancer. The normal patients were a normal plasma background control.
[0H2] Nucleosome profiling was performed for larger groups of subjects as well. Nucleosome profiling was performed on 103 subjects having mCRPC and 69 normal subjects. For example FIG. 3A illustrates comparisons of overall ARBS nucleosome profiling between 103 samples from subjects having mCRPC and 69 samples from normal subjects not having cancer.
[0113] Nucleosome profiling abnormality scores were generated for normal plasma background samples and mCRPC samples. The nucleosome profiling abnormality scores were plotted as shown in FIG. 3B. The nucleosome profiling abnormality scores were obtained by quantifying sample-level ARBS scores. This quantification was performed by comparing each sample ARBS centric fragment coverage (±60bp) with normal plasma background centric fragment coverage (±60bp). Z-scores were calculated for each centric fragment coverage each sample ARBS and each normal plasma background sample. The Z-scores of ARBS samples were compared with Z scores of normal samples to generate the nucleosome profiling abnormality scores.
[0H4] The nucleosome profiling abnormality scores were then correlated with LP-WGS copy number burden scores (CNB) as illustrated in FIG. 3C to generate an estimation of tumor fraction.
[0H5] As illustrated in FIG. 3D, a value indicating a threshold for cancer detection can be generated from the nucleosome profiling abnormality scores correlated with LP-WGS CNB scores.
[0H6] Nucleosome profiles at various transcription factor binding sites comprising the selected biomarkers were compared and correlated with LP-WGS copy number variation profiles (CNV) for various subjects having mCRPC. The transcription binding sites were ASCL-1 binding sites. FIG. 4A shows estimated tumor fraction generated from an LP-WGS CNV profile from a first subject having
mCRPC. FIG. 4B shows estimated tumor fraction generated from an LP-WGS CNV profile from a second subject having mCRPC.
[0117] Nucleosome profiling at ASCL-1 binding sites between samples were mapped as distance from ASCL-1 binding sites (in base pairs) corresponding to coverage of cfDNA fragments. As illustrated in FIG. 4C, nucleosome profiling was compared between a first sample and a normal plasma background at ASCL-1 binding sites. As illustrated in FIG. 4D, nucleosome profiling was compared between a second sample and a normal plasma background at ASCL-1 binding sites.
[0118] DNA methylation was also mapped to the samples, as shown in FIG. 5A, AR binding sites were mapped to DNA methylation beta value distribution in normal plasma samples. As shown in FIG. 5B, AR binding sites were mapped to DNA methylation beta value distribution in samples having mCRPC, in this case 16 mCRPC prostate cancer samples.
[0H9] As illustrated in FIG. 5C, DNA methylation beta values and ARBS nucleosome profiling abnormality scores were mapped together for mCRPC samples.
[0120] ARBS nucleosome profiling abnormality scores were taken and mapped to clinical outcomes for mCRPC prostate cancer samples. For example, FIG. 6A illustrates plotted ARBS nucleosome profiling abnormality scores grouped by subjects with response or no response to a treatment or selected treatments. Similarly, FIG. 6B shows mCRPC grouped data of ARBS nucleosome profiling abnormality scores mapped to overall survival time.
[0121] Data before and after treatment for mCRPC subjects was also recorded and analyzed. As illustrated in FIG. 6C, LP-WGS CNV profiles of an mCRPC patient was mapped before treatment with androgen receptor pathway inhibitors (ARPls). As illustrated in FIG. 6D, LP-WGS CNV profiles of an mCRPC patient was mapped after treatment with androgen receptor pathway inhibitors (ARPls). Nucleosome profiling of the patient was performed at AR binding sites before and after ARP1 treatment and these data were mapped together as illustrated in FIG. 6E.
Example 2: Nucleosome Profiling for Disease Forecasting Using cfDNA Fragmentomics in Neuroendocrine Prostate Cancer (NEPC)
[0122] U sing methods and systems of the present disclosure, nucleosome profiling was performed at various selected biomarkers comprising transcription factor binding sites, such as AR binding sites and ASCL-1 binding sites in NEPC samples. The Griffin framework was applied to classify tumor subtypes using nucleosome profiling of cancer-specific transcription factor binding sites (TFBS) and tumor subtype-specific chromatin accessibility regions from low-pass whole genome sequencing data of cfDNA. AR and ASCL-1 binding sites were utilize for cancer subtyping to distinguish between androgen receptor-dependent prostate cancer (ARPC) and NEPC. ASCL1 binding sites together with AR binding sites were used for prostate cancer subtyping to distinguish between androgen receptor dependent prostate cancer (ARPC) and neuroendocrine prostate cancer (NEPC). ER and ERBB2 were used for breast cancer subtyping to distinguish between ER positive and ER negative tumor subtypes.
[0123] As illustrated in FIG. 7A, nucleosome profiling at AR binding sites was compared between NEPC subjects and normal plasma samples. As illustrated in FIG. 7B, nucleosome profiling at ASCL-1 binding sites was compared between NEPC subjects and normal plasma background samples. Samplelevel ARBS and ASCL-1 binding site abnormality scores were quantified by comparing centric fragment coverage for each NEPC sample (±60bp) to the centric fragment coverage for each normal sample (±60bp).
[0124] As illustrated in FIG. 7C, ARBS binding site abnormality scores were simulated in vitro as associated with different tumor fractions by blending NEPC tumor and normal samples having various different titrations. Shown in FIG. 7D, ASCL-1 binding site abnormality scores were simulated in vitro as associated with different tumor fractions by blending NEPC tumor and normal samples having various different titrations.
[0125] Nucleosome profiling was performed on over 1000 mCRPC samples and 42 normal plasma background samples as illustrated in the overall ARBS nucleosome profiling data shown in FIG. 8A. As illustrated in FIG. 8B, overall ASCL-1 nucleosome profiling was also compared for the over 1000 mCRPC samples and 42 normal plasma background samples.
[0126] As illustrated in FIG. 8C, ARBS and ASCL-1 binding site abnormality scores were mapped for mCRPC samples. As shown in FIG. 8D, ARBS binding site abnormality scores were mapped with LP- WGS inferred tumor fraction values.
[0127] As shown in FIG. 9A, nucleosome profiling was performed at ASCL-1 binding sites for 19 samples having mCRPC and compared with normal plasma background samples. As shown in FIG. 9B, tumor fraction score was mapped with ASCL-1 binding site abnormality scores for the 19 samples having mCRPC and normal plasma background samples.
[0128] As shown in FIG. 9C, ARBS binding site abnormality scores were mapped with ASCL-1 binding site abnormality scores for the 19 samples having mCRPC.
[0129] As illustrated in FIG. 9D, nucleosome profiling was performed at ASCL-1 binding sites for the same patient before treatment with chemotherapy and after treatment with chemotherapy, and these conditions were mapped.
[0130] As shown in FIG. 10A, DNA methylation was mapped to AR binding sites, and DNA methylation bata value distribution of normal plasma samples was plotted. As illustrated in FIG. 10B, DNA methylation was mapped to ASCL-1 binding sites, and DNA methylation bata value distribution of normal plasma samples was plotted.
[0131] As shown in FIG. 10C, estrogen receptor (ER) binding site DNA methylation was mapped and the DNA methylation beta value distribution plotted for normal plasma samples.
[0132] As shown in FIG. 11 A, estrogen receptor positive (ER+) AT AC nucleosome profiling was performed with ER+ breast cancer samples and 42 normal plasma background samples. As shown in FIG. 11B, estrogen receptor negative (ER-) AT AC nucleosome profiling was performed with ER- breast cancer samples and 42 normal plasma background samples.
[0133] As illustrated in FIG. 11C, tumor fraction was plotted and compared for ER+ breast cancer and ER- breast cancer groups. As illustrated in FIG. 11D, copy number burden score was plotted and compared for ER+ breast cancer and ER- breast cancer groups.
[0134] As illustrated in FIG. 12A, AR binding site nucleosome profiling abnormality score was plotted with inferred ARBS scores derived from DNA methylation profiles analyzed using LP-WGS analysis.
[0135] As illustrated in FIG. 12B, DNA methylation and ARBS nucleosome profiling abnormality scores were mapped for 54 mCRPC samples.
[0136] As illustrated in FIG. 12C, AR binding site methylation beta values were mapped in two mCRPC subjects and normal plasma samples.
[0137] As illustrated in FIGs. 13A-13D, AR binding sites were partitioned into two groups, where the first group were hypo-methylated in prostate cancer samples, and the second group did not have significant methylation changes compared with normal plasma background. ARBS nucleosome profiling signals of the two groups were compared for the same LP-WGS samples. The nucleosome profiling of the first sample is illustrated in FIG. 13A. The nucleosome profiling of the second sample is illustrated in FIG. 13B. The nucleosome profiling of the third sample is illustrated in FIG. 13C. The nucleosome profiling of the fourth sample is illustrated in FIG. 13D.
[0138] Additionally, nucleosome profiling can be applied on genome-wide DNA methylation profiles. This could provide horizontal beta information at a fragment level. For any TFBS of interest, hypo- methylated fragments can be selected first and isolated to use for nucleosome profiling analysis, and then optionally normalized.
[0139] While preferred embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure. It should be understood that various alternatives to the embodiments of the present disclosure may be employed in practicing the present disclosure. It is intended that the following claims define the scope of the present disclosure and that methods and structures within the scope of these claims and their equivalents be covered thereby.
Claims
1. A method for determining a disease forecast of a subject having cancer, comprising: a. generating a nucleosome profile relating to a biological sample obtained or derived from the subject, wherein the nucleosome profile is generated relative to a selected biomarker; b. determining a nucleosome profiling abnormality score, comprising relating the nucleosome profile of the subject sample to one or more reference nucleosome profiles; c. generating the disease forecast based at least in part on the nucleosome profiling abnormality score, wherein the disease forecast comprises one or more disease forecast characteristics.
2. The method of claim 1, wherein the one or more disease forecast characteristics comprise a cancer prognosis relating to the subject.
3. The method of claim 1 or 2, wherein the disease forecast characteristics comprise one or more of: an estimated survival time of the subject without a treatment intervention, an estimated survival time of the subject with a treatment intervention, determination of a type of the cancer, determination of a subtype of the cancer, determination of one or more clinical outcomes, or predicted treatment response of the subject to one or more treatments, or any combination thereof.
4. The method of any one of claims 1-3, wherein the biological sample comprises cell-free deoxyribonucleic acid (cfDNA) molecules.
5. The method of any one of claims 1-4, wherein the biological sample comprises one or more of: a plasma sample, a serum sample, a red blood cell sample, a urine sample, a saliva sample, pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any derivative thereof, and any combination thereof.
6. The method of any one of claims 1-5, wherein the biological sample comprises the plasma sample.
7. The method of any one of claims 1-5, wherein the biological sample comprises the urine sample.
8. The method of any one of claims 4-7, wherein the cfDNA molecules are obtained or derived from a single biological sample of the subject.
9. The method of any one of claims 4-7, wherein the cfDNA molecules are obtained or derived from different biological samples of the subject.
10. The method of any one of claims 1-9, wherein the biological sample is obtained or derived from the subject using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free deoxyribonucleic acid (DNA) collection tube, other blood collection tube, and CTC collection tubes.
11 . The method of any one of claims 1-10, further comprising assaying the biological sample to generate the nucleosome profile.
12. The method of claim 11, wherein assaying the biological sample comprises subjecting said biological sample to conditions that are sufficient to isolate, enrich, or extract the cfDNA molecules.
13. The method of claim 11 or 12, further comprising fractionating a whole blood sample of the subject to obtain the cfDNA molecules.
14. The method of any one of claims 11-13, wherein assaying the biological sample further comprises assaying the cfDNA molecules using nucleic acid sequencing to produce nucleic acid sequencing reads.
15. The method of claim 14, wherein the nucleic acid sequencing further comprises DNA sequencing.
16. The method of claim 15, wherein the DNA sequencing comprises one or more of: next-generation sequencing, whole genome sequencing, low-pass sequencing, targeted sequencing, whole exome sequencing, methylation-aware sequencing, or bisulfite sequencing, or a combination thereof.
17. The method of claim 15, wherein the DNA sequencing comprises low-pass whole genome sequencing.
18. The method of claim 15, wherein the DNA sequencing comprises whole exome sequencing.
19. The method of claim 15, wherein the DNA sequencing further comprises nucleic acid amplification.
20. The method of claim 19, wherein the nucleic acid amplification comprises polymerase chain reaction (PCR) or isothermal amplification.
21 . The method of any one of claims 11-20, wherein at least one of the cfDNA molecules are assayed using a polymerase chain reaction (PCR) assay, microarray, or a isothermal amplification.
22. The method of any one of claims 2-21, wherein the type of the cancer of the subject comprises one or more of: lung cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney
cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
23. The method of any one of claims 2-21, wherein the type of the cancer of the subject comprises prostate cancer.
24. The method of claim 23, wherein the subtype of the prostate cancer comprises one or more of: hormone sensitive prostate cancer (HSPC), castration-resistant prostate cancer (CRPC), androgen receptor-dependent prostate cancer (ARPC), metastatic prostate cancer, metastatic castration-resistant prostate cancer (mCRPC), neuroendocrine prostate cancer (NEPC), or any combination thereof.
25. The method of any one of claims 2-21, wherein the type of cancer comprises breast cancer.
26. The method of claim 25, wherein the subtype of the breast cancer comprises estrogen receptorpositive (ER+) breast cancer or estrogen receptor-negative (ER-) breast cancer.
27. The method of any one of claims 1-26, wherein the subject is asymptomatic for the cancer.
28. The method of any one of claims 1-27, wherein the selected biomarker comprises a selected binding site.
29. The method of any one of claims 1-28, wherein the selected biomarker comprises a transcription factor binding site.
30. The method of claim 28 or 29, wherein the selected biomarker comprises one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor 1 (ASCL-1) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
31 . The method of any one of claims 4-30, wherein the nucleosome profile is generated based at least in part on distance of one or more selected subsets of the cfDNA molecules to the selected biomarker.
32. The method of claim 31, wherein the distance comprises the number of base pairs between each of the one or more selected subsets of the cfDNA molecules and the selected biomarker.
33. The method of claim 31 or 32, wherein the nucleosome profile is further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules.
34. The method of claim 33, wherein the nucleosome profile is further generated based at least in part on the coverage of the one or more selected subsets of the cfDNA molecules associated with the distance of the one or more selected subsets of the cfDNA molecules from the selected biomarker.
35. The method of 33 or 34, wherein determining the nucleosome profding abnormality score further comprises comparing the coverage of the one or more selected subsets of the cfDNA molecules to one or more coverage values of the one or more reference nucleosome profdes.
36. The method of claim 35, wherein determining the nucleosome profding abnormality score further comprises determining a Z-score of the coverage of the one or more selected subsets of the cfDNA molecules.
37. The method of claim 36, wherein determining the nucleosome profding abnormality score further comprises mapping the coverage Z-score of the one or more selected subsets of the cfDNA molecules to a coverage Z-score of each of the one or more reference nucleosome profdes.
38. The method of any one of claims 31-37, wherein the one or more selected subsets of the cfDNA molecules comprise fragments of the cfDNA molecules.
39. The method of any one of claims 1-38, wherein generating the disease forecast further comprises mapping the nucleosome profding abnormality score to the one or more disease forecast characteristics.
40. The method of any one of claims 1-39, wherein the one or more reference nucleosome profdes are associated with one or more disease forecast characteristics.
41 . The method of claim 40, wherein the one or more reference nucleosome profdes are associated with a likelihood of occurrence of one or more disease forecast characteristics.
42. The method of any one of claims 1-41, wherein one or more of the reference nucleosome profdes comprise data from a sample not having the disease (normal).
43. The method of any one of claims 1-42, further comprising mapping DNA methylation patterns of the biological sample.
44. The method of claim 43, further comprising associating the mapped DNA methylation patterns with the nucleosome profde data of the biological sample.
45. The method of any one of claims 31-44, wherein the one or more selected subsets of the cfDNA molecules are selected based at least in part on DNA methylation data mapped to the cfDNA molecules.
46. The method of any one of claims 1-45, further comprising generating the disease forecast based at least in part on copy number variation data, or sequencing mutation data, or both, associated with the sample.
47. A method for determining a subtype of a cancer of a subject, comprising:
a. generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject, wherein the nucleosome profile is generated relative to one or more selected biomarkers; b. mapping one or more characteristics of the nucleosome profile of the biological sample to one or more corresponding characteristics of one or more reference nucleosome profiles to generate a selected subset of reference nucleosome profiles; c. obtaining data relating to the cancer subtypes associated with each reference nucleosome profile of the selected subset of reference nucleosome profiles; and d. determining the subtype of the cancer of the subject based at least in part on the cancer subtypes associated with each of the reference nucleosome profiles.
48. The method of claim 47, further comprising determining a tumor fraction of the cfDNA molecules;
49. The method of claim 48, further comprising associating the tumor fraction with a nucleosome profiling abnormality score.
50. The method of claim 49, wherein the nucleosome profiling abnormality score is associated with the one or more selected biomarkers.
51 . The method of any one of claims 47-50, wherein the one or more selected biomarkers comprise a binding site associated with a type of the cancer.
52. The method of any one of claims 47-51, wherein the selected biomarker comprises a transcription factor binding site associated with a type of the cancer.
53. The method of claim 51 or 52, wherein the selected biomarker comprises one or more of: androgen receptor binding sites (ARBS), Achaete-Scute Family BHLH Transcription Factor 1 (ASCL-1) binding sites, estrogen receptor (ER) binding sites, or ErbB receptor binding sites, or any combination thereof.
54. The method of any one of claims 47-53, wherein at least a portion of the reference nucleosome profiles comprise non-disease profiles (normal).
55. The method of any one of claims 47-53, wherein at least a portion of the reference nucleosome profiles comprise a cancer condition having a cancer type matching the type of the cancer of the subject.
56. A method for estimating a response of a subject having cancer to one or more treatments, comprising:
a. generating a nucleosome profile relating to cfDNA molecules derived from a biological sample obtained or derived from the subject; b. determining one or more characteristics of the cancer of the subject based at least in part on the nucleosome profile of the subject; and c. generating a treatment response determination for the subject based at least in part on the one or more characteristics of the cancer of the subject.
57. The method of claim 56, wherein the treatment response determination comprises one or more of: a treatment plan, a value representing likelihood of one or more treatment responses, a binary treatment response indicator, a probability value for each treatment response, or any combination thereof.
58. The method of claim 56 or 57, wherein the one or more characteristics of the cancer of the subject comprises a cancer type, a cancer subtype, an estimate of cancer progression, a prognosis of the subject without treatment intervention, a prognosis of the subject with treatment intervention, or any combination thereof.
59. A system for determining a disease forecast of a subject having cancer, the system comprising: a. a memory; and b. one or more processors configured to execute machine-readable instructions which, when executed, cause the one or more processors to perform a method comprising: i. generating a nucleosome profile of a biological sample obtained or derived from the subject, ii. mapping the nucleosome profile to one or more reference nucleosome profiles, iii. determining one or more disease forecast characteristics of the biological sample based at least in part on the mapping, and iv. generating the disease forecast based at least in part on the one or more disease forecast characteristics.
60. The system of claim 59, wherein the one or more disease forecast characteristics comprise one or more of: a type of the cancer, a subtype of the cancer, a prognosis of the subject, an estimated survival time of the subject without treatment intervention, an estimated survival time of the subject with treatment intervention, an estimation of the subject’s response to one or more treatments, or any combination thereof.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463624537P | 2024-01-24 | 2024-01-24 | |
| US63/624,537 | 2024-01-24 | ||
| US202463631139P | 2024-04-08 | 2024-04-08 | |
| US63/631,139 | 2024-04-08 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025160265A1 true WO2025160265A1 (en) | 2025-07-31 |
Family
ID=96545721
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/012735 Pending WO2025160265A1 (en) | 2024-01-24 | 2025-01-23 | Systems and methods of fragmentomics analysis in cancer |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025160265A1 (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190287645A1 (en) * | 2016-07-06 | 2019-09-19 | Guardant Health, Inc. | Methods for fragmentome profiling of cell-free nucleic acids |
| US20200131571A1 (en) * | 2018-05-18 | 2020-04-30 | The Johns Hopkins University | Cell-free dna for assessing and/or treating cancer |
-
2025
- 2025-01-23 WO PCT/US2025/012735 patent/WO2025160265A1/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190287645A1 (en) * | 2016-07-06 | 2019-09-19 | Guardant Health, Inc. | Methods for fragmentome profiling of cell-free nucleic acids |
| US20200131571A1 (en) * | 2018-05-18 | 2020-04-30 | The Johns Hopkins University | Cell-free dna for assessing and/or treating cancer |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20220351805A1 (en) | Systems and methods for detecting cellular pathway dysregulation in cancer specimens | |
| JP2023524627A (en) | Methods and systems for detecting colorectal cancer by nucleic acid methylation analysis | |
| JP2021521536A (en) | Machine learning implementation for multi-sample assay of biological samples | |
| DeGroat et al. | Multimodal AI/ML for discovering novel biomarkers and predicting disease using multi-omics profiles of patients with cardiovascular diseases | |
| US20210358626A1 (en) | Systems and methods for cancer condition determination using autoencoders | |
| AU2020221845A1 (en) | An integrated machine-learning framework to estimate homologous recombination deficiency | |
| US20210166813A1 (en) | Systems and methods for evaluating longitudinal biological feature data | |
| Bondar et al. | Comparison of whole blood and peripheral blood mononuclear cell gene expression for evaluation of the perioperative inflammatory response in patients with advanced heart failure | |
| US20230348980A1 (en) | Systems and methods of detecting a risk of alzheimer's disease using a circulating-free mrna profiling assay | |
| US20250051846A1 (en) | Genome-wide classifiers for detection of subacute transplant rejection and other transplant conditions | |
| WO2022212590A1 (en) | Systems and methods for multi-analyte detection of cancer | |
| Li et al. | Extended application of genomic selection to screen multiomics data for prognostic signatures of prostate cancer | |
| US20220213558A1 (en) | Methods and systems for urine-based detection of urologic conditions | |
| US20240076744A1 (en) | METHODS AND SYSTEMS FOR mRNA BOUNDARY ANALYSIS IN NEXT GENERATION SEQUENCING | |
| WO2025160265A1 (en) | Systems and methods of fragmentomics analysis in cancer | |
| CN116479123A (en) | Application of m7G related lncRNA as biomarker in liver cancer prognosis or treatment response prediction, product and system | |
| WO2025213034A1 (en) | Systems and methods for multiple biomarker analysis in cancer | |
| WO2025085495A1 (en) | Methods and compositions for generation of sequencing libraries | |
| WO2024173242A2 (en) | Systems and methods for minimal residual disease analysis | |
| EP4599091A1 (en) | Systems and methods for multi-analyte detection of cancer | |
| TW202331734A (en) | Methylation biomarker selection apparatuses and methods | |
| EP4652296A1 (en) | Methods and systems for detecting and assessing liver conditions |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25745699 Country of ref document: EP Kind code of ref document: A1 |