[go: up one dir, main page]

WO2025217055A1 - Systèmes et procédés de détection d'une maladie à l'aide d'un profilage de méthylation et d'une identification de tissu - Google Patents

Systèmes et procédés de détection d'une maladie à l'aide d'un profilage de méthylation et d'une identification de tissu

Info

Publication number
WO2025217055A1
WO2025217055A1 PCT/US2025/023477 US2025023477W WO2025217055A1 WO 2025217055 A1 WO2025217055 A1 WO 2025217055A1 US 2025023477 W US2025023477 W US 2025023477W WO 2025217055 A1 WO2025217055 A1 WO 2025217055A1
Authority
WO
WIPO (PCT)
Prior art keywords
methylation
cancer
disease state
sample
individual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/023477
Other languages
English (en)
Inventor
Pan DU
Giancarlo BONORA
Shidong JIA
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Predicine Inc
Original Assignee
Predicine Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Predicine Inc filed Critical Predicine Inc
Publication of WO2025217055A1 publication Critical patent/WO2025217055A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof

Definitions

  • Genetic differences of an individual can indicate the presence or absence of a disease of an individual, and can help diagnose a disease. There can be many types or subtypes of disease, such as many types or subtypes of cancer. Differences in genetic information can be utilized in identifying a type or subtype of a disease, such as cancer. These differences in genetic information can include epigenetic information, such as methylation of genetic samples. Identifying and determining these genetic differences can be important in determining a type or subtype of a disease, that may be easily misdiagnosed.
  • a method for determining a disease state of an individual, the method comprising: (a) receiving sequencing data for one or more cell-free nucleic acid fragments, wherein the one or more cell-free nucleic acid fragments are obtained or derived from a biological sample of the individual; (b) determining a methylation profile for the individual comprising one or more methylation features of the plurality of cell-free nucleic acid fragments; (c) identifying one or more abnormal patterns of the one or more methylation features of the methylation profile of each cell-free nucleic acid fragment of the plurality of cell-free nucleic acid fragments (i) as compared to one or more reference methylation profiles or (ii) using a trained machine learning model; and (d) generating, an indication of the disease state of the individual based at least in part on the identified one or more abnormal patterns, wherein the indication is generated by the trained machine learning model, and wherein the disease state comprises a malignancy and a tissue of origin
  • the method further comprises identifying in (c) one or more abnormal DNA fragments of the plurality of cell-free nucleic acid fragments based at least in part on the identified one or more abnormal patterns of the one or more methylation features of the methylation profile.
  • the method further comprises identifying the abnormal DNA fragments by comparing the methylation profile of the one or more abnormal DNA fragments with the one or more reference methylation profiles.
  • the method further comprises quantifying an amount of identified abnormal DNA fragments. In some embodiments, the method further comprises quantifying one or more cancer signals based at least in part on the quantified amount of identified abnormal DNA fragments. In some cases, the one or more cancer signals comprise tumor fractions.
  • the method further comprises determining tissue of origin information of the one or more cancer signals. In some embodiments, the method further comprises predicting the disease state of the individual based at least in part on the tissue of origin information and the quantified one or more cancer signals.
  • the biological sample of the individual is selected from the group consisting of: a DNA sample, an RNA sample, a plasma sample, a serum sample, a buffy coat sample, a peripheral blood mononuclear cell (PBMC) sample, a red blood cell sample, a urine sample, a urine cell pellet sample, a saliva sample, a tissue biopsy, a pleural fluid sample, a peritoneal fluid sample, an amniotic fluid sample, a cerebrospinal fluid sample, a lymphatic fluid sample, a sweat sample, a tear sample, a semen sample, or any derivative thereof, and any combination thereof.
  • the biological sample comprises a urine sample.
  • the urine sample comprises a cell-free urine sample.
  • the DNA sample comprises cell-free DNA (cfDNA).
  • the cfDNA comprises urinary cfDNA (ucfDNA).
  • the method further comprises obtaining or deriving the biological sample of the individual prior to the individual undergoing one or more transurethral resection procedures. In some cases, (a) further comprises performing DNA extraction on the biological sample of the subject. In some cases, (a) further comprises constructing a library comprising the received sequencing data and epigenetic data for the one or more nucleotide fragments. In some cases, the epigenetic data comprises the one or more methylation features.
  • the one or more methylation features comprise one or more of: methylation pattern data, tissue-of-origin deconvolution data, or fragment-level beta values, or any combination thereof.
  • the trained machine learning model is trained using training data comprising histopathologic and cytopathologic data.
  • the method further comprises training the trained machine learning model to determine patterns of methylation quantities in one or more nucleotide fragments of control sample training data.
  • the control sample training data comprises methylation feature data and sequencing data of one or more nucleotide fragments of healthy individuals. In some cases, the healthy individuals do not have the disease state.
  • control sample training data comprises methylation feature data and sequencing data of one or more nucleotide fragments of individuals having the disease state.
  • the individuals having the disease state are confirmed to have the disease state.
  • the disease state comprises having a cancer.
  • the cancer comprises one or more of: carcinomas, breast cancer, lung cancer, prostate cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
  • the cancer comprises urethral carcinoma.
  • the cancer comprises bladder cancer.
  • the bladder cancer comprises non-muscle invasive bladder cancer (NMIBC).
  • the method further comprises identifying, using the trained machine learning model, a tissue type of origin of the one or more nucleotide fragments.
  • identifying the tissue type of origin comprises performing tissue-of-origin deconvolution using the trained machine learning model.
  • the method further comprises comparing one or more methylation features of one or more nucleotide fragments of the determined tissue type of origin to one or more reference nucleotide fragments of the determined tissue type of origin.
  • the reference methylation profile data of (c) comprises one or more methylation features of one or more nucleotide fragments of (i) healthy control individuals, (ii) individuals having a benign cancer, or (iii) individuals having a malignant cancer, or any combination of (i)-(iii).
  • (c) further comprises comparing one or more methylation features of one or more localized regions of the one or more nucleotide fragments of the individual to one or more corresponding methylation features of one or more corresponding localized regions of the reference methylation profile data.
  • (d) further comprises determining whether the disease state is a cancer disease state or a non-cancer disease state. In some cases, (d) further comprises determining whether the disease state is benign disease state or a malignant disease state. In some cases, (d) further comprises mapping, using the trained machine learning model, a pattern of quantified methylation amounts at one or more localized areas of the one or more nucleotide fragments.
  • the method further comprises determining the disease state based at least in part on the mapped pattern of methylation. In some embodiments, the method further comprises generating the indication of the disease state based at least in part on comparing one or more quantified methylation values of the mapped pattern of methylation to a threshold methylation value.
  • the threshold methylation value is a dynamic value. In some cases, the dynamic threshold methylation value is dynamically generated, using the trained machine learning model, based at least in part on training data, or feedback data, or both.
  • the method further comprises dynamically generating, using the trained machine learning model, the dynamic threshold methylation value for each of the one or more localized areas.
  • the method further comprises determining one or more disease states comprising (i) a non-cancer disease state, (ii) a cancer disease state, (iii) a benign cancer disease state, or (iv) a malignant cancer disease state based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • the sequencing data comprises next-generation sequencing (NGS) data.
  • NGS next-generation sequencing
  • the indication of the disease state of the individual is generated with an accuracy of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual is generated with a sensitivity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, more than about 85%, or more than about 90%.
  • the indication of the disease state of the individual is generated with a specificity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual is generated with an accuracy of 89%. In some cases, the indication of the disease state of the individual is generated with a sensitivity of 92%. In some cases, the indication of the disease state of the individual is generated with a specificity of 86%.
  • a system comprising one or more computer processors and computer memory coupled thereto, the computer memory comprising machine executable code that, upon execution by the one or more computer processors, implements a method for determining a disease state of an individual, said method comprising: (a) receiving sequencing data for one or more nucleotide fragments, wherein the plurality of nucleotide fragments are obtained or derived from a biological sample of the individual; (b) determining a methylation profile for the individual comprising one or more methylation features of the one or more nucleotide fragments; (c) identifying one or more abnormal patterns of the one or more methylation features of the methylation profile as compared to reference methylation profile data; and (d) generating, using a trained machine learning model, an indication of the disease state of the individual based at least in part on the identified one or more abnormal patterns.
  • a system for determining a disease state of an individual comprising one or more computer processors, the one or more computer processors comprising: (a) a first machine learning model configured to determine a tissue type of origin for one or more nucleotide fragments obtained or derived from a biological sample of an individual, wherein the first machine learning model is configured to determine the tissue type of origin based performing tissue-of-origin deconvolution on sequencing data of the one or more nucleotide fragments; (b) a second machine learning model configured to generate the determination of the disease state of the individual based at least in part on (i) the sequencing data of the one or more nucleotide fragments and (ii) methylation profile pattern data of the one or more nucleotide fragments.
  • a system for determining a disease state of an individual comprising one or more computer processors, the one or more computer processors comprising: (a) an interface configured to receive sequencing data for one or more cell-free nucleic acid fragments, wherein the one or more cell-free nucleic acid fragments are obtained from a biological sample of the individual or derived from a biological sample of the individual; (b) a processor configured to determine a methylation profile for the biological sample of the individual, which methylation profile comprises one or more methylation features of the plurality of cell-free nucleic acid fragments; (c) a machine learning model trained to: (i) determine one or more abnormal patterns of the one or more methylation features of the methylation profile as compared to one or more reference methylation profiles, and (ii) generate an indication of the disease state of the individual based at least in part on the one or more abnormal patterns identified in (c), wherein the disease state comprises a malignancy of a tissue of the
  • Another aspect of the present disclosure provides a non-transitory computer readable medium comprising machine executable code that, upon execution by one or more computer processors, implements any of the methods above or elsewhere herein.
  • Another aspect of the present disclosure provides a system comprising one or more computer processors and computer memory coupled thereto.
  • the computer memory comprises machine executable code that, upon execution by the one or more computer processors, implements any of the methods above or elsewhere herein.
  • FIG. 1 illustrates an exemplary workflow for deoxyribonucleic acid (DNA) extraction, preparation, assaying, and methylation analysis.
  • FIG. 2A shows exemplary data relating to quantification of differentially methylated DNA fragments.
  • FIG. 2B illustrates exemplary data analysis relating to DNA methylation abnormality compared to estimated tumor fraction.
  • FIG. 3A shows exemplary data relating to bladder tissue of origin proportion in cancer vs non-cancer samples.
  • FIG. 3B illustrates exemplary data relating to fragment counts of differentially methylated DNA compared to bladder tissue of origin proportion.
  • FIG. 4 shows an exemplary workflow for a liquid biopsy-based methylation assay method.
  • FIG. 5A illustrates exemplary hierarchical clustering data of abnormally methylated fragments on a variety of genome locations.
  • FIG. 5B shows exemplary data of two different profiles for tissue of origin (TOO) deconvolution percentage values for normalized abnormally methylated fragments.
  • FIG. 6A illustrates exemplary data of the sensitivity, specificity, and AUC of a model disclosed herein.
  • FIG. 6B shows exemplary data relating to feature importance evaluations of various weights for abnormally methylated fragments and normal values.
  • FIG. 7 illustrates an example of a computing device; in this case, a device with one or more processors, memory, storage, and a network interface.
  • each of the expressions “at least one of A, B and C,” “at least one of A, B, or C,” “one or more of A, B, and C”, “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone,
  • the phrase “at most three” can mean less than one, one, two, or three.
  • the terms "subject,” “individual,” and “patient” may be used interchangeably and refer to humans, as well as non-human mammals (e.g., non-human primates, canines, equines, felines, porcines, bovines, ungulates, lagomorphs, rodents, and the like).
  • the subject can be a human (e.g., adult male, adult female, adolescent male, adolescent female, male child, female child) under the care of a physician or other health worker in a hospital, as an outpatient, or other clinical context.
  • the subject may not be under the care or prescription of a physician or other health worker.
  • the subject may be under the care of a dental professional.
  • treatment refers to an approach for obtaining beneficial or desired results with respect to a disease, disorder, or medical condition including, but not limited to, a therapeutic benefit and/or a prophylactic benefit.
  • treatment or treating involves administering a therapeutic to a subject.
  • a therapeutic benefit may include the eradication or amelioration of the underlying disorder being treated.
  • a therapeutic benefit may be achieved with the eradication or amelioration of one or more of the physiological symptoms associated with the underlying disorder, such as observing an improvement in the subject, notwithstanding that the subject may still be afflicted with the underlying disorder.
  • the present disclosure provides a method for determining a disease state of an individual.
  • the method can comprise receiving sequencing data for one or more cell-free nucleic acid fragments.
  • the one or more cell-free nucleic acid fragments can be obtained or derived from a biological sample of the individual.
  • the biological sample of the individual can be selected from the group consisting of: a deoxyribonucleic acid (DNA) sample, a ribonucleic acid (RNA) sample, a plasma sample, a serum sample, a buffy coat sample, a peripheral blood mononuclear cell (PBMC) sample, a red blood cell sample, a urine sample, a urine cell pellet sample, a saliva sample, a tissue biopsy, a pleural fluid sample, a peritoneal fluid sample, an amniotic fluid sample, a cerebrospinal fluid sample, a lymphatic fluid sample, a sweat sample, a tear sample, a semen sample, or any derivative thereof, and any combination thereof.
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • plasma sample a serum sample
  • a buffy coat sample a peripheral blood mononuclear cell (PBMC) sample
  • PBMC peripheral blood mononuclear cell
  • red blood cell sample a urine sample, a
  • the biological sample can comprise a urine sample.
  • the urine sample can comprise a cell-free urine sample.
  • the DNA sample can comprise cell-free DNA (cfDNA).
  • the cfDNA can comprise urinary cfDNA (ucfDNA).
  • the biological sample may comprise one or more nucleic acids.
  • the biological sample be a cell-free deoxyribonucleic acid (cfDNA) sample or a cell-free ribonucleic acid (cfRNA) sample.
  • the biological sample may comprise genomic DNA or germline DNA(gDNA).
  • the nucleic acid may be a DNA (e.g. double-stranded DNA, single- stranded DNA, singlestranded DNA hairpins, copy DNA (cDNA), genomic DNA, germline DNA, circulating tumor DNA (ctDNA), cell-free DNA (cfDNA), an RNA (e.g.
  • the biological sample may be a derived from or contain a biological fluid.
  • the biological sample may be a plasma sample, a serum sample, a buffy coat sample, a peripheral blood mononuclear cell (PBMC) sample, a red blood cell sample, a urine sample, a saliva sample, or other body fluid sample.
  • PBMC peripheral blood mononuclear cell
  • the biological sample may comprise or be a pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any combination of biological fluid.
  • the samples may comprise RNA and DNA.
  • a sample may comprise cfDNA and cfRNA.
  • the method can further comprise obtaining or deriving the biological sample of the individual prior to the individual undergoing one or more transurethral resection procedures.
  • the biological sample may be collected, obtained, or derived from said subject using a collection tube.
  • the collection tube may be an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free deoxyribonucleic acid (DNA) collection tube and circulating tumor cell (CTC) collection tubes, or other blood collection tube.
  • the collection tube may comprise additional reagents for stabilizing the nucleic acid molecules or blood cells.
  • the collection tube may allow the nucleic acid or blood cells to be stable such to minimize degradation of the biological sample prior to assaying.
  • the additional reagents may comprise buffer salts or chelators.
  • the biological sample may be obtained or derived from a subject at a various times.
  • the biological sample may be obtained or derived from a subject prior to the subject receiving a therapy for cancer.
  • the biological sample may be obtained or derived from a subject during receiving a therapy for cancer.
  • the biological sample may be obtained or derived from a subject after receiving a therapy for cancer.
  • the biological sample may be collected over 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or time points.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more hour period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more day period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more week period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more month period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more year period.
  • receiving sequencing data can further comprise performing DNA extraction on the biological sample of the subject.
  • the biological samples may be subjected to additional reactions or conditions prior to assaying.
  • the biological sample may be subjected to conditions that are sufficient to isolate, enrich, or extract nucleic acids, such cfDNA molecules or cfRNA molecules.
  • the methods disclosed herein may comprise conducting one or more enrichment reactions on one or more nucleic acid molecules in a sample.
  • the enrichment reactions may comprise contacting a sample with one or more beads or bead sets.
  • the enrichment reactions may comprise one or more hybridization reactions.
  • the enrichment reactions may comprise contacting a sample with one or more probes (e.g., capture probes) or bait molecules that hybridize to a nucleic acid molecule of the biological sample.
  • the enrichment reaction may comprise differential amplification of a set of nucleic acid molecules.
  • the enrichment reaction may enrich for a plurality of genetic loci or sequences corresponding to genetic loci.
  • the enrichment reactions may comprise the use of primers or probes that may complementarity to sequences (or sequences upstream or downstream) of a sequence that is to be enriched.
  • a capture probe may comprise sequence complementarity to a set of genomic loci and allow the enrichment of the genomic loci.
  • the enrichments reactions may comprise a plurality of probes or primers.
  • a plurality of probes may comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, or 180 different probes.
  • the probes can be a biotinylated probe.
  • the probes can be attached to a bead or other solid support.
  • the probes can be attached to a bead or other solid support via a non-covalent (e.g., biotin-streptavidin interaction) or a covalent interaction.
  • the solid support can be a magnetic solid support.
  • the methods disclosed herein may comprise conducting one or more isolation or purification reactions on one or more nucleic acid molecules in a sample.
  • the isolation or purification reactions may comprise contacting a sample with one or more beads or bead sets.
  • the isolation or purification reaction may comprise one or more hybridization reactions, enrichment reactions, amplification reactions, sequencing reactions, or a combination thereof.
  • the isolation or purification reaction may comprise the use of one or more separators.
  • the one or more separators may comprise a magnetic separator.
  • the isolation or purification reaction may comprise separating bead bound nucleic acid molecules from bead free nucleic acid molecules.
  • the isolation or purification reaction may comprise separating capture probe hybridized nucleic acid molecules from capture probe free nucleic acid molecules.
  • the isolation reactions may comprises removing or separating a group of nucleic acid molecules from another group of nucleic acids.
  • the methods disclosed herein may comprise conduction extraction reactions on one or more nucleic acids in a biological sample.
  • the extraction reactions may lyse cells or disrupt nucleic acid interactions with the cell such that the nucleic acids may be isolated, purified, enriched or subjected to other reactions.
  • the methods disclosed herein may comprise amplification or extension reactions.
  • the amplification reactions may comprise polymerase chain reaction.
  • the amplification reaction may comprise polymerase chain reaction (PCR)-based amplifications, non-PCR based amplifications, or a combination thereof.
  • the one or more PCR-based amplifications may comprise PCR, quantitative PCR (qPCR), nested PCR, linear amplification, or a combination thereof.
  • the one or more non-PCR based amplifications may comprise multiple displacement amplification (MDA), transcription-mediated amplification (TMA), nucleic acid sequencebased amplification (NASBA), strand displacement amplification (SDA), real-time SDA, rolling circle amplification, circle-to-circle amplification or a combination thereof.
  • MDA multiple displacement amplification
  • TMA transcription-mediated amplification
  • NASBA nucleic acid sequencebased amplification
  • SDA strand displacement amplification
  • real-time SDA rolling circle amplification
  • rolling circle amplification circle-to-circle amplification or a combination thereof.
  • the amplification reactions may comprise an isothermal amplification.
  • the sequencing data received may originate from one or more sequencing reactions of the genomic data of the biological sample.
  • the sequencing reactions may comprise whole genome sequencing, whole exome sequencing, low-pass whole genome sequencing, targeted sequencing, methylation-aware sequencing, enzymatic methylation sequencing, bisulfite methylation sequencing.
  • the sequencing reaction may be a transcriptome sequencing, messenger RNA sequencing (mRNA-seq), total RNA sequencing (totalRNA-seq), small RNA sequencing (smallRNA-seq), exosome sequencing, or combinations thereof. Combinations of sequencing reactions may be used in the methods described elsewhere herein.
  • a biological sample may be subjected to whole genome sequencing and whole transcriptome sequencing.
  • the biological samples may comprise multiple types of nucleic acids (e.g. RNA and DNA), sequencing reactions specific to DNA or RNA may be used such to obtain sequence reads relating to the nucleic acid type.
  • the sequencing data can comprise next-generation sequencing (NGS) data.
  • NGS next-generation sequencing
  • the sequencing reactions can be performed at various sequencing depths.
  • the sequencing depths of a sequencing reaction may be selected or modulated.
  • the sequencing reactions may comprise sequencing at a region a depth of at least lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, l lx ,12x, 13x, 14x, 15x, 16x, 17x, 18x, 19x, 20x, 25x, 30x, 35x, 40x, 45x, 50x, 60x, 70x, 80x, 90x, lOOx, 200x, 300x, 400x, 500x, 600x, 700x, 800x, 900x,1000x, 2000x, 3000x, 4000x, 5000x, 6000x, 7000x, 8000x, 9000x, 10,000x, 20,000x, 30,000x, 40,000x, 50,000x, 60,000x, 70,000x, 80,000x, 90,000, 100,000x, or more.
  • the sequencing reactions may comprise sequencing a region at a depth of no more than lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, l lx ,12x, 13x, 14x, 15x, 16x, 17x, 18x, 19x, 20x, 25x, 30x, 35x, 40x, 45x, 50x, 60x, 70x, 80x, 90x, lOOx, 200x, 300x, 400x, 500x, 600x, 700x, 800x, 900x,1000x, 2000x, 3000x, 4000x, 5000x, 6000x, 7000x, 8000x, 9000x, 10,000x, 20,000x, 30,000x, 40,000x, 50,000x, 60,000x, 70,000x, 80,000x, 90,000, 100,000x, or less.
  • a low pass whole genome sequencing can be used to sequence nucleic acids.
  • the low pass whole genome sequence may be performed at an average sequencing depth of at least lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, or more.
  • the low pass whole genome sequence may be performed at an average sequencing depth of no more than lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, or less.
  • the low pass whole genome sequencing may be performed at an average depth of between lx and 2x.
  • a sequencing reaction may be performed using a set of personalized or customized probes.
  • the sequencing reaction using a set of personalized or customized probes may be a deep sequencing reaction or ultra-deep sequencing reaction.
  • the sequencing reaction using a set of personalized or customized probes may be performed at an sequencing depth of 50x, 60x, 70x, 80x, 90x, lOOx, 200x, 300x, 400x, 500x, 600x, 700x, 800x, 900x,1000x, 2000x, 3000x, 4000x, 5000x, 6000x, 7000x, 8000x, 9000x, 10,000x, 20,000x, 30,000x, 40,000x, 50,000x, 60,000x, 70,000x, 80,000x, 90,000, 100,000x, or more.
  • a whole exome sequencing can be used to sequence nucleic acids of a subject.
  • the whole exome sequencing may be performed at a non-uniform depth. For example, certain areas of the exome may be boosted or otherwise sequenced at a greater depth than other regions, or at a greater depth than the average depth of the whole exome sequencing.
  • genes or regions that are of more interest may be analyzed with higher sensitivity, accuracy, and/or precision.
  • Genes or regions associated with or related to cancer can be sequenced at a greater depth. For example, at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or more genes can be sequenced at a higher depth than the rest of the exome (e.g. average depth of the whole exome sequencing).
  • the sequencing of nucleic acids may generate sequencing read data.
  • the sequencing reads may be processed such to generate data of improved quality.
  • the sequencing reads may be generated with a quality score.
  • the quality score may indicate an accuracy of a sequence read or a level or signal above a nose threshold for a given base call.
  • the quality scores may be used for filtering sequencing reads. For example, sequencing reads may be removed that do not meet a particular quality score threshold.
  • the sequencing reads may be processed such to generate a consensus sequence or consensus base call.
  • receiving sequencing data can further comprise constructing a library comprising the received sequencing data and epigenetic data for the one or more nucleic acid fragments.
  • the epigenetic data can comprise the one or more methylation features.
  • the method can further comprise determining a methylation profile for the individual.
  • the methylation profile can comprise one or more methylation features of the plurality of cell-free nucleic acid fragments.
  • the one or more methylation features can comprise one or more of: methylation pattern data, tissue-of-origin deconvolution data, or fragment-level beta values, or any combination thereof.
  • the method can further comprise identifying one or more abnormal patterns of the one or more methylation features of the methylation profile as compared to one or more reference methylation profiles.
  • the reference methylation profiles can be control methylation profiles.
  • the control methylation profiles can comprise methylation profiles of individuals not having the disease.
  • the reference methylation profile data of the one or more abnormal patterns can comprise one or more methylation features of one or more nucleic acid fragments of healthy control individuals.
  • the reference methylation profile data of the one or more abnormal patterns can comprise one or more methylation features of one or more nucleic acid fragments of individuals having a benign cancer.
  • the reference methylation profile data of the one or more abnormal patterns can comprise one or more methylation features of one or more nucleic acid fragments of individuals having a malignant cancer. In some cases, the reference methylation profile data of the one or more abnormal patterns can comprise one or more methylation features of one or more nucleic acid fragments of individuals not having cancer, individuals having a benign cancer, and individuals having a malignant cancer.
  • identifying one or more abnormal patterns of the one or more methylation features can further comprise comparing one or more methylation features of one or more localized regions of the one or more nucleic acid fragments of the individual to one or more corresponding methylation features of one or more corresponding localized regions of the reference methylation profile data.
  • the method can further comprise comparing one or more methylation features of one or more nucleic acid fragments of the determined tissue type of origin to one or more reference nucleic acid fragments of the determined tissue type of origin.
  • the method can further comprise identifying one or more abnormal patterns of the one or more methylation features of the methylation profile using a trained machine learning model.
  • the trained machine learning model can be trained using training data can comprise histopathologic and cytopathologic data.
  • the method can further comprise training the trained machine learning model to determine patterns of methylation quantities in one or more nucleic acid fragments of control sample training data.
  • the control sample training data can comprise methylation feature data and sequencing data of one or more nucleic acid fragments of healthy individuals. In some cases, the healthy individuals do not have the disease state.
  • the method can further comprise generating an indication of the disease state of the individual.
  • the indication can be generated based at least in part on the identified one or more abnormal patterns.
  • the indication can be generated by the trained machine learning model.
  • control sample training data can comprise methylation feature data and sequencing data of one or more nucleic acid fragments of individuals having the disease state.
  • the individuals having the disease state can be confirmed to have the disease state.
  • the disease state can comprise a malignancy.
  • generating the indication of the disease state can further comprise determining whether the disease state can be a cancer disease state or a non-cancer disease state. In some cases, generating the indication of the disease state can further comprise determining whether the disease state can be benign disease state or a malignant disease state.
  • the disease state can comprise having a cancer.
  • the cancer can comprise one or more of: carcinomas, breast cancer, lung cancer, prostate cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
  • the cancer can comprise urethral carcinoma.
  • the cancer can comprise bladder cancer.
  • the bladder cancer can comprise non-muscle invasive bladder cancer (NMIBC).
  • the disease state can comprise identifying a tissue of origin for the disease.
  • the method can further comprise can comprise identifying, using the trained machine learning model, a tissue type of origin of the one or more nucleic acid fragments.
  • identifying the tissue type of origin can comprise performing tissue-of-origin deconvolution using the trained machine learning model.
  • generating the indication of the disease state can further comprise mapping, using the trained machine learning model, a pattern of quantified methylation amounts at one or more localized areas of the one or more nucleic acid fragments.
  • the method can further comprise can comprise determining the disease state based at least in part on the mapped pattern of methylation.
  • the method can further comprise can comprise generating the indication of the disease state based at least in part on comparing one or more quantified methylation values of the mapped pattern of methylation to a threshold methylation value.
  • the threshold methylation value can be a dynamic value.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on training data.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on feedback data.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on both training data and feedback data.
  • the method can further comprise can comprise dynamically generating, using the trained machine learning model, the dynamic threshold methylation value for each of the one or more localized areas.
  • the method can further comprise determining a non-cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • the method can further comprise determining a cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • the method can further comprise determining a benign cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas. In some embodiments, the method can further comprise determining a malignant cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • the indication of the disease state of the individual can be generated with an accuracy of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual can be generated with an accuracy of 89%.
  • the indication of the disease state of the individual can be generated with a sensitivity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, more than about 85%, or more than about 90%. In some cases, the indication of the disease state of the individual can be generated with a sensitivity of 92%.
  • the indication of the disease state of the individual can be generated with a specificity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual can be generated with a specificity of 86%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at an accuracy of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a sensitivity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a specificity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a positive predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a negative predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the present disclosure provides a system comprising one or more computer processors and computer memory coupled thereto, the computer memory can comprise machine executable code that, upon execution by the one or more computer processors, implements a method for determining a disease state of an individual.
  • the method can comprise receiving sequencing data for one or more nucleic acid fragments.
  • the plurality of nucleic acid fragments can be obtained or derived from a biological sample of the individual.
  • the system-implemented method can further comprise determining a methylation profile for the individual can comprise one or more methylation features of the one or more nucleic acid fragments.
  • the one or more methylation features can comprise one or more of: methylation pattern data, tissue-of-origin deconvolution data, or fragment-level beta values, or any combination thereof.
  • the system-implemented method can further comprise identifying one or more abnormal patterns of the one or more methylation features of the methylation profile as compared to reference methylation profile data.
  • the reference methylation profiles can be control methylation profiles.
  • the control methylation profiles can comprise methylation profiles of individuals not having the disease.
  • the reference methylation profile data of the one or more abnormal patterns can comprise one or more methylation features of one or more nucleic acid fragments of (i) healthy control individuals, (ii) individuals having a benign cancer, or (iii) individuals having a malignant cancer, or any combination of (i)-(iii).
  • identifying one or more abnormal patterns of the one or more methylation features can further comprise comparing one or more methylation features of one or more localized regions of the one or more nucleic acid fragments of the individual to one or more corresponding methylation features of one or more corresponding localized regions of the reference methylation profile data.
  • system-implemented method can further comprise generating, using a trained machine learning model, an indication of the disease state of the individual based at least in part on the identified one or more abnormal patterns.
  • the method can further comprise can comprise dynamically generating, using the trained machine learning model, the dynamic threshold methylation value for each of the one or more localized areas.
  • the method can further comprise determining a non-cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • the method can further comprise determining a cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • the method can further comprise determining a benign cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas. In some embodiments, the method can further comprise determining a malignant cancer disease state, based at least in part on the comparison of the mapped pattern of methylation to the dynamically generated threshold methylation value at the one or more localized areas.
  • generating the indication of the disease state can further comprise determining whether the disease state can be a cancer disease state or a non-cancer disease state. In some cases, generating the indication of the disease state can further comprise determining whether the disease state can be benign disease state or a malignant disease state.
  • the disease state can comprise having a cancer.
  • the cancer can comprise one or more of: carcinomas, breast cancer, lung cancer, prostate cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
  • the cancer can comprise urethral carcinoma.
  • the cancer can comprise bladder cancer.
  • the bladder cancer can comprise non-muscle invasive bladder cancer (NMIBC).
  • NMIBC non-muscle invasive bladder cancer
  • the disease state can comprise identifying a tissue of origin for the disease.
  • the method can further comprise can comprise identifying, using the trained machine learning model, a tissue type of origin of the one or more nucleic acid fragments.
  • identifying the tissue type of origin can comprise performing tissue-of-origin deconvolution using the trained machine learning model.
  • generating the indication of the disease state can further comprise mapping, using the trained machine learning model, a pattern of quantified methylation amounts at one or more localized areas of the one or more nucleic acid fragments.
  • the method can further comprise can comprise determining the disease state based at least in part on the mapped pattern of methylation.
  • the method can further comprise can comprise generating the indication of the disease state based at least in part on comparing one or more quantified methylation values of the mapped pattern of methylation to a threshold methylation value.
  • the threshold methylation value can be a dynamic value.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on training data.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on feedback data.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on both training data and feedback data.
  • the indication of the disease state of the individual can be generated with an accuracy of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual can be generated with an accuracy of 89%.
  • the indication of the disease state of the individual can be generated with a sensitivity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, more than about 85%, or more than about 90%. In some cases, the indication of the disease state of the individual can be generated with a sensitivity of 92%.
  • the indication of the disease state of the individual can be generated with a specificity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual can be generated with a specificity of 86%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at an accuracy of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a sensitivity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a specificity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a positive predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the methods or systems may comprise determining the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a negative predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the present disclosure provides a system for determining a disease state of an individual can comprise one or more computer processors, the one or more computer processors comprising a first machine learning model configured to determine a tissue type of origin for one or more nucleic acid fragments obtained or derived from a biological sample of an individual.
  • the first machine learning model can be configured to determine the tissue type of origin based performing tissue-of-origin deconvolution on sequencing data of the one or more nucleic acid fragments.
  • the one or more computer processors can further comprise a second machine learning model configured to generate the determination of the disease state of the individual.
  • the trained first machine learning model or the second machine learning model may utilize one or more algorithms.
  • the one or more algorithms may comprise an unsupervised machine learning algorithm.
  • the unsupervised machine learning algorithm may utilize cluster analysis to identify attributes of interest.
  • the one or more algorithms may comprise a supervised machine learning algorithm.
  • the algorithm may be inputted with training data such to generate an expected or desired output.
  • the supervised learning algorithm may comprise a deep learning algorithm, a support vector machine (SVM), a neural network, or a Random Forest.
  • the determination of the disease state of the individual can be generated based at least in part on the sequencing data of the one or more nucleic acid fragments. In some cases, the determination of the disease state of the individual can be generated based at least in part on methylation profile pattern data of the one or more nucleic acid fragments. In some cases, the determination of the disease state of the individual can be generated based at least in part on both the sequencing data of the one or more nucleic acid fragments and methylation profile pattern data of the one or more nucleic acid fragments.
  • the present disclosure provides a system for determining a disease state of an individual comprising one or more computer processors.
  • the one or more computer processors can comprise an interface configured to receive sequencing data for one or more cell-free nucleic acid fragments.
  • the sequencing data received may originate from one or more sequencing reactions of the genomic data of the biological sample.
  • the sequencing reactions may comprise whole genome sequencing, whole exome sequencing, low-pass whole genome sequencing, targeted sequencing, methylation-aware sequencing, enzymatic methylation sequencing, bisulfite methylation sequencing.
  • the sequencing reaction may be a transcriptome sequencing, messenger RNA sequencing (mRNA-seq), total RNA sequencing (totalRNA-seq), small RNA sequencing (smallRNA-seq), exosome sequencing, or combinations thereof. Combinations of sequencing reactions may be used in the methods described elsewhere herein.
  • a biological sample may be subjected to whole genome sequencing and whole transcriptome sequencing.
  • the biological samples may comprise multiple types of nucleic acids (e.g. RNA and DNA), sequencing reactions specific to DNA or RNA may be used such to obtain sequence reads relating to the nucleic acid type.
  • the sequencing data can comprise next-generation sequencing (NGS) data.
  • NGS next-generation sequencing
  • the sequencing reactions can be performed at various sequencing depths.
  • the sequencing depths of a sequencing reaction may be selected or modulated.
  • the sequencing reactions may comprise sequencing at a region a depth of at least lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, l lx ,12x, 13x, 14x, 15x, 16x, 17x, 18x, 19x, 20x, 25x, 30x, 35x, 40x, 45x, 50x, 60x, 70x, 80x, 90x, lOOx, 200x, 300x, 400x, 500x, 600x, 700x, 800x, 900x,1000x, 2000x, 3000x, 4000x, 5000x, 6000x, 7000x, 8000x, 9000x, 10,000x, 20,000x, 30,000x, 40,000x, 50,000x, 60,000x, 70,000x, 80,000x, 90,000, 100,000x, or more.
  • the sequencing reactions may comprise sequencing a region at a depth of no more than lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, l lx ,12x, 13x, 14x, 15x, 16x, 17x, 18x, 19x, 20x, 25x, 3Ox, 35x, 40x, 45x, 5Ox, 60x, 70x, 8Ox, 90x, lOOx, 200x, 3OOx, 400x, 5OOx, 600x, 700x, 8OOx, 900x,1000x, 2000x, 3OOOx, 4000x, 5OOOx, 6000x, 7000x, 8OOOx, 9000x, lO,OOOx, 20,000x, 3O,OOOx, 40,000x, 5O,OOOx, 60,000x, 70,000x, 8O,OOOx, 90,000, 100,000x, or less.
  • a low pass whole genome sequencing can be used to sequence nucleic acids.
  • the low pass whole genome sequence may be performed at an average sequencing depth of at least lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, or more.
  • the low pass whole genome sequence may be performed at an average sequencing depth of no more than lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, or less.
  • the low pass whole genome sequencing may be performed at an average depth of between lx and 2x.
  • a sequencing reaction may be performed using a set of personalized or customized probes.
  • the sequencing reaction using a set of personalized or customized probes may be a deep sequencing reaction or ultra-deep sequencing reaction.
  • the sequencing reaction using a set of personalized or customized probes may be performed at an sequencing depth of 50x, 60x, 70x, 80x, 90x, lOOx, 200x, 300x, 400x, 500x, 600x, 700x, 800x, 900x,1000x, 2000x, 3000x, 4000x, 5000x, 6000x, 7000x, 8000x, 9000x, 10,000x, 20,000x, 30,000x, 40,000x, 50,000x, 60,000x, 70,000x, 80,000x, 90,000, 100,000x, or more.
  • a whole exome sequencing can be used to sequence nucleic acids of a subject.
  • the whole exome sequencing may be performed at a non-uniform depth. For example, certain areas of the exome may be boosted or otherwise sequenced at a greater depth than other regions, or at a greater depth than the average depth of the whole exome sequencing.
  • genes or regions that are of more interest may be analyzed with higher sensitivity, accuracy, and/or precision.
  • Genes or regions associated with or related to cancer can be sequenced at a greater depth. For example, at least 100, 200, 300, 400, 500, 600, 700, 800, 900, or more genes can be sequenced at a higher depth than the rest of the exome (e.g. average depth of the whole exome sequencing).
  • the sequencing of nucleic acids may generate sequencing read data.
  • the sequencing reads may be processed such to generate data of improved quality.
  • the sequencing reads may be generated with a quality score.
  • the quality score may indicate an accuracy of a sequence read or a level or signal above a nose threshold for a given base call.
  • the quality scores may be used for filtering sequencing reads. For example, sequencing reads may be removed that do not meet a particular quality score threshold.
  • the sequencing reads may be processed such to generate a consensus sequence or consensus base call.
  • the one or more cell-free nucleic acid fragments can be obtained from a biological sample of the individual or derived from a biological sample of the individual.
  • the biological sample of the individual can be selected from the group consisting of: a deoxyribonucleic acid (DNA) sample, a ribonucleic acid (RNA) sample, a plasma sample, a serum sample, a buffy coat sample, a peripheral blood mononuclear cell (PBMC) sample, a red blood cell sample, a urine sample, a urine cell pellet sample, a saliva sample, a tissue biopsy, a pleural fluid sample, a peritoneal fluid sample, an amniotic fluid sample, a cerebrospinal fluid sample, a lymphatic fluid sample, a sweat sample, a tear sample, a semen sample, or any derivative thereof, and any combination thereof.
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • plasma sample a serum sample
  • a buffy coat sample a peripheral blood mononuclear cell (PBMC) sample
  • PBMC peripheral blood mononuclear cell
  • red blood cell sample a urine sample, a
  • the biological sample can comprise a urine sample.
  • the urine sample can comprise a cell-free urine sample.
  • the DNA sample can comprise cell-free DNA (cfDNA).
  • the cfDNA can comprise urinary cfDNA (ucfDNA).
  • the biological sample may comprise one or more nucleic acids.
  • the biological sample be a cell-free deoxyribonucleic acid (cfDNA) sample or a cell-free ribonucleic acid (cfRNA) sample.
  • the biological sample may comprise genomic DNA or germline DNA (gDNA).
  • the nucleic acid may be a DNA (e.g. double-stranded DNA, single- stranded DNA, singlestranded DNA hairpins, copy DNA (cDNA), genomic DNA, germline DNA, circulating tumor DNA (ctDNA), cell-free DNA (cfDNA), an RNA (e.g.
  • the biological sample may be a derived from or contain a biological fluid.
  • the biological sample may be a plasma sample, a serum sample, a buffy coat sample, a peripheral blood mononuclear cell (PBMC) sample, a red blood cell sample, a urine sample, a saliva sample, or other body fluid sample.
  • PBMC peripheral blood mononuclear cell
  • the biological sample may comprise or be a pleural fluid sample, peritoneal fluid sample, amniotic fluid sample, cerebrospinal fluid sample, lymphatic fluid sample, sweat sample, tear sample, semen sample, or any combination of biological fluid.
  • the samples may comprise RNA and DNA.
  • a sample may comprise cfDNA and cfRNA.
  • the biological sample may be obtained or derived from a subject at a various times.
  • the biological sample may be obtained or derived from a subject prior to the subject receiving a therapy for cancer.
  • the biological sample may be obtained or derived from a subject during receiving a therapy for cancer.
  • the biological sample may be obtained or derived from a subject after receiving a therapy for cancer.
  • the biological sample may be collected over 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or time points.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more hour period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more day period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more week period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more month period.
  • the time points may occur over a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60 or more year period.
  • the system can further comprise a processor configured to determine a methylation profile for the biological sample of the individual.
  • the methylation profile can comprise one or more methylation features of the plurality of cell-free nucleic acid fragments.
  • the methylation profile can comprise one or more methylation features of the plurality of cell-free nucleic acid fragments.
  • the one or more methylation features can comprise one or more of: methylation pattern data, tissue-of-origin deconvolution data, or fragment-level beta values, or any combination thereof.
  • the system further comprises a machine learning model.
  • the machine learning model can utilize one or more algorithms.
  • the one or more algorithms may comprise an unsupervised machine learning algorithm.
  • the unsupervised machine learning algorithm may utilize cluster analysis to identify attributes of interest.
  • the one or more algorithms may comprise a supervised machine learning algorithm.
  • the algorithm may be inputted with training data such to generate an expected or desired output.
  • the supervised learning algorithm may comprise a deep learning algorithm, a support vector machine (SVM), a neural network, or a Random Forest.
  • the machine learning model can be trained to determine one or more abnormal patterns of the one or more methylation features of the methylation profile.
  • the one or more methylation features of the methylation profile can be compared to one or more reference methylation profiles to determine the one or more abnormal patterns of the one or more methylation features.
  • the reference methylation profiles can be control methylation profiles.
  • the control methylation profiles can comprise methylation profiles of individuals not having the disease.
  • the reference methylation profile data of the one or more abnormal patterns can comprise one or more methylation features of one or more nucleic acid fragments of (i) healthy control individuals, (ii) individuals having a benign cancer, or (iii) individuals having a malignant cancer, or any combination of (i)-(iii).
  • identifying one or more abnormal patterns of the one or more methylation features can further comprise comparing one or more methylation features of one or more localized regions of the one or more nucleic acid fragments of the individual to one or more corresponding methylation features of one or more corresponding localized regions of the reference methylation profile data.
  • the machine learning model can be trained to generate an indication of the disease state of the individual based at least in part on the one or more abnormal patterns identified.
  • the disease state can comprise a malignancy of a tissue of the individual.
  • the disease state can comprise having a cancer.
  • the cancer can comprise one or more of: carcinomas, breast cancer, lung cancer, prostate cancer, colorectal cancer, melanoma, bladder cancer, non-Hodgkin lymphoma, kidney cancer, endometrial cancer, leukemia, pancreatic cancer, thyroid cancer, or liver cancer, or any combination thereof.
  • the cancer can comprise urethral carcinoma.
  • the cancer can comprise bladder cancer.
  • the bladder cancer can comprise non-muscle invasive bladder cancer (NMIBC).
  • the disease state can comprise identifying a tissue of origin for the disease.
  • the method can further comprise can comprise identifying, using the trained machine learning model, a tissue type of origin of the one or more nucleic acid fragments.
  • identifying the tissue type of origin can comprise performing tissue-of-origin deconvolution using the trained machine learning model.
  • generating the indication of the disease state can further comprise mapping, using the trained machine learning model, a pattern of quantified methylation amounts at one or more localized areas of the one or more nucleic acid fragments.
  • the method can further comprise can comprise determining the disease state based at least in part on the mapped pattern of methylation.
  • the disease state can comprise a tissue of origin of the one or more cell- free nucleic acid fragments. In some cases, the disease state can comprise both a malignancy of a tissue of the individual and a tissue of origin of the one or more cell-free nucleic acid fragments.
  • generating the indication of the disease state can further comprise mapping, using the trained machine learning model, a pattern of quantified methylation amounts at one or more localized areas of the one or more nucleic acid fragments.
  • the method can further comprise can comprise determining the disease state based at least in part on the mapped pattern of methylation.
  • the machine learning model can generate the indication of the disease state based at least in part on comparing one or more quantified methylation values of the mapped pattern of methylation to a threshold methylation value.
  • the threshold methylation value can be a dynamic value.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on training data.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on feedback data.
  • the dynamic threshold methylation value can be dynamically generated, using the trained machine learning model, based at least in part on both training data and feedback data.
  • the machine learning model can generate the indication of the disease state of the individual with an accuracy of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%. In some cases, the indication of the disease state of the individual can be generated with an accuracy of 89%.
  • the machine learning model can generate the indication of the disease state of the individual with a sensitivity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, more than about 85%, or more than about 90%.
  • the indication of the disease state of the individual can be generated with a sensitivity of 92%.
  • the machine learning model can generate the indication of the disease state of the individual with a specificity of more than about 50%, more than about 55%, more than about 60%, more than about 65%, more than about 70%, more than about 75%, more than about 80%, or more than about 85%.
  • the indication of the disease state of the individual can be generated with a specificity of 86%.
  • the machine learning model can determine the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at an accuracy of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the machine learning model can determine the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a sensitivity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the machine learning model can determine the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a specificity of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the machine learning model can determine the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a positive predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • the machine learning model can determine the presence or the absence of cancer, a tissue of origin of the cancer, or a malignancy of the cancer in the subject at a negative predictive value of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, or at least about 99%.
  • NIMBC Non-Muscle Invasive Bladder Cancer
  • MFD minimal residual disease
  • TURBT trans-urethral resection of bladder tumor
  • DMFs Differentially methylated deoxyribonucleic acid fragments
  • NGS Whole genome methylation next-generation sequencing
  • NGS PredicineEPICTM
  • DNA fragment methylation analysis was performed using PredicineBEACONTM.
  • the assay had low DNA loss and low guanine and cytosine (GC) bias.
  • cfDNA input cell-free deoxyribonucleic acid
  • the method (100) can include, for example, blood, urine, or tissue (101) can be used for extraction of nucleic acids.
  • DNA was extracted and libraries were constructed (102) from the extracted nucleic acids.
  • the extraction (102) was performed with methylation treatment.
  • Next generation sequencing (NGS) was performed (103) on the extracted nucleic acids using the library.
  • DNA methylation data from the NGS was analyzed (104) to determine the methylation abnormalities.
  • DNA methylation abnormality was quantified using differentially methylated DNA fragments (DMFs) using PredicineEPICTM. Samples were compared having Non-Muscle Invasive Bladder Cancer (NIMBC) positive designation and negative designation. Positive samples were also compared to healthy donor samples. Disease status concordance of each group was evaluated.
  • DMFs differentially methylated DNA fragments
  • NIMBC Non-Muscle Invasive Bladder Cancer
  • DMFs were measured using PredicineEPICTM.
  • the DMFs were analyzed and charted with mutation-based tumor fractions.
  • the mutation-based tumor fractions were quantified and analyzed using PredicineBEACONTM.
  • the samples were graded and correlation analyzed between the DMFs and mutation-based tumor fractions.
  • samples were classified as NIMBC-positive or NIMBC- negative by a clinical pathologist. Samples were grouped by NIMBC-positive status (19 samples), NIMBC-negative status (11 samples), or healthy control samples. Samples were plotted by proportion of samples having a tissue of origin of bladder tissue. Bonferroni correction was applied for multiple testing.
  • DMFs were quantified using PredicineEPICTM. Proportion of tissue of origin being the bladder was plotted against the quantified DMFs for the grouped NIMBC-positive status (19 samples), NIMBC-negative status (11 samples), or healthy control samples.
  • the method (400) included, for example, patients prospectively enrolled were diagnosed with urethral carcinoma (401). Samples were taken from 36 individuals diagnosed with malignant urethral carcinoma and 25 individuals had benign or non-tumor lesions. The samples were analyzed to produced histopathologic and cytopathologic results, which were reviewed by a pathologist. Urine samples (402) were collected before surgical intervention to treat the urethral carcinoma. Next-generation sequencing (NGS) (403) was performed on the samples. Deoxyribonucleic acid (DNA) methylation assays (404) were performed using PredicineEPICTM. High-throughput sequencing (405) was performed.
  • NGS Next-generation sequencing
  • Methylation analysis (406) was performed to generate a report (407) of cancer signal detection and diagnosis.
  • a machine learning model was generated using data generated on methylation feature data.
  • the methylation features included abnormally methylated fragment data, tissue of origin deconvolution data, and fragment-level beta value data.
  • AMFRs Abnormally methylated fragment regions
  • FIG. 5A hierarchical clustering results were generated depicting arrangement of samples based on similarities of the sample DNA methylation profiles.
  • a heatmap was generated to depict the quantity of abnormally methylated fragments (AMFs) across genomic regions. Warmer colors indicate higher quantities of abnormal methylation.
  • a tissue of origin deconvolution algorithm was used to analyze the relative quantity of bladder tissue-originating genomic fragments within each of the samples. Samples were grouped into a benign sample group and a malignant sample group.
  • a scatter plot was generated of normalized AMF values and tissue of origin deconvolution feature values. Samples were grouped into a benign sample group and a malignant sample group.
  • ROC Receiver Operating Characteristic
  • weights comprising SHapley Additive exPlanations (SHAP) values for the model were plotted.
  • the SHAP values were plotted against AMF score for each feature and a heatmap of feature SHAP values was generated for a benign sample group and a malignant sample group.
  • FIG. 7 a block diagram is shown depicting an exemplary machine that includes a computer system 700 (e.g., a processing or computing system) within which a set of instructions can execute for causing a device to perform or execute any one or more of the aspects and/or methodologies for static code scheduling of the present disclosure.
  • a computer system 700 e.g., a processing or computing system
  • the components in FIG. 7 are examples only and do not limit the scope of use or functionality of any hardware, software, embedded logic component, or a combination of two or more such components implementing particular embodiments.
  • Computer system 700 may include one or more processors 701, a memory 703, and a storage 707 that communicate with each other, and with other components, via a bus 340.
  • the bus 340 may also link a display 732, one or more input devices 733 (which may, for example, include a keypad, a keyboard, a mouse, a stylus, etc.), one or more output devices 734, one or more storage devices 735, and various tangible storage media 736. All of these elements may interface directly or via one or more interfaces or adaptors to the bus 340.
  • the various tangible storage media 736 can interface with the bus 340 via storage medium interface 726.
  • Computer system 700 may have any suitable physical form, including but not limited to one or more integrated circuits (ICs), printed circuit boards (PCBs), mobile handheld devices (such as mobile telephones or PDAs), laptop or notebook computers, distributed computer systems, computing grids, or servers.
  • ICs integrated circuits
  • PCBs printed circuit boards
  • mobile handheld devices such as mobile telephones
  • Computer system 700 includes one or more processor(s) 701 (e.g., central processing units (CPUs) or general-purpose graphics processing units (GPGPUs)) that carry out functions.
  • processor(s) 701 optionally contains a cache memory unit 702 for temporary local storage of instructions, data, or computer addresses.
  • Processor(s) 701 are configured to assist in execution of computer readable instructions.
  • Computer system 700 may provide functionality for the components depicted in FIG. 7 as a result of the processor(s) 701 executing non-transitory, processor-executable instructions embodied in one or more tangible computer-readable storage media, such as memory 703, storage 708, storage devices 735, and/or storage medium 736.
  • the computer-readable media may store software that implements particular embodiments, and processor(s) 701 may execute the software.
  • Memory 703 may read the software from one or more other computer-readable media (such as mass storage device(s) 735, 736) or from one or more other sources through a suitable interface, such as network interface 720.
  • the software may cause processor(s) 701 to carry out one or more processes or one or more operations of one or more processes described or illustrated herein. Carrying out such processes or operations may include defining data structures stored in memory 703 and modifying the data structures as directed by the software.
  • the memory 703 may include various components (e.g., machine readable media) including, but not limited to, a random access memory component (e.g., RAM 704) (e.g., static RAM (SRAM), dynamic RAM (DRAM), ferroelectric random access memory (FRAM), phase-change random access memory (PRAM), etc.), a read-only memory component (e.g., ROM 705), and any combinations thereof.
  • ROM 705 may act to communicate data and instructions unidirectionally to processor(s) 701
  • RAM 704 may act to communicate data and instructions bidirectionally with processor(s) 701.
  • ROM 705 and RAM 704 may include any suitable tangible computer-readable media described below.
  • a basic input/output system 706 (BIOS), including basic routines that help to transfer information between elements within computer system 700, such as during start-up, may be stored in the memory 703.
  • Fixed storage 708 is connected bidirectionally to processor(s) 701, optionally through storage control unit 707.
  • Fixed storage 707 provides additional data storage capacity and may also include any suitable tangible computer-readable media described herein.
  • Storage 708 may be used to store operating system 709, executable(s) 710, data 711, applications 712 (application programs), and the like.
  • Storage 708 can also include an optical disk drive, a solid-state memory device (e.g., flash-based systems), or a combination of any of the above.
  • Information in storage 708 may, in appropriate cases, be incorporated as virtual memory in memory 703.
  • storage device(s) 735 may be removably interfaced with computer system 700 (e.g., via an external port connector (not shown)) via a storage device interface 725.
  • storage device(s) 735 and an associated machine-readable medium may provide non-volatile and/or volatile storage of machine-readable instructions, data structures, program modules, and/or other data for the computer system 700.
  • software may reside, completely or partially, within a machine-readable medium on storage device(s) 735.
  • software may reside, completely or partially, within processor(s) 701.
  • Bus 740 connects a wide variety of subsystems.
  • reference to a bus may encompass one or more digital signal lines serving a common function, where appropriate.
  • Bus 740 may be any of several types of bus structures including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combinations thereof, using any of a variety of bus architectures.
  • such architectures include an Industry Standard Architecture (ISA) bus, an Enhanced ISA (EISA) bus, a Micro Channel Architecture (MCA) bus, a Video Electronics Standards Association local bus (VLB), a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCLX) bus, an Accelerated Graphics Port (AGP) bus, HyperTransport (HTX) bus, serial advanced technology attachment (SATA) bus, and any combinations thereof.
  • ISA Industry Standard Architecture
  • EISA Enhanced ISA
  • MCA Micro Channel Architecture
  • VLB Video Electronics Standards Association local bus
  • PCI Peripheral Component Interconnect
  • PCLX PCI-Express
  • AGP Accelerated Graphics Port
  • HTX HyperTransport
  • SATA serial advanced technology attachment
  • Computer system 700 may also include an input device 733.
  • a user of computer system 700 may enter commands and/or other information into computer system 700 via input device(s) 733.
  • Examples of an input device(s) 733 include, but are not limited to, an alpha-numeric input device (e.g., a keyboard), a pointing device (e.g., a mouse or touchpad), a touchpad, a touch screen, a multi-touch screen, a joystick, a stylus, a gamepad, an audio input device (e.g., a microphone, a voice response system, etc.), an optical scanner, a video or still image capture device (e.g., a camera), and any combinations thereof.
  • an alpha-numeric input device e.g., a keyboard
  • a pointing device e.g., a mouse or touchpad
  • a touchpad e.g., a touch screen
  • a multi-touch screen e.g., a joystick
  • the input device is a Kinect, Leap Motion, or the like.
  • Input device(s) 733 may be interfaced to bus 740 via any of a variety of input interfaces 723 (e.g., input interface 723) including, but not limited to, serial, parallel, game port, USB, FIREWIRE, THUNDERBOLT, or any combination of the above.
  • computer system 700 when computer system 700 is connected to network 730, computer system 700 may communicate with other devices, specifically mobile devices and enterprise systems, distributed computing systems, cloud storage systems, cloud computing systems, and the like, connected to network 730. Communications to and from computer system 700 may be sent through network interface 720.
  • network interface 720 may receive incoming communications (such as requests or responses from other devices) in the form of one or more packets (such as Internet Protocol (IP) packets) from network 730, and computer system 700 may store the incoming communications in memory 703 for processing.
  • IP Internet Protocol
  • Computer system 700 may similarly store outgoing communications (such as requests or responses to other devices) in the form of one or more packets in memory 703 and communicated to network 730 from network interface 720.
  • Processor(s) 701 may access these communication packets stored in memory 703 for processing.
  • Examples of the network interface 720 include, but are not limited to, a network interface card, a modem, and any combination thereof.
  • Examples of a network 730 or network segment 730 include, but are not limited to, a distributed computing system, a cloud computing system, a wide area network (WAN) (e.g., the Internet, an enterprise network), a local area network (LAN) (e.g., a network associated with an office, a building, a campus or other relatively small geographic space), a telephone network, a direct connection between two computing devices, a peer-to-peer network, and any combinations thereof.
  • a network, such as network 730 may employ a wired and/or a wireless mode of communication. In general, any network topology may be used.
  • Information and data can be displayed through a display 732.
  • a display 732 include, but are not limited to, a cathode ray tube (CRT), a liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT-LCD), an organic liquid crystal display (OLED) such as a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display, a plasma display, and any combinations thereof.
  • the display 732 can interface to the processor(s) 701, memory 703, and fixed storage 708, as well as other devices, such as input device(s) 733, via the bus 740.
  • the display 732 is linked to the bus 740 via a video interface 722, and transport of data between the display 732 and the bus 740 can be controlled via the graphics control 721.
  • the display is a video projector.
  • the display is a head-mounted display (HMD) such as a VR headset.
  • suitable VR headsets include, by way of non-limiting examples, HTC Vive, Oculus Rift, Samsung Gear VR, Microsoft HoloLens, Razer OSVR, FOVE VR, Zeiss VR One, Avegant Glyph, Freefly VR headset, and the like.
  • the display is a combination of devices such as those disclosed herein.
  • computer system 700 may include one or more other peripheral output devices 734 including, but not limited to, an audio speaker, a printer, a storage device, and any combinations thereof.
  • peripheral output devices may be connected to the bus 740 via an output interface 724.
  • Examples of an output interface 724 include, but are not limited to, a serial port, a parallel connection, a USB port, a FIREWIRE port, a THUNDERBOLT port, and any combinations thereof.
  • computer system 700 may provide functionality as a result of logic hardwired or otherwise embodied in a circuit, which may operate in place of or together with software to execute one or more processes or one or more operations of one or more processes described or illustrated herein.
  • Reference to software in this disclosure may encompass logic, and reference to logic may encompass software.
  • reference to a computer-readable medium may encompass a circuit (such as an IC) storing software for execution, a circuit embodying logic for execution, or both, where appropriate.
  • the present disclosure encompasses any suitable combination of hardware, software, or both.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • a software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may be integral to the processor.
  • the processor and the storage medium may reside in an ASIC.
  • the ASIC may reside in a user terminal.
  • the processor and the storage medium may reside as discrete components in a user terminal.
  • suitable computing devices include, by way of non-limiting examples, cloud computing platforms, distributed computing platforms, server clusters, server computers, desktop computers, laptop computers, notebook computers, subnotebook computers, netbook computers, and netpad computers.
  • the computing device includes an operating system configured to perform executable instructions.
  • the operating system is, for example, software, including programs and data, which manages the device’s hardware and provides services for execution of applications.
  • server operating systems include, by way of nonlimiting examples, FreeBSD, OpenBSD, NetBSD®, Linux, Apple® Mac OS X Server®, Oracle® Solaris®, Windows Server®, and Novell® NetWare®.
  • personal computer operating systems include, by way of non-limiting examples, Microsoft® Windows®, Apple® Mac OS X®, UNIX®, and UNIX-like operating systems such as GNU/Linux®.
  • the operating system is provided by cloud computing.
  • Suitable mobile smartphone operating systems include, by way of non-limiting examples, Nokia® Symbian® OS, Apple® iOS®, Research in Motion® BlackBerry OS®, Google® Android®, Microsoft® Windows Phone® OS, Microsoft® Windows Mobile® OS, Linux®, and Palm® WebOS®.
  • the platforms, systems, media, and methods disclosed herein include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked computing device.
  • a computer readable storage medium is a tangible component of a computing device.
  • a computer readable storage medium is optionally removable from a computing device.
  • a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, distributed computing systems including cloud computing systems and services, and the like.
  • the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
  • the platforms, systems, media, and methods disclosed herein include at least one computer program, or use of the same.
  • a computer program includes a sequence of instructions, executable by one or more processor(s) of the computing device’s CPU, written to perform a specified task.
  • Computer readable instructions may be implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), computing data structures, and the like, which perform particular tasks or implement particular abstract data types.
  • APIs Application Programming Interfaces
  • a computer program may be written in various versions of various languages.
  • a computer program comprises one sequence of instructions. In some embodiments, a computer program comprises a plurality of sequences of instructions. In some embodiments, a computer program is provided from one location. In other embodiments, a computer program is provided from a plurality of locations. In various embodiments, a computer program includes one or more software modules. In various embodiments, a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
  • the platforms, systems, media, and methods disclosed herein include software, server, and/or database modules, or use of the same.
  • software modules are created by various techniques using various machines, software, and languages.
  • the software modules disclosed herein are implemented in a multitude of ways.
  • a software module comprises a file, a section of code, a programming object, a programming structure, a distributed computing resource, a cloud computing resource, or combinations thereof.
  • a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, a plurality of distributed computing resources, a plurality of cloud computing resources, or combinations thereof.
  • the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, a standalone application, and a distributed or cloud computing application.
  • software modules are in one computer program or application. In other embodiments, software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on a distributed computing platform such as a cloud computing platform. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
  • the platforms, systems, media, and methods disclosed herein include one or more databases, or use of the same.
  • databases are suitable for storage and retrieval of information, for example customer incident data.
  • suitable databases include, by way of non-limiting examples, relational databases, nonrelational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, XML databases, document oriented databases, and graph databases. Further non-limiting examples include SQL, PostgreSQL, MySQL, Oracle, DB2, Sybase, and MongoDB.
  • a database is Internet-based.
  • a database is web-based.
  • a database is cloud computing-based.
  • a database is a distributed database.
  • a database is based at least in part on one or more local computer storage devices.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Analytical Chemistry (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Sont divulgués ici des systèmes et des procédés pour déterminer un état pathologique d'un individu. Les systèmes peuvent être configurés pour mettre en œuvre les procédés divulgués ici, et ces procédés peuvent comprendre la réception de données de séquençage pour un ou plusieurs fragments d'acide nucléique acellulaire obtenus ou dérivés d'un échantillon biologique de l'individu. Ces procédés peuvent également comprendre la détermination d'un profil de méthylation pour l'individu sur la base de l'identification de motifs anormaux d'éléments de méthylation. Un modèle d'apprentissage automatique entraîné peut être utilisé pour générer une indication de l'état pathologique.
PCT/US2025/023477 2024-04-08 2025-04-07 Systèmes et procédés de détection d'une maladie à l'aide d'un profilage de méthylation et d'une identification de tissu Pending WO2025217055A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202463631143P 2024-04-08 2024-04-08
US63/631,143 2024-04-08
US202463631761P 2024-04-09 2024-04-09
US63/631,761 2024-04-09

Publications (1)

Publication Number Publication Date
WO2025217055A1 true WO2025217055A1 (fr) 2025-10-16

Family

ID=97350753

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/023477 Pending WO2025217055A1 (fr) 2024-04-08 2025-04-07 Systèmes et procédés de détection d'une maladie à l'aide d'un profilage de méthylation et d'une identification de tissu

Country Status (1)

Country Link
WO (1) WO2025217055A1 (fr)

Similar Documents

Publication Publication Date Title
US20220351805A1 (en) Systems and methods for detecting cellular pathway dysregulation in cancer specimens
Alasoo et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response
Tam et al. Robust global microRNA expression profiling using next-generation sequencing technologies
Zhang et al. Accuracy of CNV detection from GWAS data
Allhoff et al. Differential peak calling of ChIP-seq signals with replicates with THOR
JP2023524627A (ja) 核酸のメチル化分析による結腸直腸癌を検出するための方法およびシステム
Gamazon et al. A genome-wide integrative study of microRNAs in human liver
Yu et al. CLImAT: accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole-genome sequencing data
Wang et al. Screening and bioinformatics analysis of circular RNA expression profiles in hepatitis B-related hepatocellular carcinoma
US20130317083A1 (en) Non-coding transcripts for determination of cellular states
KR20140051461A (ko) 흡연 상태를 결정하기 위한 방법 및 조성물
CN115812101A (zh) 用于鉴定结肠细胞增殖性病症的rna标志物和方法
Lu et al. Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
Mooney et al. High throughput qPCR expression profiling of circulating microRNAs reveals minimal sex-and sample timing-related variation in plasma of healthy volunteers
Wang et al. Circular RNA as a potential biomarker for forensic age prediction
Sun et al. Genome-wide mapping of RNA Pol-II promoter usage in mouse tissues by ChIP-seq
WO2022212590A1 (fr) Systèmes et méthodes de détection multi-analytes de cancer
EP4473132A1 (fr) Systèmes et méthodes de surveillance du cancer à l'aide d'une analyse de maladie résiduelle minimale
CA3259845A1 (fr) Procédés et compositions d’enrichissement de molécules d’acide nucléique pour le séquençage
Waldron et al. Report on emerging technologies for translational bioinformatics: a symposium on gene expression profiling for archival tissues
WO2023106941A2 (fr) Systèmes et méthodes d'évaluations de maladies
WO2021150990A1 (fr) Classificateurs de maladies à base de petits arn
Gamazon et al. Integrative genomics: quantifying significance of phenotype-genotype relationships from multiple sources of high-throughput data
Shen et al. Large-scale integration of the non-coding RNAs with DNA methylation in human cancers
Zhu et al. Cell-free DNA from clinical testing as a resource of population genetic analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25786695

Country of ref document: EP

Kind code of ref document: A1