[go: up one dir, main page]

WO2024264010A1 - Procédés d'enrichissement et d'analyse d'adn méthylé - Google Patents

Procédés d'enrichissement et d'analyse d'adn méthylé Download PDF

Info

Publication number
WO2024264010A1
WO2024264010A1 PCT/US2024/035148 US2024035148W WO2024264010A1 WO 2024264010 A1 WO2024264010 A1 WO 2024264010A1 US 2024035148 W US2024035148 W US 2024035148W WO 2024264010 A1 WO2024264010 A1 WO 2024264010A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
methylation
methylated
cpg
implementations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/035148
Other languages
English (en)
Inventor
Abhijit Ajit PATEL
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of WO2024264010A1 publication Critical patent/WO2024264010A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Definitions

  • the present disclosure relates to methods and compositions for characterization and analysis of epigenetic features of DNA.
  • DNA methylation is an epigenetic modification that plays important roles in many biological processes, including gene regulation, maintenance of genome stability, embryo development, X-chromosome inactivation, genomic imprinting, and cellular differentiation.
  • DNA methylation involves the addition of a methyl group to the DNA molecule, primarily at cytosine bases in a CpG dinucleotide context, producing a 5-methylcytosine base.
  • This modification is catalyzed by a group of enzymes known as DNA methyltransferases, and methylation patterns in a cell’s genome are tightly regulated. The patterns can be maintained during cell division, providing a mechanism for stable gene silencing or activation.
  • DNA methylation patterns have been associated with several human diseases, including cancer, cardiovascular diseases, metabolic diseases, neurological disorders, autoimmune disorders, and with aging. Therefore, DNA methylation has emerged as a promising biomarker of disease.
  • the aberrant methylation patterns can occur at specific genomic regions, such as gene promoters or CpG islands, leading to the dysregulation of gene expression and contributing to disease development.
  • methylation signatures that are cell-lineage-specific, tissue-specific, or cancer-specific can be used to identify the origin of cell-free DNA fragments in biological fluids. Measurement of such methylation signatures can be used to infer the rate of death of the corresponding cell types, providing a means to assess cell death from specific tissues, transplanted organs, a fetus, or cancer.
  • DNA methylation patterns can be analyzed in various readily accessible biological samples, including blood, urine, saliva, sputum, and stool, making it a convenient and accessible biomarker. Methylation patterns can also be analyzed in biospecimens obtained from medical procedures, such as tissue, cerebrospinal fluid, pleural fluid, ascites fluid, uterine lavage fluid, and Pap smear fluid. The measurement of DNA methylation patterns can aid in disease diagnosis, prognosis, prediction or monitoring of treatment response, and detection of residual or recurrent disease.
  • cfDNA cell-free DNA
  • Dying cancer cells release DNA fragments into the bloodstream, and this cfDNA contains characteristic alterations in DNA methylation patterns.
  • Enzymatic processes that determine methylation levels in various genomic regions are tightly regulated in healthy cells, resulting in highly consistent genome-wide methylation patterns in a given tissue or cell type. These processes become dysregulated in cancer cells, yielding aberrant patterns of methylation that are rarely found in healthy tissues.
  • the transcriptional silencing of tumor suppressor genes, developmental regulators, and many other genes by hypermethylation of promoters is a fundamental mechanism of carcinogenesis.
  • CpG island In humans, approximately 70% of promoters located near a gene’s transcription start site contain a CpG island. CpG islands are stretches of genomic DNA containing a relatively high density of CpG sites which are targets of methylation (CpG site refers to the sequence 5’-C-phosphate-G-3’). There are typically several hundred to a few thousand gene promoters at CpG Islands that become aberrantly hypermethylated in a tumor, with substantial heterogeneity in methylation patterns across different patients and different tumors. Some promoter hypermethylation patterns can also be organ-specific, enabling tissue of origin prediction from hypermethylated cfDNA fragment analysis.
  • DNA methylation is being used to measure biological aging. As individuals age, patterns of DNA methylation are known to change predictably across various genomic sites. These methylation changes can be used as epigenetic clocks to estimate the biological age of an individual or of a particular organ (e.g. the liver, kidney, heart, etc.). This approach is believed to provide a more accurate reflection of an individual’s physiological state than simple chronological age, potentially providing insights into disease risk and a means to measure the effectiveness of anti-aging therapies.
  • These clocks such as the Horvath clock, assess biological age by examining methylation levels at particular CpG sites. Many of the CpG sites that are strongly predictive of biological age are found at or in the vicinity of CpG islands. Thus, genome-wide assessment of methylation levels at CpG islands could provide a more accurate measurement of biological age than current methods which are focused on a limited number of pre-defined CpG sites.
  • hypermethylated cfDNA fragments mapping to CpG islands typically have 10 or more methylated CpG sites in a fragment of -160-180 base pairs, whereas the enriched pool of methylated cfDNA captured by cell-free MeDIP contains an average of 4 methylated CpG sites.
  • Achieving more selective enrichment of fragments with higher methylation density has been challenging.
  • One approach to selectively capture more densely methylated fragments is to include methylated competitor DNA fragments (which cannot participate in downstream sequencing) during antibody- or methyl binding domainbased affinity purification.
  • the methylation density distribution of captured DNA fragments can be tuned by adjusting the amount, methylation density, methylation content, and/or fragment size of the competitor DNA added.
  • increasing selectivity for densely methylated DNA fragments also leads to greater loss of such fragments because of increased competition for binding. Such losses can degrade the relevant signal when input molecules are limited, as with cfDNA.
  • Such a method could permit comprehensive capture of densely methylated DNA fragments mapping to CpG islands throughout the genome. Importantly, the method would not require pre-specification of genomic target sequences. Instead, the method would be able to dynamically capture hypermethylated CpG island and/or promoter sequences wherever they occur in a genome. The method would be able to detect aberrant hypermethylation patterns of CpG islands from any type of cancer as well as from other disease states.
  • the method could be used to measure aberrant cancer-derived hypermethylation signals in biofluids or biospecimens to enable early detection of cancer, to assess prognosis, to predict therapy efficacy, to monitor treatment response and disease progression, to identify changes in hypermethylation patterns that could be indicative of treatment response or resistance, and to detect residual or recurrent cancer. Additionally, because the method would enable identification of patient-specific methylation patterns in a biospecimen obtained from a patient at a certain point in time, knowledge of that pattern could be used to improve the sensitivity of detection of similar patterns in biospecimens obtained from the same patient at a different point in time.
  • this approach could enable the development of tumor-informed or plasma-informed personalized assays requiring only modifications to computational algorithms, and not requiring physical or experimental changes to the assay methodology. Because the method would enrich DNA fragments based on methylation density rather than sequence, it would be able to enrich densely methylated DNA fragments regardless of their genomic origin, including from essentially any vertebrate species as well as from viruses and microbes.
  • the current disclosure is directed to methods and compositions that enable efficient measurement and analysis of biologically or medically informative DNA methylation patterns.
  • Some disclosed methods and compositions permit characterization of aberrant hypermethylation patterns at CpG Islands throughout a genome without needing to target prespecified sequences of interest.
  • Methods and compositions are described for enriching DNA molecules based on density of methylated CpG sites while minimizing loss of unique sequences derived from densely methylated DNA molecules.
  • Methods and compositions are also described to enable selective sequencing of densely methylated DNA molecules from a population of DNA molecules, while minimizing the loss of unique sequences derived from said densely methylated DNA molecules.
  • Some methods and compositions described herein enable conversion and amplification of DNA with restoration of CpG methylation patterns in the DNA copies.
  • disclosed methods can be applied to detection or monitoring of cancer. In some implementations, disclosed methods can be applied to assessment of evolving methylation patterns during or after a therapy as well as identification of potential mechanisms of therapy resistance. In some implementations, methods described herein can be applied to sensitive detection of residual, recurrent, or progressing cancer by detecting aberrantly hypermethylated DNA fragments in post- treatment biospecimens that match patient-specific patterns of aberrant hypermethylation identified in pre-treatment biospecimens. In some implementations, disclosed methods can be applied to diagnosis and monitoring of non-cancer disease conditions that produce altered methylation patterns in patient biospecimens.
  • disclosed methods can be applied to assessment of organ-specific cell-death in various disease states based on measurement of organ-specific methylation patterns in cell-free DNA. In some implementations, disclosed methods can be applied to measurement of biological aging. In some implementations, disclosed methods can be applied to assessment of CpG island hypermethylation in vertebrate species for veterinary, agricultural, and/or animal model research applications. In some implementations, disclosed methods can be applied to assessment of densely methylated viral and/or microbial DNA fragments. In some implementations, disclosed methods can be developed into a kit. In some implementations, disclosed methods can be combined with spatial labeling techniques to permit assessment of spatial patterns of CpG island hypermethylation in tissues.
  • a method of DNA conversion and amplification with restoration of CpG methylation patterns comprising the following steps: a. converting unmodified cytosine bases in template DNA molecules to uracil bases by deamination, resulting in converted template DNA molecules; b. performing a polymerase chain reaction (PCR) to generate DNA copies of the converted template DNA molecules; c. methylating cytosine bases at unconverted CpG sites in the DNA copies using an enzyme, thereby providing converted and amplified copies of DNA with CpG methylation patterns restored.
  • PCR polymerase chain reaction
  • a method of enriching DNA molecules based on density of methylated CpG sites while minimizing loss of unique sequences derived from densely methylated DNA molecules comprising: a. converting unmodified cytosine bases in template DNA molecules to uracil bases by deamination, resulting in converted template DNA molecules; b. performing a polymerase chain reaction (PCR) to generate DNA copies of the converted template DNA molecules; c. methylating cytosine bases at unconverted CpG sites in the DNA copies using an enzyme, resulting in DNA copies with restored methylation; d. enriching densely methylated members of the population of DNA copies with restored methylation via selective capture based on methylation density.
  • PCR polymerase chain reaction
  • a method of selectively sequencing densely methylated DNA molecules from a population of DNA molecules, while minimizing the loss of unique sequences derived from said densely methylated DNA molecules comprising the following steps: a. converting unmodified cytosine bases in template DNA molecules to uracil bases by deamination, resulting in converted template DNA molecules; b. performing a polymerase chain reaction (PCR) to generate DNA copies of the converted template DNA molecules; c. methylating cytosine bases at unconverted CpG sites in the DNA copies using an enzyme, resulting in converted DNA copies with restored methylation; d.
  • PCR polymerase chain reaction
  • a method for detecting residual, recurrent, or progressing cancer comprising: a. sequencing densely methylated DNA fragments from tumor tissue, blood, plasma, serum, or urine of a patient diagnosed with cancer to identify a plurality of aberrantly hypermethylated CpG island regions that are specific to that patient’s cancer; b. obtaining one or more longitudinal samples of blood, plasma, serum, or urine from the patient after the patient has received a cancer treatment; c. sequencing densely methylated DNA fragments from the post-treatment longitudinal sample; d.
  • Figure 1 provides a schematic illustration of an example method of selectively sequencing densely methylated DNA molecules from a population of DNA molecules with varying methylation density, while minimizing the loss of unique sequences derived from said densely methylated DNA molecules.
  • the schematic representation shows 3 examples of biologically derived input DNA molecules at the top of the figure: one with relatively dense methylation, a second with relatively sparse methylation, and a third with no methylation.
  • the densely methylated DNA is shown with 4 symmetrically methylated CpG sites.
  • densely methylated DNA fragments can have 8 or more methylated CpG sites (symmetric or asymmetric) in fragments of -100 to 250 base pairs in length.
  • the example shows that adapters are ligated to input DNA fragments, and then the DNA undergoes enzymatic methyl conversion (or bisulfite conversion) followed by PCR amplification.
  • the resulting converted and amplified DNA copies have sequences in which unmodified C bases were converted to T bases, whereas methylated or hydroxymethylated C bases were retained as C bases.
  • some amplified copies shown in the schematic are derived from the converted Watson strand of the input DNA and some are derived from the converted Crick strand.
  • a CpG methyltransferase enzyme is used to restore methylation at unconverted CpG sites in the converted, amplified DNA copies.
  • CpG methyltransferase enables restoration of original methylation patterns.
  • the amplified DNA copies with restored methylation patterns are then shown undergoing selective enrichment of densely methylated DNA copies by competitive binding to methyl binding domain protein (or antibody to 5-mC) and capture on magnetic beads. Densely methylated competitor DNA fragments are added to the capture mix to competitively inhibit the capture of fragments with lower methylation density.
  • the ability to generate multiple redundant copies of each DNA input molecule with methylation patterns restored enables use of stringent capture and enrichment conditions (including an option of more than one round of capture) while preserving representation of unique sequences of densely methylated DNA fragments.
  • the schematic illustrates the purification of a next-generation sequencing (NGS) library of densely methylated, converted sequences. Resulting sequences can be mapped to a reference genome and the original methylation status of cytosine bases can be inferred based on C to T conversion.
  • NGS next-generation sequencing
  • Figure 2 provides a more detailed schematic illustration of an example method in which double-stranded biologically derived input DNA is converted and amplified with restoration of CpG methylation patterns.
  • double- stranded biologically derived input DNA shown at the top of the figure, a symmetrically methylated CpG site is shown (with methylated cytosines on both strands), and several unmethylated cytosines are also shown within and outside of a CpG context.
  • Bisulfite conversion or Enzymatic Methyl conversion results in unmodified cytosines being converted to uracils by deamination. 5- methylcytosines and 5 -hydroxymethylcytosines are rarely converted to uracils.
  • Figure 3 provides a detailed schematic illustration of an example method in which single-stranded biologically derived input DNA is converted and amplified with restoration of CpG methylation patterns. The process is largely analogous to that shown in Figure 2, but the resulting converted, amplified, and re-methylated sequences are derived from conversion of a single-strand sequence.
  • Figure 4 shows a schematic overview of an example method which enables enrichment of DNA molecules based on density of methylated CpG sites while minimizing loss of unique sequences derived from densely methylated DNA molecules.
  • the figure highlights the creation of a converted, amplified NGS library with methylation patterns restored, which can then be subjected to stringent selection conditions to enrich densely methylated DNA while taking advantage of the sequence redundancy to preserve unique sequence representation of densely methylated DNA fragments.
  • Figure 5 shows a schematic of a two-step ligation scheme that enables ligation of adapter sequences to double- stranded DNA fragments of interest while minimizing formation of adapter dimers.
  • the illustrated two-step ligation scheme was used in Example 1 in the Detailed Description section to attach adapter sequences to blunted and 5 ’-phosphorylated cell-free DNA fragments and genomic DNA fragments.
  • the first step involves ligation of stem-loop adapters to the insert DNA by forming a phosphodiester linkage between a 5’- phosphylated end of the insert DNA and the 3 ’ -hydroxyl end of the stem-loop adapter.
  • the 5 ’-end of the stem-loop adapter lacks a phosphate, and therefore cannot be ligated to the insert or to another stem-loop adapter molecule (thereby avoiding adapter dimer formation).
  • USER enzyme is used to cleave at deoxyUridine positions in the stem of the stem-loop adapter to destabilize base-pairing. DNA is then cleaned up to remove unligated adapters, ligase, and USER enzyme.
  • a displacer oligonucleotide is added to displace one strand of the stem-loop adapter by hybridization to the opposite (ligated) strand, as shown in the figure.
  • a nick-sealing ligase (HiFi Taq DNA ligase) is used to ligate the 5 ’-phosphorylated displacer oligonucleotide to the 3 ’-end of the insert DNA.
  • the stem- loop adapter and displacer oligonucleotides were designed to attach sample barcodes and Illumina adapter sequences to the input DNA fragments.
  • adapter sequences for other sequencing platforms could be readily substituted.
  • Figure 6 shows histograms comparing the CpG dinucleotide content of sequenced cfDNA fragments before vs. after two rounds of selective capture and elution of densely methylated DNA fragments (which is referred to here as high density methyl-capture).
  • the CpG dinucleotide count refers to the number of CpG sites (methylated, hydroxymethylated, or unmethylated) in a biologically derived input DNA fragment, not the remaining (unconverted) CpG sites after conversion and amplification.
  • Red boxes are included to highlight the robust enrichment of fragments harboring 8 or more CpG sites (in fragments averaging -170-180 bp in length), which is the methylation density range typically found in CpG islands and promoters.
  • Figure 7 presents a genomic map showing a change in alignment and coverage of sequenced cell-free DNA fragments in the region of the PAX8 gene on Chromosome 2 before vs. after two rounds of selective capture and elution of densely methylated DNA fragments.
  • Preparation of the native library comprised steps of conversion, amplification, and restoration of methylation patterns using methods disclosed herein.
  • the enriched library was further subjected to two rounds of methyl binding domain-based affinity capture and elution with competitive binding of a 226 base-pair competitor DNA containing 10 methylated CpG sites.
  • sequences mapped in a largely random pattern throughout the genome.
  • Figure 8 shows a heat map displaying genomic regions at which aberrantly hypermethylated sequences from cell-free DNA fragments were observed to map in plasma of 11 patients with various types of cancer (advanced stage) and 8 non-cancer control subjects who were heavy smokers participating in a lung cancer screening program. Results are displayed for chromosome 2 (chosen arbitrarily), which is representative of genome-wide patterns. Dark bars represent genomic regions at which mapping is observed of one or more cfDNA fragments that are categorized as aberrantly hypermethylated.
  • Such fragments are densely methylated but map to genomic regions that are expected to have a methylation level of less than 40% (averaged across all CpG sites) in multiple types of healthy cells and tissues based on publicly available whole genome bisulfite sequencing data (from Roadmap and Blueprint studies).
  • Figure 9 shows the evolution of plasma cell-free DNA hypermethylation patterns over time in a patient with metastatic non-small cell lung cancer being treated with olaparib and cediranib.
  • the patient’s cancer initially showed a modest response to therapy (considered stable disease by RECIST criteria), and subsequently showed progression.
  • initial shrinkage followed by enlargement of a liver metastasis are shown in computer tomography (CT) scan images taken at baseline (prior to therapy) and at cycles 4 and 8 of therapy (each cycle is 28 days).
  • CT computer tomography
  • a graph is also provided showing changes over time in tumor burden (defined as sum of diameters of target lesions according to RECIST guidelines) and in tumor-derived cfDNA level (measured as the variant allele fraction [VAF] of a tumor-specific KRAS mutation in plasma cell-free DNA).
  • the tumor burden initially decreases with the drug therapy but then increases, likely because of growth of treatmentresistant tumor clones.
  • the mutant tumor-derived cfDNA level shows a transient spike in level (possibly due to initial cell kill) followed by a decline, indicative of tumor response. After ⁇ 4 months, the tumor-derived cfDNA level began to increase, indicative to tumor progression.
  • aberrantly hypermethylated cfDNA fragment counts mapping to chromosome 10 are shown at 4 time points: at baseline, shortly after beginning treatment, at the nadir of mutant tumor-derived cfDNA level, and when the cancer has clearly progressed.
  • Each circle indicates the observation of one or more aberrantly hypermethylated cfDNA fragments mapping to a CpG island at that genomic location.
  • Circle size is proportional to the number of fragments mapping to a given CpG island. Blue circles indicate CpG islands at which aberrantly hypermethylated cfDNA fragments mapped at baseline and during therapy. Green circles indicate CpG islands at which aberrantly hypermethylated fragments were observed at either of the first two time points but not thereafter.
  • Red circles indicate CpG islands at which aberrantly hypermethylated fragments were not observed at either of the first two time points but emerged thereafter. Analysis of such evolving aberrant hypermethylation patterns at CpG islands can provide biological and clinical insights pertaining to epigenetic resistance mechanisms, tumor heterogeneity, prognosis, and response or lack of response to therapy.
  • Figure 10 compares cell-free DNA hypermethylation patterns at CpG islands in plasma samples obtained at two time points (pre- and post-treatment) in a 76 year-old male patient with metastatic non-small cell lung cancer who received immune checkpoint inhibitor therapy with the drug Pembrolizumab. Plasma samples were obtained from the patient prior to initiating therapy (on cycle 1 day 1) and again after completing one cycle of treatment (on cycle 2 day 1, prior to receiving the second cycle). Cell-free DNA was extracted from plasma and was tested according to the methods described in Example 1. The Figure shows a graph in which cell-free DNA fragment counts mapping to various CpG islands are displayed for two time points (pre-treatment on the X-axis, and after 1 cycle on the Y-axis).
  • Each data point on the graph shows cell-free DNA fragment counts mapping to an individual CpG island at the two time points.
  • the graph shows that at many CpG islands, the relative fragment counts mapping to those CpG islands remain fairly stable over time, suggesting that these CpG islands are unlikely to be cancer-associated (considered background signal).
  • the graph also shows that at several other CpG islands, the relative fragment counts mapping to those CpG islands decrease substantially from the pre-treatment sample to the post-treatment sample, suggesting that these CpG islands are likely to be cancer-associated.
  • Such analysis can facilitate identification of CpG islands that show cancer-associated hypermethylation in a given patient (i.e., a personalized cancer-associated hypermethylation pattern).
  • FIG 11 shows that densely methylated viral DNA can be captured and sequenced using methods disclosed herein.
  • the data for the figure were generated from a 1 mL plasma sample that was obtained from a male patient with HIV who developed diffuse large B-cell lymphoma (DLBCL). Densely methylated viral DNA fragments were captured from this patient’s plasma in parallel with densely methylated cell-free DNA fragments derived from the patient’s genome. It is known that DLBCL in the setting of HIV is often associated with latent Epstein-Barr Virus (EBV) infection of B-cells. The plasma sample was obtained prior to initiation of any therapy.
  • EBV Epstein-Barr Virus
  • Cell-free DNA (including viral DNA) was extracted from the plasma sample and was tested according to the methods described in Example 1 , with a modification in the bioinformatic analysis to include alignment of DNA fragment sequences to viral genomes including Epstein-Barr Virus, HIV-1, Human Papilloma Virus, Kaposi’s Sarcoma Herpesvirus, Hepatitis B Virus, and Hepatitis C virus, in addition to the human genome reference (hg38).
  • the Figure shows densely methylated cell-free DNA fragments mapping to the EBV genome in the plasma of this patient. Note the periodicity of sequence coverage suggests phased nucleosomal protection of cfDNA fragments. Red bars in magnified views indicate methylated CpG sites; blue bars indicate unmethylated sites.
  • the current disclosure is directed to methods and compositions relating to medical diagnostics and biomedical research. Some methods enable enrichment of densely methylated DNA fragments from a mixture of DNA fragments with varying methylation density. Some methods enable enrichment of densely methylated DNA fragments that map to CpG island regions or promoter regions (or both) in a genome. Some methods enable enrichment of densely methylated DNA fragments while reducing loss of unique sequence representation of such fragments. Some methods enable enrichment of densely methylated DNA fragments using affinity capture techniques in which an antibody or protein preferentially binds to 5-methylcytosine or a methylated CpG site.
  • Some methods enable identification and quantification of DNA molecules that harbor epigenetic modifications including 5-methylcytosine, 5-hydroxymethylcytosine, or both. Some methods include methylated competitor DNA during affinity capture to modify the methylation density profile of the captured DNA. Some methods use next-generation sequencing to obtain the sequences of the enriched, densely methylated DNA fragments. Some methods include library preparation steps prior to and/or after the enrichment of densely methylated DNA fragments to enable next-generation sequencing of the densely methylated DNA. Some methods reduce loss of unique sequence representation of densely methylated DNA fragments during affinity capture by producing a plurality of copies of the template DNA fragments (with CpG methylation patterns of the template DNA molecules restored on the DNA copies) prior to performing selective affinity capture based on methylation density. Some methods enable conversion and amplification of DNA with restoration of CpG methylation patterns in the DNA copies. Some methods are well-suited to analysis of DNA from biological specimens in which the DNA quantity is limited, such as cell-free DNA from blood.
  • methods described herein address the challenge of selectively capturing and sequencing densely methylated DNA from CpG islands and/or promoter regions of vertebrate genomes without pre-defining target sequences and with minimal loss of desired sequence information during selective capture.
  • Enrichment of densely methylated DNA from CpG islands for sequencing is desirable because these genomic regions can be especially rich in biologically informative methylation signals.
  • signals of interest e.g., for biomarker development
  • DMRs differentially methylated regions frequently occur at CpG islands, even though CpG islands have been estimated to constitute only approximately 1-2% of most mammalian genomes.
  • some methods described herein enable generation of many redundant copies of each original (biologically derived) DNA fragment prior to performing selective capture of densely methylated DNA. In this way, representation of unique sequences is preserved even if a large proportion of desired molecules are lost under high- stringency capture conditions. For example, if an original densely methylated DNA molecule is amplified by PCR to 200 copies and only 5% of these copies are recovered after capture, 10 copies of that unique original molecule will still remain available for sequencing. However, standard PCR amplification does not copy methylation marks, making it impossible to subsequently perform affinity capture based on DNA methylation density. To enable amplification of DNA with restoration of methylation patterns, some methods are described herein that permit conversion and amplification of DNA with restoration of methylation at CpG sites.
  • some methods described herein comprise conversion of original (biologically derived) DNA molecules resulting in deamination of unmodified cytosine bases to uracil bases.
  • conversion can be performed by treatment of DNA with bisulfite or with enzymatic treatment (TET2 then APOBEC) as in Enzymatic Methylation sequencing (EM-Seq).
  • TET2 then APOBEC
  • EMT2 Enzymatic Methylation sequencing
  • cytosine bases which are methylated (5-mC) or hydroxymethylated (5-hmC) are protected from deamination.
  • the converted DNA can then undergo PCR amplification in which the uracil bases in the original converted DNA molecules are replaced by thymine bases in the DNA copies.
  • some methods described herein use a CpG methyltransferase (such as M.SssI) to restore methylation at CpG sites in the amplified DNA copies which correspond to CpG sites that were either methylated or hydroxymethylated in the original DNA molecules.
  • the CpG methyltransferase enzyme M.SssI catalyzes methylation at the C5 position of all cytosine residues within the double- stranded dinucleotide recognition sequence 5’ ...CG...3’.
  • CG dinucleotides also known as CpG sites
  • CpG methyltransferase can be used to restore the original methylation patterns on the DNA copies.
  • some conversion methods such as bisulfite conversion and EM-Seq conversion
  • the DNA copies will sometimes (rarely) contain methylated CpG sites where the original DNA was not methylated or hydroxymethylated, or conversely, will lack methylation at a CpG site that was methylated or hydroxymethylated on the original DNA.
  • the end result of this conversion, amplification, and re-methylation process is the production of amplified DNA copies in which unmodified C bases in the original DNA fragments are converted to T bases in the DNA copies, and methylated or hydroxymethylated CpG sites in the original DNA fragments are restored as methylated CpG sites in the DNA copies.
  • sequence redundancy Taking advantage of the sequence redundancy of the converted DNA copies with methylation patterns restored, some methods described herein are able to enrich DNA fragments based on their methylation density while minimizing loss of unique sequence representation of densely methylated DNA fragments.
  • the sequence redundancy permits use of enrichment or capture conditions that are highly selective for densely methylated DNA sequences. Because of the sequence redundancy, loss of some copies of an original (biologically derived) densely methylated DNA fragment under stringent enrichment conditions would be unlikely to result in complete loss of sequence representation of that fragment.
  • enrichment of densely methylated DNA copies could be performed using affinity purification methods based on antibodies or proteins that bind to 5- methylcytosine or to symmetrically methylated CpG sites in double- stranded DNA.
  • densely methylated competitor DNA molecules can be added to the affinity purification mixture to preferentially occupy binding sites to reduce the probability of capture of DNA copies with low methylation density.
  • the enriched, densely methylated, converted DNA copies can undergo next-generation sequencing to enable characterization of the sequences, genomic mapping locations, and methylation status of the DNA copies.
  • the term “enrichment of densely methylated DNA” refers to at least a 2-fold increase in the fraction of densely methylated DNA molecules divided by the total number of DNA molecules in a population. In a preferred implementation, this term refers to at least a 10-fold increase in the fraction of densely methylated DNA molecules divided by the total number of DNA molecules in a population. In a more preferred implementation, this term refers to at least a 100-fold increase in the fraction of densely methylated DNA molecules divided by the total number of DNA molecules in a population. In a most preferred implementation, this term refers to at least a 500-fold increase in the fraction of densely methylated DNA molecules divided by the total number of DNA molecules in a population.
  • methylation status of the DNA fragments can be directly assessed because of the sequence conversion.
  • Other existing methods such as MeDIP and MBD Capture that enrich methylated DNA have been developed to directly capture methylated DNA fragments derived from the biological source (sometimes preceded by or followed by steps to incorporate adapters and/or indices for next-generation sequencing), without conversion or amplification of the DNA prior to capture. With these methods, because the DNA did not undergo conversion, the captured fragments are assumed to be methylated at CpG sites based on the fact that they were selectively captured, but there is no direct sequence-based evidence of their methylation state.
  • DNA used as input for the assays and/or methods described herein can be derived from biological sources (such DNA is referred to herein as input DNA, original DNA, original input DNA, or biologically derived DNA).
  • input DNA can be cell-free DNA (cfDNA) derived from biofluids or biospecimens including but not limited to blood, plasma, serum, saliva, sputum, stool, cerebrospinal fluid, Papanicolaou smear fluid, uterine lavage fluid, peritoneal fluid, pleural fluid, or urine.
  • cfDNA cell-free DNA
  • input DNA can be cell-derived DNA or exosome-derived DNA obtained from biofluids or biospecimens including but not limited to tissue, blood, plasma, serum, saliva, sputum, stool, cerebrospinal fluid, Papanicolaou smear fluid, uterine lavage fluid, peritoneal fluid, pleural fluid, or urine.
  • input DNA can be double-stranded, single-stranded, or a combination of both.
  • DNA can be obtained from patients with cancer.
  • input DNA can be obtained from individuals being screened for cancer.
  • input DNA can be obtained from individuals with inflammatory, autoimmune, or infectious disease processes.
  • DNA can be obtained from healthy individuals with no known disease. In some implementations DNA can be obtained from forensic specimens including but not limited to hair, blood, semen, vaginal fluid, and skin. In some implementations, DNA can be obtained from sources that combine the DNA of multiple individuals or organisms including but not limited to human wastewater, agricultural wastewater, agricultural food stocks, and animal-derived food products.
  • adapter oligonucleotides that are compatible with a particular sequencing platform can be ligated to the DNA (to produce a next-generation sequencing library).
  • adapter sequences can be compatible with one or more of the following sequencing platforms, including but not limited to Illumina, Ion Torrent, Pacific Biosciences, BGI, Complete Genomics, and Oxford Nanopore.
  • the ends of the DNA inserts may be prepared for ligation to adapter oligonucleotides by enzymatic treatment to phosphorylate the 5 ’-ends, to produce blunt ends, or to produce ends with appropriate overhangs that are compatible with the adapters that are to be ligated.
  • a tagmentation approach can be used to attach adapters.
  • DNA adapter molecules are ligated to both paired DNA strands of a double-stranded DNA fragment (on one end or both ends of the DNA fragment).
  • adapter molecules can be attached in a similar manner by a transposase enzyme or by primer extension.
  • an adapter molecule can be ligated to the 5 ’-end of one strand of DNA, and a polymerase can be used to extend the 3 ’-end of the opposite strand to make a reverse-complement copy of the ligated adapter molecule, thereby attaching adapter sequences to both strands of the DNA.
  • the adapter molecule can comprise a DNA sequence tag that is substantially unique to the adapter (e.g., a Unique Molecular Identifier).
  • the adapter molecule can comprise a Molecular Lineage Tag (which may have diverse sequences but not necessarily sufficient diversity to be unique).
  • the adapter can be fully double-stranded, or can be partially double-stranded and partially singlestranded.
  • the adapter can be fully single-stranded.
  • the adapter can comprise the 4 unmodified DNA bases (A, C, T, and G).
  • the adapter can comprise modified DNA bases, including but not limited to 5 -methylcytosine and/or 5-hydroxymethylcytosine.
  • partially-double-stranded adapters can be ligated to both strands of the double-stranded DNA fragments.
  • adapters can be ligated to the biologically derived input DNA molecules prior to conversion.
  • adapters can be ligated to converted DNA molecules prior to amplification.
  • adapters can be litigated to converted and amplified DNA molecules prior to restoration of methylation patterns.
  • adapters can be ligated to converted, amplified DNA copies with methylation patterns restored prior to enrichment of densely methylated copies.
  • adapters can be ligated to DNA converted, amplified, methylation pattern restored, and dense methylation-enriched DNA prior to next generation sequencing.
  • adapters can comprise 5- methylcytosine (or 5-hydroxymethylcytosine or 5-carboxyctyosine or 5-formylcytosine) bases to prevent conversion in adapter sequences.
  • adapter sequences can be designed to avoid incorporation of CpG sequences which can subsequently become methylated by CpG methyltransferase.
  • adapters can be designed to minimize the formation of adapter dimers during litigation.
  • the process of adapter ligation can be optimized to reduce or prevent the formation of adapter dimers.
  • a two-step ligation approach is used 2 minimize formation of adapter dimers.
  • a two-step ligation approach (as schematized in Figure 5) comprises: (1) ligation of a 3’-end of a stem-loop adapter oligonucleotide to a 5 ’-end of a double-stranded insert DNA, without ligation of the opposite strand; and (2) displacement of the unligated strand of the stem-loop adapter by a displacer oligonucleotide, followed by ligation of the 5 ’-end of the displacer oligonucleotide to a 3 ’-end of the insert DNA.
  • a variety of adapter ligation methods are known in the art, including single stranded and double stranded ligation methods; in some implementations, any of these ligation methods can be utilized to produce next generation sequencing libraries of the densely methylated DNA.
  • the biologically derived DNA can undergo chemical or enzymatic (or both) conversion processes to enable methylated cytosines to be distinguished from unmethylated cytosines in the DNA.
  • the conversion process comprises bisulfite conversion.
  • the conversion process comprises the conversion methods used in Enzymatic Methylation Sequencing (EM-seq).
  • EM-seq Enzymatic Methylation Sequencing
  • unmodified cytosine bases in the biologically derived DNA are converted to uracils by deamination, and can be subsequently replaced by thymine bases in PCR- amplified DNA copies.
  • 5-methylcytosine bases and 5- hydroxymethylcytosine bases are protected from conversion, and are represented as cytosine bases in PCR-amplified DNA copies.
  • alternative conversion methods could be used to produce the conversion patterns shown in Table 1.
  • OxBS-seq Oxidative Bisulfite sequencing
  • TAB-seq TET- Assisted Bisulfite sequencing
  • ACE-seq APOBEC-Coupled Epigenetic sequencing
  • TAPS-seq TET-Assisted Pyridine Borane sequencing
  • Sequenced base* is sequencing output after conversion of input DNA and PCR.
  • Table 1 is not comprehensive; additional conversion methods exist and could be used with our approach.
  • the conversion is performed using chemical reagents, including but not limited to sodium bisulfite, potassium perruthenate, and/or pyridine borane.
  • the conversion is performed using enzymatic methods including, but not limited to APOBEC3A, TET2, and/or T4-betaGal.
  • the conversion is performed using a combination of enzymatic and chemical methods.
  • adapters may contain modified bases which would be resistant to conversion.
  • methylated CpG sites in the original biologically derived DNA be retained as CpG sites in the converted, PCR-amplified DNA copies, and that unmethylated CpG sites in the original biologically derived DNA be converted to a non-CG sequence in the converted, PCR-amplified DNA copies.
  • converted DNA is amplified via a polymerase chain reaction (PCR).
  • PCR amplification results in replacement of uracil bases in the converted DNA to thymine bases in the amplified DNA copies.
  • dUTP nucleotides can be included in the PCR buffer to retain uracil bases in the converted DNA as uracil bases in the amplified DNA copies.
  • the PCR amplification can be catalyzed by an enzyme that has the ability to read and amplify DNA templates containing uracil bases, including but not limited to Q5U polymerase (New England Biolabs), Phusion U polymerase (Thermo Fisher), and ZymoTaq Polymerase (Zymo Research).
  • the polymerase chain reaction can be facilitated by thermocycling.
  • an isothermal amplification reaction can be used to amplify the DNA, such as loop-mediated isothermal amplification (LAMP) or rolling circle amplification.
  • thermocycling can be stopped before the PCR amplification reaches plateau phase to ensure that most amplified products remain double-stranded.
  • fluorescence signal can be monitored via real time quantitative PCR to determine when thermocycling should be stopped prior to plateau phase. When a PCR amplification approaches plateau phase (saturation), the amplified products can become denatured with a low probability of becoming double-stranded by primer-extension in the next cycle (since PCR reagents have been exhausted).
  • the amplified products have high sequence diversity (such as when genomic DNA is amplified), there is very low probability of re-annealing of top and bottom strand copies of a given template DNA molecule. If PCR reaches plateau phase, many amplified copies will be single-stranded, and will therefore be unable to undergo subsequent methylation via a CpG methyltransferase enzyme. In some implementations, single stranded DNA copies from a PCR amplification that was allowed to reach plateau phase could be restored to double-stranded DNA by one or more rounds of primer extension in a separate enzymatic reaction.
  • PCR-amplified DNA products can be run on an electrophoretic gel (e.g., agarose) to selectively purify double-stranded DNA fragments of the desired size range.
  • electrophoretic gel e.g., agarose
  • PCR- amplified DNA products can be size-selected based on binding to solid-phase reversible immobilization (SPRI) paramagnetic beads.
  • SPRI solid-phase reversible immobilization
  • a CpG methyltransferase can be used to methylate cytosines bases at CpG sites in the converted, amplified double-stranded DNA copies.
  • the CpG methyltransferase can be M.SssI.
  • the CpG methyltransferase can be a member of the family of DNMT3 enzymes.
  • the CpG methyltransferase can be a member of the DNMT1 enzymes.
  • the CpG methyltransferase can be any methyltransferase with specificity for methylation of CpG sites.
  • the converted, amplified DNA copies with methylation patterns restored can be subjected to enrichment based on methylation density of the DNA copies.
  • densely methylated DNA copies are enriched.
  • enrichment of densely methylated DNA copies is enabled by affinity capture using one or more antibodies that specifically bind to 5-methylcytosine.
  • enrichment of densely methylated DNA copies is enabled by affinity capture using any member of the family of methyl binding domain proteins (MBD), or derivatives thereof, that have binding affinity for methylated double-stranded CpG sites.
  • MBD methyl binding domain proteins
  • enrichment of densely methylated DNA copies is enabled by affinity capture using MeCP2.
  • 5-methylcytosine bases in the methylated DNA copies can be converted to 5-hydoxymethylcytosine or 5-formylcytosine or 5- carboxycytosine, and enrichment of DNA copies can be enabled by affinity capture based on binding to the correspondingly modified cytosine base.
  • affinity capture of densely methylated DNA can be mediated by any of (but not limited to) the following: antibodies, aptamers, Affibodies, proteins, or peptides.
  • the selectivity of enrichment of densely methylated DNA can be increased by including methylated competitor DNA molecules in the affinity purification mixture.
  • the methylated competitor DNA can comprise DNA molecules with a high methylation density to competitively inhibit capture of DNA copies with a lower methylation density, and to promote capture of DNA copies with a high methylation density.
  • the methylated competitor DNA can be synthesized via chemical means on an oligonucleotide synthesizer.
  • the methylated competitor DNA can be produced via PCR amplification of a template that contains multiple CG dinucleotides, followed by methylation of CpG sites in the amplified competitor DNA copies using a CpG methyltransferase.
  • the methylated competitor DNA can be derived from a natural source.
  • the methylated competitor DNA can comprise many copies of a single defined sequence with a defined number of methylated CPG sites.
  • the methylated competitor DNA can comprise many different sequences with a defined number of methylated CPG sites.
  • the methylated competitor DNA can comprise many different sequences with a range of CPG density or CpG content.
  • the methylated competitor DNA can be derived from a biological source including but not limited to animals, plants, microbes, or viruses. In some implementations, the methylated competitor DNA can be derived from chemical synthesis. In some implementations, the methylated competitor DNA can be derived from in vitro enzymatic reactions. In some implementations, the methylated competitor DNA can have a length between 200 base pairs and 400 base pairs. In some implementations, the methylated competitor DNA can have a length between 20 base pairs and 1000 base pairs. In some implementations, the competitor DNA can have a broad range of lengths without any specified limits. In some implementations, the competitor DNA can have an average CpG methylation density of between 3 and 20 methylated CpG sites per 100 base pairs.
  • the competitor DNA can have an average CpG methylation density of between 5 and 15 methylated CpG sites per 100 base pairs. In some implementations, the competitor DNA can have an average CpG methylation density of between 6 and 10 methylated CpG sites per 100 base pairs. In some implementations, the methylation density of the competitor DNA can be adjusted to a level that yields a desired methylation density profile in the captured DNA of interest.
  • various parameters of the affinity purification can be adjusted to achieve a desired methylation density profile in the captured DNA of interest; the perimeters include but are not limited to: amount of competitor DNA, methylation density of competitor DNA, amount of binding protein (or antibody), amount of affinity capture beads, density of capture sites on the beads or surface, temperature of the capture, buffer conditions of the capture, washing conditions, and conditions of elution.
  • the selectivity of the enrichment method can be adjusted to capture mostly DNA fragments that map to CpG islands.
  • a single round of capture can be performed.
  • two or more rounds of capture can be performed to further remove fragments with low-methylation density. Sequence redundancy of the methylated DNA copies enables highly selective enrichment of densely methylated DNA, optionally including two or more rounds of enrichment, with minimal loss of unique sequence representation.
  • PCR amplification is necessary to produce sufficient DNA for next-generation sequencing of the densely methylated DNA copies that were converted, amplified, re-methylated, and enriched via selective capture.
  • a post-enrichment PCR amplification can be performed.
  • primers used in a post-enrichment PCR amplification can incorporate additional sequences in the amplified DNA copies, including but not limited to indices or barcodes to enable sample multiplexing and sequences needed for compatibility with a sequencing platform (e.g., Illumina P5 and P7 sequences).
  • the amplified next-generation sequencing library can be purified and/or undergo size-selection to enrich for DNA products of the appropriate length.
  • Figure 1 provides a schematic illustration of an example method of selectively sequencing densely methylated DNA molecules from a population of DNA molecules with varying methylation density, while minimizing the loss of unique sequences derived from said densely methylated DNA molecules.
  • the schematic representation shows 3 examples of biologically-derived input DNA molecules at the top of the figure: one with relatively dense methylation, a second with relatively sparse methylation, and a third with no methylation.
  • the densely methylated DNA is shown with 4 symmetrically methylated CpG sites.
  • densely methylated DNA fragments can have 8 or more methylated CpG sites (symmetric or asymmetric) in fragments of -100 to 250 base pairs in length.
  • the example shows that adapters are ligated to input DNA fragments, and then the DNA undergoes enzymatic methyl conversion (or bisulfite conversion) followed by PCR amplification.
  • the resulting converted and amplified DNA copies have sequences in which unmodified C bases were converted to T bases, whereas methylated or hydroxymethylated C bases were retained as C bases.
  • some amplified copies shown in the schematic are derived from the converted Watson strand of the input DNA and some are derived from the converted Crick strand.
  • a CpG methyltransferase enzyme is used to restore methylation at unconverted CpG sites in the converted, amplified DNA copies.
  • CpG methyltransferase enables restoration of original methylation patterns.
  • the amplified DNA copies with restored methylation patterns are then shown undergoing selective enrichment of densely methylated DNA copies by competitive binding to methyl binding domain protein (or antibody to 5-mC) and capture on magnetic beads. Densely methylated competitor DNA fragments are added to the capture mix to competitively inhibit the capture of fragments with lower methylation density.
  • the ability to generate multiple redundant copies of each DNA input molecule with methylation patterns restored enables use of stringent capture and enrichment conditions (including an option of more than one round of capture) while preserving representation of unique sequences of densely methylated DNA fragments.
  • the schematic illustrates the purification of a next-generation sequencing (NGS) library of densely methylated, converted sequences. Resulting sequences can be mapped to a reference genome and the original methylation status of cytosine bases can be inferred based on C to T conversion.
  • NGS next-generation sequencing
  • Figure 2 provides a more detailed schematic illustration of an example method in which double-stranded biologically-derived input DNA is converted and amplified with restoration of CpG methylation patterns.
  • double- stranded biologically-derived input DNA shown at the top of the figure, a symmetrically methylated CpG site is shown (with methylated cytosines on both strands), and several unmethylated cytosines are also shown within and outside of a CpG context.
  • Bisulfite conversion or Enzymatic Methyl conversion results in unmodified cytosines being converted to uracils by deamination. 5- methylcytosines and 5 -hydroxymethylcytosines are rarely converted to uracils.
  • Figure 3 provides a detailed schematic illustration of an example method in which single-stranded biologically-derived input DNA is converted and amplified with restoration of CpG methylation patterns. The process is largely analogous to that shown in Figure 2, but the resulting converted, amplified, and re-methylated sequences are derived from conversion of a single-strand sequence.
  • Figure 4 shows a schematic overview of an example method which enables enrichment of DNA molecules based on density of methylated CpG sites while minimizing loss of unique sequences derived from densely methylated DNA molecules.
  • the figure highlights the creation of a converted, amplified NGS library with methylation patterns restored, which can then be subjected to stringent selection conditions to enrich densely methylated DNA while taking advantage of the sequence redundancy to preserve unique sequence representation of densely methylated DNA fragments.
  • sequencing includes but is not limited to next-generation sequencing (NGS) or massively parallel sequencing.
  • NGS next-generation sequencing
  • an NGS platform used for analysis can be a sequencer made by Illumina.
  • next-generation sequencing can be performed on an instrument manufactured by companies including but not limited to Illumina, Ion Torrent, Pacific Biosciences, Qiagen, Thermo Fisher, Roche, BGI, Complete Genomics, and Oxford Nanopore.
  • sequencing can be performed in paired-end mode or in single-end mode.
  • sequencing read lengths can be between 30 and 1000 bases.
  • long -read sequencing can be used, in which read lengths are not defined.
  • sequencing is performed with 150- or 100-base read-lengths, in paired-end mode.
  • the sequencing output yields a plurality of converted sequences.
  • the converted, densely methylated DNA copies produced using methods described herein can be analyzed via other analytical means including but not limited to microarrays, pyrosequencing, primer extension assays, hybridization with complementary oligonucleotides, and/or analysis via fluorescence in microfluidic devices.
  • next-generation sequencing of DNA libraries produced using methods described herein yields a plurality of converted DNA sequences.
  • converted sequences comprise sequences in which unmodified cytosine bases in the original DNA molecules are read as thymine bases in the converted sequences.
  • converted sequences comprise sequences in which 5- methylcytosine bases in the original DNA molecules are read as cytosine bases in the converted sequences.
  • converted sequences comprise sequences in which 5 -hydroxy methylcytosine bases in the original DNA molecules are read as cytosine bases in the converted sequences.
  • converted sequences can be aligned to reference genome sequences that have been converted in silico.
  • the plurality of converted sequences can be grouped into sets, wherein each set of sequences is determined to be derived from an individual DNA fragment.
  • converted sequences can be compared to reference genome sequences that have not been converted to infer methylation states of cytosine bases in the original, unconverted DNA molecules.
  • methylation states of multiple CpG sites in a DNA fragment can be used to evaluate a methylation level of the fragment.
  • most converted sequences map to genomic regions with a high density of CpG sites, including but not limited to CpG islands.
  • the converted sequence data can be used to evaluate fragment-level methylation patterns at CpG islands across a genome.
  • fragment-level methylation patterns can be compared to methylation patterns obtained from independent evaluations of DNA derived from any of (but not limited to) the following: healthy tissues, diseased tissues, healthy cells, diseased cells, biospecimens from healthy individuals, biospecimens from individuals with disease, cancer cells, cancer tissues, or biospecimens from individuals with cancer.
  • comparisons of fragment-level methylation patterns with independently obtained methylation data can enable identification of fragments that match expected methylation patterns for a cell type, a tissue, or a disease state.
  • comparisons of fragment-level methylation patterns with reference methylation data can enable identification of fragments that do not match expected methylation patterns (aberrantly methylated fragments).
  • identification of fragments that match expected methylation patterns for a disease state can be used to aid in diagnosis of said disease state.
  • identification of fragments that match expected methylation patterns of a tissue or cell type can be used to infer the presence of DNA or measure the amount of DNA from that tissue or cell type in a biospecimen.
  • identification of fragments that match expected methylation patterns for a particular cancer type can be used to aid in diagnosis of that cancer type.
  • identification of fragments that do not match expected methylation patterns of healthy (non-cancerous) cells or tissues can be used to identify the presence of aberrant methylation patterns in a biospecimen that could be an indication of cancer-derived DNA.
  • the number of fragments that have aberrant methylation patterns in a biospecimen can be used to infer the amount of cancer cell death contributing to tumor-derived cell-free DNA in the biospecimen.
  • the number of fragments that have disease-associated methylation patterns in a biospecimen can be used to infer the amount disease-associated cell-free DNA in the biospecimen.
  • measurement of the number of fragments with cancer-associated or disease-associated methylation patterns can aid in evaluating the extent or degree of the cancer or disease.
  • disclosed methods can be used for clinical purposes. In some implementations, disclosed methods can be used for research purposes. In some implementations, disclosed methods can be used to determine if a person has a disease state. In some implementations, disclosed methods can be used to determine if a person has cancer. In some implementations, disclosed methods can be used to aid in early detection of cancer. For example, the detection of cancer-specific hypermethylated DNA fragment patterns in a clinical biospecimen such as plasma or urine can be used to identify patients who are likely to have cancer. In some implementations, disclosed methods can be used to estimate probabilities that a cancer originated from a particular type of tissue. For example, different cancer types are known to have cancer-type-specific methylation patterns.
  • hypermethylated DNA patterns can be compared to expected patterns for various types of cancer to find similarities in patterns which can suggest that the hypermethylated DNA fragments were derived from a particular type of cancer.
  • disclosed methods can be used to assess the stage of a cancer, the extent of a cancer, or the burden of tumor. For example, increased amounts of DNA fragments bearing tumor- specific methylation patterns in a biospecimen such as plasma may indicate a greater amount of tumor-DNA shedding which may be associated with a greater tumor burden (or cancer stage).
  • disclosed methods can be used to assess prognosis of a disease based on evaluation of either the amount or the pattern of disease-specific hypermethylated DNA fragments, or both.
  • disclosed methods can be used to assess the regression or progression of cancer. For example, changes over time in levels of DNA fragments bearing tumor- specific methylation patterns in a biofluid may indicate a corresponding change in the tumor burden of the patient.
  • disclosed methods can be used to assess treatment response to cancer therapy. For example, it has been shown in many studies that a patient whose cancer is responding to therapy will often have a decrease over time in tumor-derived cell-free DNA (cfDNA) fragments measurable in his or her plasma. In some patients, tumor-derived cell-free DNA is shed at a higher rate initially as cancer cells are killed by the therapy and spill their DNA into the blood (a transient spike).
  • cfDNA tumor-derived cell-free DNA
  • tumor-derived cfDNA would be expected to decrease.
  • changes in tumor-derived cell-free DNA levels can be measured by quantifying the amount of DNA fragments bearing tumor- specific methylation patterns.
  • disclosed methods can be used to assess the presence of residual cancer after a patient receives a curative-intent therapy.
  • a patient’s plasma can be tested following curative-intent therapy to detect the presence of cfDNA fragments containing cancer-associated methylation patterns.
  • detection of small amounts of residual cancer after a curative-intent therapy can be challenging due to the very small amount of tumor-derived cfDNA fragments that may be shed into the blood.
  • a patient-specific pattern of aberrant hypermethylation can be identified by initial testing of a biospecimen from that patient (for example, testing of tumor tissue or pre-treatment plasma).
  • such a patient-specific pattern can be used to personalize the signal detection algorithm to improve detection sensitivity.
  • a tumorspecific set of aberrantly hypermethylated CpG islands could be identified for a particular patient by testing tumor tissue or pre-treatment plasma of said patient (in which cancerspecific hypermethylated fragments are likely to be more abundant, providing a stronger signal).
  • tumor tissue or pre-treatment plasma of said patient in which cancerspecific hypermethylated fragments are likely to be more abundant, providing a stronger signal.
  • by identifying such aberrantly hypermethylated genomic regions that are specific to a particular patient’s tumor(s) one could look in post-treatment plasma for residual hypermethylation signal mapping to those same genomic regions (which would suggest the presence of persistent cancer after therapy).
  • a personalized algorithm when applied to measurement of cancer signals in a subsequent biospecimen (e.g., post-treatment plasma), a personalized algorithm could assign greater weight to signals from hypermethylated DNA fragments that match aberrant methylation patterns already identified to be present in that patient’s tumor tissue or pre-treatment plasma.
  • the disclosed methods do not require any physical or experimental alterations to the assay, as personalization could be achieved simply by bioinformatic modifications.
  • disclosed methods can be used to assess recurrence of disease after a patient receives cancer therapy. Early detection of recurrent cancer can also require very high detection sensitivity.
  • a personalized signal detection approach could also be employed for this purpose.
  • disclosed methods can be used to monitor changes in tumor- specific methylation patterns over time in a patient to assess the epigenetic evolution of a tumor. Because the disclosed methods are able to assess hypermethylated CpG island and/or promoter sequences from anywhere in a genome, in some implementations, the methods can dynamically capture the evolution of hypermethylation patterns in a tumor over time. This can be done without pre-defining genomic target regions based on sequencespecific targeting. In some implementations, disclosed methods can be used to identify epigenetic mechanisms of resistance to drug therapy. Because changes in methylation patterns can be monitored dynamically over time in an untargeted manner, in some implementations, the disclosed methods can enable identification of methylation changes that give rise to drug resistance without requiring pre-defined hypotheses for resistance mechanisms.
  • monitoring of dynamic changes in CpG island and/or promoter hypermethylation patterns could enable assessment of evolving cells states (e.g., epithelial to mesenchymal transition, transformation from adenocarcinoma to small cell carcinoma, etc.).
  • evolving cells states e.g., epithelial to mesenchymal transition, transformation from adenocarcinoma to small cell carcinoma, etc.
  • disclosed methods can be used to assess a variety of pathologies by identifying tissue- specific patterns of hypermethylation in blood. For example, a patient with liver cirrhosis may shed liver-derived DNA into the blood stream, allowing DNA fragments with liver-specific methylation patterns to be detected at higher levels than in the general population. In some cases, the amount of such a signal could be correlated with the severity or extent of disease. In some implementations, changes in tissuespecific hypermethylation signals over time could be used to monitor for exacerbations or improvements in a disease process.
  • methylation patterns derived from non-diseased cells
  • significant changes in disease-related methylation signals can be more readily identified by comparing signal in the same patient at different time points rather than comparing a patient’s signal against measurements in a population.
  • Similar analysis could be applied to methylation patterns that are specific to other organs including but not limited to: kidney, heart, lung, brain, muscles, bones, intestines, and pancreas.
  • disclosed methods can be used to assess transplanted organ rejection based on measurement of organ-specific methylation patterns.
  • disclosed methods can be used to assess methylation patterns of fetal or placental DNA from the maternal circulation in pregnancy.
  • disclosed methods can be used to assess changes in cells of a person’s immune system as an indication of health or disease.
  • disclosed methods can be used to assess aging. For example, changes in DNA methylation are known to occur as individuals age. Such changes could be measured to evaluate health status via an assessment of epigenetic age.
  • Biological aging can be measured via epigenetic clocks which are based on evaluation of DNA methylation changes. These clocks, such as the Horvath and Hannum clocks, analyze the methylation status of specific age- associated CpG sites across the genome.
  • disclosed methods which enrich densely methylated DNA fragments mapping to CpG islands can be used to evaluate methylation levels and/or patterns that can provide an estimation of biological age.
  • organspecific methylation changes could be evaluated to assess pathology or stress in an organ. For example, an individual who has a long history of excessive alcohol consumption may have a disproportionately high epigenetically measured age of his or her liver.
  • disclosed methods can be used in biomedical research to characterize hypermethylation patterns that can provide an assessment of gene expression states.
  • disclosed methods can he used in biomedical research to identify hypermethylation patterns to provide an understanding of fundamental cellular or developmental epigenetic processes.
  • disclosed methods can be used in biomedical research or clinical applications to evaluate methylation patterns in single cells or small clusters of cells because some methods are compatible with analysis of very small amounts of input DNA.
  • disclosed methods can be used to characterize methylation patterns in an ovum, sperm, or embryo to guide clinical decisions pertaining to in vitro fertilization.
  • disclosed methods can be used to evaluate methylation patterns in DNA derived from vertebrate organisms.
  • CpG islands which are genomic regions that have a high density of CpG sites, are found in the genomes of nearly all vertebrate organisms. Because disclosed methods can enrich densely methylated DNA fragments regardless of the genomic origin of said fragments, disclosed methods can be applied to analysis of methylation patterns in human and/or non-human vertebrate species.
  • disclosed methods can be used in veterinary medical applications in a manner that is analogous to human medical applications.
  • disclosed methods can be used to detect, diagnose, and/or monitor cancer in vertebrate animals, including but not limited to household pets, farm animals, and horses.
  • disclosed methods can be used to detect, diagnose, and/or monitor various diseases in vertebrate animals.
  • disclosed methods can be used for agricultural applications to detect, diagnose, and/or monitor disease in livestock.
  • disclosed methods can be used in biomedical research applications to study methylation patterns in model organisms including but not limited to mice, rats, frogs, and fish.
  • disclosed methods can be used in laboratory animals having human xenografted tumors.
  • disclosed methods can be used to distinguish methylation patterns arising from xenografted tumor cells versus from the host animal’s cells and/or tissues.
  • disclosed methods can be used to enrich densely methylated DNA fragments arising from a virus.
  • methylation state of DNA in a DNA virus can change depending on whether the DNA is in the virion or in a host cell, and also depending on the state of the host cell.
  • the Epstien-Barr Virus genome is known to be unmethylated in virions but becomes highly methylated during latent infection and in transformed B cells.
  • the proliferation and turnover of EB V-infected B cells can lead to increased shedding of hypermethylated EB V DNA into plasma, which can be exploited as a biomarker signal for lymphoma detection.
  • the DNA of several viruses has been observed to become hypermethylated in virus-associated malignancies (e.g.
  • hypermethylated viral DNA has a similar methylation density as CpG islands in vertebrate genomes, it can be enriched from complex DNA mixtures (for example, cell-free DNA) using methods disclosed herein.
  • disclosed methods can be used to enrich densely methylated viral DNA in parallel with densely methylated human and/or vertebrate animal DNA.
  • disclosed methods can be used to measure densely methylated viral DNA as a biomarker of cancer.
  • disclosed methods can be used to enrich densely methylated DNA fragments arising from Epstein-Barr Virus as a biomarker for detection of lymphoma in patients with HIV. In some implementations, disclosed methods can be used to enrich densely methylated DNA fragments arising from Epstein-Barr Virus as a biomarker for detection of post-transplant lymphoproliferative disorder (PTLD) in patients receiving immunosuppressive therapy after organ transplantation. In some implementations, disclosed methods can be used to measure densely methylated viral DNA as a biomarker to evaluate latent or lytic viral state. In some implementations, disclosed methods can be used to measure densely methylated viral DNA as a biomarker of disease involving shedding of said viral DNA from infected cells. For example, shedding of hepatitis B or hepatitis C viral DNA into blood could serve as a biomarker of liver cell death.
  • shedding of hepatitis B or hepatitis C viral DNA into blood could serve as a biomarker of liver
  • disclosed methods can be combined with spatially encoded DNA barcoding techniques to permit genome- wide analysis of methylation patterns at CpG islands and/or promoters in tissues in a spatially resolved manner.
  • spatially encoded DNA barcodes can be incorporated in or added to sequencing adapters.
  • spatial DNA barcodes can be attached by ligation.
  • spatial DNA barcodes can be attached by primer extension.
  • spatial DNA barcodes attachment can be facilitated by a transposase.
  • disclosed methods can be used to evaluate additional features of enriched densely methylated DNA fragments including but not limited to mutations, DNA fragment size, fragment location within the genome, and/or nucleosome protection pattern.
  • information gained from analysis of such additional DNA features could enable improved biomarker performance compared to analysis of DNA methylation patterns alone.
  • disclosed methods for enriching densely methylated DNA fragments can be preceded by chromatin immunoprecipitation (ChIP) to selectively enrich DNA fragments associated with histones having particular modifications.
  • immunoprecipitation using antibodies that specifically bind to, for example, Histone H3K27me3, Histone H3K9me3, Histone H3K4me3, and/or Histone H3K27ac could be used to enrich DNA fragments based on chromatin features prior to enrichment based on methylation density.
  • such sequential, orthogonal enrichment steps could yield more nuanced biomarker signals and/or improve the sampling of cancer-specific signals.
  • the ability to directly determine methylation status of CpG sites from converted sequence data results in greater accuracy in measurement of densely methylated DNA fragments.
  • methods such as MeDIP or MBD Capture enrich methylated DNA directly from biological sources without conversion, and captured fragments are presumed to be methylated because they were captured based on methylationspecific binding.
  • some fragments that have zero or few methylated CpG sites can also be non-specifically captured. Without the ability to assess methylation status, such fragments may be incorrectly presumed to have high CpG methylation content, thereby contributing to inaccurate background noise of an assay.
  • the methods described herein can improve the accuracy of measuring densely methylated DNA fragments because enriched fragments are converted, and can be verified to have a high CpG methylation density by comparison to aligned reference genomic sequences.
  • kits for performing the methods disclosed herein can comprise the reagents and materials necessary for the conversion of DNA, for PCR amplification, for CpG methylation, and for enrichment of densely methylated DNA.
  • a kit for performing the methods disclosed herein can additionally comprise reagents and materials necessary for production of next-generation sequencing libraries.
  • a kit for performing the methods disclosed herein can additionally comprise instructions and quality control materials to ensure accurate and reproducible results.
  • a kit for performing the methods disclosed herein can additionally comprise software and/or access to computational resources to enable analysis of next-generation sequencing data.
  • a method of attaching adapter oligonucleotides to double- stranded DNA fragments can be employed which utilizes two sequential enzymatic ligation steps to minimize formation of adapter dimers.
  • the adapters can be ligated to double-stranded DNA fragments of interest for the purpose of facilitating analysis by next-generation sequencing.
  • Adapter dimers can be formed via ligation of one adapter molecule to another adapter molecule.
  • Adapter dimers can be problematic for nextgeneration sequencing (NGS) libraries. These dimers can dominate the sequencing output, overwhelming the sequence output from the desired DNA fragments of interest.
  • Adapter dimers are more likely to form when the DNA fragments of interest are very low in abundance, as reaction stoichiometry in such situations favors ligation of adapters to other adapters over ligation of adapters to the DNA fragments of interest.
  • Adapter dimers can also be more efficiently amplified during PCR than the desired product of adapters ligated to the DNA fragments of interest because of the dimers’ short length (generally, shorter targets amplify more efficiently in PCR). Therefore, it is important to minimize formation of adapter dimers, especially when DNA input quantities for NGS library preparation are low.
  • a two- step ligation method disclosed herein is able to greatly reduce adapter dimer formation.
  • the method uses stem- loop adapters that lack a 5 ’-phosphate which would be required for adapter self-ligation.
  • a stem-loop adapter is able to ligate via its 3 ’-end to a 5’-phosphylated strand of a double-stranded DNA fragment of interest (insert DNA), but not to the opposite strand.
  • stemloop adapter molecule is unable to ligate to another stem loop adapter molecule because of the lack of 5’-phosphate ends on adapter molecules.
  • a second oligonucleotide having a 5 ’-phosphate can be hybridized to the ligated stem-loop adapter (by displacing one strand of DNA at the stem) and then ligated to the target DNA in a second enzymatic step.
  • the adapter used in the first ligation step is a stem-loop adapter.
  • the adapter used in the first ligation step can comprise two strands which are partially complementary and hybridized (known in the art as a Y-shaped adapter).
  • the adapter used in the first step can comprise two strands of DNA: a first strand having a 3 ’-end that is available for ligation to a 5 ’-phosphorylated DNA fragment of interest, and a second strand that is either partially or fully hybridized to the first strand in a manner that would enable a DNA ligase to catalyze ligation of the first strand to the DNA of interest but wherein the second strand lacks a 5 ’-phosphate and therefore cannot be ligated.
  • the target DNA fragments can be blunt-ended.
  • the target DNA fragments can have overhangs at their ends, such as a 3’-dA tail.
  • the first ligation step can utilize a DNA ligase that has optimal efficiency for ligation of double- stranded DNA (such as T4 DNA ligase or NEBNext Ultra II DNA ligase).
  • a DNA ligase that has optimal efficiency for ligation of double- stranded DNA
  • the DNA ligase and the excess unligated adapter molecules can be removed prior to the second ligation step by performing a DNA clean-up step.
  • cleavable positions (such as dU) can be incorporated into adapter oligonucleotides to facilitate hybridization of the displacer oligonucleotides in the second ligation step.
  • a 5-phosphorylated displacer molecule can be ligated to the double-stranded DNA of interest in the second ligation step using a nicksealing ligase such as HiFi Taq DNA ligase (New England Biolabs).
  • a nick-sealing ligase such as HiFi Taq DNA ligase (New England Biolabs).
  • HiFi Taq DNA ligase New England Biolabs
  • the two-step ligation method disclosed herein can enable ligation of adapters and displacer oligonucleotides to very low amounts of input DNA (double-stranded DNA of interest).
  • the two-step ligation method disclosed herein can enable next-generation sequencing from DNA derived from a small number of cells (less than 10 cells or less than 100 cells). In some implementations, the two- step ligation method disclosed herein can enable next-generation sequencing from DNA derived from a single cell.
  • Figure 5 shows an example schematic of a two-step ligation scheme that enables ligation of adapter sequences to double- stranded DNA fragments of interest while minimizing formation of adapter dimers.
  • the illustrated two-step ligation scheme was used in Example 1 in the Detailed Description section to attach adapter sequences to blunted and 5’- phosphorylated cell-free DNA fragments and genomic DNA fragments.
  • the first step involves ligation of stem-loop adapters to the insert DNA by forming a phosphodiester linkage between a 5 ’ -phosphylated end of the insert DNA and the 3 ’-hydroxyl end of the stem-loop adapter.
  • the 5 ’-end of the stem-loop adapter lacks a phosphate, and therefore cannot be ligated to the insert or to another stem-loop adapter molecule (thereby avoiding adapter dimer formation).
  • USER enzyme is used to cleave at deoxyUridine positions in the stem of the stem-loop adapter to destabilize base-pairing. DNA is then cleaned up to remove unligated adapters, ligase, and USER enzyme.
  • a displacer oligonucleotide is added to displace one strand of the stem-loop adapter by hybridization to the opposite (ligated) strand, as shown in the figure.
  • a nick-sealing ligase (HiFi Taq DNA ligase) is used to ligate the 5 ’-phosphorylated displacer oligonucleotide to the 3 ’-end of the insert DNA.
  • the stem-loop adapter and displacer oligonucleotides were designed to attach sample barcodes and Illumina adapter sequences to the input DNA fragments.
  • adapter sequences for other sequencing platforms could be readily substituted.
  • Blood was collected by venipuncture into a blood collection tube containing potassium-EDTA or containing a proprietary anticoagulation and stabilization cocktail designed to limit cellular degradation and to stabilize cell-free DNA (Cell-free DNA BCT from Streck). Tubes had 10 mL capacity, and at least 8 mL blood volume was required to be collected in each tube. Blood was inverted in the tube several times at the time of collection to ensure even mixing with the anticoagulant and/or stabilizer. Samples were kept at room temperature (20-25°C) during temporary storage and transportation prior to separation of plasma. Plasma was separated and frozen as soon as possible after blood collection, preferably within four hours if collection was in an EDTA tube or within 2 weeks if collection was in a Streck tube.
  • the collection tubes were centrifuged at 1000 x g for 10 minutes in a clinical centrifuge with a swinging bucket rotor with slow acceleration and deceleration (brake off). Plasma was removed from the red blood cells and buffy coat using a 1 mL pipette, being careful not to disturb the cells in the tube. The plasma was dispensed into 1.5 mL cryovials in 0.5 to 1 mL aliquots. The plasma was then frozen at -80° C until needed for further processing.
  • Blood was obtained from patients with various types of cancer at various stages. For some patients, blood was obtained at multiple time points before and during therapy. Blood was also obtained from individuals who did not have a cancer diagnosis (control subjects). Some of these control subjects had a history of heavy smoking and were participating in a lung cancer screening program based on eligibility according to the guidelines of the United States Preventative Services Task Force. All subjects provided informed written consent for participation in the study, which was approved by the Human Investigation Committee of Yale University.
  • Plasma was removed from the -80° C freezer and was thawed at room temperature for 15 to 30 minutes before proceeding with DNA extraction. Thawed plasma was then centrifuged at 6800 x g for 3 minutes to remove any cryoprecipitate. The supernatant was transferred to a fresh tube for further processing.
  • a QiaAmp® MinElute® Virus Vacuum Kit (Qiagen) was used for extraction of DNA from plasma volumes up to 1 mL (elution volume as low as 20 LI L). For larger volumes of plasma up to 5 mL, the QiaAmp® Circulating Nucleic Acid Kit was used for DNA purification (elution volume as low as 20 pL).
  • kits were used according to the manufacturer's instructions, generally eluting the DNA into the lowest recommended volume (preferably 20 pL).
  • cRNA carrier RNA
  • Qiagen carrier RNA
  • Genomic DNA was extracted from frozen tumor tissue samples or cancer cell lines using a DNeasy Blood & Tissue Kit (Qiagen), according to the manufacturer’s instructions.
  • tissue- or cell-derived gDNA Before further processing tissue- or cell-derived gDNA for next-generation sequencing library preparation, the DNA was sheared into fragments with an average length of 180 - 200 bp using focused ultrasonication (Covaris).
  • the cell-free DNA and fragmented gDNA samples were quantified by real-time quantitative PCR using a KAPA Human Genomic DNA Quantification and QC Kit for Illumina platforms (Roche) with the 129 bp Primer Premix, suitable for the expected fragment size distribution of the samples.
  • Varying amounts of cell-free DNA or fragmented gDNA were obtained and quantified.
  • minimum and maximum input DNA limits were set at 1 ng and 30 ng, respectively.
  • a quantitative spike-in control DNA mixture was added to each sample to enable comparison of library preparation efficiency across samples.
  • the spike-in control mixture consisted of PhiX 174 RF I DNA (New England Biolabs) that was fragmented to an average size of 180-200 base pairs by ultrasonication (Covaris). Approximately 50% of the fragments in the mixture were unmethylated, and 50% of the fragments had undergone CpG methylation using a CpG Methyltransferase (M.SssI; New England Biolabs) according to the manufacturer’s instructions. A total of approximately 2 picograms of the spike-in control mixture was added to each DNA sample.
  • the DNA samples (each in 20 microliters of 10 mM Tris-HCl pH 7.8 buffer) were treated with an enzyme mix comprising T4 DNA Polymerase and T4 Polynucleotide Kinase as provided in the Quick Blunting Kit (New England Biolabs; following manufacturer’s protocol), to produce 5 ’-phosphorylated, blunt-ended DNA. Enzymes were then heat-inactivated by incubation at 70°C for 10 minutes.
  • the blunted, 5 ’-phosphorylated DNA was then ligated to custom stem-loop oligonucleotide adapters using the NEBNext Ultra II Ligation Module (New England Biolabs) according to the manufacturer’s protocol.
  • the custom stem loop adapters and accompanying displacer oligonucleotides were designed to greatly reduce the formation of adapter dimers, thereby enabling preparation of sequencing libraries from very small amounts of input DNA.
  • the adapter oligonucleotide sequences are as follows: EMSv2m- 1 AGTXYAAGAYAXAXTXTTTXXXTAXAXGAXGXTXTTXTGATXTTAGAXT
  • FIG. 5 A two-step scheme of one-strand ligation followed by displacement and ligation of the second strand is shown in Figure 5. Because the 5 ’-end of the stem- loop adapter was not phosphorylated, an adapter molecule is unable to become ligated to another adapter molecule, thereby minimizing formation of adapter dimers. The 3’ -end of the adapter is able to become ligated to the 5’ -ends of the double- stranded insert DNA fragment which is phosphorylated at its 5 ’-ends.
  • the stem-loop adapters also contained deoxyUridine (dU) within the stem sequence to permit site-specific cleavage by USER enzyme (New
  • samplespecific barcode sequences were included in the adapter sequence to enable multiplexed sequencing of a plurality of samples on the same lane of a next-generation sequencing instrument (because sequences can be sorted into sample-specific datasets based on their barcode sequence).
  • sequences can be sorted into sample-specific datasets based on their barcode sequence.
  • a single uniquely barcoded adapter sequence was used for ligation to each sample, such that 24 individual samples would be ligated to adapters labeled with 24 distinct barcodes (1 to 1 mapping).
  • the bioinformatic demultiplexing algorithm would require both ends of the sequence to labeled with the same barcode.
  • concentration of stem-loop oligonucleotide adapter used in the ligation reaction was 1 micromolar in a final reaction volume of 45 microliters.
  • USER enzyme New England Biolabs was added at the manufacturer’s recommended concentration and the sample was incubated at 37°C for 30 minutes to cleave dU sites in the adapters.
  • DNA was then cleaned up to remove enzymes, buffers, and unligated adapters using AMPure XP beads (Beckman Coulter) according to the manufacturer’s protocol (adding 1.3 x the volume of bead slurry relative to the reaction volume to be purified).
  • the clean-up process included wash steps followed by elution in 12 microliters of 10 mM Tris-HCl pH 7.8 for each sample.
  • displacer oligonucleotides which have a phosphorylated 5 ’ -end were hybridized to the complementary sequence of the ligated adapter oligonucleotide and were ligated using HiFi Taq Ligase (New England Biolabs) according to the manufacturer’s protocol.
  • HiFi Taq Ligase efficiently seals nicks in DNA with very high fidelity, exhibiting greatly reduced ligation efficiency if there are mismatched base pairs at either side of the ligation junction.
  • the two-step ligation method which greatly reduces adapter dimer formation is schematized in Figure 5.
  • the sequence of the displacer oligonucleotide (with 24 distinct barcode sequences that match the barcode sequences of the stem-loop adapter oligonucleotides) is as follows:
  • the displacer oligonucleotide used in the second ligation step had the same barcode as the stem-loop adapter used in the first ligation step, to ensure perfect base pairing between the displacer and adapter sequences in the vicinity of the ligation junction.
  • the concentration of the displacer oligonucleotide in the reaction was 0.5 micromolar, and the final reaction volume was 25 microliters (for each sample).
  • the reaction was incubated at 60°C for 30 minutes.
  • Ligated DNA was then cleaned up using AMPure XP beads (Beckman Coulter) according to the manufacturer’s protocol (adding 1.3 x the volume of bead slurry relative to the reaction volume to be purified).
  • Enzymatic conversion of ligated DNA was performed using the NEBNext® Enzymatic Methyl-seq (EM-seq) Conversion Module kit (New England Biolabs), according to the manufacturer’s instructions.
  • This method is an alternative to bisulfite conversion, causing less damage, fragmentation, GC bias, and degradation of DNA.
  • the method enables identification of 5-methylcytosine (5-mC) and 5-hydroxymethylcytosine (5-hmC) bases in DNA by efficiently converting unmodified cytosine bases (not 5-mC or 5-hmC) to uracil bases.
  • the EM-seq method comprises two steps: (1) The enzyme TET2 is used to oxidize 5- mC and 5-hmC to 5-carboxycytosine (5-caC), providing protection from deamination by APOBEC enzy me; (2) The enzyme APOBEC is used to deaminate unmodified cytosines to uracils, while the 5-mC and 5-hmC bases which were oxidized to 5-caC in the first step are protected from deamination. Between the two steps, TET2-converted DNA was cleaned up according to the manufacturer’s protocol.
  • APOBEC-mediated deamination of cytosine is more efficient with single-stranded DNA
  • formamide was used to denature the DNA prior to the APOBEC enzymatic reaction, according to the manufacturer’s protocol.
  • the ligated adapter and displacer oligonucleotides contained several 5-mC positions which were protected from deamination and conversion to uracils.
  • Converted DNA was then cleaned up using AMPure XP beads (Beckman Coulter) according to the manufacturer’s protocol (adding 1.3 x the volume of bead slurry relative to the reaction volume to be purified). Cleaned-up DNA was eluted in 20 microliters of 10 mM Tris-HCl pH 7.8 for each batch of 12 samples.
  • PCR polymerase chain reaction
  • PCR-amplification was carried out using the NEBNext® Q5U Master Mix (New England Biolabs), according to the manufacturer’s protocol.
  • the Q5U high fidelity DNA polymerase harbors a mutation which enables amplification of templates containing uracil bases.
  • PCR primers were designed to hybridize to the adapter and displacer sequences, and the primers comprised the following sequences (5-carboxy-cytosine bases were included to prevent methylation at those bases in subsequent steps):
  • Primers were added to the reaction at a final concentration of 200 nanomolar for each primer.
  • SYBR Green I dye (Thermo Fisher Scientific) was added to the PCR reaction at the concentration recommended by the manufacturer to permit fluorescence-based measurement of double- stranded DNA amplification during real-time quantitative PCR.
  • Quantitative PCR was carried out on a CFX96TM System (Bio-Rad) thermocycler, and change in fluorescence signal during the reactions was monitored in real-time. Samples were removed from the thermocycler as the amplification neared saturation (plateau of fluorescence signal), but approximately 2-3 cycles prior to reaching saturation.
  • Thermocycling parameters were as follows: (1) 98°C for 30 seconds, (2) 98°C for 10 seconds, (3) 62°C for 30 seconds, (4) 65°C for 60 seconds, (5) repeat thermocycling steps #2-4 until the real-time fluorescence signal begins to plateau. Samples were removed from the thermocycler after the 65°C extension step, approximately 2-3 cycles prior to reaching plateau of fluorescence.
  • uracil bases in the template DNA were replaced with thymine bases in the DNA copies, whereas 5-carboxycytosine (oxidation product of 5-mC and/or 5-hmC) bases in the template DNA were replaced with cytosine bases in the DNA copies.
  • methylated cytosine bases (5-mC or 5-hmC) in the original template DNA were retained as cytosine bases in the converted, PCR-amplified copies.
  • Any unmodified cytosine bases in the original template DNA were converted to thymine bases in the converted, PCR-amplified copies. Notably, the conversion process did not achieve completely accurate discrimination of methylated vs.
  • PCR-amplified DNA was then cleaned up using AMPure XP beads (Beckman Coulter) according to the manufacturer’s protocol (adding 1.3 x the volume of bead slurry relative to the reaction volume to be purified). Cleaned-up DNA was eluted in 12 microliters of 10 mM Tris-HCl pH 7.8 for each batch of 12 samples.
  • the double- stranded, converted, amplified DNA copies underwent CpG methylation using M.SssI according to the manufacturer’s protocol (including a buffer supplemented with S-adenosylmethionine).
  • CpG sites CG dinucleotides
  • TG dinucleotides or CA if an unmethylated C on the opposite strand was converted
  • cytosines that were either methylated or unmethylated outside of a CpG context were not methylated by M.SssI.
  • the methylation pattern at CpG sites in an original template DNA molecule could be reconstituted on the DNA copies after conversion and PCR-amplification using a CpG methyltransferase.
  • Converted, amplified DNA copies with methylation patterns restored were then cleaned up using AMPure XP beads (Beckman Coulter) according to the manufacturer’s protocol (adding 1 .3 x the volume of bead slurry relative to the reaction volume to be purified).
  • Cleaned-up DNA was eluted in 12 microliters of 10 mM Tris-HCl pH 7.8 for each batch of 12 samples.
  • MBD Methyl-CpG-binding domain
  • the MethylCap Kit first bound MethylCap proteins to methylated DNA fragments in solution, and then the complexes were captured with glutathione-coated magnetic beads. A magnetic field was used to isolate the beads, and after two wash steps, the methylated DNA was eluted from the beads into a high salt buffer provided in the kit. A densely methylated competitor DNA (1 microgram) was mixed with the methylated library DNA copies prior to binding with MethylCap protein to reduce the probability of capture of less densely methylated DNA fragments. The competitive capture and elution process was repeated a second time to yield a library with even lower representation of fragments with low or moderate methylation density (less than 8 methylated CpG sites per DNA fragment).
  • the competitor DNA consisted of a double-stranded, 226 base-pair long PCR product (amplicon) containing 10 CpG sites, derived from amplification of a segment of PhiX174 RF I phage DNA (New England Biolabs). Importantly, the competitor DNA did not contain sequences that would be required for next-generation sequencing (e.g., Illumina Read 1 and Read 2 sequences), so although the competitor DNA was captured and eluted along with the densely methylated library DNA fragments, the competitor DNA was not able to participate in downstream PCR or sequencing reactions.
  • the following primers were used to generate the PCR-amplified competitor DNA: lOCpGFWD:
  • PCR was performed using EmeraldAMP® Max HS PCR Mastermix (Takara) to produce a high yield of amplified DNA according to the manufacturer’s instructions, using 0.5 ng of PhiX174 RF I phage DNA (New England Biolabs) as a template and 0.5 micromolar concentration of each primer. Thermocycling parameters were set according to the manufacturer’s recommendations, with an annealing temperature of 60°C. PCR was carried out to saturation (plateau phase) to maximize product yield. The PCR product was purified using QIAquick PCR Purification kit (Qiagen) according to the manufacturer’s protocol. The purified DNA underwent CpG methylation using M.SssI (New England Biolabs) according to the manufacturer’s protocol.
  • Methylated competitor DNA was again purified using a QIAquick PCR Purification kit (Qiagen) according to the manufacturer’s protocol. For each round of competitive capture performed on a batch of 12 samples, 1 microgram of methylated competitor DNA was used.
  • the selectively captured densely methylated DNA copies were then further amplified by PCR to produce enough DNA library copies for loading onto a flow cell of an Illumina NovaSeq next-generation sequencing instrument.
  • the PCR amplification was carried out using NEBNext® Dual Index Primers for Illumina® (with 8 base-pair indices) according to the manufacturer’s protocol.
  • a distinct Illumina index pair was used for each batch of 12 samples that were intended to be sequenced on the same lane of the sequencing flow cell (allowing multiple batches of 12 samples to be sequenced in a multiplexed fashion on a single flow cell lane). As many as 8 batches (96 samples total) have been successfully multiplexed on a single flow cell lane.
  • NEBNext® Q5U Master Mix (New England Biolabs) was used for the PCR amplification.
  • Primers were added to the reaction at a final concentration of 200 nanomolar for each primer.
  • SYBR Green I dye (Thermo Fisher Scientific) was added to the PCR reaction at the concentration recommended by the manufacturer to permit fluorescence-based measurement of double- stranded DNA amplification during real-time quantitative PCR.
  • Quantitative PCR was carried out on a CFX96TM System (Bio-Rad) thermocycler, and change in fluorescence signal during the reactions was monitored in real-time.
  • thermocycling parameters were used: (1) 98°C for 30 seconds, (2) 98°C for 10 seconds, (3) 62°C for 30 seconds, (4) 65°C for 60 seconds, (5) repeat thermocycling steps #2-4 until the real-time fluorescence signal begins to plateau. Samples were removed from the thermocycler after the 65°C extension step, approximately 1-2 cycles prior to reaching plateau of fluorescence signal. Amplified, indexed sequencing libraries were then cleaned up using AMPure XP beads (Beckman Coulter) according to the manufacturer’s protocol (adding 1.3 x the volume of bead slurry relative to the reaction volume to be purified). Cleaned-up DNA was eluted in 12 microliters of 10 mM Tris-HCl pH 7.8 for each batch of 12 samples.
  • the amplified, indexed libraries were further purified on a precast E-GelTM SizeSelecfTM II Agarose Gel, 2% (Thermo Fisher) using an E-Gel Power Snap Electrophoresis System (Thermo Fisher), according to the manufacturer’s protocol.
  • a DNA ladder run in a separate gel lane as a size reference, a band in the size range of approximately 320-360 base-pairs (representing the expected size of the library) was recovered from the gel for libraries produced from cfDNA with a mononucleosomal size distribution.
  • a library was produced from sheared genomic DNA, a broader size distribution was expected due to random fragmentation, and accordingly a broader band was recovered in a size range of approximately 300-380 base pairs.
  • the DNA was recovered in deionized water and could be used without further purification as a library input for nextgeneration sequencing on an Illumina flow cell (after appropriate adjustment of concentration).
  • Next-seneration sequencing To prepare the library for loading onto an Illumina NovaSeq flow cell, the concentration of DNA was measured using a KAPA Library Quantification Kit (Roche) according to the manufacturer’s protocol. The size profile and concentration of the libraries was also evaluated on a Bioanalyzer (Agilent). Libraries were diluted to the concentration recommended for the flow cell to be used (both S 1 and S4 flow cells were used in different experiments). Cluster formation was carried out on the flow cell according to Illumina’s protocol. Sequencing was performed on a NovaSeq 6000 instrument in multiplexed paired- end mode, with a read length of 150 base pairs in each direction (2 x 150 bp mode). Two index reads were also performed, with read lengths of 8 bases each. Data was output to a server from which files could be downloaded for further processing.
  • the sequence output from the Illumina sequencer was analyzed according to the following general scheme. First, read pairs were demultiplexed based on Illumina indexes to sort read pairs arising from different sample batches. Then, then read pairs were further sorted based on sample barcodes to yield sample-specific sets of read pairs. Read pairs were discarded if their sample barcode sequences did not exactly match one of the used barcodes or if the barcodes of a pair of reads did not match each other. Low-quality reads were also filtered out according to quality filtering parameters recommended by Illumina. Next, any adapter sequences identified at the ends of reads were trimmed.
  • Each read-pair from a given cluster was then joined by overlapping the 3 ’-regions to re-create a full sequence of a DNA insert fragment (merged read pairs). Any read-pairs that had ⁇ 95% sequence agreement in their overlapping 3 ’-regions (imperfect complementarity) were discarded because such discrepancies would be indicative of sequencer errors.
  • an initial de-duplication was performed to remove any replicate sequences that had exactly identical sequences. Such deduplicated sequences were annotated to record the number of replicate sequences that were collapsed into a single sequence.
  • Resulting sequences were then further processed using Bismark software (Babraham Bioinformatics Institute) to map sequences to the human genome (using an in silico C to T converted reference genome) and to perform methylation status calling (using an unconverted reference genome).
  • Bismark software (Babraham Bioinformatics Institute) to map sequences to the human genome (using an in silico C to T converted reference genome) and to perform methylation status calling (using an unconverted reference genome).
  • Build version hg38 of the human reference genome was used.
  • Bismark used the short read aligner Bowtie 2 to map sequences to the human genome.
  • a further de-duplication step was performed by Bismark to remove alignments mapping to the same position (including start and end positions) in the genome, unless the sequences aligned to the same genomic position but on different strands.
  • sequences that were considered to be truly densely methylated sequences were required to meet all of the following filter criteria: (1) must contain no more than 20% cytosines that were read as being methylated outside of a CpG context, (2) must contain a minimum of 10 CpG sites, and (3) must contain a minimum of 80% methylated cytosines at CpG sites.
  • WGBS whole genome bisulfite sequencing
  • WGBS data from the following cell types were used: alternatively activated macrophage, band form neutrophil, CD 14-positive CD 16-negative classical monocyte, CD3-negative CD4-positive CD8-positive double positive thymocyte, CD3-positive CD4-positive CD8-positive double positive thymocyte, CD34-negative CD41- positive CD42 -positive megakaryocyte cell, CD38-negative naive B cell, CD4-positive alpha-beta T cell, CD4-positive alpha-beta thymocyte, CD8-positive alpha-beta T cell, CD8- positive alpha-beta thymocyte, central memory CD4-positive alpha-beta T cell, central memory CD8-positive alpha-beta T cell, class switched memory B cell, conventional dendritic cell, cytotoxic CD56-dim natural killer cell, effector memory CD4-positive alpha- beta T cell, effector memory CD8-positive alpha-beta T cell
  • WGBS data from the following cell types were used: aorta, esophagus, left ventricle, liver, lung, macrophage, natural killer cell, pancreas, primary hematopoietic stem cells GCSF-mobilized, psoas muscle, sigmoid colon, small intestine, spleen, stomach, T Cell, and thymus.
  • an expected average methylation level was calculated for each healthy tissue or cell type by averaging the beta values at all CpG sites in the genomic region covered by the sequence. For example, if a sequence was mapped to a 170 base-pair region of a CpG island on chromosome 2, and this region contained 13 CpG sites, an average methylation level would be calculated for each healthy tissue or cell type by averaging the 13 beta values at the corresponding genomic region in the WGBS data. Thus, for each DNA sequence, a list of corresponding expected average methylation level values was generated from the healthy tissue/cell public WGBS datasets.
  • a sequence (fragment) was considered to be aberrantly hypermethylated if it passed the filters for being considered a truly densely methylated sequence, and additionally, none of the expected average methylation level values from all healthy samples exceeded 0.4 (or 40%).
  • a truly densely methylated sequence was considered to be aberrantly hypermethylated if it mapped to a genomic region that was known to have a low expected average methylation level in all queried healthy cell types and tissues.
  • an aberrantly hypermethylated sequence also mapped to a genomic region annotated as a CpG island, it was considered an aberrantly hypermethylated CpG island sequence.
  • Figure 6 shows histograms comparing the CpG dinucleotide content of sequenced cfDNA fragments before vs. after two rounds of selective capture and elution of densely methylated DNA fragments (which is referred to here as high density methyl-capture).
  • the CpG dinucleotide count refers to the number of CpG sites (methylated, hydroxymethylated, or unmethylated) in a biologically derived input DNA fragment, not the remaining (unconverted) CpG sites after conversion and amplification.
  • Red boxes are included to highlight the robust enrichment of fragments harboring 8 or more CpG sites (in fragments averaging -170-180 bp in length), which is the methylation density range typically found in CpG islands and promoters.
  • Figure 7 presents a genomic map showing a change in alignment and coverage of sequenced cell-free DNA fragments in the region of the PAX8 gene on Chromosome 2 before vs. after two rounds of selective capture and elution of densely methylated DNA fragments.
  • Preparation of the native library comprised steps of conversion, amplification, and restoration of methylation patterns using methods disclosed herein.
  • the enriched library was further subjected to two rounds of methyl binding domain-based affinity capture and elution with competitive binding of a 226 base-pair competitor DNA containing 10 methylated CpG sites.
  • sequences mapped in a largely random pattern throughout the genome.
  • Example 3 Patterns of aberrant hypermethylation of cell-free DNA fragments in plasma from patients with various types of cancers and from non-cancer control subjects.
  • Plasma samples ( ⁇ 1 mL) were tested from 11 patients with various types of advanced-stage cancer and from 8 individuals with no known cancer history who were undergoing lung cancer screening because they had a heavy smoking history (meeting US Preventative Services Task Force eligibility criteria). Samples were tested according to the methods described in Example 1.
  • Figure 8 shows a heat map displaying genomic regions at which aberrantly hypermethylated sequences from cell-free DNA fragments were observed to map in plasma of 11 patients with various types of cancer (advanced stage) and 8 non-cancer control subjects who were heavy smokers participating in a lung cancer screening program. Results are displayed for chromosome 2 (chosen arbitrarily), which is representative of genome-wide patterns. Dark bars represent genomic regions at which mapping is observed of one or more cfDNA fragments that are categorized as aberrantly hypermethylated.
  • Such fragments are densely methylated but map to genomic regions that are expected to have a methylation level of less than 40% (averaged across all CpG sites) in multiple types of healthy cells and tissues based on publicly available whole genome bisulfite sequencing data (from Roadmap and Blueprint studies). The difference in signal between cancer cases and non-cancer control subjects is striking. The distinct patterns of hypermethylation between samples underscores the importance of untargeted capture. If a panel of targeted hybrid-capture oligonucleotides had been used instead, such comprehensive capture for all cancer types would not have been possible. These results demonstrate the ability of the assay to capture aberrant promoter hypermethylation signals regardless of genomic location and from multiple types of cancer.
  • initial shrinkage followed by enlargement of a liver metastasis are shown in computer tomography (CT) scan images taken at baseline (prior to therapy) and at cycles 4 and 8 of therapy (each cycle is 28 days).
  • CT computer tomography
  • a graph is also provided showing changes over time in tumor burden (defined as sum of diameters of target lesions according to RECIST guidelines) and in tumor-derived cfDNA level (measured as the variant allele fraction [VAF] of a tumorspecific KRAS mutation in plasma cell-free DNA).
  • the tumor burden initially decreases with the drug therapy but then increases, likely because of growth of treatment-resistant tumor clones.
  • the mutant tumor-derived cfDNA level shows a transient spike in level (possibly due to initial cell kill) followed by a decline, indicative of tumor response.
  • aberrantly hypermethylated cfDNA fragment counts mapping to chromosome 10 are shown at 4 time points: at baseline, shortly after beginning treatment, at the nadir of mutant tumor-derived cfDNA level, and when the cancer has clearly progressed.
  • Each circle indicates the observation of one or more aberrantly hypermethylated cfDNA fragments mapping to a CpG island at that genomic location.
  • Circle size is proportional to the number of fragments mapping to a given CpG island. Blue circles indicate CpG islands at which aberrantly hypermethylated cfDNA fragments mapped at baseline and during therapy.
  • Green circles indicate CpG islands at which aberrantly hypermethylated fragments were observed at either of the first two time points but not thereafter.
  • Red circles indicate CpG islands at which aberrantly hypermethylated fragments were not observed at either of the first two time points but emerged thereafter. Analysis of such evolving aberrant hypermethylation patterns at CpG islands can provide biological and clinical insights pertaining to epigenetic resistance mechanisms, tumor heterogeneity, prognosis, and response or lack of response to therapy.
  • Analyzing longitudinal changes in methylation patterns over time in the same patient facilitates identification and monitoring of personalized disease-associated methylation signals.
  • various logical approaches can be applied alone or in combination: (1) identify hypermethylated cell-free DNA fragments that map to CpG islands which are rarely hypermethylated in healthy plasma, (2) identify hypermethylated cell-free DNA fragments that map to CpG islands which are known to commonly become hypermethylated in cancer cells based on data from studies of other patients, and/or (3) identify hypermethylated cell-free DNA fragments that map to CpG islands whose fragment counts (relative to other CpG islands in the same biospecimen) change over time in concert with changes in tumor burden (e.g., relative DNA fragment counts mapping to a CpG island can increase over time with disease progression or decrease over time when tumors shrink in response to effective therapy).
  • this information can be used to improve sensitivity and/or specificity for detecting tumor- derived signals in subsequent biospecimens obtained from the same patient.
  • this information can be used to improve sensitivity and/or specificity for detecting tumor- derived signals in subsequent biospecimens obtained from the same patient.
  • observation of hypermethylated DNA fragments mapping those genomic regions in a subsequent biospecimen can be considered to have a greater probability of being tumor- derived.
  • observation of hypermethylated DNA fragments mapping outside of those genomic regions would be less likely to be tumor-derived. Similar approaches can be applied to DNA derived from other biological samples beyond just plasma.
  • longitudinal plasma samples ( ⁇ 1 mL each) were obtained from a 76 year-old male patient with metastatic non-small cell lung cancer who received immune checkpoint inhibitor therapy with the drug Pembrolizumab. Plasma samples were obtained from the patient prior to initiating therapy (on cycle 1 day 1) and again after completing one cycle of treatment (on cycle 2 day 1, prior to receiving the second cycle). Cell-free DNA was extracted from plasma and was tested according to the methods described in Example 1 .
  • Figure 10 shows a graph in which cell-free DNA fragment counts mapping to various CpG islands are displayed for two time points (pre-treatment on the X-axis, and after 1 cycle on the Y-axis). Each data point on the graph shows cell-free DNA fragment counts mapping to an individual CpG island at the two time points.
  • the graph shows that at many CpG islands, the relative fragment counts mapping to those CpG islands remain fairly stable over time, suggesting that these CpG islands are unlikely to be cancer-associated (considered background signal).
  • the graph also shows that at some other CpG islands, the relative fragment counts mapping to those CpG islands decrease substantially from the pre-treatment sample to the post-treatment sample, suggesting that these CpG islands are likely to be cancer-associated.
  • Such analysis can facilitate identification of CpG islands that show cancer-associated hypermethylation in a given patient (i.e., a personalized cancer-associated hypermethylation pattern).
  • This example shows that densely methylated viral DNA can be captured and sequenced using methods disclosed herein.
  • densely methylated viral DNA fragments were captured from a patient’s plasma in parallel with densely methylated cell-free DNA fragments derived from the patient’s genome.
  • a 1 mL plasma sample was obtained from a male patient with HIV who developed diffuse large B-cell lymphoma (DLBCL). It is known that DLBCL in the setting of HIV is often associated with latent Epstein-Barr Virus (EB V) infection of B-cells. The plasma sample was obtained prior to initiation of any therapy.
  • DLBCL diffuse large B-cell lymphoma
  • EB V latent Epstein-Barr Virus
  • Cell-free DNA (including viral DNA) was extracted from the plasma sample and was tested according to the methods described in Example 1 , with a modification in the bioinformatic analysis to include alignment of DNA fragment sequences to viral genomes including Epstein-Barr Virus, HIV-1, Human Papilloma Virus, Kaposi’s Sarcoma Herpesvirus, Hepatitis B Virus, and Hepatitis C virus, in addition to the human genome reference (hg38).
  • Figure 11 shows densely methylated cell-free DNA fragments mapping to the EBV genome in the plasma of this patient with HIV and DLBCL. Note the periodicity of sequence coverage suggests phased nucleosomal protection of cfDNA fragments.
  • Red bars in magnified views indicate methylated CpG sites; blue bars indicate unmethylated sites.
  • no fragments were found to map to the genomes of any of the other viral reference genomes that were included in the bioinformatic analysis (besides EBV), suggesting that other viral DNA was not present in the blood, or if present, may not have been methylated with sufficient density to be captured and sequenced.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne des procédés et des compositions qui permettent une caractérisation efficace et précise de modifications épigénétiques de l'ADN. Certains procédés permettent d'enrichir l'ADN en se basant sur la densité des sites CpG méthylés tout en préservant la représentation unique de la séquence de l'ADN densément méthylé. Certains procédés permettent l'amplification d'ADN avec restauration de motifs de méthylation au niveau de sites CpG dans les copies d'ADN. Certains procédés peuvent être appliqués à la détection de motifs de méthylation spécifiques du cancer ou spécifiques d'une maladie à partir d'échantillons biologiques.
PCT/US2024/035148 2023-06-21 2024-06-21 Procédés d'enrichissement et d'analyse d'adn méthylé Pending WO2024264010A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363509353P 2023-06-21 2023-06-21
US63/509,353 2023-06-21

Publications (1)

Publication Number Publication Date
WO2024264010A1 true WO2024264010A1 (fr) 2024-12-26

Family

ID=93936411

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/035148 Pending WO2024264010A1 (fr) 2023-06-21 2024-06-21 Procédés d'enrichissement et d'analyse d'adn méthylé

Country Status (1)

Country Link
WO (1) WO2024264010A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050196792A1 (en) * 2004-02-13 2005-09-08 Affymetrix, Inc. Analysis of methylation status using nucleic acid arrays
US20190032148A1 (en) * 2016-01-29 2019-01-31 Epigenomics Ag Methods for detecting cpg methylation of tumor-derived dna in blood samples
CN110643702A (zh) * 2018-06-26 2020-01-03 深圳市圣必智科技开发有限公司 测定生物样本中特异位点dna甲基化水平的方法及其应用
WO2020243609A1 (fr) * 2019-05-31 2020-12-03 Freenome Holdings, Inc. Méthodes et systèmes de séquençage à haute profondeur d'acide nucléique méthylé
WO2022255944A2 (fr) * 2021-06-02 2022-12-08 Lucence Life Sciences Pte. Ltd. Procédé de détection et de quantification d'adn méthylé

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050196792A1 (en) * 2004-02-13 2005-09-08 Affymetrix, Inc. Analysis of methylation status using nucleic acid arrays
US20190032148A1 (en) * 2016-01-29 2019-01-31 Epigenomics Ag Methods for detecting cpg methylation of tumor-derived dna in blood samples
CN110643702A (zh) * 2018-06-26 2020-01-03 深圳市圣必智科技开发有限公司 测定生物样本中特异位点dna甲基化水平的方法及其应用
WO2020243609A1 (fr) * 2019-05-31 2020-12-03 Freenome Holdings, Inc. Méthodes et systèmes de séquençage à haute profondeur d'acide nucléique méthylé
WO2022255944A2 (fr) * 2021-06-02 2022-12-08 Lucence Life Sciences Pte. Ltd. Procédé de détection et de quantification d'adn méthylé

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
THERMOFISHER SCIENTIFIC: "Methyl-Seq Direct workflow: a fast method for DNA methylation analysis", APPLIED BIOSYSTEMS / APPLICATION NOTE: SEQSTUDIO GENETIC ANALYZER, 1 January 2020 (2020-01-01), pages 1 - 12, XP093252907, Retrieved from the Internet <URL:https://assets.thermofisher.com/TFS-Assets/GSD/Application-Notes/methyl-seq-direct-workflow-application-note.pdf> *

Similar Documents

Publication Publication Date Title
US20250346960A1 (en) Identification and use of circulating nucleic acid tumor markers
JP6977014B2 (ja) 個別的エピゲノミクスのための天然クロマチンへの転移
Elazezy et al. Techniques of using circulating tumor DNA as a liquid biopsy component in cancer management
JP6683752B2 (ja) 血漿による胎児または腫瘍のメチロームの非侵襲的決定
AU2011316807C1 (en) Varietal counting of nucleic acids for obtaining genomic copy number information
US20170298427A1 (en) Nucleic acids and methods for detecting methylation status
US20190309352A1 (en) Multimodal assay for detecting nucleic acid aberrations
JP2021176302A (ja) 腫瘍のディープシークエンシングプロファイリング
CN110168108A (zh) 血浆中稀少dna的去卷积和检测
JP2022526415A (ja) 血漿中の膵管腺癌の検出
US12428684B2 (en) Methods for detecting and treating a tumorigenic phenotype of the liver
JP2023528533A (ja) 循環腫瘍核酸分子のマルチモーダル分析
WO2024020573A1 (fr) Procédés de détection et de réduction des artefacts de méthylation induits par la préparation des échantillons
EP2912468B1 (fr) Test de papanicolaou pour les cancers de l&#39;ovaire et de l&#39;endomètre
US20240352518A1 (en) Methods for simultaneous mutation detection and methylation analysis
JP2023527912A (ja) がんにおける治療反応を予測するための方法
WO2024264010A1 (fr) Procédés d&#39;enrichissement et d&#39;analyse d&#39;adn méthylé
JP2022512848A (ja) エピジェネティック区画アッセイを較正するための方法、組成物およびシステム
US20250230507A1 (en) Methods and systems for cell-free nucleic acid processing
Picardo Analysis of tumor eterogeneity in blood and tissue samples
WO2024192294A1 (fr) Procédés et systèmes pour générer des banques de séquençage
Rosenbaum et al. Telomemore enables single-cell analysis of cell cycle and chromatin condensation
Yakovenko et al. Telomemore enables single-cell analysis of cell cycle and chromatin condensation
JP2025538165A (ja) 腫瘍核酸の同定方法
CN118139987A (zh) 用于cfrna和cftna靶向ngs测序的组合物和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24826797

Country of ref document: EP

Kind code of ref document: A1