WO2025207941A1 - Procédés de séparation d'adn riche en cpg par liaison de protéines se liant au cpg et désamination sensible au méthyle - Google Patents
Procédés de séparation d'adn riche en cpg par liaison de protéines se liant au cpg et désamination sensible au méthyleInfo
- Publication number
- WO2025207941A1 WO2025207941A1 PCT/US2025/021845 US2025021845W WO2025207941A1 WO 2025207941 A1 WO2025207941 A1 WO 2025207941A1 US 2025021845 W US2025021845 W US 2025021845W WO 2025207941 A1 WO2025207941 A1 WO 2025207941A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dna
- sample
- cpg
- dntp
- sequencing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/6853—Nucleic acid amplification reactions using modified primers or templates
- C12Q1/6855—Ligating adaptors
Definitions
- Biopsies represent a traditional approach for detecting or diagnosing cancer in which cells or tissue are extracted from a possible cancer site and analyzed for relevant phenotypic and/or genotypic features. Biopsies have the drawback of being invasive. Cancer detection based on analysis of body fluids (“liquid biopsies”), such as blood, is an intriguing alternative based on the observation that DNA from cancer cells is released into body fluids.
- a liquid biopsy is noninvasive (sometimes requiring only a blood draw).
- it has been challenging to develop accurate and sensitive methods for analyzing liquid biopsy material in part because the amount of nucleic acids released into body fluids is low and variable, as is recovery of nucleic acids from such fluids in analyzable form.
- the contribution of DNA from cells in or around a cancer or neoplasm to a sample may be relatively small relative to the contribution from other cells, and the DNA contributed from other cells may be uninformative as to cancer status. Isolating and processing cell-free DNA useful for further analysis in liquid biopsy procedures can be a useful part of these methods.
- the present disclosure provides methods for separating CpG-dense DNA in a sample comprising contacting the DNA in the sample with a CpG-binding protein (CpG-binding protein) to provide CpG protein-bound DNA, removing the CpG-binding protein from at least a portion of the CpG protein-bound DNA to provide CpG-dense DNA, and contacting the CpG-dense DNA with a methyl-sensitive deaminase to selectively deaminate at least a portion of unmethylated CpGs (thereby providing a converted sample).
- the DNA in the sample comprises methylated cytosines and unmethylated cytosines in CpG dinucleotides.
- the disclosed methods can improve separation of CpG-dense DNA and analysis of methylated DNA in a sample by contacting DNA in the sample with a CpG-binding protein and selectively deaminating DNA in the sample with a methyl-sensitive deaminase.
- Embodiment 6 is the method of any one of the preceding embodiments, wherein the CpG-binding protein is immobilized on a solid support.
- Embodiment 7 is the method of any one of the preceding embodiments, wherein the CpG-binding protein is removed from the CpG-dense DNA after step (b).
- Embodiment 8.1 is the method of any one of the preceding embodiments, wherein the CpG-binding protein is an mCpG-binding protein.
- Embodiment 8.2 is the method of the immediately preceding embodiment, wherein the mCpG-binding protein preferentially binds to methylated CpG dinucleotides relative to unmethylated CpG dinucleotides.
- Embodiment 8.3 is the method of embodiment 8.1 or 8.2, wherein the CpG-binding protein comprises mCpG-binding domain 4 (MBD4), mCpG-binding domain 2 (MBD2), mCpG- binding domain 1 (MBD1), or methyl CpG binding protein 2 (MeCP2).
- MBD4 mCpG-binding domain 4
- MBD2 mCpG-binding domain 2
- MBD1 mCpG- binding domain 1
- MeCP2 methyl CpG binding protein 2
- Embodiment 8.4 is the method of any one of embodiments 1-7, wherein the CpG-binding protein binds to methylated CpG dinucleotides and unmethylated CpG dinucleotides with about equal affinity.
- Embodiment 8.5 is the method of any one of embodiments 1-7 or 8.4, wherein the CpG- binding protein comprises mCpG-binding domain 3 (MBD3).
- Embodiment 9 is the method of any one of the preceding embodiments, wherein the CpG protein-bound DNA is separated from the unbound DNA using the CpG-binding protein.
- Embodiment 10 is the method of any one of the preceding embodiments, wherein the separating comprises partitioning the DNA in the sample into a plurality of partitioned subsamples, the plurality comprising a first partitioned subsample and a second partitioned subsample, wherein the first partitioned subsample comprises CpG-dense DNA in a greater proportion than the second partitioned subsample.
- Embodiment 11 is the method of the immediately preceding embodiment, wherein the first partitioned subsample is differentially tagged relative to the second partitioned subsample. Attorney Docket No.
- Embodiment 15 is the method of any one of the preceding embodiments, further comprising eluting the CpG protein-bound DNA, thereby providing eluted DNA.
- Embodiment 16 is the method of the immediately preceding embodiment, wherein the eluted DNA is single-stranded DNA.
- Embodiment 17 is the method of embodiment 15 or 16, wherein the eluted DNA is CpG- dense DNA.
- Embodiment 18 is the method of any one of the preceding embodiments, wherein the methyl-sensitive deaminase is thermally inactivated after contacting the CpG-dense DNA with the methyl-sensitive deaminase.
- Embodiment 19 is the method of any one of the preceding embodiments, wherein the methyl-sensitive deaminase is a dsDNA methyl-sensitive deaminase.
- Embodiment 20 is the method of the immediately preceding embodiment, wherein the dsDNA methyl-sensitive deaminase is modification-sensitive DNA deaminase A (MsddA) or a modification-sensitive DNA deaminase A (MsddA)-like deaminase.
- MsddA modification-sensitive DNA deaminase A
- MsddA modification-sensitive DNA deaminase A
- Embodiment 21 is the method of any one of the preceding embodiments, further comprising, prior to step (a): subjecting the DNA in the sample to end repair to generate end- repaired DNA molecules, wherein the end repair is performed using deoxynucleotide triphosphates (dNTPs).
- dNTPs deoxynucleotide triphosphates
- Embodiment 22 is the method of the immediately preceding embodiment, wherein at least one type of dNTP comprises a modified base, and the at least one dNTP comprising a modified base is incorporated into a repaired region of the end-repaired DNA molecules at one or more locations.
- Embodiment 23 is the method of embodiment 21 or 22, wherein the end repair is performed using a DNA polymerase that does not have 5’-3’ exonuclease activity and/or is not a strand displacing DNA polymerase.
- Attorney Docket No. GH0205WO [0038]
- Embodiment 24 is the method of embodiment 21 or 22, wherein the end repair is performed using a DNA polymerase that has 5’-3’ exonuclease activity and/or is a strand displacing DNA polymerase.
- Embodiment 26 is the method of any one of the preceding embodiments, further comprising performing an A-tailing reaction, optionally after a step of subjecting the DNA in the sample to end repair.
- Embodiment 27 is the method of the immediately preceding embodiment, wherein the end-repair and the A-tailing reaction are performed in the same reaction mixture, optionally wherein the end-repair and the A-tailing reaction are performed a single tube and/or optionally wherein the end-repair and the A-tailing reaction are performed without an intervening clean-up step.
- Embodiment 34 is the method of any one of embodiments 30-32, wherein the methylation-preserving amplification comprises thermocycled amplification.
- Embodiment 35 is the method of any one of embodiments 30-32, wherein the methylation-preserving amplification comprises isothermal amplification.
- Embodiment 36 is the method of any one of the preceding embodiments, wherein the step of contacting the CpG-dense DNA with a methyl-sensitive deaminase, thereby providing a converted sample in which at least a portion of unmethylated CpGs in the DNA are converted to UpGs, comprises single-enzyme 5-methylcytosine sequencing (SEM-seq) method.
- Embodiment 37 is the method of any one of the preceding embodiments, further comprising sequencing at least a portion of the DNA of the converted sample.
- Embodiment 38 is the method of any one of the preceding embodiments, further comprising quantifying a level of methylation at one or more differentially methylated regions of the DNA.
- Embodiment 39 is the method of the immediately preceding embodiment, wherein quantifying the level of methylation at one or more differentially methylated regions of the DNA comprises sequencing at least a portion of the amplified DNA or quantitative PCR.
- Embodiment 40 is the method of any one of embodiments 37-39, wherein the sequencing comprises next-generation sequencing (NGS).
- NGS next-generation sequencing
- Embodiment 41 is the method of the immediately preceding embodiment, wherein the NGS comprises pyrosequencing, sequencing-by-synthesis, semiconductor sequencing, sequencing-by-ligation, or sequencing-by-hybridization.
- Embodiment 42 is the method of any one of embodiments 37-39, wherein the sequencing comprises single-molecule real time (SMRT) sequencing.
- SMRT single-molecule real time
- Embodiment 43 is the method of any one of embodiments 37-39, wherein the sequencing comprises long-read sequencing.
- Embodiment 44 is the method of any one of embodiments 37-39, wherein the sequencing comprises nanopore-based sequencing.
- Embodiment 45 is the method of any one of embodiments 37-39, wherein the sequencing comprises 5-letter or 6-letter sequencing. Attorney Docket No.
- Embodiment 46 is the method of embodiments 37-39 or 44, wherein the sequencing comprises nanopore-based sequencing and the method further comprises subjecting the DNA in the sample to end repair to generate end-repaired DNA molecules, wherein the end repair is performed using at least one type of dNTP which comprises a modified base including a dNTP comprising 4mC, a dNTP comprising 5mC, a dNTP comprising 5hmC, a dNTP comprising 6mA, a dNTP comprising BrdU, dUTP, a dNTP comprising fluorodeoxyuridine (FldU), a dNTP comprising 5-iododeoxyuridine (IdU), and/or a dNTP comprising 5-ethynyldeoxyuridine (EdU), and the at least one type of dNTP comprising a modified base is incorporated into a repaired region of the end-repaired DNA molecules at one
- Embodiment 47 is the method of embodiment 37-39 or 42, wherein the sequencing comprises single-molecule real time (SMRT) sequencing and the method further comprises subjecting the DNA in the sample to end repair to generate end-repaired DNA molecules, wherein the end repair is performed using at least one type of dNTP which comprises a modified base including a dNTP comprising a 4mC, a dNTP comprising 5mC, a dNTP comprising 5hmC, a dNTP comprising 6mA, and/or a dNTP comprising 8oxoG, and the at least one type of dNTP comprising a modified base is incorporated into a repaired region of the end-repaired DNA molecules at one or more locations.
- SMRT single-molecule real time
- Embodiment 48 is the method of any one of embodiments 37-47, further comprising analyzing at least some of the sequence data corresponding to regions that are not identified as being synthesized during the end repair to detect the presence or absence of base modifications or mutations present in the DNA sample.
- Embodiment 49 is the method of any one of embodiments 37-48, wherein the method further comprises detecting the methylation status of cytosines in the DNA of the sample, and wherein the analyzing the sequence data further comprises filtering out the one or more repaired regions of the end-repaired DNA molecules such that the one or more repaired regions are not used to determine the methylation status of cytosines in the DNA sample.
- Embodiment 50 is the method of any one of embodiments 37-48, wherein the method is for detecting the single nucleotide variants (SNVs) in the DNA sample, and wherein the analyzing the sequence data further comprises classifying all base calls within the one or more repaired regions as not having double stranded support.
- Embodiment 51 is the method of any one of embodiments 37-50, wherein the DNA sample comprises cell-free DNA (cfDNA). Attorney Docket No. GH0205WO
- Embodiment 52 is the method of the immediately preceding embodiment, further comprising analyzing the sequence data to determine a level of measured artifacts in the cfDNA.
- Embodiment 53 is the method of any one of embodiments 21-45, wherein the end repair is performed using at least one type of dNTP which comprises a modified base, wherein the modified base is other than 5mC or 5hmC, and the at least one type of dNTP comprising a modified base is incorporated into a repaired region of the end-repaired DNA molecules at one or more locations.
- Embodiment 54 is the method of any one of embodiments 21-45, wherein the end repair is performed using at least one type of dNTP which comprises a modified base, wherein the modified base is a methylated cytosine, optionally wherein the methylated base is 5mC or 5hmC, and the at least one type of dNTP comprising a modified base is incorporated into a repaired region of the end-repaired DNA molecules at one or more locations.
- Embodiment 55 is the method of any one of embodiments 21-45, wherein the end repair is performed using at least one type of dNTP which comprises a modified base, wherein the modified base is a methylated cytosine, optionally wherein the methylated base is 5mC or 5hmC, wherein the at least one type of dNTP comprising a modified base is incorporated into a repaired region of the end-repaired DNA molecules at one or more locations, and the repaired region is defined as: (i) the sequence between two non-methylated cytosines which span one or more methylated CpH cytosines; and/or (ii) the sequence between a methylated CpH cytosine and an end of a sequence read, wherein the methylated CpH cytosine is the CpH cytosine most distant from the end of the sequence read, or a subsequence thereof comprising one or more methylated CpH cytosines.
- Embodiment 56 is the method of any one of the preceding embodiments, wherein one or more adapters are ligated to the end-repaired DNA molecules or one or more adapters are ligated to the DNA in the sample.
- Embodiment 57 is the method of the immediately preceding embodiment, wherein the one or more adapters comprise molecular barcodes.
- Embodiment 58 is the method of any one of embodiments 56-57, wherein at least one cytosine in the one or more adapters is a modification resistant cytosine, optionally wherein each cytosine in the one or more adapters is a modification resistant cytosine.
- Embodiment 59 is the method of the immediately preceding embodiment, wherein the modification resistant cytosine is a deaminase resistant cytosine.
- Embodiment 60 is the method of the immediately preceding embodiment, wherein the deaminase resistant cytosine is 5-propynylC (5pyC), 5-pyrrolo-dC (5pyrC), 5- hydroxymethylcytosine (5hmC), glucosylated5-hydroxymethylcytosine (5ghmC), cytosine 5- methylenesulfonate (CMS), or N4-modified cytosine.
- the deaminase resistant cytosine is 5-propynylC (5pyC), 5-pyrrolo-dC (5pyrC), 5- hydroxymethylcytosine (5hmC), glucosylated5-hydroxymethylcytosine (5ghmC), cytosine 5- methylenesulfonate (CMS), or N4-
- Embodiment 61 is the method of any one of embodiments 56-60, wherein the one or more adapters are Y-shaped adapters.
- Embodiment 62 is the method of any one of the preceding embodiments, wherein the DNA in the sample comprises barcodes.
- Embodiment 63 is the method of any one of the preceding embodiments, further comprising enriching the DNA for a plurality of target regions.
- Embodiment 64 is the method of the immediately preceding embodiment, wherein the enriching the DNA occurs prior to a step of amplifying DNA in the converted sample, prior to a step of sequencing the DNA, after contacting the CpG-dense DNA with the methyl-sensitive deaminase, and/or after partitioning the DNA in the sample into a plurality of subsamples.
- Embodiment 65 is the method of embodiment 63 or 64, wherein the plurality of target regions comprises epigenetic target regions.
- Embodiment 66 is the method of embodiment 65, wherein the epigenetic target regions comprise hypermethylation variable target regions.
- Embodiment 75 is the method of any one of embodiments 73-74, wherein the subject is an animal.
- Embodiment 76 is the method of the immediately preceding embodiment, wherein the subject is a human.
- Embodiment 77 is the method of any one of embodiments 73-76, wherein the subject has or is at risk of having a cancer.
- Embodiment 78 is the method of any one of embodiments 73-77, further comprising determining the presence or status of a cancer in the subject.
- the results of the methods disclosed herein are used as an input to generate a report.
- the report may be in a paper or electronic format.
- the true methylation status of cytosines or variants, as obtained by the methods disclosed herein, or information derived therefrom, can be displayed directly in such a report.
- diagnostic information or therapeutic recommendations which are at least in part based on the methods disclosed herein can be included in the report.
- the various steps of the methods disclosed herein may be carried out at the same or different times, in the same or different geographical locations, e.g. countries, and/or by the same or different people.
- Additional advantages will be set forth in part in the description which follows or may be learned by practice. The advantages will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
- FIG.1 is a schematic showing an exemplary separation method for CpG-dense DNA wherein an immobilizable CpG-binding protein (such as MBD3 or a methyl-binding domain (MBD) protein (e.g., MeCP2, MBD4, MBD2, or MBD1), e.g., MBD2-biotin) is used to separate CpG-dense DNA followed by deamination of unmethylated CpGs in the DNA using a methyl-sensitive deaminase. End-repair and A-tailing reactions are performed on a DNA sample, Attorney Docket No. GH0205WO followed by ligation of NGS adapters to the DNA.
- an immobilizable CpG-binding protein such as MBD3 or a methyl-binding domain (MBD) protein (e.g., MeCP2, MBD4, MBD2, or MBD1), e.g., MBD2-biotin
- CpG-binding protein is then bound to at least a portion of the CpG dinucleotides in the DNA to form a protein-DNA complex, i.e., CpG-protein bound DNA.
- CpG protein-bound DNA is then separated from unbound DNA.
- the CpG-protein may be immobilized on a surface (e.g., a bead or a plate, such as by binding surface-linked streptavidin to biotin on the CpG-binding protein) before or after the CpG-binding protein is bound to at least a portion of the DNA in the DNA sample, to facilitate the separation.
- Unbound and unmethylated CpGs in the CpG-protein bound DNA are deaminated using a methyl-sensitive deaminase, and the DNA is amplified using a uracil-tolerant amplification.
- the sequences in the figure are: mCGmCGmCGmCGmCG (SEQ ID NO: 90), mCGmCGmCGCGCG (SEQ ID NO: 91), CGCGmCGmCGmCG (SEQ ID NO: 92), CGCGCGCGCG (SEQ ID NO: 93), mCGmCGmCGTGTG (SEQ ID NO: 94), and TGTGTGTGTG (SEQ ID NO: 95), wherein “mC” denotes a methylated cytosine.
- FIG.2 is a schematic diagram of an example of a system suitable for use with some embodiments of the disclosure.
- DETAILED DESCRIPTION [0098] Reference will now be made in detail to certain embodiments of the disclosure. While the disclosure will be described in conjunction with such embodiments, it will be understood that they are not intended to limit the disclosure to those embodiments. On the contrary, the disclosure is intended to cover all alternatives, modifications, and equivalents, which may be included within the disclosure as defined by the appended claims. [0099] Before describing the present teachings in detail, it is to be understood that the disclosure is not limited to specific compositions or process steps, as such may vary.
- nucleic acid includes a plurality of nucleic acids.
- Numeric ranges are inclusive of the numbers defining the range. Measured and measurable values are understood to be approximate, taking into account significant digits and the error associated with the measurement.
- embodiments in the specification that recite “comprising” various components are also contemplated as “consisting of” or “consisting essentially of” the recited components.
- “Buffy coat” refers to the portion of a blood (such as whole blood) or bone marrow sample that contains all or most of the white blood cells and platelets of the sample.
- the buffy coat fraction of a sample can be prepared from the sample using centrifugation, which separates sample components by density.
- Leukapheresis may be performed, e.g., obtain cells for research, diagnostic, prognostic, or monitoring purposes, such as those described herein.
- a “leukapheresis sample” refers to a sample comprising leukocytes collected from a subject using leukapheresis.
- PBMCs peripheral blood mononuclear cells
- Such cells include, e.g., lymphocytes (T cells, B cells, and NK cells) as well as monocytes, and are isolated from blood samples (such as from a whole blood sample collected from a subject) using density gradient centrifugation.
- amplify refers to a process by which extra or multiple copies of a particular polynucleotide are formed. Amplification methods can include any suitable methods known in the art.
- a nucleic acid molecule Attorney Docket No. GH0205WO amplified using “methylation-preserving amplification” substantially maintains its methylation status post-amplification.
- the polypeptide is the human polypeptide unless indicated otherwise.
- the polypeptide comprising the X 1 nnnX 2 mutation may, but does not necessarily, comprise additional differences from the wild-type sequence, including but not limited to truncations and deletions as well as other substitutions.
- a “T1372S mutation” in TET2 refers to a substitution in a TET2 enzyme of the threonine present at position 1372 of the full-length wild- type human TET2 enzyme with a serine.
- Position 1372 of wild-type human TET2 aligns to position 258 and 248, respectively, of the truncated TET2 sequences disclosed as SEQ ID NOs: 23 and 24 of US Patent 10,961,525.
- a “V1900X 2 mutation” where X 2 is A, C, G, I, or P in TET2 refers to a substitution in a TET2 enzyme of the valine present at position 1900 of the full-length wild-type human TET2 enzyme with an alanine, cysteine, glycine, isoleucine, or proline.
- “Or” is used in the inclusive sense, i.e., equivalent to “and/or,” unless the context requires otherwise. II. Exemplary methods A. Overview [00164] Many commercialized methods and methods undergoing development target specific methylation changes, e.g., cancer-related changes that occur in early stage cancers and pre-cancers.
- Cancer formation and progression may arise from both genetic modification and epigenetic features of deoxyribonucleic acid (DNA).
- DNA such as cell-free DNA (cfDNA).
- cfDNA cell-free DNA
- the present disclosure provides methods of separating CpG-dense DNA in a sample, e.g., to facilitate characterization of CpG-dense DNA, including with respect to changes in methylation.
- cells in or around a cancer or neoplasm may shed more DNA than cells of the same tissue type in a healthy subject.
- the distribution of tissue of origin of certain DNA samples, such as cfDNA may change upon carcinogenesis.
- an increase in the level of hypermethylation variable target regions that show lower methylation in healthy cfDNA than in at least one other tissue type can be an indicator of the presence (or recurrence, depending on the history of the subject) of cancer.
- an increase in the level of hypomethylation variable target regions in the sample can be an indicator of the presence (or recurrence, depending on the history of the subject) of cancer.
- DNA methylation profiling can be used to detect aberrant methylation in DNA of a sample.
- the DNA can correspond to certain genomic regions (“differentially methylated regions” or “DMRs”) that are normally hypermethylated or hypomethylated in a given sample type (e.g., cfDNA from the bloodstream) but which may show an abnormal degree of methylation that correlates to a neoplasm or cancer, e.g., because of unusually increased contributions of tissues to the type of sample (e.g., due to increased shedding of DNA in or around the neoplasm or cancer) and/or from extents of methylation of the genome that are altered during development or that are perturbed by disease, for example, cancer or any cancer- associated disease.
- DMRs genomic regions
- cfDNA from the bloodstream e.g., cfDNA from the bloodstream
- DNA methylation comprises addition of a methyl group to a cytosine residue at a CpG site (cytosine-phosphate-guanine site (i.e., a cytosine followed by a guanine in a 5’ -> 3’ direction of the nucleic acid sequence).
- DNA methylation comprises addition of a methyl group to an adenine residue, such as in N6- methyladenine.
- DNA methylation is 5-methylation (modification of the carbon in the 5th position of the cytosine ring).
- 5-methylation comprises addition of a methyl group to the 5C position of the cytosine residue to create 5-methylcytosine (m5c or 5-mC or 5mC).
- methylation comprises a derivative of m5c.
- Derivatives of m5c include, but are not limited to, 5-hydroxymethylcytosine (5-hmC or 5hmC), 5-formylcytosine (5-fC), and 5-caryboxylcytosine (5-caC).
- DNA methylation is 3C methylation (modification of the carbon in the 3 rd position of the cytosine ring).
- the CpG-binding protein domain binds to methylated CpG dinucleotides and unmethylated CpG dinucleotides with about equal affinity, e.g., mCpG- binding domain 3 (MBD3).
- the mCpG-binding protein has an affinity (e.g., K d ) for a methylated CpG dinucleotide within a factor of about 10, about 5, about 2, or about 1.5 of its affinity for an unmethylated CpG dinucleotide.
- the CpG- binding protein domain does not preferentially bind to methylated CpG dinucleotides relative to unmethylated CpG dinucleotides, e.g., mCpG-binding domain 3 (MBD3). In some embodiments, the CpG-binding protein domain does not preferentially bind to unmethylated CpG dinucleotides relative to methylated CpG dinucleotides, e.g., mCpG-binding domain 3 (MBD3).
- the mCpG-binding protein has an affinity (e.g., K d ) for a methylated CpG dinucleotide that is stronger than the affinity for an unmethylated CpG dinucleotide by a factor of about 10, about 5, about 2, or about 1.5.
- the mCpG-binding protein comprises mCpG-binding domain 4 (MBD4).
- the mCpG-binding protein comprises mCpG-binding domain 2 (MBD2).
- the mCpG-binding protein comprises methyl CpG binding protein 2 (MeCP2).
- the mCpG-binding protein comprises mCpG-binding domain 1 (MBD1).
- MBD1 contains three internal CXXC zinc finger domains: CXXC-1, CXXC-2, and CXXC-3.
- MBD1 contains two internal CXXC zinc finger domains: CXXC-1 and CXXC-2
- MBD1 contains internal CXXC zinc finger domains CXXC-1 and CXXC-2 and does not contain CXXC-3.
- the method of separating CpG-dense DNA in a sample further comprises amplifying DNA in the converted sample using a DNA polymerase.
- the DNA polymerase is a uracil-tolerant polymerase.
- the uracil-tolerant polymerase may be Q5U® Hot Start High-Fidelity DNA Polymerase, OneTaq ® DNA Polymerase, Taq DNA Polymerase, Attorney Docket No.
- Bst DNA Polymerase Full Length, Bst DNA Polymerase, Large Fragment, Bst 2.0 DNA Polymerase, Bst 3.0 DNA Polymerase, Bsu DNA Polymerase, Large Fragment, phi29 DNA Polymerase, phi29-XT DNA Polymerase
- the CpG-binding protein comprises a capture moiety.
- the capture moiety comprises biotin, avidin, streptavidin, neutravidin, an oligonucleotide, digoxygenin, a histidine tag, an affinity tag, an immunoglobulin constant domain, a hapten, a magnetic particle, or any combination thereof.
- the CpG-binding protein is immobilized on a solid support.
- the CpG-dense DNA is unbound (i.e., not bound) to a CpG-binding protein.
- the method of separating CpG-dense DNA in a sample further comprises separating the CpG protein-bound DNA from unbound DNA using the CpG- binding protein.
- the separating can provide separated CpG-dense DNA.
- the separating is performed before contacting the CpG-dense DNA with a methyl-sensitive deaminase.
- the separating is performed after contacting the DNA in the sample with a CpG-binding protein.
- the separating comprises partitioning DNA in the sample into a plurality of partitioned subsamples.
- the plurality of partitioned subsamples comprises a first partitioned subsample and a second partitioned subsample.
- the first partitioned subsample comprises CpG-dense DNA in a greater proportion than the second partitioned subsample.
- the method comprises differentially tagging the first partitioned subsample and the second partitioned subsample.
- the first partitioned subsample is differentially tagged relative to the second partitioned subsample.
- partitioning uses the CpG-binding protein to separate CpG protein-bound DNA from DNA unbound to the CpG-binding protein.
- partitioning uses the CpG-binding protein to separate CpG protein-bound DNA from DNA unbound to the CpG- binding protein to yield CpG-dense DNA separated from DNA unbound to the CpG-binding protein.
- the partitioning comprises precipitating the CpG protein-bound Attorney Docket No. GH0205WO DNA.
- the precipitation of the CpG protein-bound DNA separates it from the unbound DNA.
- the method further comprises eluting the CpG protein- bound DNA.
- the eluting provides eluted DNA.
- the eluted DNA is single-stranded DNA.
- the disclosed methods are combined with one or more methods, such as but not limited to, methods for assessing DNA methylation patterns, DNA mutations (such as somatic mutations), nucleic acid fragmentation patterns, non-coding RNA (such as micro RNAs (miRNAs), ribosomal RNAs, transfer RNAs, small nucleolar RNAs (snow RNAs), and/or small nuclear RNAs (snRNAs)) levels, and/or cell type proportions/levels, cellular locations, and/or structural modifications of one or more proteins (such as in a sample from a subject).
- methods for assessing DNA methylation patterns such as DNA mutations (such as somatic mutations), nucleic acid fragmentation patterns, non-coding RNA (such as micro RNAs (miRNAs), ribosomal RNAs, transfer RNAs, small nucleolar RNAs (snow RNAs), and/or small nuclear RNAs (snRNAs)) levels, and/or cell type proportions/levels,
- the disclosure relates to methods of separating CpG-dense DNA in a sample, comprising contacting the DNA of the sample with a CpG-binding protein (e.g., a methyl- binding domain (MBD) protein, such as mCpG-binding domain 4 (MBD4), mCpG-binding domain 2 (MBD2), mCpG-binding domain 1 (MBD1), or methyl CpG binding protein 2 (MeCP2), or a CpG-binding protein (e.g., MBD3)) to provide CpG protein-bound DNA, separating the CpG protein-bound DNA from unbound DNA, thereby providing CpG-dense DNA, and contacting the CpG-dense DNA with a methyl-sensitive deaminase to provide a converted sample.
- a CpG-binding protein e.g., a methyl- binding domain (MBD) protein, such as mCpG-binding domain 4 (MBD4)
- the subject may be a human, a mammal, an animal, a primate, rodent (including mice and rats), or other common laboratory, domestic, companion, service or agricultural animal, for example a rabbit, dog, cat, horse, cow, sheep, goat or pig.
- the DNA sample is from a human.
- the subject may in some cases have or be suspected of having a cancer, tumor, or neoplasm. In other cases, the subject may not have cancer or a detectable cancer symptom.
- the subject may have been treated with one or more cancer therapy, e.g., any one or more of chemotherapies, antibodies, vaccines or biologics.
- the subject may be in remission, e.g.
- the pre-cancer, cancer, tumor, or neoplasia or suspected pre-cancer, cancer, tumor, or neoplasia may be of the bladder, head and neck, lung, colon, rectum, kidney, breast, prostate, skin, or liver.
- the pre-cancer, cancer, tumor, or neoplasia or suspected pre-cancer, cancer, tumor, or neoplasia is of the lung.
- a sample can be in the form originally isolated from a subject or can have been subjected to further processing to remove or add components, such as cells, or enrich for one component relative to another.
- a population of nucleic acids is obtained from a serum, plasma or blood sample from a subject having or suspected of having neoplasia, a tumor, precancer, or cancer or previously diagnosed with neoplasia, a tumor, precancer, or cancer.
- the DNA sample may be or comprise cell free nucleic acids or cfDNA.
- the cfDNA may be obtained from a test subject, for example as described above.
- the sample for analysis may be plasma or serum containing cell-free nucleic acids.
- Cell-free DNA “cfDNA molecules,” or “cfDNA”, for example, include DNA molecules that naturally occur in a subject in extracellular form (e.g., in blood, serum, plasma, or other bodily fluids such as lymph, cerebrospinal fluid, urine, or sputum).
- the amount can be up to about 600 ng, up to about 500 ng, up to about 400 ng, up to about 300 ng, up to about 200 ng, up to about 100 ng, up to about 50 ng, or up to about 20 ng of cell-free nucleic acid molecules.
- the amount can be at least 1 fg, at least 10 fg, at least 100 fg, at least 1 pg, at least 10 pg, at least 100 pg, at least 1 ng, at least 10 ng, at least 100 ng, at least 150 ng, or at least 200 ng of cell-free nucleic acid molecules.
- the amount can be up to 1 femtogram (fg), 10 fg, 100 fg, 1 picogram (pg), 10 pg, 100 pg, 1 ng, 10 ng, 100 ng, 150 ng, Attorney Docket No. GH0205WO or 200 ng of cell-free nucleic acid molecules.
- the method can comprise obtaining 1 femtogram (fg) to 200 ng cell-free nucleic acid molecules from samples.
- nucleic acids can be precipitated with an alcohol. Further clean up steps may be used such as silica-based columns to remove contaminants or salts. Non-specific bulk carrier nucleic acids, DNA or protein for sequencing, hybridization, and/or ligation, may be added throughout the reaction to optimize certain aspects of the procedure such as yield.
- samples can include various forms of nucleic acid including double stranded DNA, single stranded DNA and single stranded RNA.
- single stranded DNA and RNA can be converted to double stranded forms so they are included in subsequent processing and analysis steps.
- At least one type of dNTP comprises a modified base, and the at least one dNTP comprising a modified base is incorporated into repaired regions of the end-repaired DNA molecules at one or more locations.
- End repair refers to methods for repairing DNA by the conversion of non-blunt ended DNA into blunt ended DNA. Sequencing workflows typically use end repair to make ends of DNA molecules compatible with adapters, which are subsequently ligated onto the DNA. Fragmented and/or damaged DNA (e.g. cfDNA or DNA from FFPE samples) often contain non- blunt ends, which contain 3’overhangs and/or 5’overhangs.
- end repair is conducted in the presence of dATP, dCTP, dGTP and dTTP.
- End repair can also Attorney Docket No. GH0205WO include a second step, which involves the addition of a phosphate group to the 5' ends of DNA, by an enzyme such as polynucleotide kinase.
- A-tailing refers to the addition of a single deoxyadenosine residue to the end of a blunt-ended double-stranded DNA fragment to form a 3' deoxyadenosine single-base overhang.
- a tailing reactions are conducted with polymerases which have the ability to add a non-templated A to the 3' end of a blunt, double-stranded DNA molecule.
- Polymerases capable of A-tailing typically do not possess 3’-5’ exonuclease activity.
- A-tailing is performed as a separate reaction to end repair, it is typically conducted in the presence of dATP, but the absence of dCTP, dTTP and dGTP.
- A-tailed fragments are not compatible for self-ligation (i.e., self-circularizatian and concantenation of the DNA), but they are compatible with 3' T-overhangs, which can be used on adapters.
- Methods comprising end repair, A-tailing and ligation to adapters with 3' T-overhangs can result in higher efficiency ligation, compared to blunt ended ligation, as blunt ligation can lead to self-ligation of the adapters and/or DNA molecules.
- the methods disclosed herein comprise end repair of the DNA molecules followed by blunt end ligation of adapters. In other embodiments, the methods disclosed herein comprise end repair of the DNA molecules followed by A-tailing and sticky-end ligation of T-tailed adapters.
- the methods disclosed herein comprise an A-tailing step, it may be performed separately from the end repair with an intervening reaction clean-up step or it may be performed in the same reaction as the end repair (e.g. using NEBNext® UltraTM II End Repair/dA-Tailing Module (E7546)).
- the reaction clean-up step removes unincorporated dNTPs.
- the gaps can be filled in with DNA polymerases used in the end repair reaction, regardless of whether they possess 5’ to 3’ exonuclease activity or strand displacement activity. After this gap filling, a nick will still exist between the synthesized region and the region of the original DNA molecule 3’ of the gap.
- the A-tailing enzymes may then introduce further synthesized regions through nick translation, as described for the nicked DNA. This synthesized region may extend to the 3’end of the DNA molecule.
- the end-repair and the A-tailing reactions are performed in a single tube. In such cases, the A tailing reaction can be performed at a higher temperature than the end repair.
- end repair is performed at ambient temperature (e.g.15-35°C) and A tailing is performed at a temperature over 60°C, including e.g., about 60°C-75°C.
- the A tailing reaction can be performed using a thermostable polymerase (e.g. Taq DNA polymerase, Tfl DNA polymerase, Bst DNA Polymerase, Large Fragment or Tth DNA polymerase) and the method further comprises increasing temperature of the sample after the end repair to inactivate the polymerase used in end repair (e.g. T4 DNA polymerase or Klenow fragment).
- a thermostable polymerase e.g. Taq DNA polymerase, Tfl DNA polymerase, Bst DNA Polymerase, Large Fragment or Tth DNA polymerase
- the A-tailing is performed using a DNA polymerase that: (i) does not possess 5’-3’ exonuclease activity; and/or (ii) is not a strand displacing DNA polymerase.
- a DNA polymerase that: (i) does not possess 5’-3’ exonuclease activity; and/or (ii) is not a strand displacing DNA polymerase.
- the A-tailing is performed using a DNA polymerase that cannot extend from a nick in the DNA such as HemoKlen Taq.
- the A- tailing is performed using Taq DNA polymerase.
- the A-tailing is performed using Tfl polymerase, Bst DNA Polymerase, Large Fragment or Tth polymerase.
- the end repair reaction can be performed using DNA polymerases can be used which lack 5’to 3’ exonuclease activity and/or strand displacement activity (e.g. T4 DNA polymerase or Klenow fragment).
- strand displacement activity e.g. T4 DNA polymerase or Klenow fragment.
- the separation of the end repair and A tailing reaction by a reaction clean- up means that only dATP (not dCTP, dTTP or dGTP) is present in the A tailing reaction. This means that efficient nick translation cannot occur in the A tailing reaction because the three of the four nucleotide components are not present in the reaction mixture.
- the gaps can be filled in with DNA polymerases used in the end repair reaction, regardless of whether they possess 5’ to 3’ exonuclease activity or strand displacement activity. These filled gaps thereby generate synthesized regions.
- the end-repair is performed with a polymerase which lacks 5’to 3’ exonuclease activity and/or strand displacement activity.
- the polymerase used in the end repair reaction may be Q5® High-Fidelity DNA Polymerase, Q5U® Hot Start High-Fidelity DNA Polymerase, Phusion® High-Fidelity DNA Polymerase, Hemo KlenTaq, phi29 DNA Polymerase, T7 DNA Polymerase, DNA Polymerase I (E. coli), DNA Polymerase I, Large (Klenow) Fragment (“Klenow fragment”) or T4 DNA Polymerase.
- the polymerase used in the end repair is T4 DNA Polymerase or Klenow fragment.
- the end repair is performed with a DNA polymerase which has 5’-3’ exonuclease activity and/or is a strand displacing DNA polymerase.
- the methods disclosed herein comprise an A tailing reaction after the end repair and before the ligation reaction, wherein the end repair and A tailing reactions are separated by a reaction cleanup.
- the A tailing reaction is typically performed in the presence of dATP, but in the absence of dCTP, dTTP and dGTP.
- the A tailing reaction is performed using Klenow Fragment lacking 3'-5' exonuclease activity.
- the modified base may be 5-caryboxylcytosine (5-caC), 4-methylcytosine (4mC), 5-methylcytosine (5mC), 5-hydroxymethyl-cytosine (5hmC), N6-methyladenosine (6mA), bromodeoxyuridine (BrdU), 5-fluorodeoxyuridine (FldU), 5-iododeoxyuridine (IdU), 5- ethynyldeoxyuridine (EdU) and/or 8-oxoguanine (8oxoG).
- a dNTP comprising a modified base
- it may be used in place of the equivalent unmodified base in the end repair reaction.
- dCTP comprising 5mC
- multiple types of dNTP comprising a modified base are used in the end repair.
- dATP comprising 6mA and dCTP comprising 5mC can be used in the end repair reaction in place of dATP comprising unmodified adenine and dCTP comprising unmodified cytosine.
- dNTP comprising a modified base
- the use of multiple types of dNTP comprising a modified base is advantageous because it provides increased resolution in defining the regions of the end- repaired DNA molecule which have been synthesized during the end repair reaction. This is because, in this example, the end of a synthesized region can be defined as the first unmodified adenine or unmodified cytosine after a stretch of containing 6mAs and/or 5mCs, rather than relying on the detection of solely an unmodified adenine or solely an unmodified cytosine.
- the sequencing method used will depend on the type of modified base used in the end-repair reaction such that the specific modification can be detected. Exemplary conversion- based methods are described above alongside the base modification which they can detect.
- nanopore-based sequencing can be used to detect 5-caC, 4mC, 5mC, 5hmC, 6mA, BrdU, FldU, IdU, and EdU
- single-molecule real time (SMRT) sequencing from Pacific Biosciences can be used to detect 5-caC, 4mC, 5mC, 5hmC, 6mA, and 8oxoG.
- the disclosed methods use at least one type of dNTP which comprises a modified base (e.g. a methylated deoxycytidine triphosphate, such as deoxycytidine triphosphate comprising 5-methylcytosine (5mC) and/or 5-hydroxymethyl-cytosine (5hmC)) in the end repair reaction.
- a modified base e.g. a methylated deoxycytidine triphosphate, such as deoxycytidine triphosphate comprising 5-methylcytosine (5mC) and/or 5-hydroxymethyl-cytosine (5hmC)
- a repaired region is defined as (i) the sequence between two non-modified bases spanning a modified base, wherein the bases are of the same identity to the modified bases present in the at least one type of dNTP; and/or (ii) the sequence between a non-modified base and the end of a sequence read, wherein there is no additional non-modified bases between the non-modified base and the end of the sequence read, where the non-modified bases are the same identity as the modified base present in the at least one type of dNTP.
- the methods comprise ligating adapters to DNA.
- DNA molecules can be subjected to blunt-end ligation with blunt-ended adapters. Attorney Docket No. GH0205WO
- DNA molecules can be subjected to sticky-end ligation with sticky-ended adapters.
- once the DNA has been end-repaired it can be subjected to blunt-end ligation with blunt-ended adapters, in cases where A-tailing is not performed, or sticky end ligation with T-tailed adapters, when A tailing is performed.
- DNA molecules can be ligated to adapters at either one end or both ends.
- the ligation step occurs before contacting the DNA (e.g., CpG-dense DNA) with a methyl-sensitive deaminase, before removing the CpG- binding protein from the DNA (e.g., CpG protein-bound DNA), and after contacting the DNA (e.g., DNA in the sample) with a CpG-binding protein.
- the ligation step occurs before contacting the DNA (e.g., CpG-dense DNA) with a methyl-sensitive deaminase, and after removing the CpG-binding protein from the DNA (e.g., CpG protein-bound DNA).
- the ligation step occurs after contacting the DNA (e.g., CpG-dense DNA) with a methyl-sensitive deaminase.
- the DNA e.g., CpG-dense DNA
- adapters are ligated to end-repaired DNA molecules or the adapters are ligated to the DNA molecule or a plurality of DNA molecules.
- the ligation reaction also seals nicks present in the end- repaired DNA.
- DNA ligase and adapters are added to ligate DNA molecules in the sample with an adapter on one or both ends, i.e. to form adapted DNA.
- adapter refers to short nucleic acids (e.g., less than about 500, less than about 100 or less than about 50 nucleotides in length, or be 20-30, 20-40, 30-50, 30-60, 40-60, 40-70, 50-60, 50-70, 20-500, or 30-100 bases from end to end) that are typically at least partially double-stranded and can be ligated to the end of a given sample DNA molecule.
- two adapters can be ligated to a single sample DNA molecule, with one adapter ligated to each end of the sample nucleic acid molecule.
- the ligase used in ligation reactions can act on both single strand DNA nicks and double stranded DNA ends.
- the ligase is T4 DNA ligase or T3 DNA ligase.
- Adapters can include nucleic acid primer binding sites to permit amplification of a sample DNA molecule flanked by adapters at both ends, and/or a sequencing primer binding site, including primer binding sites for sequencing applications, such as various next generation sequencing (NGS) applications.
- Adapters can include a sequence for hybridizing to a solid support, e.g., a flow cell sequence.
- the adapters used in the methods of the Attorney Docket No. GH0205WO present disclosure comprise one or more known modified nucleosides, such as methylated nucleosides.
- the modified nucleosides comprise modification resistant cytosines.
- each cytosine in each adapter is a modification resistant cytosine.
- the modification resistant cytosine is a deamination resistant cytosine.
- the one or more methylated nucleotides in the MSRE digestion-resistant adapters comprise 5-methylcytosine and/or 5-hydroxymethylcytosine.
- the adapters are resistant to digestion by a methylation dependent restriction enzyme (MDRE).
- MDRE digestion-resistant adapters comprise one or more unmethylated nucleotides, comprise one or more nucleotide analogs resistant to methylation dependent restriction enzymes, or do not comprise a nucleotide sequence recognized by the MDRE.
- either or both of the adapters may comprise one or more known modified nucleosides.
- the primer binding site(s), sequencing primer binding site(s), sample index(es) and/or molecular barcode(s), if present do not comprise the known modified nucleosides that change base pairing specificity as a result of the conversion procedure.
- adapters may be added to the DNA or a subsample thereof. Adapters can be ligated to DNA at any point in the methods herein.
- adapters are ligated to the DNA of a sample or subsample thereof before the DNA is contacted with the capture probes.
- the DNA to which the adapters are ligated is in the same sample or subsample as the DNA used as a template to generate capture probes.
- the DNA to which the adapters are ligated is in a different sample or subsample, e.g., a second sample or a second subsample of a first sample, than the DNA used as a template to generate capture probes.
- the adapters ligated to DNA captured by the capture probes.
- the primers used to generate capture probes are not complementary to adapters, and the resulting capture probes therefore do not comprise adapters.
- Adapter-ligated DNA can therefore be selectively amplified in the presence of capture probes that do not comprise adapters.
- adapter-ligated DNA can be separated from DNA that does not comprise adapters.
- the disclosed methods comprise analyzing DNA in a sample. In such methods, adapters may be added to the DNA.
- first adapters are added to the 3’ ends of the nucleic acids by ligation, which may include ligation to single-stranded DNA.
- first adapters are added to the nucleic acids by ligation, which may include ligation to single-stranded DNA (e.g., to the 3’ ends thereof).
- the capture probes can be isolated after partitioning and ligation.
- the hypomethylated partition can be ligated with adapters and a portion of the ligated hypomethylated partition can then be used to generate the capture probes for rearrangements.
- the adapter can be used as a priming site for second-strand synthesis, e.g., using a universal primer and a DNA polymerase.
- a second adapter can then be ligated to at least the 3’ end of the second strand of the now double-stranded molecule.
- the first adapter comprises an affinity tag, such as biotin, and nucleic acid ligated to the first adapter is bound to a solid support (e.g., bead), which may comprise a binding partner for the affinity tag such as streptavidin.
- a solid support e.g., bead
- a binding partner for the affinity tag such as streptavidin.
- the single-stranded DNA library preparation is performed in a one-step combined phosphorylation/ligation reaction, e.g., as described in Troll et al., BMC Genomics, 20:1023 (2019), available at https://doi.org/10.1186/s12864-019-6355-0.
- This method called Single Reaction Single-stranded LibrarY (“SRSLY,”) can be performed without end-polishing.
- SRSLY may be useful for converting short and fragmented DNA molecules, e.g., cfDNA fragments, into sequencing libraries while retaining native lengths and ends.
- the SRSLY method can create sequencing libraries (e.g., Illumina sequencing libraries) from fragmented or degraded template (input) DNA.
- template DNA is first heat denatured and then immediately cold shocked to render the template DNA molecules single- stranded.
- the DNA can be maintained as single-stranded throughout the ligation reaction by the inclusion of a thermostable single-stranded binding protein (SSB).
- SSB thermostable single-stranded binding protein
- the template DNA Attorney Docket No. GH0205WO which at this point can be single-stranded and coated with SSB, is placed in a phosphorylation/ligation dual reaction with directional dsDNA NGS adapters that contain single- stranded overhangs.
- Both the forward and reverse sequencing adapters can share similar structures but differ in which termini is unblocked in order to facilitate proper ligations.
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 11, which is the amino acid sequence of SrDa01 (Accession: D6Z8J1; database: UniProt): MSRPSQLRSTIDSATSSAKGQRPAEDAGVAGVRAATRDPFAEASSLLVHLVINRVPPIKH VLDKVTLDEGAAMALAQQWSDEFPAALREQRDALVSTRESISQNWEGEGASKTYQDR QRELEDLADSLAAACEGAGFLTSAVNELMVGVREQIIEWIAQLVSMLMRRLISVVATFW IPIVGEINEAAFIFEGVARVVETIERITSLITRVEGILGEVSSVAQVLGGSTQQVQQLGSAL GEVPQGLGASPLRALGRVGSVRQSGLTGSHGAVIRSVVPRRSKSPGGSAHAQSAGTSTR
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 19, which is the amino acid sequence of PdDa01 (Accession: JH605467.1; database: GeneBank): MIDPNSSETEYKYDKLYRLIEQIDGESGKTKYRYDKVGNVRFVTDPRDKVTEYQYDRV YRLQKEIDANDETREYGYDLAGNLTSIQDRRDNTTIFEYDDVNRQTSRINPYGARFKIDY DKVGNVIKETDELGRSTKYDYDELNRLEVVTNAEDGTVVYGYDKVGNVISVQNERGKI TRYEYDEINRQIKITNALGFETIIDYTNDTDSNQLVVLVTEQVDADSSRSSEWRTDSRERL QTFTNADDTTTTYSYDGNNNLLTVVD
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 20, which is the amino acid sequence of MGYPDa25 (Accession: MGYP001677015708; database: MGnify): MVQSVHNLAGGLKTLDARAIEKSTVAILEVELRNGSKTLYAAGSSGWLNPRQRALLEK LGVPKENILSGKKYTLSKDVVKNINHVEQIILRNIPDNTKVTRKGISWGSKQRNAHCSRC KPSVDGAGGVYD (SEQ ID NO: 20).
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 22, which is the amino acid sequence of AshDa01 (Accession: A0A7H8HBT9; database: UniProt): MRYHHVRVTDPGVTGGTVPGVALYAKKITYTGQGKIEGRYSITFTRDRDRGEPRRPDV VIDARNGFKQVTADLLRRVDILLDDKPIRAYELNYRTGAFAKTLLQSVTQFGEDGSPFTT HTFDYYDDIRSADGQYQAFGPAAGWSVPGDSLGESVPDGHASALSATTSRSTGGHLYV GFNPAVVSKSGSAGAKVGFNAGTSEGLLSLTDVNGDSLPDKVFRTGAGVFYRPSLSGPG GAPKFGDTPIRLPGLPGISEESTRSTTSGAEAYFGVAAQLDHVSTTTRTDRYLTDVNGDG I
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 23, which is the amino acid sequence of MGYPDa21 (Accession: MGYP001278432191; database: MGnify): DLIEQLAERLGNHANLTDELIAAELQAMGVAATDDIVARVWSGLNASALRQSGALRAV ALEVHNLAGSQIAINQSAIAVVEVTLADGSRQVFASGSGGRLTTEQVRRLIALGVPPENIL Attorney Docket No.
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 24, which is the amino acid sequence of PpDa03 (Accession: A0A3S0YAN6; database: UniProt): MAGTETHYGYDEDGNCVTVRNGEGEIRHFLHDGRGLLIRETAPDDILHYRYDAAGRLT EVTSATSHIQLAYDKRDRVLEEHNSGSVIRRHYQDASHTVTRSLLWEGEEDSAALTSSF CYSATGELRQVQLPDGAELMLAHDAAGRESIRHSNGGFTQQREYDAMGWLTREMSGQ QQDGRLHASQTREYLYDGAGNL
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 25, which is the amino acid sequence of SbDa01 (Accession: A0A7G6KCF9; database: UniProt): MAQKVHDLAGGLHAPDLRAIRNSAVAIVEATVNGEKILFAAGSAGRLNPRQVALLKEY GVLEENIFRNSAVTKGFEQLENHAERIILRNLPEGATVERWGISWAGKQKNIPCPHCEPF VRDAGGFFDKIW (SEQ ID NO: 25).
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 26, which is the amino acid sequence of BlDa01 (Accession: A0A6P2WMK2; database: UniProt): MWNGILLVQETYHDRWGEEALTYLYESNSYVPLARIDQGKAAANDANARDAVYYFHN DVSGLPEELTNAEGELIWQARYKVWGNAVQEEWIARAPQQPVPEWGELQVATATSVH MPRPQNLRFQGQYLDRETGLHYNTFRFYDPDIGRFINPDPIGLLGGTNLYRYAANPLVWI DPWGWACGELSGKAQEVHNLAGGGDARSIRNSTVSIVEAKVNGKPQLFAAGSGGRLSP Attorney Docket No.
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 27, which is the amino acid sequence of PpDa04 (Accession: A0A5M9IK01; database: UniProt): MSDALWAARMGDALTHTSMMADILGGVLEVAANVAITALATAAVAAAIGVTVATAGI GGCILGAVVGIVVGMAMSKTGADKGLSKMCESFANALFPPVVEATIAVGSADTFVNSIP AARAAGSIPSHVAPAGTELELPPPEPDTAPQAEPGFLDMAEGFFSQLWRPTVATPAPGVV PALTDMVLCARHPPM
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 28, which is the amino acid sequence of CsDa01 (Accession: A0A3E1NUV0; database: UniProt): MLISNKHFVPVIGLDIHIVILFGFPIPLPHPYLGFVLDPMDYIPFLGASTKVNHVPRGVSDT SGIIIILFHFPMGGPWLLAPLIGHDSVNFYGAKNTLVEGRMLSPTGHMLMTCNDIGLPLSL KPGKKLIPMPGMYLPTSYSIPLSFGKPVIVGGPYVPDWAGVLMNLLMSFGFGALLKGIG KLGKKMMTKFNHALKGKLGSNKLSKFLCKHGFEPVDLVQGIVISEGLDFELSGPIPIVW ERTWNSDSAHKGLLGHGNHLFY
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 29, which is the amino acid sequence of MGYPDa22 (Accession: MGYP001462196871; database: MGnify): Attorney Docket No.
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 30, which is the amino acid sequence of FlDa01 (Accession: A0A0Q4RZG0; database: UniProt): MVYDKPEPIKNLITWVYEGGSFVPSAKIIGEHKFSIINDYIGRPIQVYNEVGDVVWETDY DIYGGLRNLKGDKSFIPFRQLGQYEDVETGLYYNRHRYYNPESGGYISQDPIGLLGGSAS YKYVHDCNNCVDIFGLNPVIFSEELSKIAQEAHNVLLEPGKSPRGFNNSTVSVAKVDVN GVSTLYASGNGASLSPAQRTKLVELGVPEENIFSGKRFKEIIDGDTGTLTKLSNHAERVIE RNIPKDASIKEWGISWASKQKNEMC
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 31, which is the amino acid sequence of MGYPDa24 (Accession: MGYP000620945751; database: MGnify): QETGFVEVSKGDIGVIGNAALARKIEPDPYLCPLRFQGQWEDEETGLYYNRFRYYEPMA GCYLSRDPMGIYGSYRPGAYVPNPALWIDPFGLQRQPASELSGCEELGEMARAVHDIID DPRAKANSTVEIFRGTDGQLYASGSGSRLRPAQREALMRMGIPASNIFSGVAFQGADKL ENHAERTILRNMPEGVKVQSFGVSWGSRQRNVCCTACAASMIPNLAN (SEQ ID NO: 31).
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 32, which is the amino acid sequence of AaDa02 (Accession: A0A2S8AG27; database: UniProt): MAREVHNLARDERALRSQTVSIAKVEKDGVSSLYASGSGASLNPRQREKLEELGVPKE NIFSGKKFKVFIDGERGIQTKLANHAERVIERNIPLDSEIKEFGISWSSKQKNEMCPNCKE HFGHSH (SEQ ID NO: 32).
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 33, which is the amino acid sequence of MmgDa01 (Accession: JANFCG010307071; database: GeneBank): SSTPVGSLRSIAMAAHSKAATPTALEKTAVALAEVRLKNGSVEYWAAGSGESLSGIQRN YLESLGFRVISGKAGFHAEEQIGYLLPEGSDVLRWGISWTTPQKGIPCIRCNPLLDTIGGVI EK (SEQ ID NO: 33).
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 34, which is the amino acid sequence of PbDa01 (Accession: MCL1918637.1; database: GeneBank): YQNDPNGMPLRLLDENGIIKWEGHYSAFGLVDRVSVEAVGQPLRLQGQYFDDESGLCY NRHRYYDAVVGCFISSDPIGLDGGLNPYRFAPNVLGWIDPWGLSCAVLSARLARIARIV HNLSGNPRAIRQSTVAIARVRINGKYQLFAAGSGGRLNPAQRAELVRLGVPEGNIFHGR GVTNGFSQIENHAERIVLRNIPEGSSVVHWGISWGGLQRNLPCVSCRPYVLGT (SEQ ID NO: 34).
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 35, which is the amino acid sequence of BcDa02 (Accession: KVH32961.1; database: GeneBank): MSSDHPGGYVEHNRLKMFEDKRFEYDVYGRLVRKLSGHGPAKELVLEYDDWNQLKA VVRKDRLGIGTTHFEYDAFGRRIRKFNGSYASTDFRWGGMRLVQETYHDRQGEEALTY LYEANSYVPLARIDQGKPAANDADARDAVYYFHNDVSGLPEELTDAGGELVWQARYK VWGNVVQEEWIAPVRHQPALAWGEVRAVIESPDHVPRPQNLRFQGQYLDRETGLHYN TFRFYDPDIGRFISPDPIGLNGGRNLYRYTP
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 36, which is the amino acid sequence of LsfDa01 (Accession: UHQ21442.1; database: GeneBank): MTTAAKHFDPQLGIDIHMYVFPPVPLPVPLPTPHIGIVLDPFDYLPFLGGTVHVNGIKRAT AGTGGLNLHIPMGAYHPAFLPKLPTGPQTDDELFMGSMTVSADGDPFSKLAMPVLDCN VVGMVPPFRLRKPKKPKLSLTLPTAVNLAIPTNVNVGGPPTISLMAMAMKGLFKLLGPV FKRGGKAFKKLRQKVFGNMKPGFLKCKVLRAEPVDIRTGSVSVTHEDFVVPGRLPLAW TREYGSNNDHVGACGYGWETPADIRLELDADGSVLFHSGEGVAV
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 38, which is the amino acid sequence of DaDa01 (Accession: tr
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 39, which is the amino acid sequence of EcDa01 (Accession: Attorney Docket No.
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 59, which is the amino acid sequence of VsDa01 (Accession: tr
- the amino acid sequence of the methyl-sensitive deaminase comprises or has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 72, which is the amino acid sequence of MGYPDa624 (Accession: MGYP001011623624; database: MGnify): DPLGLEHGNMPRKGAGVGNGGSKYTLNKDFQGKFTLKRPELPVYDGKTTSGVLVTDD FKQIRFNSGNGDPRYTNYANNGHVEQKASLYMQDNNISKATLYHNNTNGTCGWCNNM TETFLPEGATLKVIPPSNAVANNSKAVAIPKVYTGNSNVPKVRKK (SEQ ID NO: 72).
- the agent is immobilized on a solid support.
- the solid support comprises a bead.
- the partitioning comprises immunoprecipitation, e.g., using the antibody agent, such as an antibody or a CpG-binding protein, immobilized on solid support.
- the separating comprises precipitating the CpG protein- bound DNA.
- the separating comprises precipitating the CpG protein- bound DNA to separate it from the unbound DNA.
- the partitioning comprises precipitating the CpG protein-bound DNA.
- the partitioning comprises precipitating the CpG protein-bound DNA to separate it from the unbound DNA.
- the precipitating the CpG protein-bound DNA can be performed using any pair of binding partners.
- one of the binding partners may be linked to the CpG protein, and the other binding partner may be linked to a solid support.
- the binding partner comprises biotin and streptavidin.
- the biotin may be linked to the CpG-binding protein, and the streptavidin may be linked to a solid support.
- the CpG-binding protein is linked to a solid support, optionally using any pair of binding partners.
- the separating comprises immunoprecipitating the CpG protein-bound DNA.
- the separating comprises immunoprecipitating the CpG protein-bound DNA separately from the unbound DNA. In some embodiments, the partitioning comprises immunoprecipitating the CpG protein- bound DNA. In some embodiments, the partitioning comprises immunoprecipitating the CpG protein-bound DNA separately from the unbound DNA.
- the modification is methylation, and in some such embodiments, the partitioning comprises partitioning on the basis of methylation level.
- the agent is a methyl binding reagent. In some embodiments, the methyl binding reagent specifically recognizes 5-methylcytosine. In some such embodiments, the agent is a hydroxymethyl binding reagent.
- the methyl binding reagent specifically recognizes 5-hydroxymethylcytosine, biotinylated 5-hydroxymethylcytosine, Attorney Docket No. GH0205WO glucosylated 5-hydroxymethylcytosine, or sulfonylated 5-hydroxymethylcytosine.
- the partitioning comprises partitioning on the basis of binding to a protein comprising contacting the sample comprising the DNA with a binding reagent specific for the protein.
- binding reagent specifically binds a methylated protein, an acetylated protein, such as a methylated or acetylated histone.
- the binding reagent specifically binds an unmethylated or unacetylated protein epitope.
- the modification is hydroxymethylation
- the partitioning comprises partitioning on the basis of hydroxymethylation level.
- the agent is a hydroxymethyl binding reagent, such as an antibody.
- the hydroxymethyl binding reagent e.g., antibody
- the DNA may be converted to double-stranded form by complementary strand synthesis before a subsequent step. Such synthesis may use an adapter as a primer binding site, or can use random priming.
- Partitioning nucleic acid molecules in a sample can increase a rare signal, e.g., by enriching rare nucleic acid molecules that are more prevalent in one partition of the sample. For example, a genetic variation present in hypermethylated DNA but less (or not) present in hypomethylated DNA can be more easily detected by partitioning a sample into hypermethylated and hypomethylated nucleic acid molecules.
- Partitioning may include physically partitioning nucleic acid molecules into partitions or subsamples based on the presence or absence of one or more methylated nucleobases.
- a sample may be partitioned into partitions or subsamples based on a characteristic that is indicative of differential gene expression or a disease state.
- a sample may be partitioned based on a characteristic, or combination thereof that provides a difference in signal between a Attorney Docket No.
- each partition is differentially tagged.
- Tagged partitions can then be pooled together for collective sample prep and/or sequencing.
- the partitioning-tagging-pooling steps can occur more than once, with each round of partitioning occurring based on a different characteristic (examples provided herein), and tagged using differential tags that are distinguished from other partitions and partitioning means.
- the differentially tagged partitions are separately sequenced.
- sequence reads from differentially tagged and pooled DNA are obtained and analyzed in silico. After sequencing, analysis of reads can be performed on a partition-by-partition level, as well as a whole DNA population level. Tags are used to sort reads from different partitions.
- Analysis to detect genetic variants can be performed on a partition-by-partition level, as well as whole nucleic acid population level.
- analysis can include in silico analysis to determine genetic variants, such as copy number variations (CNVs), single nucleotide variations (SNVs), insertions/deletions (indels), and/or fusions in nucleic acids in each partition.
- CNVs copy number variations
- SNVs single nucleotide variations
- indels insertions/deletions
- fusions in nucleic acids in each partition can include analysis to determine epigenetic variation (one or more of methylation, chromatin structure, etc.).
- Analysis can include in silico using sequence information, genomic coordinates length, coverage, and/or copy number. For example, coverage of sequence reads can be used to determine nucleosome positioning in chromatin. Tags are used to sort reads from different partitions.
- the agents used to partition populations of nucleic acids within a sample can be affinity agents, such as antibodies with the desired specificity, natural binding partners or variants thereof (Bock et al., Nat Biotech 28: 1106-1114 (2010); Song et al., Nat Biotech 29: 68- 72 (2011)), or artificial peptides selected e.g., by phage display to have specificity to a given target.
- the agent used in the partitioning is an agent that recognizes a Attorney Docket No. GH0205WO modified nucleobase.
- the modified nucleobase recognized by the agent is a modified cytosine, such as a methylcytosine (e.g., 5-methylcytosine).
- the modified nucleobase recognized by the agent is a product of a procedure that affects the first nucleobase in the DNA differently from the second nucleobase in the DNA of the sample.
- the modified nucleobase may be a “converted nucleobase,” meaning that its base pairing specificity was changed by a procedure.
- partitioning agents include antibodies, such as antibodies that recognize a modified nucleobase, which may be a modified cytosine, such as a methylcytosine (e.g., 5-methylcytosine).
- the partitioning agent is an antibody that recognizes a modified cytosine other than 5-methylcytosine, such as 5-carboxylcytosine (5-caC).
- Alternative partitioning agents include methyl binding domain (MBDs) and methyl binding proteins (MBPs) as described herein, including proteins such as MeCP2, MBD4, MBD2, MBD1, and antibodies preferentially binding to 5-methylcytosine.
- MBDs methyl binding domain
- MBPs methyl binding proteins
- the methylated DNA may be recovered in single-stranded form.
- a second strand can be synthesized.
- Hypermethylated (and optionally intermediately methylated) subsamples may then be contacted with a methylation sensitive nuclease that does not cleave hemi-methylated DNA, such as HpaII, BstUI, or Hin6i.
- hypomethylated (and optionally intermediately methylated) subsamples may then be contacted with a methylation dependent nuclease that cleaves hemi-methylated DNA.
- partitioning agents are histone binding proteins which can separate nucleic acids bound to histones from free or unbound nucleic acids. Examples of histone binding proteins that can be used in the methods disclosed herein include RBBP4, RbAp48 and SANT domain peptides.
- partitioning can comprise both binary partitioning and partitioning based on degree/level of modifications.
- methylated fragments can be partitioned by methylated DNA immunoprecipitation (MeDIP), or all methylated fragments can be partitioned from unmethylated fragments using methyl binding domain proteins (e.g., MethylMinder Methylated DNA Enrichment Kit (ThermoFisher Scientific). Subsequently, additional partitioning may involve eluting fragments having different levels of methylation by Attorney Docket No. GH0205WO adjusting the salt concentration in a solution with the methyl binding domain and bound fragments. As salt concentration increases, fragments having greater methylation levels are eluted.
- Analyzing DNA may comprise detecting or quantifying DNA of interest.
- Analyzing DNA can comprise detecting genetic variants and/or epigenetic features (e.g., DNA methylation and/or DNA fragmentation).
- the DNA of interest is one or more differentially methylated regions of the DNA.
- the detecting or quantifying the DNA of interest comprises quantifying and/or detecting a level of methylation at one or more differentially methylated regions of the DNA.
- quantifying and/or detecting the level of methylation at one or more differentially methylated regions of the DNA comprises sequencing at least a portion of the amplified DNA or quantitative PCR (qPCR).
- methylation levels can be determined using partitioning, modification-sensitive conversion such as direct detection during sequencing, methylation- sensitive restriction enzyme digestion, methylation-dependent restriction enzyme digestion, or any other suitable approach.
- different forms of DNA e.g., hypermethylated and hypomethylated DNA
- a methylated DNA binding protein e.g., an MBD such as MBD1, MBD2, MBD4, or MeCP2
- an antibody specific for 5-methylcytosine as in MeDIP
- This approach can be used to determine, for example, whether certain sequences are hypermethylated or hypomethylated.
- a DNA fragmentation pattern can be determined based on endpoints and/or centerpoints of DNA molecules, such as cfDNA molecules.
- the final partitions are enriched in nucleic acids having different extents of modifications (overrepresentative or underrepresentative of modifications). Overrepresentation and underrepresentation can be defined by the number of modifications born by a nucleic acid relative to the median number of modifications per strand in a population. For example, if the median number of 5-methylcytosine residues in nucleic acid in a sample is 2, a nucleic acid including more than two 5-methylcytosine residues is overrepresented in this modification and a nucleic acid with 1 or zero 5-methylcytosine residues is underrepresented.
- the effect of affinity separation is to enrich for nucleic acids overrepresented in a modification in a bound phase and for nucleic acids underrepresented in a modification in an unbound phase Attorney Docket No. GH0205WO (i.e. in solution).
- the nucleic acids in the bound phase can be eluted before subsequent processing.
- MeDIP or MethylMiner ® Methylated DNA Enrichment Kit various levels of methylation can be partitioned using sequential elutions. For example, a hypomethylated partition (no methylation) can be separated from a methylated partition by contacting the nucleic acid population with the MBD from the kit, which is attached to magnetic beads.
- a first set of methylated nucleic acids can be eluted at a salt concentration of 160 mM or higher, e.g., at least 150 mM, at least 200 mM, 300 mM, 400 mM, 500 mM, 600 mM, 700 mM, 800 mM, 900 mM, 1000 mM, or 2000 mM.
- nucleic acids bound to an agent used for affinity separation based partitioning are subjected to a wash step.
- the wash step washes off nucleic acids weakly bound to the affinity agent.
- nucleic acids can be enriched in nucleic acids having the modification to an extent close to the mean or median (i.e., intermediate between nucleic acids remaining bound to the solid phase and nucleic acids not binding to the solid phase on initial contacting of the sample with the agent).
- the affinity separation results in at least two, and sometimes three or more partitions of nucleic acids with different extents of a modification. While the partitions are still separate, the nucleic acids of at least one partition, and usually two or three (or more) partitions are linked to nucleic acid tags, usually provided as components of adapters, with the nucleic acids in different partitions receiving different tags that distinguish members of one partition from another.
- the tags linked to nucleic acid molecules of the same partition can be the same or different from one another. But if different from one another, the tags may have part of their Attorney Docket No. GH0205WO code in common so as to identify the molecules to which they are attached as being of a particular partition.
- the nucleic acid molecules can be partitioned into different partitions based on the nucleic acid molecules that are bound to a specific protein or a fragment thereof and those that are not bound to that specific protein or fragment thereof.
- Nucleic acid molecules can be partitioned based on DNA-protein binding.
- Protein-DNA complexes can be partitioned based on a specific property of a protein. Examples of such properties include various epitopes, modifications (e.g., histone methylation or acetylation) or enzymatic activity. Examples of proteins which may bind to DNA and serve as a basis for fractionation may include, but are not limited to, protein A and protein G. Any suitable method can be used to partition the nucleic acid molecules based on protein bound regions.
- an MBD binds to 5-methylcytosine (5mC), and an MBP comprises an MBD and is referred to interchangeably herein as a methyl binding protein or a methyl binding domain protein.
- MBD is coupled to paramagnetic beads, such as Dynabeads® M-280 Streptavidin Attorney Docket No. GH0205WO via a biotin linker. Partitioning into fractions with different extents of methylation can be performed by eluting fractions by increasing the NaCl concentration.
- bound DNA is eluted by contacting the antibody or MBD with a protease, such as proteinase K.
- agents that recognize a modified nucleobase contemplated herein include, but are not limited to: [00379] (a) MeCP2 is a protein that preferentially binds to 5-methyl-cytosine over unmodified cytosine. [00380] (b) RPL26, PRP8 and the DNA mismatch repair protein MHS6 preferentially bind to 5- hydroxymethyl-cytosine over unmodified cytosine.
- FOXK1, FOXK2, FOXP1, FOXP4 and FOXI3 preferably bind to 5-formyl- cytosine over unmodified cytosine (Iurlaro et al., Genome Biol.14: R119 (2013)).
- elution is a function of the number of modifications, such as the number of methylated sites per molecule, with molecules having more methylation eluting under increased salt concentrations.
- elution buffers of increasing NaCl concentration.
- Salt concentration can range from about 100 nm to about 2500 mM NaCl.
- the process results in three (3) partitions.
- Molecules are contacted with a solution at a first salt concentration and comprising a molecule comprising an agent that recognizes a modified nucleobase, which molecule can be attached to a capture moiety, such as streptavidin.
- a population of molecules will bind to the agent and a population will remain unbound. The unbound population can be separated as a “hypomethylated” population.
- a first partition enriched in hypomethylated form of DNA is that which remains unbound at a low salt concentration, e.g., 100 mM or 160 mM.
- a second partition enriched in intermediate methylated DNA is eluted using an intermediate salt concentration, e.g., between 100 mM and 2000 mM concentration. This is also separated from the sample.
- a third partition enriched in hypermethylated form of DNA is eluted using a high salt concentration, e.g., at least about 2000 mM.
- Attorney Docket No. GH0205WO [00384]
- the eluted DNA is single-stranded DNA.
- one or more adapters are ligated to the eluted DNA.
- the eluted DNA comprises CpG-dense DNA.
- the CpG-binding protein is removed from the eluted DNA comprising CpG-dense DNA.
- CpG protein-bound DNA is eluted from the CpG-binding protein.
- a monoclonal antibody raised against 5-methylcytidine (5mC) is used to purify methylated DNA.
- DNA is denatured, e.g., at 95°C in order to yield single-stranded DNA fragments.
- Protein G coupled to standard or magnetic beads as well as washes following incubation with the anti-5mC antibody are used to immunoprecipitate DNA bound to the antibody.
- Partitions may comprise unprecipitated DNA and one or more partitions eluted from the beads.
- the partitions of DNA are desalted and concentrated in preparation for enzymatic steps of library preparation.
- Sequences that comprise aberrantly high copy numbers may tend to be hypermethylated.
- the DNA contacted with capture probes specific for members of an epigenetic target region set comprising a plurality of target regions that are both type-specific differentially methylated regions and copy number variants comprises at least a portion of a hypermethylated partition.
- the DNA from or comprising at least a portion of the hypermethylated partition may or may not be combined with DNA from or comprising at least a portion of one or more other partitions, such as an intermediate partition or a hypomethylated partition.
- different procedures are applied to different partitions to determine different characteristics of the initial sample.
- the DNA of at least one partition is subjected to an end repair and sequencing procedure described herein.
- at least one partition is not subjected to the end repair and sequencing procedure according to the methods of the disclosure described herein.
- methylation-preserving amplification comprises linear amplification with thermocycling.
- methylation-preserving amplification comprises amplification performed in the presence of a methyltransferase.
- RCA occurs prior to a step of sequencing the DNA.
- adapted DNA is amplified before sequencing. This may be an additional amplification step subsequent to an earlier amplification step, such as amplification as described elsewhere herein.
- amplification of adapted DNA comprises RCA, e.g., as described above.
- RCA comprises copying the circularized DNA template using a rolling circle polymerase to generate a plurality of circularized DNA templates.
- the capture step is performed prior to a step of amplifying methylation-separated DNA, prior to a step of sequencing the DNA, after separating the CpG protein-bound DNA from unbound DNA, and after partitioning the DNA in the sample into a plurality of subsamples. In some embodiments, the capture step is performed prior to a step of amplifying methylation-separated DNA, after contacting the CpG-dense DNA with the methyl-sensitive deaminase, after separating the CpG protein-bound DNA from unbound DNA, and after partitioning the DNA in the sample into a plurality of subsamples.
- the capture step is performed after contacting the CpG-dense DNA with the methyl-sensitive deaminase and after partitioning the DNA in the sample into a plurality of subsamples. In some embodiments, the capture step is performed after separating the CpG protein-bound DNA from unbound DNA and after partitioning the DNA in the sample into a plurality of subsamples.
- Capture may be performed using any suitable approach known in the art. Target capture can involve use of a bait set comprising oligonucleotide baits (a type of probe useful herein) labeled with a capture moiety, such as biotin or the other examples noted below.
- the probes can have sequences selected to tile across a panel of regions, such as genes.
- Such bait sets are combined with a sample under conditions that allow hybridization of the target molecules with the baits.
- captured molecules are isolated using the capture moiety.
- a biotin capture moiety by bead-based streptavidin.
- Capture moieties include, without limitation, biotin, avidin, streptavidin, a nucleic acid comprising a particular nucleotide sequence, digoxygenin, a histidine tag, an affinity tag, an immunoglobulin constant domain, a hapten recognized by an antibody, and magnetically attractable particles.
- the immunoglobulin constant domain may be bound using protein A, protein G, or a secondary antibody.
- the secondary antibody comprises an anti-mouse secondary antibody.
- the anti-mouse secondary antibody is a goat anti-mouse secondary antibody, rabbit anti-mouse secondary antibody, or a donkey anti-mouse secondary antibody.
- a CpG-binding protein comprises a capture moiety.
- the extraction moiety can be a member of a binding pair, such as biotin/streptavidin or hapten/antibody.
- a capture moiety that is attached to an analyte is captured by its binding pair which is attached to an isolatable moiety, such as a magnetically attractable particle or a large particle that can be sedimented through centrifugation.
- the capture moiety can be any type of molecule that allows affinity separation of nucleic acids bearing the capture moiety from nucleic acids lacking the capture moiety.
- Exemplary capture moieties are biotin which allows affinity separation by binding to streptavidin Attorney Docket No.
- a panel of regions targeted for enrichment can be selected such that they do not contain regions known to include the base modification used in the end repair reaction.
- a panel of regions targeted for enrichment may be selected such that they do not contain CpH dinucleotides which are known to be naturally methylated in the subject (e.g. humans).
- Such CpH dinucleotides can be identified through the use of publicly available resources (e.g.
- capturing comprises contacting the DNA to be captured with a set of target-specific probes.
- the set of target-specific probes may have any of the features described herein for sets of target-specific probes, including but not limited to in the embodiments set forth above and the sections relating to probes below. Capturing may be performed on one or more subsamples prepared during methods disclosed herein.
- DNA is captured from at least the first subsample or the second subsample, e.g., at least the first subsample and the second subsample.
- the subsamples are differentially tagged (e.g., as described herein) and then pooled before undergoing capture.
- the capturing step may be performed using conditions suitable for specific nucleic acid hybridization, which generally depend to some extent on features of the probes such as length, base composition, etc. Those skilled in the art will be familiar with appropriate conditions given general knowledge in the art regarding nucleic acid hybridization.
- complexes of target-specific probes and DNA are formed.
- suitable target region sets are available from the literature.
- Gale et al., PLoS One 13: e0194630 (2016) which is incorporated herein by reference, describes a panel of 35 cancer-related gene targets that can be used as part or all of a sequence-variable target region set.
- longer sequence lengths will generally provide increased affinity.
- Other nucleotide modifications such as the substitution of the nucleobase hypoxanthine for guanine, reduce affinity by reducing the amount of hydrogen bonding between the oligonucleotide and its complementary sequence.
- the capture probes specific for the sequence-variable target region set have modifications that increase their affinity for their targets.
- the capture probes may be provided as a plurality of compositions, e.g., comprising a first composition comprising probes specific for the epigenetic target region set and a second composition comprising probes specific for the sequence-variable target region set. These probes may be mixed in appropriate proportions to provide a combined probe composition with any of the foregoing fold differences in concentration and/or capture yield. Alternatively, they may be used in separate capture procedures (e.g., with aliquots of a sample or sequentially with the same sample) to provide first and second compositions comprising captured epigenetic target regions and sequence-variable target regions, respectively. 1.
- the probes for the epigenetic target region set may comprise probes specific for one or more types of target regions likely to differentiate DNA from neoplastic (e.g., tumor or cancer) cells from healthy cells, e.g., non-neoplastic circulating cells. Exemplary types of such regions are discussed in detail herein, e.g., in the sections above concerning captured sets.
- the Attorney Docket No. GH0205WO probes for the epigenetic target region set may also comprise probes for one or more control regions, e.g., as described herein.
- the probes for the epigenetic target region set have a footprint of at least 100 kbp, e.g., at least 200 kbp, at least 300 kbp, or at least 400 kbp.
- the epigenetic target region set has a footprint of at least 20 Mbp.
- the probes for the epigenetic target region set comprise probes specific for one or more hypermethylation variable target regions.
- Hypermethylation variable target regions may also be referred to herein as hypermethylated DMRs (differentially methylated regions).
- the hypermethylation variable target regions may be any of those set forth above.
- the probes specific for hypermethylation variable target regions comprise probes specific for a plurality of loci listed in Table 1, e.g., at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% of the loci listed in Table 1.
- the probes specific for hypermethylation variable target regions comprise probes specific for a plurality of loci listed in Table 2, e.g., at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% of the loci listed in Table 2.
- the probes specific for hypermethylation variable target regions comprise probes specific for a plurality of loci listed in Table 1 or Table 2, e.g., at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% of the loci listed in Table 1 or Table 2.
- each locus included as a target region there may be one or more probes with a hybridization site that binds between the transcription start site and the stop codon (the last stop codon for genes that are alternatively spliced) of the gene.
- the one or more probes bind within 300 bp of the listed position, e.g., within 200 or 100 bp.
- a probe has a hybridization site overlapping the position listed above.
- the probes specific for the hypermethylation target regions include probes specific for one, two, three, four, or five subsets of hypermethylation target regions that collectively show hypermethylation in one, two, three, four, or five of breast, colon, kidney, liver, and lung cancers.
- Hypomethylation variable target regions [00464]
- the probes for the epigenetic target region set comprise probes specific for one or more hypomethylation variable target regions. Hypomethylation variable target regions may also be referred to herein as hypomethylated DMRs (differentially methylated regions). The hypomethylation variable target regions may be any of those set forth above.
- the probes specific for one or more hypomethylation variable target regions may include probes for regions such as repeated elements, e.g., LINE1 elements, Alu elements, centromeric tandem repeats, pericentromeric tandem repeats, and satellite DNA, and intergenic regions that are ordinarily methylated in healthy cells may show reduced methylation in tumor cells.
- probes specific for hypomethylation variable target regions include probes specific for repeated elements and/or intergenic regions.
- probes specific for repeated elements include probes specific for one, two, three, four, or five of LINE1 elements, Alu elements, centromeric tandem repeats, pericentromeric tandem repeats, and/or satellite DNA.
- Exemplary probes specific for genomic regions that show cancer-associated hypomethylation include probes specific for nucleotides 8403565-8953708 and/or 151104701- 151106035 of human chromosome 1.
- the probes specific for hypomethylation variable target regions include probes specific for regions overlapping or comprising nucleotides 8403565-8953708 and/or 151104701-151106035 of human chromosome 1.
- CTCF binding regions [00467]
- the probes for the epigenetic target region set include probes specific for CTCF binding regions.
- the probes specific for CTCF binding regions comprise probes specific for at least 10, 20, 50, 100, 200, or 500 CTCF binding regions, or 10-20, 20-50, 50-100, 100-200, 200-500, or 500-1000 CTCF binding regions, e.g., such as CTCF binding regions described above or in one or more of CTCFBSDB or the Cuddapah et al., Martin et al., or Rhee et al. articles cited above.
- the probes for the epigenetic target region set comprise at least 100 bp, at least 200 bp at least 300 bp, at least 400 bp, at least 500 bp, at least 750 bp, or at least 1000 bp upstream and downstream regions of the CTCF binding sites.
- Attorney Docket No. GH0205WO d. Transcription start sites [00468]
- the probes for the epigenetic target region set include probes specific for transcriptional start sites.
- the probes specific for transcriptional start sites comprise probes specific for at least 10, 20, 50, 100, 200, or 500 transcriptional start sites, or 10-20, 20-50, 50-100, 100-200, 200-500, or 500-1000 transcriptional start sites, e.g., such as transcriptional start sites listed in DBTSS.
- the probes for the epigenetic target region set comprise probes for sequences at least 100 bp, at least 200 bp, at least 300 bp, at least 400 bp, at least 500 bp, at least 750 bp, or at least 1000 bp upstream and downstream of the transcriptional start sites. e.
- focal amplifications are somatic mutations, they can be detected by sequencing based on read frequency in a manner analogous to approaches for detecting certain epigenetic changes such as changes in methylation. As such, regions that may show focal amplifications in cancer can be included in the epigenetic target region set, as discussed above.
- the probes specific for the epigenetic target region set include probes specific for focal amplifications.
- the probes specific for focal amplifications include probes specific for one or more of AR, BRAF, CCND1, CCND2, CCNE1, CDK4, CDK6, EGFR, ERBB2, FGFR1, FGFR2, KIT, KRAS, MET, MYC, PDGFRA, PIK3CA, and RAF1.
- the probes specific for focal amplifications include probes specific for one or more of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or 18 of the foregoing targets.
- Control regions It can be useful to include control regions to facilitate data validation.
- the probes specific for the epigenetic target region set include probes specific for control methylated regions that are expected to be methylated in essentially all samples. In some embodiments, the probes specific for the epigenetic target region set include probes specific for control hypomethylated regions that are expected to be hypomethylated in essentially all samples. 2. Probes specific for sequence-variable target regions [00471] The probes for the sequence-variable target region set may comprise probes specific for a plurality of regions known to undergo somatic mutations in cancer. The probes Attorney Docket No. GH0205WO may be specific for any sequence-variable target region set described herein. Exemplary sequence-variable target region sets are discussed in detail herein, e.g., in the sections above concerning captured sets.
- the present methods can be used to monitor the likelihood of residual disease or the likelihood of recurrence of disease.
- the present methods are used for screening for a cancer, such as a metastasis, or in a method for screening cancer, such as in a method of detecting the presence or absence of a metastasis.
- the sample can be a sample from a subject who has or has not been previously diagnosed with cancer.
- one or more, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more samples are collected from a subject as described herein, such as before and/or after the subject is diagnosed with a cancer.
- the subject is at least 40, 45, 50, 55, 60, 65, 70, 75, or 80 years old.
- the subject has poor nutrition, e.g., high consumption of one or more of red meat and/or processed meat, trans fat, saturated fat, and refined sugars, and/or low consumption of fruits and vegetables, complex carbohydrates, and/or unsaturated fats.
- High and low consumption can be defined, e.g., as exceeding or falling below, respectively, recommendations in Dietary Guidelines for Americans 2020-2025, available at dietaryguidelines.gov/sites/default/files/2021- 03/Dietary_Guidelines_for_Americans-2020-2025.pdf .
- Non-limiting examples of such cancers include biliary tract cancer, bladder cancer, transitional cell carcinoma, urothelial carcinoma, brain cancer, gliomas, astrocytomas, breast cancer, metaplastic carcinoma, cervical cancer, cervical squamous cell carcinoma, rectal cancer, colorectal carcinoma, colon cancer, hereditary nonpolyposis colorectal cancer, colorectal adenocarcinomas, gastrointestinal stromal tumors (GISTs), endometrial carcinoma, endometrial stromal sarcomas, esophageal cancer, esophageal Attorney Docket No.
- cancers may remain benign, inactive or dormant.
- the system and methods of this disclosure may be useful in determining disease progression.
- the methods of the disclosure may be used to characterize the heterogeneity of an abnormal condition in a subject. Such methods can include, e.g., generating a genetic and/or epigenetic profile of cfDNA derived from the subject, wherein the genetic and/or epigenetic profile comprises a plurality of data resulting from copy number variation and rare mutation analyses.
- an abnormal condition is cancer, e.g. as described herein.
- the abnormal condition may be one resulting in a heterogeneous genomic population.
- some tumors are known to comprise tumor cells in different stages of the cancer.
- the nucleic acid sequencer performs pyrosequencing, single-molecule sequencing, nanopore sequencing, semiconductor sequencing, sequencing-by-synthesis, 5-letter sequencing, 6-letter sequencing, sequencing-by-ligation or Attorney Docket No. GH0205WO sequencing-by-hybridization on the nucleic acids to generate sequencing reads.
- the method further comprises grouping the sequence reads into families of sequence reads, each family comprising sequence reads generated from a nucleic acid in the sample.
- the methods comprise determining the likelihood that the subject from which the sample was obtained has cancer or precancer, or has a metastasis, that is related to changes in proportions of types of immune cells.
- the method may further comprise determining a cancer recurrence score that is indicative of the presence or absence of the DNA originating or derived from the tumor cell for the subject.
- a cancer recurrence score may further be used to determine a cancer recurrence status.
- the cancer recurrence status may be at risk for cancer recurrence, e.g., when the cancer recurrence score is above a predetermined threshold.
- the cancer recurrence status may be at low or lower risk for cancer recurrence, e.g., when the cancer recurrence score is above a predetermined threshold.
- GH0205WO recurrence threshold may result in classification as either a candidate for a subsequent cancer treatment or not a candidate for therapy.
- the present methods can also be used to quantify levels of different cell types, such as immune cell types, including rare immune cell types, such as activated lymphocytes and myeloid cells at particular stages of differentiation. Such quantification can be based on the numbers of molecules corresponding to a given cell type in a sample. Sequence information obtained in the present methods may comprise sequence reads of the nucleic acids generated by a nucleic acid sequencer.
- Comparisons of immune cell identities and/or immune cell quantities/proportions between two or more samples collected from a subject at two different time points can allow for monitoring of one or more aspects of a condition in the subject over time, such as a response of the subject to a treatment, the severity of the condition (such as a cancer stage) in the subject, a recurrence of the condition (such as a cancer), and/or the subject’s risk of developing the condition (such as a cancer).
- the methods discussed above may further comprise any compatible feature or features set forth elsewhere herein, including in the section regarding methods of determining a risk of cancer recurrence in a subject and/or classifying a subject as being a candidate for a subsequent cancer treatment. 2.
- a method provided herein is or comprises a method of determining a risk of cancer recurrence in a subject. In some embodiments, a method provided herein is or comprises a method of detecting the presence of absence of a metastasis in a subject. Attorney Docket No. GH0205WO In some embodiments, a method provided herein is or comprises a method of classifying a subject as being a candidate for a subsequent cancer treatment.
- any of such methods may comprise collecting a sample (such as DNA, such as DNA originating or derived from a tumor cell) from the subject diagnosed with the cancer at one or more preselected timepoints following one or more previous cancer treatments to the subject.
- the subject may be any of the subjects described herein.
- the sample may comprise chromatin, cfDNA, or other cell materials.
- the sample, such as the DNA sample may be a tissue sample.
- the DNA may be DNA, such as cfDNA, from a blood sample (e.g., a whole blood sample, a buffy coat sample, a leukapheresis sample, or a PBMC sample).
- the DNA may comprise DNA obtained from a tissue sample.
- Any of such methods may comprise capturing a plurality of sets of target regions from DNA from the subject, wherein the plurality of target region sets comprises a sequence- variable target region set and an epigenetic target region set, whereby a captured set of DNA molecules is produced.
- the capturing step may be performed according to any of the embodiments described elsewhere herein.
- the previous cancer treatment may comprise surgery, administration of a therapeutic composition, and/or chemotherapy.
- Any of such methods may comprise sequencing the captured DNA molecules, whereby a set of sequence information is produced.
- the captured DNA molecules of the sequence-variable target region set may be sequenced to a greater depth of sequencing than the captured DNA molecules of the epigenetic target region set.
- Any of such methods may comprise detecting a presence or absence of DNA originating or derived from a tumor cell at a preselected timepoint using the set of sequence information.
- the detection of the presence or absence of DNA, such as cfDNA, originating or derived from a tumor cell may be performed according to any of the embodiments thereof described elsewhere herein.
- Methods of determining a risk of cancer recurrence in a subject may comprise determining a cancer recurrence score that is indicative of the presence or absence, or amount, of the DNA, such as genomic regions of interest and target regions, originating or derived from the tumor cell for the subject.
- the cancer recurrence score may further be used to determine a cancer recurrence status.
- the cancer recurrence status may be at risk for cancer recurrence, e.g., when the cancer recurrence score is above a predetermined threshold.
- the cancer recurrence status Attorney Docket No. GH0205WO may be at low or lower risk for cancer recurrence, e.g., when the cancer recurrence score is above a predetermined threshold.
- a cancer recurrence score equal to the predetermined threshold may result in a cancer recurrence status of either at risk for cancer recurrence or at low or lower risk for cancer recurrence.
- Methods of detecting the presence or absence of metastasis in a subject may comprise comparing the presence or level of a tissue-specific cell material to the presence or level of the tissue-specific cell material obtained from the subject at a different time, a reference level of the tissue-specific cell material, or to a comparator cell material. Methods herein may comprise additional steps to determine whether a metastasis is present.
- Methods of classifying a subject as being a candidate for a subsequent cancer treatment may comprise comparing the cancer recurrence score of the subject with a predetermined cancer recurrence threshold, thereby classifying the subject as a candidate for the subsequent cancer treatment when the cancer recurrence score is above the cancer recurrence threshold or not a candidate for therapy when the cancer recurrence score is below the cancer recurrence threshold.
- a cancer recurrence score equal to the cancer recurrence threshold may result in classification as either a candidate for a subsequent cancer treatment or not a candidate for therapy.
- the subsequent cancer treatment comprises chemotherapy or administration of a therapeutic composition.
- a number of mutations in the sequence-variable target regions chosen from 1, 2, 3, 4, or 5 is sufficient for the first subscore to result in a cancer recurrence score classified as positive for cancer recurrence.
- the number of mutations is chosen from 1, 2, or 3.
- epigenetic target region sequences are obtained, and determining the cancer recurrence score comprises determining a second subscore indicative of the amount of molecules (obtained from the epigenetic target region sequences) that represent an Attorney Docket No.
- GH0205WO epigenetic state different from DNA found in a corresponding sample from a healthy subject e.g., DNA, such as cfDNA, found in a blood sample (e.g., a whole blood sample, a buffy coat sample, a leukapheresis sample, or a PBMC sample) from a healthy subject, or DNA found in a tissue sample from a healthy subject where the tissue sample is of the same type of tissue as was obtained from the subject).
- abnormal molecules i.e., molecules with an epigenetic state different from DNA found in a corresponding sample from a healthy subject
- epigenetic changes associated with cancer such as with a metastasis
- methylation of hypermethylation variable target regions and/or perturbed fragmentation of fragmentation variable target regions where “perturbed” means different from DNA found in a corresponding sample from a healthy subject.
- a proportion of molecules corresponding to the hypermethylation variable target region set and/or fragmentation variable target region set that indicate hypermethylation in the hypermethylation variable target region set and/or abnormal fragmentation in the fragmentation variable target region set greater than or equal to a value in the range of 0.001%-10% is sufficient for the subscore to be classified as positive for cancer recurrence.
- the range may be 0.001%-1%, 0.005%-1%, 0.01%-5%, 0.01%-2%, or 0.01%-1%.
- any of such methods may comprise determining a fraction of tumor DNA from the fraction of molecules in the set of sequence information that indicate one or more features indicative of origination from a tumor cell.
- Determination of a cancer recurrence score may be based at least in part on the fraction of tumor DNA, wherein a fraction of tumor DNA greater than a threshold in the range of 10 -11 to 1 or 10 -10 to 1 is sufficient for the cancer recurrence score to be classified as positive for cancer recurrence.
- the present methods can be used to monitor one or more aspects of a condition in a subject over time, such as a subject’s response to receiving a treatment for a condition (such as a response to a chemotherapeutic or immunotherapeutic), the severity of the condition (such as a cancer stage) in the subject, a recurrence of the condition (such as a cancer), and/or the subject’s risk of developing the condition (such as a cancer) and/or to monitor a subject’s health as part of a preventative health monitoring program (such as to determine whether and/or when a subject is in need of further diagnostic screening).
- a condition in a subject over time such as a subject’s response to receiving a treatment for a condition (such as a response to a chemotherapeutic or immunotherapeutic), the severity of the condition (such as a cancer stage) in the subject, a recurrence of the condition (such as a cancer), and/or the subject’s risk of developing the condition (such as a cancer)
- monitoring comprises analysis of at least two samples collected from a subject at at least two different time points as described herein.
- the methods according to the present disclosure can be useful in predicting a subject’s response to a particular treatment option, such as over a period of time.
- successful treatment options may increase the amount of cancer associated DNA sequences detected in a subject's blood, such as if the treatment is successful as more cancers may die and shed DNA.
- certain treatment options may be correlated with genetic profiles of cancers over time. This correlation may be useful in selecting a therapy.
- one or more samples is collected from a subject at least once per year, such as about 1-12 times or about 2-6 times, such as about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 times per year. In other embodiments, one or more samples is collected from the subject less than once per year, such as about once every 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 months. In some embodiments, one or more samples is collected from the subject about once every 1-5 years or about once every 1-2 years, such as about every 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 years.
- the treatment comprises immunotherapies and/or immune checkpoint inhibitors (ICIS).
- Immunotherapies are treatments with one or more agents that act to stimulate the immune system so as to kill or at least to inhibit growth of cancer cells, and preferably to reduce further growth of the cancer, reduce the size of the cancer and/or eliminate the cancer.
- Some such agents bind to a target present on cancer cells; some bind to a target present on immune cells and not on cancer cells; some bind to a target present on both cancer Attorney Docket No. GH0205WO cells and immune cells.
- Such agents include, but are not limited to, checkpoint inhibitors and/or antibodies.
- anti-PD-1 or anti-PD-L1 therapies comprise pembrolizumab (KEYTRUDA®), nivolumab (OPDIVO®), and cemiplimab (LIBTAYO®), atezolizumab (TECENTRIQ®), durvalumab (INFINZI®), and avelumab (BAVENCIO®). These therapies may be used to treat patients identified as having high microsatellite instability (MSI) status or high tumor mutational burden (TMB).
- MSI microsatellite instability
- TMB tumor mutational burden
- the inhibitory immune checkpoint molecule is CTLA4 or PD-1.
- the inhibitory immune checkpoint molecule is a ligand for PD-1, such as PD-L1 or PD-L2.
- the inhibitory immune checkpoint molecule is a ligand for CTLA4, such as CD80 or CD86.
- the inhibitory immune checkpoint molecule is lymphocyte activation gene 3 (LAG3), killer cell immunoglobulin like receptor (KIR), T cell membrane protein 3 (TIM3), galectin 9 (GAL9), or adenosine A2a receptor (A2aR).
- the antibody is a monoclonal anti-PD-1 antibody. In some embodiments, the antibody is a monoclonal anti-PD-L1 antibody. In certain embodiments, the monoclonal antibody is a combination of an anti-CTLA4 antibody and an anti-PD-1 antibody, an anti- CTLA4 antibody and an anti-PD-L1 antibody, or an anti-PD-L1 antibody and an anti-PD-1 antibody. In certain embodiments, the anti-PD-1 antibody is one or more of pembrolizumab (Keytruda®) or nivolumab (Opdivo®). In certain embodiments, the anti-CTLA4 antibody is ipilimumab (Yervoy®).
- the anti-PD-L1 antibody is one or more of atezolizumab (Tecentriq®), avelumab (Bavencio®), or durvalumab (Imfinzi®).
- the immunotherapy or immunotherapeutic agent is an antagonist (e.g., antibody) against CD80, CD86, LAG3, KIR, TIM3, GAL9, or A2aR.
- the antagonist is a soluble version of the inhibitory immune checkpoint molecule, such as a soluble fusion protein comprising the extracellular domain of the inhibitory immune checkpoint molecule and an Fc domain of an antibody.
- the soluble fusion protein comprises the extracellular domain of CTLA4, PD-1, PD-L1, or PD-L2. In some embodiments, the soluble fusion protein comprises the extracellular domain of CD80, CD86, LAG3, KIR, TIM3, GAL9, or A2aR. In one embodiment, the soluble fusion protein comprises the extracellular domain of PD-L2 or LAG3.
- the therapies target mutated forms of the EGFR protein. Such therapies can include osimertinib (TAGRISSO®), erlotinib (TARCEVA®), and gefinitib (IRESSA®).
- Therapies can include one or more of treatments for target therapies, including abemaciclib (VERZENIO®), abiraterone acetate (ZYTIGA®), acalabrutinib (CALQUENCE®), adagrasib (KRAZATI®), ado-trastuzumab emtansine (KADCYLA®), afatinib dimaleate (GILOTRIF®), alectinib (ALCENSA®), alemtuzumab (CAMPATH®), alitretinoin (PANRETIN®), alpelisib (PIQRAY®), amivantamab- vmjw (RYBREVANT®), anastrozole (ARIMIDEX®), apalutamide (ERLEADA®), asciminib hydrochloride (SCEMBLIX®), atezolizumab (TECENTRIQ®), avapritinib (AYVAKIT®), aveluma
- GH0205WO endometrial lenvatinib mesylate (LENVIMA®) FGFR1 + FGFR2 + FGFR3 + FGFR4 + PDGFR ⁇ + RET + VEGFR1 + VEGFR2 + VEGFR3 + c-Kit 4 4 4
- GH0205WO liver and bile nivolumab OPDIVO®
- PD-1 duct lung nivolumab OPDIVO®
- the immune checkpoint molecule is a co-stimulatory molecule that amplifies a signal involved in a T cell response to an antigen.
- the biomarker may include an epigenetic signature, such as a methylation state, methylation score and/or DNA fragmentation pattern/score.
- the epigenetic signature can be determined for one or more regions that include, but not limited to, transcription start sites, promoter regions, CTCF binding regions and regulatory protein binding regions.
- the epigenetic signature is determined for one or more regions that include, but not limited to, transcription start sites, promoter regions, intergenic regions and/or intronic regions that are associated with at least one or more genes listed in Table 6.
- Such treatments may include small-molecule drugs or monoclonal antibodies.
- the methods may also improve biomarker testing in individuals suffering from disease and help determine if the individual is a candidate for a certain drug or combination of drugs based on the presence or absence of the biomarker. Additionally, the methods can improve identification of mutations that contribute to the development of resistance Attorney Docket No. GH0205WO to targeted therapy. Consequently, the analysis techniques may reduce unnecessary or untimely therapeutic interventions, patient suffering, and patient mortality. [00588] In certain embodiments, the status of a nucleic acid variant from a sample from a subject as being of somatic or germline origin may be compared with a database of comparator results from a reference population to identify customized or targeted therapies for that subject.
- the reference population includes patients with the same cancer or disease type as the subject and/or patients who are receiving, or who have received, the same therapy as the subject.
- a customized or targeted therapy may be identified when the nucleic variant and the comparator results satisfy certain classification criteria (e.g., are a substantial or an approximate match).
- the customized therapies described herein are typically administered parenterally (e.g., intravenously or subcutaneously).
- Pharmaceutical compositions containing an immunotherapeutic agent are typically administered intravenously. Certain therapeutic agents are administered orally.
- customized therapies may also be administered by any method known in the art, for example, buccal, sublingual, rectal, vaginal, intraurethral, topical, intraocular, intranasal, and/or intraauricular, which administration may include tablets, capsules, granules, aqueous suspensions, gels, sprays, suppositories, salves, ointments, or the like.
- therapy is customized based on the status of a nucleic acid variant as being of somatic or germline origin.
- determination of the levels of particular cell types facilitates selection of appropriate treatment.
- the present methods can be used to diagnose the presence of a condition, e.g., cancer or precancer, in a subject, to characterize a condition (such as to determine a cancer stage or heterogeneity of a cancer), to monitor a subject’s response to receiving a treatment for a condition (such as a response to a chemotherapeutic or immunotherapeutic), assess prognosis of a subject (such as to predict a survival outcome in a subject having a cancer), to determine a subject’s risk of developing a condition, to predict a subsequent course of a condition in a subject, to determine metastasis or recurrence of a cancer in a subject (or a risk of cancer metastasis or recurrence), and/or to monitor a subject’s health as part of a preventative health monitoring program (such as to determine
- the methods according to the present disclosure can also be useful in Attorney Docket No. GH0205WO predicting a subject’s response to a particular treatment option.
- Successful treatment options may increase the amount of copy number variation, rare mutations, and/or cancer-related epigenetic signatures (such as hypermethylated regions or hypomethylated regions) detected in a subject's blood (such as in DNA isolated from a buffy coat sample or any other sample comprising cells, such as a blood sample (e.g., a whole blood sample, a buffy coat sample, a leukapheresis sample, or a PBMC sample) from the subject) if the treatment is successful as more cancer cells may die and shed DNA, or if a successful treatment results in an increase or decrease in the quantity of a specific immune cell type in the blood and an unsuccessful treatment results in no change.
- a blood sample e.g., a whole blood sample, a buffy coat sample, a leukapheresis sample, or a PBMC sample
- certain treatment options may be correlated with genetic profiles of cancers over time. This correlation may be useful in selecting a therapy for a subject.
- determination of the metastasis site facilitates selection of appropriate treatment.
- quantities of each of one or more of a particular genetic and/or epigenetic signature e.g., quantities of fusions, indels, SNPs, CNVs, and/or rare mutations, and/or cancer-related epigenetic signatures (such as specific (e.g., DMRs) or global hypermethylated or hypomethylated regions, and/or fragmentation variable regions)
- DNA from a subject's blood such as in DNA (e.g., cfDNA) isolated from a blood sample (e.g., a whole blood sample) from the subject)
- DNA e.g., cfDNA
- quantities of each of a plurality of cell types are determined based on sequencing and analysis (such as determination of epigenetic and/or genomic signatures) of DNA isolated from at least one sample comprising cells (such as blood sample (e.g., a whole blood sample, a buffy coat sample, a leukapheresis sample, or a PBMC sample) from a subject.
- DNA sample e.g., a whole blood sample, a buffy coat sample, a leukapheresis sample, or a PBMC sample
- the plurality of immune cell types can include, but is not limited to, macrophages (including M1 macrophages and M2 macrophages), activated B cells (including regulatory B cells, memory B cells and plasma cells); T cell subsets, such as central memory T cells, na ⁇ ve-like T cells, and activated T cells (including cytotoxic T cells, regulatory T cells (Tregs), CD4 effector memory T cells, CD4 central memory T cells, CD8 effector memory T cells, and CD8 central memory T cells); immature myeloid cells (including myeloid-derived suppressor cells (MDSCs), low-density neutrophils, immature neutrophils, and immature granulocytes); and natural killer (NK) cells.
- macrophages including M1 macrophages and M2 macrophages
- activated B cells including regulatory B cells, memory B cells and plasma cells
- T cell subsets such as central memory T cells, na ⁇ ve-like T cells, and activated T cells (including
- genetic and/or epigenetic signatures, and/or cell types are compared between samples taken at at least 2-10, at least 2-5, at least 3-6, or at least 2, such as at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, or at least 20 time points collected after the subject has been diagnosed and/or after the subject has received the treatment.
- Sample collection from a subject can be ongoing during and/or after treatment to monitor the subject’s response to the treatment.
- therapies e.g., immunotherapeutic agents, etc.
- methods such as, for example, buccal, sublingual, rectal, vaginal, intraurethral, topical, intraocular, intranasal, and/or intraauricular, which administration may include tablets, capsules, granules, aqueous suspensions, gels, sprays, suppositories, salves, ointments, or the like.
- Therapeutic options for treating specific genetic-based diseases, disorders, or conditions, other than cancer are generally well-known to those of ordinary skill in the art and will be apparent given the particular disease, disorder, or condition under consideration. III.
- aspects of the systems and methods provided herein can be embodied in programming.
- Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium.
- Machine-executable code can be stored on an electronic storage unit, such memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.
- Storage type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming.
- All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server.
- another type of media that may bear the software elements includes optical, electrical, and electromagnetic waves, such as those used across physical interfaces between local devices, through wired and optical landline networks, and over various air-links.
- the computer system 201 can include or be in communication with an electronic display that comprises a user interface (UI) for providing, for example, one or more results of sample analysis.
- UIs include, without limitation, a graphical user interface (GUI) and web-based user interface.
- GUI graphical user interface
- Additional details relating to computer systems and networks, databases, and computer program products are also provided in, for example, Peterson, Computer Networks: A Systems Approach, Morgan Kaufmann, 5th Ed. (2011), Kurose, Computer Networking: A Top- Down Approach, Pearson, 7 th Ed. (2016), Elmasri, Fundamentals of Database Systems, Addison Wesley, 6th Ed. (2010), Coronel, Database Systems: Design, Implementation, & Management, Cengage Learning, 11 th Ed.
- a set of patient samples is analyzed by a blood-based NGS assay at Guardant Health (Redwood City, CA, USA) to detect the presence or absence of cancer.
- DNA is extracted from the blood of these patients.
- the DNA is subjected to end-repair and A-tailing reactions, optionally with a deaminase-resistant modified cytosine (e.g., a methylated cytosine).
- deamination resistant NGS adapters are added to the DNA by ligation to the 3’ ends thereof, the 5’ ends thereof, or both the 3’ and 5’ ends thereof.
- CpG-binding protein-biotin proteins such as MBD1-biotin, MBD2-biotin, MBD3-biotin, MBD4-biotin, MeCP2-biotin
- MBD1-biotin, MBD2-biotin, MBD3-biotin, MBD4-biotin, MeCP2-biotin are bound to methylated CpG dinucleotides and/or unmethylated CpG dinucleotides in DNA molecules, thereby providing CpG protein-bound DNA.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
L'invention concerne la séparation et l'analyse d'ADN riche en CpG par liaison d'une protéine se liant au CpG à l'ADN et désamination sélective. Plus particulièrement, l'invention concerne des procédés de séparation d'ADN riche en CpG dans un échantillon par mise en contact de l'ADN avec une protéine de liaison CpG, permettant ainsi d'obtenir de l'ADN lié à la protéine CpG, séparation de l'ADN lié à la protéine CpG de l'ADN non lié, permettant ainsi d'obtenir de l'ADN riche en CpG, et mise en contact de l'ADN riche en CpG avec une désaminase sensible au méthyle, permettant ainsi d'obtenir un échantillon converti dans lequel au moins une partie des CpG non méthylés de l'ADN sont convertis en UpG.
Applications Claiming Priority (12)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463570986P | 2024-03-28 | 2024-03-28 | |
| US63/570,986 | 2024-03-28 | ||
| US202463667401P | 2024-07-03 | 2024-07-03 | |
| US63/667,401 | 2024-07-03 | ||
| US202463669078P | 2024-07-09 | 2024-07-09 | |
| US202463669082P | 2024-07-09 | 2024-07-09 | |
| US63/669,078 | 2024-07-09 | ||
| US63/669,082 | 2024-07-09 | ||
| US202463670035P | 2024-07-11 | 2024-07-11 | |
| US202463670034P | 2024-07-11 | 2024-07-11 | |
| US63/670,035 | 2024-07-11 | ||
| US63/670,034 | 2024-07-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025207941A1 true WO2025207941A1 (fr) | 2025-10-02 |
Family
ID=95399281
Family Applications (6)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/021808 Pending WO2025207921A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés d'enrichissement de méthylation faisant appel à une désamination spécifique de cpg |
| PCT/US2025/021843 Pending WO2025207939A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de séparation d'adn méthylé par désamination sensible au méthyle et liaison de protéines se liant au cpg |
| PCT/US2025/021845 Pending WO2025207941A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de séparation d'adn riche en cpg par liaison de protéines se liant au cpg et désamination sensible au méthyle |
| PCT/US2025/021813 Pending WO2025207925A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés d'enrichissement par méthylation par l'utilisation d'une ligature préférentielle d'adaptateurs |
| PCT/US2025/021812 Pending WO2025207924A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de désamination sélective utilisant des protéines de liaison à cpg |
| PCT/US2025/021817 Pending WO2025207926A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de désamination sélective utilisant des désaminases sensibles au méthyle |
Family Applications Before (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/021808 Pending WO2025207921A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés d'enrichissement de méthylation faisant appel à une désamination spécifique de cpg |
| PCT/US2025/021843 Pending WO2025207939A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de séparation d'adn méthylé par désamination sensible au méthyle et liaison de protéines se liant au cpg |
Family Applications After (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/021813 Pending WO2025207925A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés d'enrichissement par méthylation par l'utilisation d'une ligature préférentielle d'adaptateurs |
| PCT/US2025/021812 Pending WO2025207924A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de désamination sélective utilisant des protéines de liaison à cpg |
| PCT/US2025/021817 Pending WO2025207926A1 (fr) | 2024-03-28 | 2025-03-27 | Procédés de désamination sélective utilisant des désaminases sensibles au méthyle |
Country Status (1)
| Country | Link |
|---|---|
| WO (6) | WO2025207921A1 (fr) |
Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20010053519A1 (en) | 1990-12-06 | 2001-12-20 | Fodor Stephen P.A. | Oligonucleotides |
| US20030152490A1 (en) | 1994-02-10 | 2003-08-14 | Mark Trulson | Method and apparatus for imaging a sample on a device |
| WO2007010004A1 (fr) * | 2005-07-19 | 2007-01-25 | Epigenomics Ag | Procede pour etudier des methylations de cytosine dans de l'adn |
| US7537898B2 (en) | 2001-11-28 | 2009-05-26 | Applied Biosystems, Llc | Compositions and methods of selective nucleic acid isolation |
| US20110160078A1 (en) | 2009-12-15 | 2011-06-30 | Affymetrix, Inc. | Digital Counting of Individual Molecules by Stochastic Attachment of Diverse Labels |
| US20130040343A1 (en) * | 2008-07-24 | 2013-02-14 | Brookhaven Science Associates, Llc | Methods for Detection of Methyl-CpG Dinucleotides |
| US9598731B2 (en) | 2012-09-04 | 2017-03-21 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
| US9738894B2 (en) | 2003-03-21 | 2017-08-22 | Roche Innovation Center Copenhagen A/S | Short interfering RNA (siRNA) analogues |
| US9850523B1 (en) | 2016-09-30 | 2017-12-26 | Guardant Health, Inc. | Methods for multi-resolution analysis of cell-free nucleic acids |
| US9902992B2 (en) | 2012-09-04 | 2018-02-27 | Guardant Helath, Inc. | Systems and methods to detect rare mutations and copy number variation |
| WO2018119452A2 (fr) | 2016-12-22 | 2018-06-28 | Guardant Health, Inc. | Procédés et systèmes pour analyser des molécules d'acide nucléique |
| US10260088B2 (en) | 2015-10-30 | 2019-04-16 | New England Biolabs, Inc. | Compositions and methods for analyzing modified nucleotides |
| WO2020160414A1 (fr) | 2019-01-31 | 2020-08-06 | Guardant Health, Inc. | Compositions et méthodes pour isoler de l'adn acellulaire |
| US10961525B2 (en) | 2017-07-05 | 2021-03-30 | The Trustees Of The University Of Pennsylvania | Hyperactive AID/APOBEC and hmC dominant TET enzymes |
| WO2022087309A1 (fr) * | 2020-10-23 | 2022-04-28 | Guardant Health, Inc. | Compositions et procédés d'analyse d'adn par division et conversion de base |
| WO2022147420A1 (fr) * | 2020-12-30 | 2022-07-07 | Guardant Health, Inc. | Détection d'état épigénétique à l'aide d'une dégradation spécifique à une séquence |
| CN116287152A (zh) * | 2023-02-22 | 2023-06-23 | 青岛大学 | 基于甲基化敏感的胞苷脱氨酶建立的基因甲基化检测方法 |
| WO2024073043A1 (fr) | 2022-09-30 | 2024-04-04 | Illumina, Inc. | Procédés d'utilisation de protéines de liaison cpg dans la cartographie de nucléotides cytosine modifiés |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE10331107B3 (de) * | 2003-07-04 | 2004-12-02 | Epigenomics Ag | Verfahren zum Nachweis von Cytosin-Methylierungen in DNA mittels Cytidin-Deaminasen |
| CA2496997A1 (fr) * | 2004-02-13 | 2005-08-13 | Affymetrix, Inc. | Analyse et determination du degre de methylation a l'aide de matrices d'acides nucleiques |
| US20150099670A1 (en) * | 2013-10-07 | 2015-04-09 | Weiwei Li | Method of preparing post-bisulfite conversion DNA library |
| US10155939B1 (en) * | 2017-06-15 | 2018-12-18 | New England Biolabs, Inc. | Method for performing multiple enzyme reactions in a single tube |
| IL315876A (en) * | 2022-04-07 | 2024-11-01 | Illumina Inc | Modification of cytidine deaminases and methods of use |
| JP2025523964A (ja) * | 2022-07-21 | 2025-07-25 | ガーダント ヘルス, インコーポレイテッド | 試料調製により誘発されるメチル化アーチファクトの検出および低減のための方法 |
| WO2024137880A2 (fr) * | 2022-12-22 | 2024-06-27 | Guardant Health, Inc. | Procédés recourant à une amplification préservant la méthylation avec correction des erreurs |
| EP4646491A1 (fr) * | 2023-01-06 | 2025-11-12 | Illumina, Inc. | Réduction des uraciles par polymérase |
-
2025
- 2025-03-27 WO PCT/US2025/021808 patent/WO2025207921A1/fr active Pending
- 2025-03-27 WO PCT/US2025/021843 patent/WO2025207939A1/fr active Pending
- 2025-03-27 WO PCT/US2025/021845 patent/WO2025207941A1/fr active Pending
- 2025-03-27 WO PCT/US2025/021813 patent/WO2025207925A1/fr active Pending
- 2025-03-27 WO PCT/US2025/021812 patent/WO2025207924A1/fr active Pending
- 2025-03-27 WO PCT/US2025/021817 patent/WO2025207926A1/fr active Pending
Patent Citations (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6582908B2 (en) | 1990-12-06 | 2003-06-24 | Affymetrix, Inc. | Oligonucleotides |
| US20010053519A1 (en) | 1990-12-06 | 2001-12-20 | Fodor Stephen P.A. | Oligonucleotides |
| US20030152490A1 (en) | 1994-02-10 | 2003-08-14 | Mark Trulson | Method and apparatus for imaging a sample on a device |
| US7537898B2 (en) | 2001-11-28 | 2009-05-26 | Applied Biosystems, Llc | Compositions and methods of selective nucleic acid isolation |
| US9738894B2 (en) | 2003-03-21 | 2017-08-22 | Roche Innovation Center Copenhagen A/S | Short interfering RNA (siRNA) analogues |
| WO2007010004A1 (fr) * | 2005-07-19 | 2007-01-25 | Epigenomics Ag | Procede pour etudier des methylations de cytosine dans de l'adn |
| US20130040343A1 (en) * | 2008-07-24 | 2013-02-14 | Brookhaven Science Associates, Llc | Methods for Detection of Methyl-CpG Dinucleotides |
| US20110160078A1 (en) | 2009-12-15 | 2011-06-30 | Affymetrix, Inc. | Digital Counting of Individual Molecules by Stochastic Attachment of Diverse Labels |
| US9598731B2 (en) | 2012-09-04 | 2017-03-21 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
| US9902992B2 (en) | 2012-09-04 | 2018-02-27 | Guardant Helath, Inc. | Systems and methods to detect rare mutations and copy number variation |
| US10260088B2 (en) | 2015-10-30 | 2019-04-16 | New England Biolabs, Inc. | Compositions and methods for analyzing modified nucleotides |
| US9850523B1 (en) | 2016-09-30 | 2017-12-26 | Guardant Health, Inc. | Methods for multi-resolution analysis of cell-free nucleic acids |
| WO2018119452A2 (fr) | 2016-12-22 | 2018-06-28 | Guardant Health, Inc. | Procédés et systèmes pour analyser des molécules d'acide nucléique |
| US10961525B2 (en) | 2017-07-05 | 2021-03-30 | The Trustees Of The University Of Pennsylvania | Hyperactive AID/APOBEC and hmC dominant TET enzymes |
| WO2020160414A1 (fr) | 2019-01-31 | 2020-08-06 | Guardant Health, Inc. | Compositions et méthodes pour isoler de l'adn acellulaire |
| WO2022087309A1 (fr) * | 2020-10-23 | 2022-04-28 | Guardant Health, Inc. | Compositions et procédés d'analyse d'adn par division et conversion de base |
| WO2022147420A1 (fr) * | 2020-12-30 | 2022-07-07 | Guardant Health, Inc. | Détection d'état épigénétique à l'aide d'une dégradation spécifique à une séquence |
| WO2024073043A1 (fr) | 2022-09-30 | 2024-04-04 | Illumina, Inc. | Procédés d'utilisation de protéines de liaison cpg dans la cartographie de nucléotides cytosine modifiés |
| CN116287152A (zh) * | 2023-02-22 | 2023-06-23 | 青岛大学 | 基于甲基化敏感的胞苷脱氨酶建立的基因甲基化检测方法 |
Non-Patent Citations (41)
| Title |
|---|
| "MethBank3.0: a database of DNA methylomes across a variety of species", NUCLEIC ACIDS RES, 2018 |
| BASHTRYKOV PI ET AL.: "The UHRF1 protein stimulates the activity and specificity of the maintenance DNA methyltransferase DNMT1 by an allosteric mechanism", J BIOL CHEM., 2014 |
| BLANCO ET AL., J. BIOL. CHEM., vol. 264, 1989, pages 8935 - 8940 |
| BOCK ET AL., NAT BIOTECH, vol. 28, 2010, pages 1106 - 1114 |
| CORONEL: "Database Systems: Design, Implementation, & Management, Cengage Learning", 2014 |
| DU ET AL.: "Methyl-CpG-binding domain proteins: readers of the epigenome", FPIGENOMICS, vol. 7, no. 6, 2015, pages 1051 - 73, XP093109738, DOI: 10.2217/epi.15.39 |
| ELMASRI: "Fundamentals of Database Systems", 2010, ADDISON WESLEY |
| FREIER ET AL., NUCLEIC ACIDS RES., vol. 25, 1997, pages 4429 - 4443 |
| FULLGRABE ET AL., BIORXIV, 2022 |
| GALE ET AL., PLOS ONE, vol. 13, 2018, pages e0194630 |
| GANSAUGE ET AL., NATURE PROTOCOLS, vol. 8, 2013, pages 737 - 748 |
| GOUILKENIRY, ESSAYS IN BIOCHEMISTRY, vol. 63, 2019, pages 639 - 648 |
| HENNION ET AL., GENOME BIOLOGY, vol. 21, no. 125, 2020 |
| IURLARO ET AL., GENOME BIOL., vol. 14, 2013, pages R119 |
| JANG ET AL., GENES, vol. 8, no. 6, June 2017 (2017-06-01), pages 148 |
| KINDE ET AL., PROC NAT'L ACAD SCI USA, vol. 108, 2011, pages 9530 - 9535 |
| KO ET AL., NATURE, vol. 468, 2010, pages 839 - 843 |
| KOU ET AL., PLOS ONE, vol. 11, 2016, pages e0146638 |
| KUROSE: "Computer Networking: A 1' p-Down Approach", 2016 |
| KUTYAVIN, BIOCHEMISTRY, vol. 47, no. 51, 2008, pages 13666 - 1367 |
| LIU ET AL., NAT CHEM BIOL, vol. 13, 2017, pages 181 - 187 |
| LIU ET AL., NAT CHEM BIOL., vol. 13, no. 2, February 2017 (2017-02-01), pages 181 - 187 |
| LIZARD ET AL., NAT. GENETICS, vol. 19, 1998, pages 225 - 232 |
| LOU ET AL., PROC. NATL. ACAD. SCI., vol. 110, no. 49, 2013, pages 19872 - 19877 |
| MULLER ET AL., NATURE METHODS, vol. 16, 2019, pages 429 - 436 |
| NOTOMI ET AL., NUC. ACIDS RES., vol. 28, 2000, pages e63 |
| PARDOLL, NATURE REVIEWS CANCER, vol. 12, 2012, pages 252 - 264 |
| PETERSONMORGAN KAUFMANN: "Cloud Computing Architected: Solution Design Handbook", 2011, RECURSIVE PRESS |
| ROBERTSON ET AL., NUCLEIC ACIDS RES., vol. 39, 2011, pages 8740 - 8751 |
| SCOTT, C.A.DURYEA, J.D.MACKAY, H. ET AL.: "Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data", GENOME BIOI, vol. 21, 2020, pages 156 |
| SONG ET AL., NAT BIOTECH, vol. 29, 2011, pages 68 - 72 |
| SONG ET AL., NAT. BIOTECH., vol. 29, 2010, pages 68 - 72 |
| TROLL ET AL., BMC GENOMICS, vol. 20, 2019, pages 1023, Retrieved from the Internet <URL:https://doi.org/10.1186/12864-019-6355-0> |
| TUCKER: "Programming Languages", 2006, MCGRAW-HILL |
| VAISVILA ET AL.: "Discovery of novel DNA cytosine deaminase activities enables a nondestructive single-enzyme methylation sequencing method for base resolution high-coverage methylome mapping of cell-free and ultra-low input DNA", MOL CELL, vol. 84, no. 5, 7 March 2024 (2024-03-07), pages 854 - 866 |
| VAISVILA ROMUALDAS ET AL: "Discovery of novel DNA cytosine deaminase activities enables a nondestructive single-enzyme methylation sequencing method for base resolution high-coverage methylome mapping of cell-free and ultra-low input DNA", BIORXIV, 19 December 2023 (2023-12-19), XP093237351, Retrieved from the Internet <URL:https://www.biorxiv.org/content/10.1101/2023.06.29.547047v2.full.pdf> DOI: 10.1101/2023.06.29.547047 * |
| WEIRATHER ET AL., F LOOORESEARCH, vol. 6, 2017, pages 100 |
| WEIRATHER JL ET AL., F1000RESEARCH, vol. 6, 2017, pages 100 |
| WEIRATHER JL ET AL.: "Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis", F LOOORESEARCH, vol. 6, 2017, pages 100 |
| YANG ET AL., BIO-PROTOCOL, vol. 12, no. 17, 2023, pages e4496 |
| YU ET AL., CELL, vol. 149, 2012, pages 1368 - 80 |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2025207921A1 (fr) | 2025-10-02 |
| WO2025207925A1 (fr) | 2025-10-02 |
| WO2025207924A1 (fr) | 2025-10-02 |
| WO2025207939A1 (fr) | 2025-10-02 |
| WO2025207926A1 (fr) | 2025-10-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240191290A1 (en) | Methods for detection and reduction of sample preparation-induced methylation artifacts | |
| US20250236916A1 (en) | Enrichment of aberrantly modified dna | |
| US20250084464A1 (en) | Compositions and methods for synthesis and use of probes targeting nucleic acid rearrangements | |
| US20240263241A1 (en) | Methods and compositions for copy-number informed tissue-of-origin analysis | |
| EP4638781A2 (fr) | Procédés recourant à une amplification préservant la méthylation avec correction des erreurs | |
| WO2025090956A1 (fr) | Procédés de détection de variants d'acide nucléique à l'aide de sondes de capture | |
| US20250101494A1 (en) | Methods for analyzing cytosine methylation and hydroxymethylation | |
| US20240093292A1 (en) | Quality control method | |
| WO2025029475A1 (fr) | Procédés d'enrichissement de variants nucléotidiques par sélection négative | |
| WO2024229143A1 (fr) | Procédé de contrôle qualité pour les procédures de conversion enzymatique | |
| WO2024159053A1 (fr) | Procédé pour établir le profil de méthylation d'acides nucléiques | |
| WO2024264065A1 (fr) | Procédés et compositions pour quantifier des acides nucléiques de cellules immunitaires | |
| WO2025207941A1 (fr) | Procédés de séparation d'adn riche en cpg par liaison de protéines se liant au cpg et désamination sensible au méthyle | |
| WO2025090954A1 (fr) | Procédé de détection de variants d'acide nucléique | |
| WO2025160433A1 (fr) | Procédés d'analyse de lectures de séquençage | |
| WO2025235889A1 (fr) | Procédés impliquant une pcr groupée multiplexée | |
| WO2025155895A1 (fr) | Procédé de profilage de modification d'acide nucléique | |
| EP4659248A1 (fr) | Surveillance non invasive d'altérations génomiques induites par des thérapies d'édition génique | |
| WO2025137620A1 (fr) | Procédés de séquençage de méthylation de haute qualité et de haute précision | |
| WO2025038399A1 (fr) | Procédés d'enrichissement méthylé pour séquençage génétique et épigénétique à molécule unique | |
| WO2024138180A2 (fr) | Flux de travail ciblés et intégrés de séquençage de génome somatique entier et de méthylation d'adn | |
| WO2024229433A1 (fr) | Procédés d'analyse de la méthylation de l'adn | |
| WO2024073508A2 (fr) | Procédés et compositions de quantification d'adn de cellules immunitaires |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25719560 Country of ref document: EP Kind code of ref document: A1 |