WO2025141580A1 - Methods for preparing dna sequencing libraries - Google Patents
Methods for preparing dna sequencing libraries Download PDFInfo
- Publication number
- WO2025141580A1 WO2025141580A1 PCT/IL2024/051230 IL2024051230W WO2025141580A1 WO 2025141580 A1 WO2025141580 A1 WO 2025141580A1 IL 2024051230 W IL2024051230 W IL 2024051230W WO 2025141580 A1 WO2025141580 A1 WO 2025141580A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dna
- methylation
- digestion
- sequencing
- restriction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6809—Methods for determination or identification of nucleic acids involving differential detection
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/6858—Allele-specific amplification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/686—Polymerase chain reaction [PCR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/154—Methylation markers
Definitions
- DNA screening and analysis requires several consecutive steps of activity of various enzymes and compounds. Different enzymes often need completely different conditions and/or presence of additional components for optimal activity. In some cases, a ‘universal’ buffer may be used in which, although not optimal, the different enzymes can be used with sufficient activity. In other cases, additional components are added between the different steps in order to adapt the conditions to the reactions. In some cases, the required conditions are too different and a full purification step is needed between reactions in order to adjust the conditions.
- NGS next-generation sequencing
- a library of the fragmented molecules is typically prepared, in which DNA molecules are ligated with sequencing adapters suitable to a selected sequencing platform. Sequencing adapters are typically introduced via enzymatic ligation following an end-repair process to obtain DNA molecules with blunt ends suitable for ligation.
- the library is then subjected to sequencing, with or without additional preparation steps, such as PCR amplification and targeted capture of particular regions of interest.
- R-M restriction-modification
- type II restriction endonucleases show an absolute requirement for divalent metal ions to catalyze in a charge repulsive, polyanionic context the cleavage of the phosphodiester bond, which is one of the most stable bonds in biochemistry.
- physiological metal ion for the bacterial enzymes appears to be the magnesium, they can utilize a variety of divalent cations for in vitro DNA cleavage reaction, including Mn 2+ , Ca 2+ , Fe 2+ , Co 2+ , Ni 2+ , Zn 2+ , or Cd 2+ , depending on the enzyme.
- X-ray crystallographic analysis of type II restriction enzymes in different metal-bound states has revealed two DNA cleavage mechanisms in which one or two metal ions are involved.
- compositions and methods for library preparation disclosed herein provide efficient library preparation reactions (end repair, A-tailing, adapter ligation) despite the presence of impurities from the digestion steps, along with preservation of the DNA material by avoiding the purification step.
- the present invention is further directed to an integrated process in which methylation- sensitive enzymatic digestion is followed by quantitative PCR, and advantageously allows performing the two steps in the same reaction mix.
- the present invention thus provides, according to certain aspects, methods for methylation-sensitive enzymatic digestion of DNA followed by quantitative PCR , wherein both steps are carried out in the same reaction mix. These methods significantly reduce the need for operator involvement in the entire process, thereby minimizing the potential for human error and/or potential contamination events, while enabling automation.
- DNA digestion with the methylation-sensitive restriction endonucleases Acil and HinPlI and subsequently quantitative PCR can be performed in the same tube and in the same reaction mix, without the need to adjust the reaction components, or add or remove any component between the two steps.
- the present invention provides a method for preparing a DNA sample for methylation analysis, the method comprising:
- step ii(a) comprises incubating the mixture for 45 minutes- 3 hours (e.g., for 45-120 minutes, for 45-90 minutes or for 60 minutes) at 15-25°C (e.g., at 20°C) and subsequently for 20-45 minutes at 60-75°C.
- time and temperature represents a separate embodiment of the present invention.
- the subsequent incubation is carried out under conditions of time and temperature sufficient for inactivation of the enzymes of the end repair and adapter ligation. In some particular embodiments, the subsequent incubation is carried out for about 30 minutes at about 65°C.
- step ii(b) comprises mixing an adapter ligation reaction mix with the end-repaired DNA to obtain an adapter concentration of 0.1pM-0.4pM. In some particular embodiments, step ii(b) comprises mixing an adapter ligation reaction mix with the end-repaired DNA to obtain an adapter concentration of about 0.2pM.
- step ii(b) comprises incubating for 1-20 hours (e.g., for 1-18 hours, or for 16 hours) at 2-20°C (e.g., at 4-20°C, at 4-18°C, or at 16°C).
- 1-20 hours e.g., for 1-18 hours, or for 16 hours
- 2-20°C e.g., at 4-20°C, at 4-18°C, or at 16°C.
- step ii(a) comprises incubating the mixture for about 60 minutes at about 20°C and subsequently for about 30 minutes at about 65°C.
- step ii(b) comprises mixing an adapter ligation reaction mix with the end-repaired DNA to obtain an adapter concentration of about 0.2 pM, and incubating for aboutl6 hours at about 16°C.
- the DNA is DNA extracted from a tumor sample.
- the method comprises sequencing the library by a high-throughput sequencing method to provide sequencing data.
- the method comprises determining from the sequencing data a methylation value for at least one restriction locus and optionally at least one additional genetic or epigenetic characteristic of the DNA sample, e.g., DNA mutation and/or copy number variation.
- the at least one restriction locus is located within a CG-island.
- the method further comprises identifying the presence or absence of a disease in the subject based on the methylation profile of the DNA sample, by comparing the methylation profile of the DNA sample to one or more reference methylation profile(s).
- the disease is cancer.
- the cancer is lung cancer.
- the DNA sample is from a subject suspected of having the disease and/or a subject at risk of developing the disease, and the method comprises detecting methylation changes comprising determining whether the DNA sample is a healthy or disease DNA sample.
- the disease is cancer.
- the cancer is lung cancer.
- the method further comprises preparing a report in paper or electronic form based on the methylation profile and communicating the report to the subject and/or to a healthcare provider of the subject.
- the high-throughput sequencing is target- specific high-throughput sequencing.
- determining a methylation value for at least one restriction locus comprises: determining read counts and relative copy number between the at least one restriction and a control locus.
- the at least one restriction locus is a plurality of restriction loci.
- the at least one methylation- sensitive restriction endonuclease is a plurality of methylation-sensitive restriction endonucleases, and the digestion with the plurality of methylation-sensitive restriction endonucleases is a simultaneous digestion.
- the step of subjecting the cell-free DNA sample to digestion with at least one methylation-sensitive restriction endonuclease further comprises determining digestion efficiency .
- proceeding to preparing a sequencing library is carried out if the digestion efficiency is above a predefined threshold.
- the present invention provides a method for detecting cancer-related genetic and epigenetic changes in a cell-free DNA sample (cfDNA) from a subject, the method comprising: profiling methylation and optionally at least one additional genetic or epigenetic characteristic of the cfDNA sample as disclosed herein, to obtain a genetic and epigenetic profile of the cfDNA sample; and comparing the genetic and epigenetic profile of the cfDNA sample to one or more reference genetic and epigenetic profile selected from a cancer profile and a non-cancer profile, to detect cancer-associated genetic and epigenetic changes in the cfDNA sample.
- cfDNA cell-free DNA sample
- the first source of DNA is a cancer DNA and the second source of DNA is a non-cancer DNA.
- the first source of DNA is plasma cell-free DNA of a cancer patient and the second source of DNA is plasma cell-free DNA of one or more healthy individuals.
- the first and second sources of DNA are different stages of a cancer.
- the present invention provides a method for detecting methylation changes in a DNA sample, the method comprising: profiling methylation of the DNA sample as disclosed herein, to obtain a methylation profile of the DNA sample; and comparing the methylation profile of the DNA sample to one or more reference methylation profile to detect methylation changes in the DNA sample.
- the present invention provides a method for profiling methylation of a DNA sample from a subject, the method comprising:
- PCR amplifying from the restriction endonuclease-treated DNA at least one restriction locus wherein the PCR amplification is carried out in the same reaction mix as the digestion, without adjusting the reaction mix between the digestion and the PCR amplification steps.
- the digestion is with Acil and/or HinPlI.
- the present invention provides a method for profiling methylation of a DNA sample from a subject, the method comprising:
- the reaction mix comprises reagents required for both the digestion and PCR amplification steps.
- the PCR amplification step is quantitative PCR (qPCR).
- the reaction mix comprises between 2-6 mM divalent cation(s). According to additional embodiments, the reaction mix comprises between 2-4 mM divalent cation(s).
- the divalent cation(s) is selected from the group consisting of Mg 2+ , Mn 2+ , Ca 2+ , Fe 2+ , Co 2+ , Ni 2+ , Zn 2+ , or Cd 2+ .
- the divalent cation is magnesium (Mg 2+ ).
- the reaction mix comprises between 2-6 mM Mg 2+ . According to additional embodiments, the reaction mix comprises between 2-4 mM Mg 2+ . According to some particular embodiments, the reaction mix comprises between 2-6 mM MgCh. According to additional particular embodiments, the reaction mix comprises between 2-4 mM MgCh
- the DNA digestion step is performed for between 12 and 20 hours. According to some embodiments, the DNA digestion step is performed for between 14 and 18 hours. According to some embodiments, the DNA digestion step is performed for between 12 and 16 hours. According to some embodiments, the DNA digestion step is performed for about 16 hours. According to some embodiments, the DNA sample is a cell-free DNA sample.
- the DNA sample is mitochondrial DNA.
- the DNA is DNA extracted from a tumor sample.
- the method comprises simultaneous amplification of more than one target sequence in the same reaction mix.
- the amplification step comprises co-amplifying at least one restriction locus differentially methylated between normal and disease DNA and at least one control locus.
- the amplification step comprises a step of coamplification of at least one restriction locus and a control locus, thereby generating an amplification product for each locus.
- control locus is not digested by the at least one methylation- sensitive or methylation-dependent restriction endonuclease.
- control locus is a locus devoid of a recognition sequence of the methylation- sensitive restriction endonuclease(s).
- the method comprises determining a signal intensity for each generated amplification product.
- the method comprises a step of comparing a ratio between the signal intensities of the amplification products of each of said at least one restriction locus and the control locus to at least one reference ratio.
- the control locus is a locus devoid of a recognition sequence of the methylation- sensitive restriction endonuclease(s).
- the at least one restriction locus is located within a CG-island.
- the at least one restriction locus is a plurality of restriction loci.
- FIG. 3 Sequencing depth of original DNA molecules using a library preparation protocol modified according to embodiments of the present invention vs. a commercial protocol. Results are shown as average coverage per target after collapsing all reads with the same unique molecular identifiers (UMIs).
- UMIs unique molecular identifiers
- a “subject” according to the present invention is typically a human subject.
- the subject may be suspected of having a certain disease.
- the subject is diagnosed with a disease of interest.
- the subject is a healthy subject that does not have the disease of interest.
- the subject may also be at risk of developing the disease, for example, based on previous history of the disease, genetic predisposition, and/or family history, and/or a subject who exhibits suspicious clinical signs of the disease and/or a subject that is suspected of having the disease based on other prior assay(s) e.g., based on testing of other biomarker(s).
- the subject is at risk of recurrence of the disease.
- the subject shows at least one symptom or characteristic of the disease.
- the subject is asymptomatic.
- the methods as described herein comprise: profiling methylation of the DNA sample using HinPlI and Acil digestion; and comparing the methylation profile to one or more reference methylation profile.
- the DNA sample is cell-free DNA extracted from a biological fluid.
- Digestion efficiency can be evaluated either internally to the examined sample, or externally. Internal evaluation can be performed by measuring intact cut sites of genomic positions that are known to be ubiquitously unmethylated. An example of such a locus can be any site on the mitochondrion DNA. External evaluation of digestion efficiency can be performed either by including an unmethylated sample in the digestion step, digesting both samples in parallel, and then verifying that the unmethylated sample was indeed digested (by measuring numbers of intact cut sites). Such an unmethylated sample could be, for example, PCR amplicons, plasmid DNA, commercial unmethylated DNA species, or cell line DNA that is known to be unmethylated in certain genomic positions.
- external evaluation of digestion efficiency can be achieved in a single step, by spiking in an unmethylated sample into the interrogated sample, and measuring the digestion of the unmethylated DNA sample in the same step as the interrogated sample.
- the use of small targets is preferred, such as PCR amplicons or plasmid DNA.
- Digestion efficiency may be represented using any measure that is indicative of the amount of DNA that remained undigested (or that was digested) out of the original DNA amount.
- digestion efficiency may be represented as %undigested DNA or as ACq in qPCR for a certain locus.
- the methods described herein comprise a step of preparing a sequencing library.
- Preparing a sequencing library according to the present invention comprises ligating sequencing adapters to amplification products.
- the step of preparing a sequencing library according to the present invention is carried out on the digested DNA without a step of DNA amplification.
- library preparation for sequencing according to the present invention is carried out in an end-preserving manner, indicating that the library preparation process is carried our such that the sequence information at the ends of DNA molecules is preserved.
- a library preparation process according to these embodiments does not include PCR to enrich genomic regions of interest and/or introduce sequencing adapters.
- a library preparation process according to these embodiments preferably also performs blunt-ending by filling in gaps not digesting overhangs.
- library preparation comprises adding sequencing adapters via ligation (e.g., enzymatic ligation). If enrichment of certain genomic regions is desired, library preparation according to these embodiments comprises enriching the genomic regions of interest using capture agents following the ligation of sequencing adapters.
- the present invention relates to compositions and methods for high resolution DNA methylation profiling.
- the present invention provides the use of methylation- sen sitive/methylation-dependent restriction enzymes and high-throughput sequencing in the analysis of DNA methylation.
- the present invention provides the use of methylation- sensitive/methylation-dependent restriction enzymes and high-throughput sequencing for direct calculation of methylated and unmethylated DNA levels.
- Methylation in the human genome occurs in the form of 5-methyl cytosine and is confined to cytosine residues that are part of the sequence CG, also denoted as CpG dinucleotides (cytosine residues that are part of other sequences are not methylated). Some CG dinucleotides in the human genome are methylated, and others are not.
- methylation is cell and tissue specific, such that a specific CG dinucleotide can be methylated in a certain cell and at the same time unmethylated in a different cell, or methylated in a certain tissue and at the same time unmethylated in different tissues. DNA methylation is an important regulator of gene transcription.
- the methylation pattern of cancer DNA differs from that of normal DNA, wherein some loci are hypermethylated while others are hypomethylated.
- the present invention provides methods and compositions for sensitive detection of differentially methylated (e.g., hypermethylated) genomic loci associated with cancer.
- 3.3 pg of DNA corresponds to 1 haploid equivalent.
- the methods disclosed herein are carried out using an initial amount of lOng of DNA. In additional embodiments, the methods disclosed herein are carried out using an initial amount of 20ng of DNA. In additional embodiments, the methods disclosed herein are carried out using an initial amount of DNA ranging from l-400ng, for example between l-200ng, between 10-200ng, between l-150ng, between l-100ng, including each value within the ranges. Each possibility represents a separate embodiment.
- the methods disclosed herein are carried out using an initial amount of 3,000 haploid equivalents. In additional embodiments, the methods disclosed herein are carried out using an initial amount of 6,000 haploid equivalents. In additional embodiments, the methods disclosed herein are carried out using an initial amount of DNA comprising 3,000-60,000 haploid equivalents, for example between 6,000- 60,000 haploid equivalents, between 6,000-30,000 haploid equivalents, including each value within the ranges. Each possibility represents a separate embodiment.
- a method for profiling methylation of a DNA sample from a subject comprising:
- step (v) determining a methylation value for the at least one restriction locus based on the read count determined in step (v) and a reference read count, thereby profiling methylation of the cell-free DNA sample.
- profiling methylation of a DNA sample comprises determining the number of sequence reads covering a predefined genomic region of at least 60 bps in length that contains said restriction locus, for example a predefined genomic region of at least 70 bps, at least 80 bps, at least 90 bps, at least 100 bps, between 50-150 bps, between 50-120 bps, between 50-100 bps that contains the restriction locus.
- a predefined genomic region of at least 60 bps in length that contains said restriction locus for example a predefined genomic region of at least 70 bps, at least 80 bps, at least 90 bps, at least 100 bps, between 50-150 bps, between 50-120 bps, between 50-100 bps that contains the restriction locus.
- the at least one restriction locus is located within a CG-island.
- CG islands are regions of DNA with a high G/C content and a high frequency of CG dinucleotides relative to the whole genome of an organism of interest. CG islands are typically between 200-3,000 bps in length and are typically characterized by a GC content greater than 50% and an observed: expected CG ratio of more than 0.6. Genomic regions of lower CG density are termed "CG oceans" and comprise most of the genome.
- the DNA methylation marker is a marker indicative of the presence or absence of a disease, e.g., a type of cancer.
- the DNA methylation marker is a marker indicative of a stage of a disease, e.g., a cancer stage.
- the DNA methylation marker is a marker indicative of a type of tissue (e.g., lung tissue, breast tissue, colon tissue etc.).
- a method for profiling methylation comprises: selecting at least one restriction locus and determining the number of sequence reads covering a predefined genomic region of at least 50 bps in length that contains said restriction locus; and calculating a methylation value based on the read count of the predefined genomic region and a reference read count, the calculated methylation value reflects the number of molecules that were unmethylated in the DNA sample and therefore remained intact following digestion with methylation-dependent restrictions enzymes(s).
- the method comprises: determining from the sequence reads a read count of sequence reads starting or ending at a nucleotide within the restriction locus, the read count representing the number of DNA molecules in the DNA sample in which said restriction locus was methylated and therefore cut by the restriction endonuclease; and calculating a level of methylated DNA at the restriction locus based on the determined read count of sequence reads starting or ending at a nucleotide within the restriction locus.
- the method comprises: determining from the sequence reads a read count of the restriction locus, the read count representing the number of DNA molecules in the DNA sample in which said restriction locus was unmethylated and therefore remained intact; and calculating a level of unmethylated DNA at the restriction locus based on the determined read count of the restriction locus.
- High throughput sequencing includes sequence determination using methods that determine many (typically thousands to billions) of nucleic acid sequences in parallel.
- High throughput sequencing generally involves three basic steps: library preparation, sequencing and data analysis.
- Examples of high throughput sequencing techniques include sequencing -by- synthesis and sequencing-by-ligation (employed, for example, by Illumina Inc., Life Technologies Inc., Roche), nanopore sequencing methods, electronic detection-based methods such as Ion TorrentTM technology (Life Technologies Inc.), and Single Molecule Real-Time (SMRT®) sequencing and sequencing-by-binding employed by PacBio.
- a restriction locus according to the present invention contains a CG dinucleotide which is more methylated in DNA from a cancerous tissue (e.g., a tumor sample) than in DNA from a non-cancerous tissue, meaning that in the cancerous tissue a greater proportion of DNA molecules are methylated at this position compared to the non-cancerous tissue.
- a methylation-sensitive restriction enzyme cleaves its recognition sequence only if it is unmethylated.
- a methylation-dependent restriction enzyme cleaves its recognition sequence only if it is methylated.
- the exact nucleotide within the restriction locus in which the sequence reads start or end depends on the type of restriction endonuclease used in the digestion step and the length of its recognition sequence. For example, for restriction endonucleases that produce non-blunt ends with 5' overhangs, digestion and end repair result in fragments that start at the second nucleotide of the recognition sequence and fragments that end at the penultimate nucleotide of the recognition sequence.
- the number of reads starting or ending at a nucleotide within the restriction locus represent the number of DNA molecules in the DNA sample in which the restriction locus was unmethylated and therefore cut by the restriction endonuclease.
- the method of the present invention comprises: determining a number of sequence reads starting at a nucleotide within the restriction locus; determining a number of sequence reads ending at a nucleotide within the restriction locus; and calculating a level of unmethylated DNA at the restriction locus using the orientation that provides the larger number of sequence reads.
- the method of the present invention comprises: determining a number of sequence reads starting at a nucleotide within the restriction locus; determining a number of sequence reads ending at a nucleotide within the restriction locus; calculating an average between the two values; and using the average to calculate a level of unmethylated DNA at the restriction locus.
- the level of unmethylated DNA is calculated by determining a total fragment number, which is determined from the read count of the restriction locus and read count of sequence reads starting or ending at a nucleotide within the restriction locus.
- the level of unmethylated DNA is expressed as percentage (%) of unmethylation, representing the percentage of DNA molecules that are unmethylated at the restriction locus out of the total number of DNA molecules containing the restriction locus in the sample.
- detecting methylation changes refers to detecting whether a tested DNA sample contains methylation changes compared to one or more reference DNA samples, detecting whether a DNA sample is characterized by a different methylation profile at selected genomic loci compared to a reference methylation profile, and/or determining whether the methylation profile of a DNA sample is normal or contains methylation changes indicative of the presence of a disease.
- Detecting methylation changes also encompasses comparing methylation data obtained as disclosed herein between samples in order to identify genomic regions differentially methylated between the samples, which may be used as DNA methylation markers.
- methylation data obtained as disclosed herein may be analyzed to identify genomic regions differentially methylated between different types of tissues, between cancer and non-cancer DNA, between different types of cancer, or between different stages of a certain type of cancer.
- the methods disclosed herein provide genome-wide methylation analysis.
- the methods disclosed herein provide target-specific methylation analysis.
- Computer software may be used in the analysis of the sequencing and methylation data.
- markers are of a cancer selected from the group consisting of lung cancer, colorectal cancer, liver cancer, breast cancer, pancreatic cancer, uterine cancer, ovarian cancer, head & neck cancer, gastric cancer, esophageal cancer, hematological cancers (e.g. lymphoma) and sarcoma.
- the markers are used as pan-cancer markers.
- the methods may also be applied for identifying differential methylation between different types of cancer, for example, determining methylation profiles characteristic of different types of cancer, that can differentiate between different types of cancer.
- the methods disclosed herein are applicable to any type of cancer, including, but not limited to: lung cancer, bladder cancer, breast cancer, colorectal cancer, prostate cancer, gastric cancer, skin cancer (e.g. melanoma), cancer affecting the nervous system, bone cancer, ovarian cancer, liver cancer (e.g. hepatocellular carcinoma), hematologic malignancies, pancreatic cancer, kidney cancer, cervical cancer.
- Each type of cancer is a separate embodiment of the present invention.
- the methods of the present invention may also be applied to identify tissue-specific methylation markers. For example, to identify methylation markers specific for: lung, bladder, breast, colorectal, prostate, gastric, ovarian, pancreas, kidney, cervical tissue.
- tissue is a separate embodiment of the present invention. Such markers may be used, for example, to identify the tissue source of circulating cell-free DNA.
- the methods of the present invention may be applied for identifying a disease (e.g., a cancer) in a subject.
- Identifying a disease encompasses any one or more of screening for the disease, detecting the presence or absence of the disease, detecting recurrence of the disease, detecting susceptibility to the disease, detecting response to treatment, determining efficacy of treatment, determining stage (severity) of the disease, determining prognosis and early diagnosis of the disease in a subject.
- Identifying a disease encompasses any one or more of screening for the disease, detecting the presence or absence of the disease, detecting recurrence of the disease, detecting susceptibility to the disease, detecting response to treatment, determining efficacy of treatment, determining stage (severity) of the disease, determining prognosis and early diagnosis of the disease in a subject.
- stage severeness
- “Assessing cancer” or “assessing the presence of cancer” or “assessing the presence or absence of cancer” as used herein refer to determining the likelihood that a subject has cancer.
- the terms encompass determining whether a subject should be subjected to confirmatory cancer testing to confirm (or rule out) the presence of cancer, such as confirmatory blood tests, urine tests, cytology, imaging, endoscopy and/or biopsy.
- the terms further encompass aiding the diagnosis of cancer in a subject.
- the terms further encompass quantifying cancer-related changes in cell-free DNA samples which are indicative for the presence of cancer.
- Assessing the presence of cancer according to the present invention includes one or more of screening for cancer, assessing recurrence of cancer, assessing susceptibility or risk to cancer, assessing and/or monitoring response to treatment, assessing efficacy of treatment, assessing severity (stage) of cancer and assessing prognosis of cancer in a subject.
- Each possibility represents a separate embodiment of the present invention. It is to be understood that a negative result in the assays disclosed herein is still considered an assessment for the presence of cancer according to the present invention.
- the methods of the present invention may further include a step of determining a tumor fraction, or fractional concentration of tumor DNA.
- Tumor fraction is the proportion of tumor molecules in a cfDNA sample.
- Determining a "methylation profile" refers to determining methylation values at one or more restriction loci, preferably at a plurality of restriction loci. In some embodiments, determining a methylation profile comprises determining levels of methylated and unmethylated DNA at one or more restriction loci, preferably at a plurality of restriction loci.
- a “reference methylation profile” as disclosed herein refers to a methylation profile determined in DNA from a known source.
- a “reference DNA sample” is a DNA sample from a known source.
- a reference methylation profile is a profile determined in a plurality of reference DNA samples.
- the methods of the present invention may be used for analyzing (e.g., measuring) methylation changes between DNA samples taken from a single subject at different time points, for example, taken at different stages of a disease, or taken before and after treatment of a disease.
- the methylation profile of the DNA sample taken at a first time point may be used as a reference for the methylation profile of a DNA sample taken at a second (later) time point.
- a “reference methylation level” for a particular restriction locus or a particular genomic region spanning a plurality of restriction loci is the level of methylation measured for the particular restriction locus/genomic region in DNA from a known source.
- a “reference methylation value” for a particular restriction locus or a particular genomic region spanning a plurality of restriction loci is a numerical value representing the level of methylation of the particular restriction locus/genomic region in DNA from a known source.
- a "reference level of unmethylated DNA" for a particular restriction locus or a particular genomic region spanning a plurality of restriction loci is the level of unmethylated DNA measured for the particular restriction locus/genomic region in DNA from a known source.
- the methods disclosed herein are diagnostic methods. According to some embodiments, diagnostic methods disclosed herein comprise pre-determination of reference methylation and/or unmethylation from disease DNA. In some embodiments, diagnostic methods of the present invention comprise predetermination of reference methylation and/or unmethylation from normal DNA as disclosed herein.
- Tissue-specific methylation profile can also be characterized using the methods disclosed herein, in order to establish normal non-cancer DNA methylation profile of the tissue.
- tissue-specific methylation profile can be characterized in order to identify the tissue source of circulating cell-free DNA.
- detecting methylation changes comprises identifying the presence or absence of a certain disease in a subject, based on the methylation profile of a DNA sample from the subject.
- a method for identifying the cell source or tissue source of a DNA sample is provided (e.g., identifying what is the type of tissue from which the DNA is derived, and/or identifying whether the DNA is derived from normal or diseased cells/tissue).
- DNA methylation values and/or unmethylation values calculated for a tested sample may be performed in a number of ways, using various statistical means.
- the methods disclosed herein comprise comparing a plurality of values calculated for a plurality of restriction loci to their corresponding healthy and/or disease references values.
- a pattern of values is analyzed using statistical means and computerized algorithm to determine if it represents a pattern of a disease in question or a normal, healthy pattern.
- Exemplary algorithms include, but are not limited to, machine learning and pattern recognition algorithms.
- DNA methylation/unmethylation values it is possible to obtain from the same sequencing data disclosed herein information on DNA mutations, copy number changes, and nucleosome positioning for cell-free DNA.
- cell-free DNA circulates in fragments ranging between 120-220 bp. This pattern agrees with the length of DNA wrapped around a single nucleosome, plus a short stretch of ⁇ 20 bp (linker DNA) bound to a histone.
- linker DNA linker DNA
- determination of DNA methylation profile and determination of at least one additional genetic or epigenetic characteristic as disclosed herein may be carried out based on the same sequencing data.
- a sequencing-based assay as disclosed herein combines detection of methylation changes with mutation detection and analysis of additional epigenetic characteristics, all in one single assay.
- the assay advantageously allows combined analysis of small amounts of DNA in a single assay.
- the combined analysis of methylation and additional genetic and epigenetic characteristics is useful in enhancing detection of cancer (or any other condition/tissue source).
- cancer-associated mutation e.g., cancer-associated mutation in oncogenes/tumor suppressors
- cancer-associated copy number variation e.g., cancer-associated copy number variation
- cancer-associated nucleosomal positioning
- the non-methylation cancer-associated changes may be combined with methylation information in a dependent or independent manner, depending on whether or not the cancer- associated changes are found on the same DNA fragment, where changes that are found on the same fragment provide a stronger indication for the presence of cancer.
- a method for profiling genetic and epigenetic characteristics of a DNA sample comprising: profiling methylation of the DNA sample as disclosed herein; and determining at least one additional genetic or epigenetic characteristic of the DNA sample, wherein the at least one additional genetic or epigenetic characteristic is selected from DNA mutation, copy number variation and nucleosome positioning, wherein profiling the methylation and determining the at least one additional genetic or epigenetic characteristic are carried out using the same sequencing data, thereby profiling genetic and epigenetic characteristics of the DNA sample.
- a method for detecting the presence or absence of a disease in a subject comprising: profiling methylation of the DNA sample as disclosed herein; and determining at least one additional genetic or epigenetic characteristic of the DNA sample, wherein the at least one additional genetic or epigenetic characteristic is selected from DNA mutation, copy number variation and nucleosome positioning.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Microbiology (AREA)
- General Health & Medical Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Pathology (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Compositions and methods are provided, for preparing DNA libraries for high- throughput sequencing following methylation-sensitive and/or methylation-dependent enzymatic digestion of the DNA. The provided compositions and methods obviate the need to clean-up the DNA sample between the digestion step and subsequent library preparation steps. The provided compositions and methods are particularly advantageous for library preparation from small amounts of DNA, such as cell-free DNA from body fluid samples.
Description
METHODS FOR PREPARING DNA SEQUENCING LIBRARIES
FIELD OF THE INVENTION
The present invention relates to compositions and methods for preparing DNA libraries for high-throughput sequencing. In particular, the present invention relates to DNA library preparation following methylation-sensitive and/or methylation-dependent enzymatic digestion of the DNA, wherein the library preparation does not require clean-up of the sample between the digestion step and subsequent library preparation steps, e.g. endrepair and adapter ligation. The methods of the present invention are particularly advantageous for library preparation from small amounts of DNA, such as cell-free DNA from body fluid samples.
BACKGROUND OF THE INVENTION
DNA screening and analysis requires several consecutive steps of activity of various enzymes and compounds. Different enzymes often need completely different conditions and/or presence of additional components for optimal activity. In some cases, a ‘universal’ buffer may be used in which, although not optimal, the different enzymes can be used with sufficient activity. In other cases, additional components are added between the different steps in order to adapt the conditions to the reactions. In some cases, the required conditions are too different and a full purification step is needed between reactions in order to adjust the conditions.
Genetic and epigenetic changes are known to occur in many types of cancer, including mutations, DNA methylation changes (e.g., hypomethylation of isolated CpGs and hypermethylation occurring mostly at CpG islands), copy number variation and more. For example, hypermethylation of CpG islands in the promotor regions of tumor suppressor genes, leading to gene silencing, has been studied extensively and demonstrated in many different types of cancer. Tumors release DNA fragments, or "cell-free DNA", into body fluids and consequently genetic and epigenetic changes of tumor derived DNA molecules can be detected in "liquid biopsies" obtained from body fluids such as blood plasma and urine. In contrast to traditional biopsies, liquid biopsies are non-invasive and may better represent the full genetic spectrum of tumor sub-clones. Consequently, detection of genetic and epigenetic changes associated with cancer in liquid biopsies holds great promise for early detection, prognosis, and therapeutic surveillance. However, in order to detect tumor
derived DNA in liquid biopsies, ultra- sensitive biochemical methods are required, as the concentration of cell-free DNA in biological fluids may be low, and furthermore because the tumor DNA can be present in extremely low quantities in relation to the large background of normal DNA.
Common methods for identifying genetic and epigenetic changes involve the use of restriction enzymes whose activity varies according to these changes. The DNA is fragmented as a function of the genetic or epigenetic phenotype. Analysis of the fragmented DNA may be carried out using various methods, including high-throughput sequencing, also known as next-generation sequencing (NGS). For NGS, a library of the fragmented molecules is typically prepared, in which DNA molecules are ligated with sequencing adapters suitable to a selected sequencing platform. Sequencing adapters are typically introduced via enzymatic ligation following an end-repair process to obtain DNA molecules with blunt ends suitable for ligation. The library is then subjected to sequencing, with or without additional preparation steps, such as PCR amplification and targeted capture of particular regions of interest.
Methylation-sensitive/-dependent enzymatic digestion and the different library preparation steps require certain conditions and reaction mixtures for optimal results. Thus, clean-up steps have been necessary in between steps to purify the DNA reaction products of reagents from the previous reaction. Without such clean-up steps, the library chemistry is inefficient and results in poor yield of DNA molecules suitable for sequencing. However, multiple clean-up steps result in loss of DNA material, introduce contamination and make the process more complex. The loss of DNA material is particularly significant when analyzing cell-free DNA from body fluid samples, in which the starting amount of DNA is low.
WO 2020/188561, WO 2022/107145 and WO 2023/089613, assigned to the Applicant of the present invention, disclose, inter alia, methods for detecting methylation changes in DNA samples using restriction enzymes and high-throughput sequencing.
It would be highly beneficial to have compositions and methods that improve the efficiency of DNA processing and analysis, to reduce DNA loss during sequencing library preparation and increase the yield of adapter-ligated molecules suitable for sequencing.
SUMMARY OF THE INVENTION
The present invention provides compositions and methods for preparing DNA libraries for high-throughput sequencing. In particular, the present invention relates to DNA library preparation following methylation-sensitive and/or methylation-dependent enzymatic digestion of the DNA, wherein the library preparation does not require clean-up of the DNA sample between the digestion step and subsequent library preparation steps, e.g. end-repair and adapter ligation. In particular, the present invention discloses a combination of incubation conditions for end repair and adapter ligation (time and temperature), and adapter concentrations that provide efficient library chemistry without purifying the DNA sample following methylation-sensitive and/or methylation-dependent enzymatic digestion. By avoiding loss of DNA material, yet maintaining efficient library chemistry, the present invention advantageously enables sequencing of more original DNA molecules than currently available methods, as exemplified hereinbelow.
A large number of restriction-modification (R-M) systems have been discovered and well characterized during the past few decades. Based on the cutting position, recognition sequence, cleavage requirements, and subunit structure, R-M systems are mainly classified into four types I, II, III, and IV. The type II R-M systems are the most abundant group of enzymes; they produce double- stranded DNA cleavage within or close the recognition sequence which consists of 4- to 8-defined nucleotides that can be symmetric, asymmetric, or degenerate. Most of type II restriction endonucleases show an absolute requirement for divalent metal ions to catalyze in a charge repulsive, polyanionic context the cleavage of the phosphodiester bond, which is one of the most stable bonds in biochemistry. Although the physiological metal ion for the bacterial enzymes appears to be the magnesium, they can utilize a variety of divalent cations for in vitro DNA cleavage reaction, including Mn2+, Ca2+, Fe2+, Co2+, Ni2+, Zn2+, or Cd2+, depending on the enzyme. X-ray crystallographic analysis of type II restriction enzymes in different metal-bound states has revealed two DNA cleavage mechanisms in which one or two metal ions are involved.
It is now disclosed that modifications to known/commercially available protocols of sequencing library preparation avoid the need to purify the DNA sample following restriction enzyme digestion in order to remove excess magnesium and/or additional components from the digestion reaction that may interfere with subsequent library preparation steps. The library preparation methods according to the present invention thus reduce DNA loss during library preparation and enable higher library yields. The
compositions and methods for library preparation disclosed herein provide efficient library preparation reactions (end repair, A-tailing, adapter ligation) despite the presence of impurities from the digestion steps, along with preservation of the DNA material by avoiding the purification step.
The present invention is further directed to an integrated process in which methylation- sensitive enzymatic digestion is followed by quantitative PCR, and advantageously allows performing the two steps in the same reaction mix. The present invention thus provides, according to certain aspects, methods for methylation-sensitive enzymatic digestion of DNA followed by quantitative PCR , wherein both steps are carried out in the same reaction mix. These methods significantly reduce the need for operator involvement in the entire process, thereby minimizing the potential for human error and/or potential contamination events, while enabling automation.
More particularly, it is now disclosed that, unexpectedly, DNA digestion with the methylation- sensitive restriction endonucleases Acil and HinPlI and subsequently quantitative PCR can be performed in the same tube and in the same reaction mix, without the need to adjust the reaction components, or add or remove any component between the two steps.
According to one aspect, the present invention provides a method for preparing a DNA sample for methylation analysis, the method comprising:
(i) subjecting the DNA sample to digestion with at least one methylation-sensitive or methylation-dependent restriction endonuclease in a digestion reaction mix supporting cleavage of the DNA sample by the at least one methylation-sensitive or methylationdependent restriction endonuclease, to obtain restriction endonuclease-treated DNA; and
(ii) preparing a sequencing library from the restriction endonuclease-treated DNA without purifying the restriction endonuclease-treated DNA from the digestion reaction mix, wherein preparing the sequencing library comprises:
(a) performing end-repair by forming a mixture of the endonuclease-treated DNA and an end-repair reaction mix without purifying the endonuclease-treated DNA prior to forming the mixture, and incubating the mixture for 45 minutes- 4 hours at 15-25°C and subsequently for 20-45 minutes at 60-75°C, to obtain end-repaired DNA; and
(b) ligating sequencing adapters to the end-repaired DNA by mixing an adapter ligation reaction mix with the end-repaired DNA, wherein the adapter ligation mix is mixed to obtain an adapter concentration of 0.08pM-0.4pM, and incubating for 45
minutes- 20 hours at 2-20°C, to obtain a library of adapter-ligated DNA for high-throughput sequencing.
In some embodiments, step ii(a) comprises incubating the mixture for 45 minutes- 3 hours (e.g., for 45-120 minutes, for 45-90 minutes or for 60 minutes) at 15-25°C (e.g., at 20°C) and subsequently for 20-45 minutes at 60-75°C. Each combination of time and temperature represents a separate embodiment of the present invention. Typically, the subsequent incubation is carried out under conditions of time and temperature sufficient for inactivation of the enzymes of the end repair and adapter ligation. In some particular embodiments, the subsequent incubation is carried out for about 30 minutes at about 65°C.
In some embodiments, step ii(b) comprises mixing an adapter ligation reaction mix with the end-repaired DNA to obtain an adapter concentration of 0.1pM-0.4pM. In some particular embodiments, step ii(b) comprises mixing an adapter ligation reaction mix with the end-repaired DNA to obtain an adapter concentration of about 0.2pM.
In some embodiments, step ii(b) comprises incubating for 1-20 hours (e.g., for 1-18 hours, or for 16 hours) at 2-20°C (e.g., at 4-20°C, at 4-18°C, or at 16°C). Each combination of time and temperature represents a separate embodiment of the present invention.
In some exemplary embodiments, step ii(a) comprises incubating the mixture for about 60 minutes at about 20°C and subsequently for about 30 minutes at about 65°C.
In some exemplary embodiments, step ii(b) comprises mixing an adapter ligation reaction mix with the end-repaired DNA to obtain an adapter concentration of about 0.2 pM, and incubating for aboutl6 hours at about 16°C.
According to another aspect, the present invention provides a method for preparing a cell-free DNA sample for methylation analysis, the method comprising:
(i) subjecting cfDNA extracted from 6-10 ml blood to digestion with at least one methylation- sensitive or methylation-dependent restriction endonuclease in a digestion reaction mix supporting cleavage of the cfDNA by the at least one methylation-sensitive or methylation-dependent restriction endonuclease, to obtain restriction endonuclease-treated DNA; and
(ii) preparing a sequencing library for target- specific high-throughput sequencing from the restriction endonuclease-treated DNA without purifying the restriction endonuclease-treated DNA from the digestion reaction mix, wherein preparing the sequencing library comprises:
(a) performing end-repair by forming a mixture of the endonuclease-treated DNA and an end-repair reaction mix without purifying the endonuclease-treated DNA prior to forming the mixture, and incubating the mixture for 45 minutes - 4 hours at 15-25°C and subsequently for 20-45 minutes at 60-75°C, to obtain end-repaired DNA;
(b) ligating sequencing adapters to the end-repaired DNA by mixing an adapter ligation reaction mix with the end-repaired DNA, wherein the adapter ligation mix is mixed to obtain an adapter concentration of 0.08pM-0.4pM, and incubating for 45 minutes- 20 hours at 2-20°C, to obtain adapter-ligated DNA; and
(c) subjecting the adapter-ligated DNA to target capture, to enrich target sequences of interest, thereby obtaining a sequencing library for target- specific high-throughput sequencing providing an average coverage per target of at least 1800 reads per target.
In some embodiments, the obtained sequencing library for targeted sequencing provides an average coverage per target of at least 1900 reads per target. In some embodiments, the obtained sequencing library for targeted sequencing provides an average coverage per target of 1800-30,000 reads per target, for example, 1800-20,000 reads per target, 1800-10,000 reads per target, 1900-30,000 reads per target, 1900-20,000 reads per target, 1900-10,000 reads per target. Each possibility represents a separate embodiment of the present invention.
In some embodiments, there is provided herein a method for analyzing methylation of a DNA sample, the method comprising:
(A) preparing a sequencing library from the DNA sample as disclosed herein;
(B) sequencing the sequencing library by a high-throughput sequencing method to provide sequencing data; and
(C) determining from the sequencing data a methylation value for at least one restriction locus.
In some embodiments, the at least one restriction locus is a plurality of restriction loci.
According to a further aspect, the present invention provides a reaction mix for adapter ligation comprising 0.08pM-0.4pM sequencing adapters, a DNA ligase and 1- 400ng DNA that was subjected to methylation- sensitive or methylation-dependent enzymatic digestion and end-repair. Such reaction mix according to the present invention further comprises at least one methylation-sensitive or methylation-dependent restriction
endonucleases that were used to digest the DNA and other components from the digestion reaction, and also enzymes and components that were used to perform end-repair following the digestion, as these components were not separated from the DNA between the methylation- sensitive or methylation-dependent enzymatic digestion and subsequent library preparation steps.
According to some embodiments, the DNA sample is a cell-free DNA sample.
According to some embodiments, the DNA is cell-free DNA extracted from a biological fluid sample. In some embodiments, the biological fluid sample is plasma, serum or urine. Each possibility of the biological sample is a separate embodiment of the present invention. According to some embodiments, the sample is a plasma sample.
According to some embodiments, the cell-free DNA is plasma cell-free DNA, and the amount of the cell-free DNA is an amount obtained from 6- 10ml of blood. According to some embodiments, the cell-free DNA is plasma cell-free DNA, and the amount of the cell- free DNA is an amount obtained from 8-10ml of blood. According to some embodiments, the amount of cell-free DNA is between l-400ng. According to some embodiments, the amount of cell-free DNA is between l-150ng. According to some embodiments, the amount of cell-free DNA is between l-100ng. According to additional embodiments, the amount of cell-free DNA is between 10-400ng. According to some embodiments, the amount of cell- free DNA is between 10-250ng. According to some embodiments, the amount of cell-free DNA is between 10-200ng. According to additional embodiments, the amount of cell-free DNA is between 10-150ng. According to additional embodiments, the amount of cell-free DNA is between 20-100 ng.
According to some embodiments, the DNA is DNA extracted from a tumor sample.
According to some embodiments, the at least one methylation- sensitive restriction endonuclease is a plurality of methylation-sensitive restriction endonucleases.
According to some embodiments, the methylation-sensitive restriction endonuclease is selected from the group consisting of: Aatll, AccII, Acil, Acll, Afel, Agel, Aorl3HI, Aor51HI, Asci, AsiSI, AspLEI, Aval, BceAI, BmgBI, BsaAI, BsaHI, BsiEI, BsiWI, BsmBI, BspDI, BspT104I, BssHII, BstBI, BstUI, Cfol, CfrlOI, Clal, Cpol, DpnII, EagI, Eco52I, Faul, Fsel, FspI, Haell, HapII, Hgal, Hhal, Hin6I, HinPlI, Hpall, Hpy99I, HpyCH4IV, KasI, Mini, Nael, Narl, NgoMIV, Notl, Nrul, Nsbl, PaeR7I, PluTI, PmaCI, Pmll, Psp 14061, Pvul, RsrII, SacII, Sall, ScrFI, Sfol, SgrAI, Smal, SnaBI, Srfl, TspMI and Zral. Each possibility represents a separate embodiment of the present invention.
According to some embodiments, the at least one methylation- sensitive restriction endonuclease comprises HinPlI. According to additional embodiments, the at least one methylation- sensitive restriction endonuclease comprises Acil. According to some embodiments, step (i) comprises digestion with a combination of restriction enzymes comprising HinPlI and Acil. According to additional embodiments, step (i) comprises digestion with the restriction enzymes HinPlI and Acil. In some embodiments, digestion is carried out using a combination of enzymes consisting of HinPlI and Acil, namely, HinPlI and Acil are the only two enzymes used.
According to some embodiments, at least one methylation-dependent restriction endonuclease is used, selected from the group consisting of: BspEI, BtgZI, FspEI, Glal, LpnPI, McrBC, MspJI, Xhol, Xmal. Each possibility represents a separate embodiment of the present invention.
According to some embodiments, the method comprises sequencing the library by a high-throughput sequencing method to provide sequencing data. According to certain embodiments, the method comprises determining from the sequencing data a methylation value for at least one restriction locus and optionally at least one additional genetic or epigenetic characteristic of the DNA sample, e.g., DNA mutation and/or copy number variation.
According to some embodiments, the at least one restriction locus is located within a CG-island.
According to some embodiments, the method further comprises identifying the presence or absence of a disease in the subject based on the methylation profile of the DNA sample, by comparing the methylation profile of the DNA sample to one or more reference methylation profile(s). In some embodiments, the disease is cancer. In some particular embodiments, the cancer is lung cancer.
According to some embodiments, the DNA sample is from a subject suspected of having the disease and/or a subject at risk of developing the disease, and the method comprises detecting methylation changes comprising determining whether the DNA sample is a healthy or disease DNA sample. According to some embodiments, the disease is cancer. According to particular embodiments, the cancer is lung cancer.
According to some embodiments, the method further comprises preparing a report in paper or electronic form based on the methylation profile and communicating the report to the subject and/or to a healthcare provider of the subject.
According to some embodiments, the high-throughput sequencing is target- specific high-throughput sequencing.
According to certain embodiments, the method comprises determining from the sequencing data a methylation value for at least one restriction locus, and optionally further comprising determining from the sequencing data at least one additional genetic or epigenetic characteristic of the DNA sample, e.g., DNA mutation and copy number variation.
According to some embodiments, determining a methylation value for at least one restriction locus comprises: determining read counts and relative copy number between the at least one restriction and a control locus.
According to some embodiments, the at least one restriction locus is a plurality of restriction loci.
According to some embodiments, the at least one methylation- sensitive restriction endonuclease is a plurality of methylation-sensitive restriction endonucleases, and the digestion with the plurality of methylation-sensitive restriction endonucleases is a simultaneous digestion.
According to some embodiments, the step of subjecting the cell-free DNA sample to digestion with at least one methylation-sensitive restriction endonuclease further comprises determining digestion efficiency . In some embodiments, proceeding to preparing a sequencing library is carried out if the digestion efficiency is above a predefined threshold.
According to another aspect, the present invention provides a method for detecting cancer-related genetic and epigenetic changes in a cell-free DNA sample (cfDNA) from a subject, the method comprising: profiling methylation and optionally at least one additional genetic or epigenetic characteristic of the cfDNA sample as disclosed herein, to obtain a genetic and epigenetic profile of the cfDNA sample; and comparing the genetic and epigenetic profile of the cfDNA sample to one or more reference genetic and epigenetic profile selected from a cancer profile and a non-cancer profile, to detect cancer-associated genetic and epigenetic changes in the cfDNA sample.
According to some embodiments, the cell-free DNA sample is from a subject suspected of having cancer or at risk of having cancer, and the method further comprises administering to the subject active cancer surveillance and follow-up testing when cancer- associated changes are detected, the cancer surveillance and follow-up testing comprising one or more of blood tests, urine tests, cytology, imaging, endoscopy and biopsy.
According to another aspect, the present invention provides a method for genetic and epigenetic profiling of a DNA sample, the method comprising determining a methylation value for at least one restriction locus as disclosed herein, and further determining from the sequencing data at least one additional genetic or epigenetic characteristic of the DNA sample, e.g., DNA mutation and/or copy number variation.
According to a further aspect, the present invention provides a method for identifying genomic regions differentially methylated between a first and second source of DNA, the method comprising: profiling methylation of at least one DNA sample from the first source as disclosed herein, to obtain a first DNA methylation profile; profiling methylation of at least one DNA sample from the second source as disclosed herein, to obtain a second DNA methylation profile; and comparing the first and second DNA methylation profiles to identify genomic regions differentially methylated between the first and second sources of DNA.
According to some embodiments, the first source of DNA is a cancer DNA and the second source of DNA is a non-cancer DNA. According to some embodiments, the first source of DNA is plasma cell-free DNA of a cancer patient and the second source of DNA is plasma cell-free DNA of one or more healthy individuals. In additional embodiments, the first and second sources of DNA are different stages of a cancer.
According to another aspect, the present invention provides a method for detecting methylation changes in a DNA sample, the method comprising: profiling methylation of the DNA sample as disclosed herein, to obtain a methylation profile of the DNA sample; and comparing the methylation profile of the DNA sample to one or more reference methylation profile to detect methylation changes in the DNA sample.
According to another aspect, the present invention provides a method for profiling methylation of a DNA sample from a subject, the method comprising:
(i) subjecting the DNA sample to digestion with at least one methylation-sensitive or methylation-dependent restriction endonuclease, to obtain restriction endonuclease- treated DNA; and
(ii) PCR amplifying from the restriction endonuclease-treated DNA at least one restriction locus, wherein the PCR amplification is carried out in the same reaction mix as the digestion, without adjusting the reaction mix between the digestion and the PCR amplification steps.
In some embodiments, the digestion is with Acil and/or HinPlI.
According to another aspect, the present invention provides a method for profiling methylation of a DNA sample from a subject, the method comprising:
(i) subjecting the DNA sample to digestion with methylation-sensitive restriction endonucleases Acil and HinPlI, to obtain restriction endonuclease-treated DNA in which methylated sites are intact and unmethylated sites are cut; and
(ii) PCR amplifying from the restriction endonuclease-treated DNA at least one restriction locus, wherein the PCR amplification is carried out in the same reaction mix as the digestion, without adjusting the reaction mix between the digestion and the PCR amplification steps.
As disclosed herein, the reaction mix comprises reagents required for both the digestion and PCR amplification steps.
According to some embodiments, the PCR amplification step is quantitative PCR (qPCR).
According to some embodiments, the reaction mix comprises between 2-6 mM divalent cation(s). According to additional embodiments, the reaction mix comprises between 2-4 mM divalent cation(s).
According to some embodiments, the divalent cation(s) is selected from the group consisting of Mg2+, Mn2+, Ca2+, Fe2+, Co2+, Ni2+, Zn2+, or Cd2+. According to certain embodiments, the divalent cation is magnesium (Mg2+).
According to some embodiments, the reaction mix comprises between 2-6 mM Mg2+. According to additional embodiments, the reaction mix comprises between 2-4 mM Mg2+. According to some particular embodiments, the reaction mix comprises between 2-6 mM MgCh. According to additional particular embodiments, the reaction mix comprises between 2-4 mM MgCh
According to some embodiments, the DNA digestion step is up to 20 hours. According to some embodiments, the DNA digestion step is up to 16 hours.
According to some embodiments, the DNA digestion step is performed for between 12 and 20 hours. According to some embodiments, the DNA digestion step is performed for between 14 and 18 hours. According to some embodiments, the DNA digestion step is performed for between 12 and 16 hours. According to some embodiments, the DNA digestion step is performed for about 16 hours.
According to some embodiments, the DNA sample is a cell-free DNA sample.
According to certain embodiments, the DNA sample is mitochondrial DNA.
According to some embodiments, the DNA is cell-free DNA extracted from a biological fluid sample. In some embodiments, the biological fluid sample is plasma, serum or urine. Each possibility of the biological sample is a separate embodiment of the present invention. According to some embodiments, the sample is a plasma sample.
According to some embodiments, the DNA is DNA extracted from a tumor sample.
According to some embodiments, the method comprises simultaneous amplification of more than one target sequence in the same reaction mix.
According to some embodiments, the amplification step comprises co-amplifying at least one restriction locus differentially methylated between normal and disease DNA and at least one control locus.
According to some embodiments, the amplification step comprises a step of coamplification of at least one restriction locus and a control locus, thereby generating an amplification product for each locus.
According to some embodiments, the control locus is not digested by the at least one methylation- sensitive or methylation-dependent restriction endonuclease. According to certain embodiments, the control locus is a locus devoid of a recognition sequence of the methylation- sensitive restriction endonuclease(s).
According to some embodiments, the method comprises determining a signal intensity for each generated amplification product. According to certain embodiments, the method comprises a step of comparing a ratio between the signal intensities of the amplification products of each of said at least one restriction locus and the control locus to at least one reference ratio. According to some embodiments, the control locus is a locus devoid of a recognition sequence of the methylation- sensitive restriction endonuclease(s).
According to some embodiments, the at least one restriction locus is located within a CG-island.
According to some embodiments, the at least one restriction locus is a plurality of restriction loci.
According to some embodiments, the method for profiling methylation further comprises identifying the presence or absence of a disease in the subject based on the methylation profile of the DNA sample, by comparing the methylation profile of the DNA sample to one or more reference methylation profile(s).
According to some embodiments, the DNA sample is from a subject suspected of having the disease and/or a subject at risk of developing the disease, and the method comprises detecting methylation changes comprising determining whether the DNA sample is a healthy or disease DNA sample. According to some embodiments, the disease is cancer. According to particular embodiments, the cancer is lung cancer.
According to some embodiments, the method further comprises preparing a report in paper or electronic form based on the methylation profile and communicating the report to the subject and/or to a healthcare provider of the subject.
According to another aspect, the present invention provides a combined DNA digestion and PCR reaction mix, wherein the reaction mix comprises 2-6 mM divalent cation(s), at least one methylation- sensitive or methylation-dependent restriction endonuclease, and a DNA polymerase. In some embodiments, the divalent cation is magnesium. In some embodiments, the reaction mix comprises 2-6 mM MgCh; such as 2- 4 mM MgCh. In some embodiments, the reaction mix comprises Acil and/or HinPlI.
According to another aspect, the present invention provides a combined DNA digestion and PCR reaction mix, the rection mix comprises 2-6 mM Mg2+ (e.g., MgCh), methylation- sensitive restriction endonucleases Acil and HinPlI and a Taq polymerase. In some embodiments, the DNA digestion and PCR reaction mix comprises 2-4 mM Mg2+ (e.g., MgCh), methylation-sensitive restriction endonucleases Acil and HinPlI and a Taq polymerase.
According to some embodiments, the PCR reaction mix comprises dNTPs.
According to some embodiments, the PCR reaction mix comprising at least one primer pair. According to some embodiments, the PCR reaction mix comprising a plurality of primer pairs. According to certain embodiments, the reaction mix comprises probes.
According to another aspect, the present invention provides a kit comprising the combined DNA digestion and PCR reaction mix as described herein.
These and further aspects and features of the present invention will become apparent from the detailed description, examples and claims which follow.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1A-1E. Library yields of modified library preparation protocols. (1A) Modified end repair time; (IB) Modified adapter ligation time; (1C) Modified adapter
ligation time and temperature; (ID) Modified adapter concentration; (IE) Protocol adjustments combinations.
Figure 2. Library yields of a library preparation protocol modified according to embodiments of the present invention vs. a commercial protocol.
Figure 3. Sequencing depth of original DNA molecules using a library preparation protocol modified according to embodiments of the present invention vs. a commercial protocol. Results are shown as average coverage per target after collapsing all reads with the same unique molecular identifiers (UMIs).
Figure 4A-4B. (4A) A quantitative PCR analysis of restriction loci after DNA digestion with methylation-sensitive endonucleases. The DNA digestion step and the qPCR were performed using the same reaction mix, which contained 2-4 mM MgCh. The Figure shows a representative qPCR amplification plot screen display. The plot shows an amplification of targeted locus (methylated and uncut) vs. control, unmethylated locus; (4B) Influence of Mg concentration on qPCR efficiency. Results are provided as Cq values measured for each Mg concentration.
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates to methods and compositions for profiling genetic and epigenetic characteristics of DNA samples, particularly cell-free DNA samples, using digestion of DNA with methylation-sensitive/ methylation-dependent restriction enzymes followed by library preparation for sequencing using improved protocols that avoid the need to purify the DNA following the digestion. Advantageously, the methods and compositions of the present invention enable working with very low amounts of DNA and receiving vast amounts of information, including methylation data, mutation data and more.
The term “reaction mix” as used herein refers to aqueous solutions or compositions that are suitable for carrying out the indicated reaction, such as DNA digestion, end repair, adapter ligation, etc.
The term “buffer” as used herein, refers to aqueous solutions or compositions that resist changes in pH when acids or bases are added to the solution and are suitable for carrying out the indicated reaction.
The term “amplification”, as used herein, refers to an increase in the number of copies of one or more particular nucleic acid target(s) of interest. Amplification is typically performed by polymerase chain reaction (PCR) in the presence of a PCR reaction mix which
may include a suitable buffer supplemented with the DNA template, polymerase (usually Taq Polymerase), dNTPs, primers and probes (as appropriate), as known in the art.
An "amplification product" collectively refers to nucleic acid molecules of a particular target sequence that are generated and accumulated in an amplification reaction. The term generally refers to nucleic acid molecules generated by PCR using a given set of amplification primers.
As used herein, a "primer" defines an oligonucleotide which is capable of annealing to (hybridizing with) a target sequence, thereby creating a double stranded region which can serve as an initiation point for DNA synthesis under suitable conditions.
As used herein the term "plurality" refers to more than one, namely two or more.
DNA sample
A DNA sample for use according to the present invention may be obtained from any biological sample of a subject from which nucleic acids can be obtained, including biological fluid samples such as blood, plasma, serum, urine, cerebrospinal fluid, semen, stool, sputum and amniotic fluid. Each possibility represents a separate embodiment of the present invention. Biological samples also include tissue and organ samples.
A “subject” according to the present invention is typically a human subject. The subject may be suspected of having a certain disease. In some embodiments, the subject is diagnosed with a disease of interest. In other embodiments, the subject is a healthy subject that does not have the disease of interest. The subject may also be at risk of developing the disease, for example, based on previous history of the disease, genetic predisposition, and/or family history, and/or a subject who exhibits suspicious clinical signs of the disease and/or a subject that is suspected of having the disease based on other prior assay(s) e.g., based on testing of other biomarker(s). In some embodiments, the subject is at risk of recurrence of the disease. In some embodiments, the subject shows at least one symptom or characteristic of the disease. In other embodiments, the subject is asymptomatic.
According to some embodiments, the DNA sample is cell-free DNA extracted from a biological fluid sample. The term “cell-free DNA” (abbreviated “cfDNA”) refers to DNA molecules which are freely circulating in body fluids and are not contained within intact cells. The origin of cfDNA is not fully understood but believed to be related to apoptosis, necrosis and active release from cells. cfDNA is released by both normal and tumor cells. cfDNA is highly fragmented, with fragments typically ranging between 120-220 bps in
length, mostly between 150-180 bps in length. It is to be understood that the term “cell-free DNA” as used herein refers to DNA which is already cell-free in the body of the subject. It is to be understood that for cell-free DNA samples, "restriction endonuclease-treated DNA" comprises fragments generated as a result of the digestion, and also natural cell-free DNA fragments, for example, cell-free DNA fragments that do not contain a recognition sequence of the enzyme(s) used in the assay, cell-free DNA fragments that contain one or more recognition sequence(s) of MSRE(s) used in the assay that are all methylated and therefore not cut by the MS RE, and cell-free DNA fragments that contain one or more recognition sequence(s) of MDRE(s) used in the assay that are all unmethylated and therefore not cut by the MDRE.
Alternatively, the DNA sample may be DNA extracted from cells, for example, DNA extracted from tissue or organ samples or from blood cells. Typically, cell lysis is required in order to extract the DNA. DNA may be obtained from tumor samples or from healthy tissues. A "tumor sample" as used herein encompasses a whole tumor resected by surgery or portions thereof. A "tumor sample" also encompasses a sample taken from a tumor by biopsy, and a sample taken from a lesion or a tissue suspected of being cancerous. Tumor samples for use according to the present invention include fresh tumor samples as well as frozen/preserved tumor samples.
For sequencing DNA extracted from cells, a step of fragmenting the DNA into fragments suitable for high-throughput sequencing may be carried out before, after or during the digestion with the at least one methylation- sensitive or methylation-dependent restriction endonuclease according to the present invention, to simplify downstream processing and preparation of a sequencing library. Such fragmentation can be carried out, for example, using sonication, or using a restriction endonuclease which is insensitive to methylation, namely, cleaves its recognition sequence regardless of methylation status. It can also be carried out using a restriction endonuclease with a recognition sequence that does not include CG dinucleotides.
Preferably, the DNA sample which is subjected to digestion by MSREs/MDREs as disclosed herein (i.e., the restriction endonuclease-treated DNA) is substantially free of single- stranded DNA (ssDNA). As used herein, “substantially free of ssDNA” indicates a DNA sample in which less than 7% of the DNA molecules (by number) are ssDNA, preferably less than 5% of the DNA molecules are ssDNA, more preferably less than 1% of the DNA molecules are ssDNA (namely, at least 99% of the DNA molecules are double-
stranded) (by number of molecules). In some embodiments, the DNA sample contains less than 0.1% ssDNA. In some embodiments, the DNA sample contains less than 0.01% ssDNA. In some embodiments, the DNA sample contains no ssDNA (free of ssDNA). Extraction of DNA to obtain a DNA sample substantially free of ssDNA is described, for example, in WO 2020/188561, assigned to the Applicant of the present invention.
An exemplary kit for extracting cell-free DNA which is suitable for use with the method of the present invention is QIAamp® Circulating Nucleic Acid Kit (QIAGEN, Hilden, Germany). An exemplary kit for extracting DNA from cells is QIAamp® Blood Mini Kit.
DNA digestion
According to the present invention, following extraction (and optionally enrichment for regions of interest and/or fragmentation to reduce size) the DNA is subjected to digestion with at least one methylation-sensitive restriction endonuclease and/or at least one methylation-dependent restriction endonuclease, preferably with a plurality of methylationsensitive restriction endonucleases (or a plurality of methylation-dependent restriction endonucleases) applied simultaneously. As used herein, “restriction endonucleases applied simultaneously” or “simultaneous digestion” means that the enzymes are present together in the reaction mix in an active form, without inactivation of one prior to application of another. For example, one, two, three, four or five methylation- sensitive and/or methylationdependent restriction endonucleases may be used. Each number of endonucleases used in the assay represents a separate embodiment of the present invention.
Where methods are described herein as involving “digestion”, this term (and also “digesting”, etc.) refers to the mixing of active restriction enzymes with DNA in conditions under which digestion can occur. If there are no recognition sites for the restriction enzyme in question (e.g. because it is a MSRE and all of the recognition sequences are fully methylated) then a step of “digestion” still takes place even though DNA cleavage does not occur.
According to some embodiments, the entire DNA that was extracted is used in the digestion step. In some embodiments, the DNA is not quantified prior to being subjected to digestion. In other embodiments, the DNA is quantified prior to digestion thereof. In some embodiments, the DNA is aliquoted into a first aliquot that is subjected to digestion and a second aliquot that is kept as an undigested control.
A "restriction endonuclease", used herein interchangeably with a "restriction enzyme", refers to an enzyme that cuts DNA at or near specific recognition sequences, also known as restriction sites. Restriction sites are usually 4 to 8 nucleotide long and are typically palindromic (i.e., reading in a certain direction, e.g. 5' to 3', on one strand is identical to the sequence in the same direction (5' to 3') on the complementary strand).
A "methylation- sensitive" restriction endonuclease (MSRE) is a restriction endonuclease that cleaves its recognition sequence only if it is unmethylated (while methylated sites remain intact). Thus, the extent of digestion of a DNA sample by a methylation- sensitive restriction endonuclease depends on the methylation level, where a higher methylation level protects from cleavage and accordingly results in less digestion. A DNA sample treated with a methylation- sensitive restriction endonuclease is characterized by intact methylated sites and cut unmethylated sites. It is to be understood that there is no need for 100% digestion efficiency and thus some unmethylated sites might remain intact. In some embodiments, the methods of the present invention comprise determining the digestion efficiency, and proceeding to preparing a sequencing library if the digestion efficiency is above a predefined threshold/level.
A "methylation-dependent" restriction endonuclease (MDRE) is a restriction endonuclease that cleaves its recognition sequence only if it is methylated (while unmethylated sites remain intact). Thus, the extent of digestion of a DNA sample by a methylation-dependent restriction endonuclease depends on the methylation level, where a higher methylation level results in more extensive digestion.
According to some embodiments, a DNA sample according to the present invention is subjected to digestion with a single methylation- sensitive restriction endonuclease. In additional embodiments, the DNA sample is subjected to digestion with two methylationsensitive restriction endonucleases.
In some particular embodiments, the methylation-sensitive restriction endonucleases HinPlI and Acil are used.
Some commercial suppliers provide the Hin6I enzyme instead of HinPlI. These two enzymes have essentially the same properties i.e. they have the same recognition sequence, the same optimum digestion temperature, and they can both be inactivated at 65 °C for 20 minutes. Also, 1 unit of Hin6I is defined in the same way as 1 unit of HinPlI. Thus the terms HinPlI and Hin6I are used interchangeably herein, and any enzyme combination which is disclosed as using HinPlI should be understood as also disclosing that same combination
using Hin6I instead e.g. the invention provides combinations of Acil & Hin6I in the same manner as disclosed herein for Acil & HinPlI.
In some embodiments, there is provided a method for profiling methylation of a DNA sample, the method comprising: subjecting the DNA sample to digestion with the methylation- sensitive restriction endonucleases HinPlI and Acil; and analyzing methylation of at least one restriction locus of HinPlI and/or at least one restriction locus of Acil, thereby profiling methylation of the DNA sample. In some embodiments, the method comprises subjecting the DNA sample to digestion with the methylation-sensitive restriction endonucleases HinPlI and Acil; and determining a level of methylated DNA and optionally a level of unmethylated DNA of at least one restriction locus of HinPlI and/or at least one restriction locus of Acil, thereby profiling methylation of the DNA sample. In some embodiments, the DNA sample is cell-free DNA extracted from a biological fluid.
When a composition includes HinPlI and Acil, then HinPlI is ideally present at an excess (measured in terms of enzymatic units) to Acil, and ideally an excess of at least 1.2:1 e.g. at least 1.5:1, at least 1.75:1, at least 2:1, at least 3:1, at least 4:1, or at least 5:1. A ratio of at least 2: 1 is often useful e.g. when the intention is to analyze human cfDNA, and a ratio of about 4.5:1 has been found to be useful when digesting human cfDNA from plasma.
The digestion is carried out in a reaction buffer that contains several components for optimal activity of the restriction enzymes. Reaction buffers may contain, for example, Tris- HC1, MgCh, NaCl, and 2-mercaptoethanol, at concentrations suitable to support cleavage of DNA by the restriction enzymes.
According to some embodiments, a DNA digestion reaction mix comprises MgCh (typically ranging between 8-12mM, depending on the restriction enzyme concentration), 10-50mM buffer (for example Tris-acetate), lO-lOOmM salt, (for example potassium acetate), and 100|J.g/|J.l Albumin.
According to some embodiments, the methods as described herein comprise: profiling methylation of the DNA sample using HinPlI and Acil digestion; and comparing the methylation profile to one or more reference methylation profile. In some embodiments, the DNA sample is cell-free DNA extracted from a biological fluid.
According to some embodiments, the method for profiling methylation of a DNA sample comprises: subjecting the DNA sample to digestion with the methylation-sensitive restriction endonucleases HinPlI and Acil, thereby obtaining restriction endonuclease- treated DNA comprising restriction endonuclease-generated DNA fragments; and
subjecting the endonuclease-generated DNA fragments to end repair and/or adapter ligation as disclosed herein.
Digestion efficiency can be evaluated either internally to the examined sample, or externally. Internal evaluation can be performed by measuring intact cut sites of genomic positions that are known to be ubiquitously unmethylated. An example of such a locus can be any site on the mitochondrion DNA. External evaluation of digestion efficiency can be performed either by including an unmethylated sample in the digestion step, digesting both samples in parallel, and then verifying that the unmethylated sample was indeed digested (by measuring numbers of intact cut sites). Such an unmethylated sample could be, for example, PCR amplicons, plasmid DNA, commercial unmethylated DNA species, or cell line DNA that is known to be unmethylated in certain genomic positions. Alternatively, external evaluation of digestion efficiency can be achieved in a single step, by spiking in an unmethylated sample into the interrogated sample, and measuring the digestion of the unmethylated DNA sample in the same step as the interrogated sample. For this purpose, it is possible to use all types of unmethylated DNA species mentioned above. In some embodiments, the use of small targets is preferred, such as PCR amplicons or plasmid DNA.
Digestion efficiency may be represented using any measure that is indicative of the amount of DNA that remained undigested (or that was digested) out of the original DNA amount. For example, digestion efficiency may be represented as %undigested DNA or as ACq in qPCR for a certain locus.
According to some embodiments, DNA digestion may be carried out to complete digestion. In some exemplary embodiments, the methylation-sensitive restriction endonuclease is HinPlI and/or Acil, and complete digestion may be achieved following one to two hours incubation with the enzyme(s) at 37°C. According to certain embodiments, the complete digestion is achieved following two hours incubation. According to certain embodiments, the complete digestion may be achieved following 3, 4, 5, 6, 7, 8, 9, or 10 hours. Each possibility represents a separate embodiment of the invention. The incubation time sufficient for complete digestion is varied and depends on a number of factors, such as the type of restriction enzyme, sample purity, amount of DNA, and DNA integrity. One hour of incubation may be inadequate under certain circumstances, and routine tests may be applied in order to confirm complete digestion.
Library preparation and sequencing
According to some embodiments, the methods described herein comprise a step of preparing a sequencing library.
Preparing a sequencing library according to the present invention comprises ligating sequencing adapters to amplification products.
The step of preparing a sequencing library according to the present invention is carried out on the digested DNA without a step of DNA amplification.
According to some embodiments, library preparation for sequencing according to the present invention is carried out in an end-preserving manner, indicating that the library preparation process is carried our such that the sequence information at the ends of DNA molecules is preserved. A library preparation process according to these embodiments does not include PCR to enrich genomic regions of interest and/or introduce sequencing adapters. A library preparation process according to these embodiments preferably also performs blunt-ending by filling in gaps not digesting overhangs. According to these embodiments, library preparation comprises adding sequencing adapters via ligation (e.g., enzymatic ligation). If enrichment of certain genomic regions is desired, library preparation according to these embodiments comprises enriching the genomic regions of interest using capture agents following the ligation of sequencing adapters.
According to some embodiments, the present invention relates to compositions and methods for high resolution DNA methylation profiling. In some embodiments, the present invention provides the use of methylation- sen sitive/methylation-dependent restriction enzymes and high-throughput sequencing in the analysis of DNA methylation. In some particular embodiments, the present invention provides the use of methylation- sensitive/methylation-dependent restriction enzymes and high-throughput sequencing for direct calculation of methylated and unmethylated DNA levels.
Methylation in the human genome occurs in the form of 5-methyl cytosine and is confined to cytosine residues that are part of the sequence CG, also denoted as CpG dinucleotides (cytosine residues that are part of other sequences are not methylated). Some CG dinucleotides in the human genome are methylated, and others are not. In addition, methylation is cell and tissue specific, such that a specific CG dinucleotide can be methylated in a certain cell and at the same time unmethylated in a different cell, or methylated in a certain tissue and at the same time unmethylated in different tissues. DNA methylation is an important regulator of gene transcription.
The methylation pattern of cancer DNA differs from that of normal DNA, wherein some loci are hypermethylated while others are hypomethylated. In some embodiments, the present invention provides methods and compositions for sensitive detection of differentially methylated (e.g., hypermethylated) genomic loci associated with cancer.
As used herein, 3.3 pg of DNA corresponds to 1 haploid equivalent.
According to some embodiments, the methods disclosed herein are carried out using an initial amount of lOng of DNA. In additional embodiments, the methods disclosed herein are carried out using an initial amount of 20ng of DNA. In additional embodiments, the methods disclosed herein are carried out using an initial amount of DNA ranging from l-400ng, for example between l-200ng, between 10-200ng, between l-150ng, between l-100ng, including each value within the ranges. Each possibility represents a separate embodiment.
According to some embodiments, the methods disclosed herein are carried out using an initial amount of 3,000 haploid equivalents. In additional embodiments, the methods disclosed herein are carried out using an initial amount of 6,000 haploid equivalents. In additional embodiments, the methods disclosed herein are carried out using an initial amount of DNA comprising 3,000-60,000 haploid equivalents, for example between 6,000- 60,000 haploid equivalents, between 6,000-30,000 haploid equivalents, including each value within the ranges. Each possibility represents a separate embodiment.
According to some embodiments, there is provided herein a method for profiling methylation of a DNA sample from a subject, the method comprising:
(i) subjecting the DNA sample to digestion with at least one methylation-sensitive restriction endonuclease, to obtain restriction endonuclease-treated DNA in which methylated sites are intact and unmethylated sites are cut; and
(ii) preparing a sequencing library from the restriction endonuclease-treated DNA as disclosed herein;
(iii) sequencing the sequencing library by a high-throughput sequencing method to obtain sequence reads;
(iv) selecting at least one restriction locus and determining the number of sequence reads covering a predefined genomic region of at least 50 bps in length that contains said restriction locus; and
(v) determining a methylation value for the at least one restriction locus based on the read count determined in step (v) and a reference read count,
thereby profiling methylation of the cell-free DNA sample.
According to some embodiments, profiling methylation of a DNA sample comprises determining the number of sequence reads covering a predefined genomic region of at least 60 bps in length that contains said restriction locus, for example a predefined genomic region of at least 70 bps, at least 80 bps, at least 90 bps, at least 100 bps, between 50-150 bps, between 50-120 bps, between 50-100 bps that contains the restriction locus. Each possibility represents a separate embodiment.
According to some embodiments, the at least one restriction locus is located within a CG-island. "CG islands" (or CpG islands) are regions of DNA with a high G/C content and a high frequency of CG dinucleotides relative to the whole genome of an organism of interest. CG islands are typically between 200-3,000 bps in length and are typically characterized by a GC content greater than 50% and an observed: expected CG ratio of more than 0.6. Genomic regions of lower CG density are termed "CG oceans" and comprise most of the genome.
According to some embodiments, there is provided a method for identifying the presence or absence of a disease in a subject, comprising: profiling methylation of a DNA sample from the subject as disclosed herein; comparing the methylation profile of the DNA sample to one or more reference methylation profile; and determining the presence or absence of the disease in the subject based on the comparison.
According to some embodiments, there is provided a method for identifying a DNA methylation marker indicative of the source of a DNA sample comprising profiling methylation as disclosed herein. In additional embodiments, there is provided herein a method for assessing the quality of a DNA methylation marker comprising profiling methylation as disclosed herein. In some embodiments, the DNA methylation marker is a marker indicative of the presence or absence of a disease, e.g., a type of cancer. In additional embodiments, the DNA methylation marker is a marker indicative of a stage of a disease, e.g., a cancer stage. In additional embodiments, the DNA methylation marker is a marker indicative of a type of tissue (e.g., lung tissue, breast tissue, colon tissue etc.).
In general, embodiments which can be performed with methylation-sensitive restriction enzyme(s) can be done alternatively with methylation-dependent restriction enzyme(s), and downstream steps will be adjusted accordingly. For example, in some embodiments, following high-throughput sequencing and generation of sequence reads, a method for profiling methylation according to the present invention comprises: selecting at
least one restriction locus and determining the number of sequence reads covering a predefined genomic region of at least 50 bps in length that contains said restriction locus; and calculating a methylation value based on the read count of the predefined genomic region and a reference read count, the calculated methylation value reflects the number of molecules that were unmethylated in the DNA sample and therefore remained intact following digestion with methylation-dependent restrictions enzymes(s).
As another example, in some embodiments, for calculating a level of methylated DNA of a restriction locus, following high-throughput sequencing and generation of sequence reads, the method comprises: determining from the sequence reads a read count of sequence reads starting or ending at a nucleotide within the restriction locus, the read count representing the number of DNA molecules in the DNA sample in which said restriction locus was methylated and therefore cut by the restriction endonuclease; and calculating a level of methylated DNA at the restriction locus based on the determined read count of sequence reads starting or ending at a nucleotide within the restriction locus. For calculating a level of unmethylated DNA of a restriction locus, in some embodiments, the method comprises: determining from the sequence reads a read count of the restriction locus, the read count representing the number of DNA molecules in the DNA sample in which said restriction locus was unmethylated and therefore remained intact; and calculating a level of unmethylated DNA at the restriction locus based on the determined read count of the restriction locus.
"High throughput sequencing," (also termed "next generation sequencing") includes sequence determination using methods that determine many (typically thousands to billions) of nucleic acid sequences in parallel. High throughput sequencing generally involves three basic steps: library preparation, sequencing and data analysis. Examples of high throughput sequencing techniques include sequencing -by- synthesis and sequencing-by-ligation (employed, for example, by Illumina Inc., Life Technologies Inc., Roche), nanopore sequencing methods, electronic detection-based methods such as Ion Torrent™ technology (Life Technologies Inc.), and Single Molecule Real-Time (SMRT®) sequencing and sequencing-by-binding employed by PacBio.
High-throughput sequencing according to the present invention may be performed using various high-throughput sequencing instruments and platforms, including but not limited to: Novaseq™, Nextseq™ and MiSeq™ (Illumina), 454 Sequencing (Roche), Ion Chef™ (ThermoFisher), SOLiD® (ThermoFisher), Sequel II™ (Pacific Biosciences) and
REVIO™, ONSO™ and SEQUEL™ lie (PacBio). The appropriate platform-designed sequencing adapters are used for preparing the sequencing library.
Library preparation for the major high-throughput sequencing platforms requires the addition (typically ligation or via PCR) of specific adapter oligonucleotides to fragments of the DNA to be sequenced. As disclosed herein, restriction digestion is carried out before adapter ligation to avoid possible digestion of the adapters by the enzymes.
The digestion of DNA by the methylation- sen sitive/dependent restriction endonuclease(s) as disclosed herein typically does not result in homogeneous, blunt-ended fragments. Thus, in some embodiments, end repair is needed to ensure that each DNA molecule is free of overhangs, and contains 5' phosphate and 3' hydroxyl groups.
As disclosed herein, the endonuclease-treated DNA is subjected to end-repair without purifying the endonuclease-treated DNA following digestion with the methylation- sensitive/dependent restriction enzyme(s). As disclosed herein, an end-repair reaction mix is mixed with the endonuclease-treated DNA without purifying the DNA prior to the mixing. An end-repair reaction mix for use with the methods of the present invention comprises a DNA polymerase (for example, T4 DNA polymerase and/or Klenow fragment), a polynucleotide kinase (for example, T4 polynucleotide kinase (PNK)), dNTPs and a buffer.
For Illumina libraries, incorporation of a non-templated deoxyadenosine 5'- monophosphate (dAMP) onto the 3' end of blunted DNA fragments, a process known as dA-tailing, is also required for library preparation. dA-tails prevent concatamer formation during downstream ligation steps, and enable DNA fragments to be ligated to adapter oligonucleotides with complementary dT -overhangs. Thus, in some embodiments (e.g., wherein the library is prepared for sequencing using an Illumina platform), the methods of the present invention comprise performing end-repair combined with dA-tailing. Components needed for dA-tailing comprise a DNA polymerase such as Klenow fragment or Taq DNA polymerase, and dATP. The present invention utilizes compatible end-repair and dA-tailing components, which can be mixed in the same reaction mix and do not require purification of the DNA between these steps. Exemplary commercial kits which provide for combined end-repair and dA-tailing are indicated below.
The end-repair/dA-tailing reaction mixes are further compatible with the adapterligation reaction mix, as will be described in more detail below.
Following mixing the end-repair reaction mix with the endonuclease-treated DNA, end-repair is facilitated by incubating for 45 minutes - 4 hours (e.g., 45 minutes - 3 hours,
45-120 minutes, 45-90 minutes, 50-70 minutes, 60-120 minutes, 60-90 minutes, or 60 minutes, each possibility represents a separate embodiment of the present invention) at a temperature in the range of 15-25°C (e.g., 20°C), and subsequently for 20-45 minutes at a temperature of 60-75°C.
In some particular embodiments, end-repair is performed by incubating for 45-90 minutes at 20°C, and subsequently for 30 minutes at 65°C. In further particular embodiments, end-repair is performed by incubating for 60 minutes at 20°C, and subsequently for 30 minutes at 65°C.
Adapter oligonucleotides are also termed herein “sequencing adapters” or "adapter sequences". According to the methods disclosed herein, sequencing adapters are ligated to the DNA fragments using end-preserving methods such as enzymatic ligation in which a ligase enzyme covalently links a sequencing adapter to a DNA fragment, making a complete library molecule. Sequencing adapters are added at the 5' and 3' ends of DNA fragments in the sequencing library. Sequencing adapters typically include platform- specific sequences for fragment recognition by a particular sequencer: for example, sequences that enable library fragments to bind to the flow cells of Illumina platforms. Each sequencing instrument provider typically uses a specific set of sequences for this purpose.
Sequencing adapters may also include sample indices. “Sample indices”, also termed "sample barcodes" are sequences that enable multiple samples to be sequenced together (i.e., multiplexed) on the same instrument flow cell or chip. Each sample index, typically 6-10 bases, is specific to a given sample library and is used for de-multiplexing during data analysis to assign individual sequence reads to the correct sample. Sequencing adapters may contain single or dual sample indexes depending on the number of libraries combined and the level of accuracy desired.
Sequencing adapters may include unique molecular identifiers (UMIs). UMIs are a type of molecular barcodes that provide molecular tracking, error correction and increased accuracy during sequencing. UMIs are short sequences, typically 5 to 20 bases in length, used to uniquely tag each molecule in a sample library. Since each nucleic acid in the starting material is tagged with a unique molecular barcode, bioinformatics software can filter out duplicate reads and PCR errors with a high level of accuracy and report unique reads, removing the identified errors before final data analysis.
In some embodiments, both a sample barcode sequence and a UMI are incorporated into a nucleic acid target molecule.
1
Ligating sequencing adapters according to the present invention is carried out by mixing an adapter ligation reaction mix with the end-repaired DNA. An adapter ligation reaction mix for use with the methods of the present invention comprises a DNA ligase, sequencing adapters and a buffer. As disclosed herein, the adapter ligation mix is mixed with the end-repaired DNA to obtain adapter concentration of 0.08pM-0.4pM, for example 0.1-0.4 pM, or 0.2 pM. Each possibility represents a separate embodiment of the present invention.
The present invention preferably utilizes an adapter ligation mix that is compatible with the end-repair reaction mix such that purification of the DNA between these steps is not required. Exemplary commercial kits which provide for compatible end-repair (including dA-tailing) and adapter ligation include NEBNext® Ultra™ II DNA Library Prep Kit for Illumina, Agilent SureSelect XT H2S™ Library preparation kit for Illumina, Qiagen - QIAseq® Ultralow Input Library, Roche - KAPA HyperPrep Kit, and Illumina library preparation kits.
Following mixing the end-repaired DNA with the adapter-ligation reaction mix, the ligation is facilitated by incubating for 45 minutes- 20 hours (e.g., 1-20 hours, 1-18 hours, 1-16 hours, 15-20 hours or 16 hours, each possibility represents a separate embodiment of the present invention) at a temperature in the range of 2-20°C (e.g., 4-16°C, 15-20°C, ISIS0® or 16°C, each possibility represents a separate embodiment of the present invention).
In some particular embodiments, adapter ligation is performed by incubating for 16 hours at 16°C.
In some embodiments, the end-repaired DNA may be frozen or refrigerated prior to the step of adapter ligation.
According to some embodiments, whole genome sequencing is performed on libraries prepared from endonuclease-treated DNA. The libraries are prepared using sequencing adapters suitable for the sequencing platform being used.
According to other embodiments, region(s) of interest in the endonuclease-treated DNA may be captured using, for example, a solution-phase or solid-phase hybridizationbased process, followed by the high-throughput sequencing. Enrichment of regions of interest followed by high-throughput sequencing is referred to herein as “target- specific high-throughput sequencing”. Target- specific high-throughput sequencing includes, for example, CpG island sequencing and exome sequencing. Target- specific high-throughput sequencing also includes sequencing of specific informative genomic regions, for example,
regions known to be differentially methylated between cancer and non-cancer tissues. Capture of genomic regions for target- specific sequencing is typically carried out after library preparation. In some embodiments, the methods disclosed herein comprise enriching genomic regions of interest.
According to some embodiments, a method for genetic and epigenetic profiling of DNA samples according to the present invention comprises: extracting DNA from a biological sample; subjecting the extracted DNA to digestion with at least one methylation-sensitive restriction endonuclease, thereby obtaining restriction endonuclease-treated DNA; preparing a sequencing library from the restriction endonuclease-treated DNA as disclosed herein; enriching at least one (preferably a plurality of) genomic regions of interest from the sequencing library using capture agents, to obtain a sequencing library enriched with the at least one (preferably a plurality of) genomic regions of interest; subjecting the sequencing library enriched with the at least one (preferably a plurality of) genomic regions of interest to high-throughput sequencing; and determining from the sequencing data a methylation value for at least one restriction locus and optionally at least one additional genetic or epigenetic characteristic of the cell- free DNA sample selected from DNA mutation, copy number variation and nucleosome positioning as disclosed herein.
Analysis of sequence reads
According to some embodiments, the DNA sample for use according to the present invention is a cell-free DNA sample.
According to some embodiments, the methods of the present invention comprise a step of sequencing a library by a high-throughput sequencing method to obtain sequencing data.
According to some embodiments, the amount of cell-free DNA comprising 3000 haploid equivalents is sufficient to achieve at least one of: unique mapping rate of at least 85%, a copy number integrity characterized by Pearson correlation of at least 0.65 compared to undigested sample and nucleosome positioning integrity characterized by Pearson correlation of at least 0.55 compared to undigested sample.
According to some embodiments, an amount of cell-free DNA comprising 6,000 haploid equivalents is sufficient for the methods disclosed herein.
According to some embodiments, “sequence reads” (or simply, “reads”), namely, nucleotide sequences produced by the sequencing process, are mapped against a reference genome. A “reference genome” as used herein refers to a previously identified genome sequence, whether partial or complete, assembled as a representative example of a species or subject. A reference genome is typically haploid, and typically does not represent the genome of a single individual of the species but rather is a mosaic of the genomes of several individuals. A reference genome for the methods of the present invention is typically a human reference genome. In some embodiments, the reference genome is the complete human genome, such as the human genome assemblies available at the website of the National Center for Biotechnology Information (NCBI) or at the University of California, Santa Cruz (UCSC) Genome Browser. An example of a suitable reference genome for human studies is the ‘hgl8’ genome assembly. As an alternative, the more recent GRCh38 major assembly can be used (going up to patch p 13) .
Read mapping is the process to align the reads on a reference genome in order to identify the location of the reads within the reference genome. The sequence reads that align are designated as being “mapped”. The alignment process aims to maximize the possibility for obtaining regions of sequence identity across the various sequences in the alignment, allowing mismatches, indels and/or clipping of some short fragments on the two ends of the reads. The number of reads mapped to a certain genomic locus of interest is referred to herein as the “read count” or “copy number” of this genomic locus. Computer software may be used to analyze sequence reads, map sequence reads against a reference genome and quantify the number of reads.
The terms "genomic locus" and “locus” as used herein are interchangeable and refer to a DNA sequence at a specific location within the genome. A “locus” may include a single position (a single nucleotide at a defined position in the genome) or a stretch or nucleotides starting and ending at defined positions in the genome. The specific position(s) may be identified by the molecular location, namely, by the chromosome and the numbers of the starting and ending base pairs on the chromosome. A variant of a DNA sequence at a given genomic position is called an allele. Alleles of a locus are located at identical sites on homologous chromosomes. Genomic loci include gene sequences as well as other genetic elements (e.g., intergenic sequences).
A "restriction locus" is used herein to describe a genomic locus which is a restriction site of a methylation-sensitive/-dependent restriction endonuclease applied in the digestion step according to the present invention. Restriction loci according to the present invention may be differentially methylated between normal and disease DNA, meaning that for a given disease for which the analysis is carried out, for example, a certain type of cancer, the restriction loci differ in their methylation level between normal DNA and DNA derived from cancer cells. For example, DNA from the cancer cells may have an increased methylation level at the restriction loci compared to normal non-cancerous DNA. More particularly, the restriction loci contain CG dinucleotides that are more methylated in cancer DNA compared to normal non-cancerous DNA. According to the present invention, the differentially methylated CG dinucleotides are located within recognition sites of the at least one restriction enzyme applied in the digestion step.
According to some embodiments, a restriction locus according to the present invention contains a CG dinucleotide which is more methylated in cell-free DNA, e.g., plasma DNA, of subjects with a certain type of cancer than in cell-free DNA of healthy subjects. In some embodiments, plasma samples of the cancer patients contain a greater proportion of DNA molecules that are methylated at the restriction locus compared to plasma samples of healthy subjects.
According to additional embodiments, a restriction locus according to the present invention contains a CG dinucleotide which is more methylated in DNA from a cancerous tissue (e.g., a tumor sample) than in DNA from a non-cancerous tissue, meaning that in the cancerous tissue a greater proportion of DNA molecules are methylated at this position compared to the non-cancerous tissue.
A methylation-sensitive restriction enzyme cleaves its recognition sequence only if it is unmethylated. A methylation-dependent restriction enzyme cleaves its recognition sequence only if it is methylated. Thus, differences in methylation levels between samples result in differences in the degree of digestion, and subsequently different amounts of sequence reads in the following sequencing and quantification steps. Such differences enable distinguishing between DNA from different samples, for example, between DNA samples from subjects with cancer and DNA samples from healthy subjects.
The terms “level of methylated DNA”, “methylation level” or "methylation value" of a restriction locus is a numerical value representing the number of DNA molecules that are methylated at this restriction locus (namely, methylated at a CG dinucleotide within the
restriction locus) out of the total number of DNA molecules containing the restriction locus in the sample. In some embodiments, the level of methylated DNA of a restriction locus is calculated herein from the read count of the restriction locus following digestion with at least one methylation- sensitive restriction endonuclease. In additional embodiments, the level of methylated DNA of a restriction locus is calculated herein from the read count of a predefined genomic region of at least 50 bps that contains the restriction locus. As methylation- sensitive restriction endonucleases cleave their recognition sequence only if it is unmethylated, the read count of the restriction locus represents the number of DNA molecules in the DNA sample in which the restriction locus was methylated and therefore remained intact.
According to some embodiments, the methylation level of the restriction locus is calculated by dividing the read count of the restriction locus, or the read count of a predefined genomic region of at least 50 bps that contains the restriction locus, by an expected read count of the restriction locus or the predefined genomic region of at least 50 bps that contains the restriction locus. An expected read count of the restriction locus/ predefined genomic region may be determined, for example, using: (i) read count of a reference locus/genomic region of the same length as the restriction locus/genomic region, that is not cut by the restriction endonuclease; (ii) average read count of a plurality of reference loci/genomic regions of the same length as the restriction locus/genomic region, that are not cut by the restriction endonuclease; or (iii) read count of the restriction locus/predefined genomic region in an undigested control DNA sample, optionally corrected for sequencing depth differences.
The terms “level of unmethylated DNA” or “unmethylation level” of a restriction locus is a numerical value representing the number of DNA molecules that are unmethylated at this restriction locus (namely, unmethylated at a CG dinucleotide within the restriction locus) out of the total number of DNA molecules containing the restriction locus in the sample. In some embodiments, the level of unmethylated DNA of a restriction locus is calculated from the number of reads starting or ending at a nucleotide within the restriction locus following digestion with at least one methylation- sensitive restriction endonuclease and any subsequent end repair. The exact nucleotide within the restriction locus in which the sequence reads start or end depends on the type of restriction endonuclease used in the digestion step and the length of its recognition sequence. For example, for restriction endonucleases that produce non-blunt ends with 5' overhangs, digestion and end repair result
in fragments that start at the second nucleotide of the recognition sequence and fragments that end at the penultimate nucleotide of the recognition sequence.
As methylation-sensitive restriction endonucleases cleave their recognition sequence only if it is unmethylated, the number of reads starting or ending at a nucleotide within the restriction locus represent the number of DNA molecules in the DNA sample in which the restriction locus was unmethylated and therefore cut by the restriction endonuclease.
Thus, in some embodiments, the method of the present invention comprises: determining a number of sequence reads starting at a nucleotide within the restriction locus; determining a number of sequence reads ending at a nucleotide within the restriction locus; and calculating a level of unmethylated DNA at the restriction locus using the orientation that provides the larger number of sequence reads. In additional embodiments, the method of the present invention comprises: determining a number of sequence reads starting at a nucleotide within the restriction locus; determining a number of sequence reads ending at a nucleotide within the restriction locus; calculating an average between the two values; and using the average to calculate a level of unmethylated DNA at the restriction locus.
According to additional embodiments, the level of unmethylated DNA is calculated by determining a total fragment number, which is determined from the read count of the restriction locus and read count of sequence reads starting or ending at a nucleotide within the restriction locus.
According to some embodiments, the level of unmethylated DNA is expressed as percentage (%) of unmethylation, representing the percentage of DNA molecules that are unmethylated at the restriction locus out of the total number of DNA molecules containing the restriction locus in the sample.
Detecting methylation changes
As used herein, “detecting methylation changes” refers to detecting whether a tested DNA sample contains methylation changes compared to one or more reference DNA samples, detecting whether a DNA sample is characterized by a different methylation profile at selected genomic loci compared to a reference methylation profile, and/or determining whether the methylation profile of a DNA sample is normal or contains methylation changes indicative of the presence of a disease. Each possibility represents a separate embodiment of the present invention. Detecting methylation changes also encompasses comparing
methylation data obtained as disclosed herein between samples in order to identify genomic regions differentially methylated between the samples, which may be used as DNA methylation markers. For example, methylation data obtained as disclosed herein may be analyzed to identify genomic regions differentially methylated between different types of tissues, between cancer and non-cancer DNA, between different types of cancer, or between different stages of a certain type of cancer. In some embodiments, the methods disclosed herein provide genome-wide methylation analysis. In other embodiments, the methods disclosed herein provide target- specific methylation analysis. Computer software may be used in the analysis of the sequencing and methylation data.
The methods of the present invention may be applied for identifying and analyzing DNA methylation marker regions which may be used as cancer diagnostic markers. According to some embodiments, markers are of a cancer selected from the group consisting of lung cancer, colorectal cancer, liver cancer, breast cancer, pancreatic cancer, uterine cancer, ovarian cancer, head & neck cancer, gastric cancer, esophageal cancer, hematological cancers (e.g. lymphoma) and sarcoma. In some embodiments, the markers are used as pan-cancer markers. The methods may also be applied for identifying differential methylation between different types of cancer, for example, determining methylation profiles characteristic of different types of cancer, that can differentiate between different types of cancer. The methods disclosed herein are applicable to any type of cancer, including, but not limited to: lung cancer, bladder cancer, breast cancer, colorectal cancer, prostate cancer, gastric cancer, skin cancer (e.g. melanoma), cancer affecting the nervous system, bone cancer, ovarian cancer, liver cancer (e.g. hepatocellular carcinoma), hematologic malignancies, pancreatic cancer, kidney cancer, cervical cancer. Each type of cancer is a separate embodiment of the present invention. The methods of the present invention may also be applied to identify tissue-specific methylation markers. For example, to identify methylation markers specific for: lung, bladder, breast, colorectal, prostate, gastric, ovarian, pancreas, kidney, cervical tissue. Each type of tissue is a separate embodiment of the present invention. Such markers may be used, for example, to identify the tissue source of circulating cell-free DNA.
The methods of the present invention may be applied for identifying a disease (e.g., a cancer) in a subject. "Identifying a disease" as used herein encompasses any one or more of screening for the disease, detecting the presence or absence of the disease, detecting recurrence of the disease, detecting susceptibility to the disease, detecting response to
treatment, determining efficacy of treatment, determining stage (severity) of the disease, determining prognosis and early diagnosis of the disease in a subject. Each possibility represents a separate embodiment of the present invention.
"Assessing cancer " or "assessing the presence of cancer" or "assessing the presence or absence of cancer" as used herein refer to determining the likelihood that a subject has cancer. The terms encompass determining whether a subject should be subjected to confirmatory cancer testing to confirm (or rule out) the presence of cancer, such as confirmatory blood tests, urine tests, cytology, imaging, endoscopy and/or biopsy. The terms further encompass aiding the diagnosis of cancer in a subject. The terms further encompass quantifying cancer-related changes in cell-free DNA samples which are indicative for the presence of cancer. Assessing the presence of cancer according to the present invention includes one or more of screening for cancer, assessing recurrence of cancer, assessing susceptibility or risk to cancer, assessing and/or monitoring response to treatment, assessing efficacy of treatment, assessing severity (stage) of cancer and assessing prognosis of cancer in a subject. Each possibility represents a separate embodiment of the present invention. It is to be understood that a negative result in the assays disclosed herein is still considered an assessment for the presence of cancer according to the present invention.
The methods of the present invention may further include a step of determining a tumor fraction, or fractional concentration of tumor DNA. Tumor fraction is the proportion of tumor molecules in a cfDNA sample.
Determining a "methylation profile" (or "DNA methylation profile" or "methylation profile of a DNA sample") as disclosed herein refers to determining methylation values at one or more restriction loci, preferably at a plurality of restriction loci. In some embodiments, determining a methylation profile comprises determining levels of methylated and unmethylated DNA at one or more restriction loci, preferably at a plurality of restriction loci.
A "reference methylation profile" as disclosed herein refers to a methylation profile determined in DNA from a known source. A "reference DNA sample" is a DNA sample from a known source. In some embodiments, a reference methylation profile is a profile determined in a plurality of reference DNA samples. In addition, the methods of the present invention may be used for analyzing (e.g., measuring) methylation changes between DNA samples taken from a single subject at different time points, for example, taken at different
stages of a disease, or taken before and after treatment of a disease. The methylation profile of the DNA sample taken at a first time point may be used as a reference for the methylation profile of a DNA sample taken at a second (later) time point.
A "reference methylation level" for a particular restriction locus or a particular genomic region spanning a plurality of restriction loci is the level of methylation measured for the particular restriction locus/genomic region in DNA from a known source. A "reference methylation value" for a particular restriction locus or a particular genomic region spanning a plurality of restriction loci is a numerical value representing the level of methylation of the particular restriction locus/genomic region in DNA from a known source.
A "reference level of unmethylated DNA" for a particular restriction locus or a particular genomic region spanning a plurality of restriction loci is the level of unmethylated DNA measured for the particular restriction locus/genomic region in DNA from a known source.
According to some embodiments, the methods disclosed herein are diagnostic methods. According to some embodiments, diagnostic methods disclosed herein comprise pre-determination of reference methylation and/or unmethylation from disease DNA. In some embodiments, diagnostic methods of the present invention comprise predetermination of reference methylation and/or unmethylation from normal DNA as disclosed herein.
Tissue-specific methylation profile can also be characterized using the methods disclosed herein, in order to establish normal non-cancer DNA methylation profile of the tissue. Alternatively or additionally, tissue-specific methylation profile can be characterized in order to identify the tissue source of circulating cell-free DNA.
According to some embodiments, detecting methylation changes according to the present invention comprises identifying the presence or absence of a certain disease in a subject, based on the methylation profile of a DNA sample from the subject.
According to some embodiments, a method for identifying the cell source or tissue source of a DNA sample is provided (e.g., identifying what is the type of tissue from which the DNA is derived, and/or identifying whether the DNA is derived from normal or diseased cells/tissue).
A person of skill in the art would appreciate that the comparison of DNA methylation values and/or unmethylation values calculated for a tested sample to one or more
corresponding reference values may be performed in a number of ways, using various statistical means.
According to some embodiments, the methods disclosed herein comprise comparing a plurality of values calculated for a plurality of restriction loci to their corresponding healthy and/or disease references values. In some embodiments, a pattern of values is analyzed using statistical means and computerized algorithm to determine if it represents a pattern of a disease in question or a normal, healthy pattern. Exemplary algorithms include, but are not limited to, machine learning and pattern recognition algorithms.
Additional genetic and epigenetic characterization
In addition to DNA methylation/unmethylation values, it is possible to obtain from the same sequencing data disclosed herein information on DNA mutations, copy number changes, and nucleosome positioning for cell-free DNA. Generally, cell-free DNA circulates in fragments ranging between 120-220 bp. This pattern agrees with the length of DNA wrapped around a single nucleosome, plus a short stretch of ~ 20 bp (linker DNA) bound to a histone. As nucleosome positioning varies between different tissues, and in malignant cells, the pattern of fragmentation has been shown to aid in determining the predominant cell-type of origin contributing to the cfDNA pool.
Advantageously, determination of DNA methylation profile and determination of at least one additional genetic or epigenetic characteristic as disclosed herein may be carried out based on the same sequencing data.
According to some embodiments, a sequencing-based assay as disclosed herein combines detection of methylation changes with mutation detection and analysis of additional epigenetic characteristics, all in one single assay. The assay advantageously allows combined analysis of small amounts of DNA in a single assay.
The combined analysis of methylation and additional genetic and epigenetic characteristics is useful in enhancing detection of cancer (or any other condition/tissue source).
According to some exemplary embodiments, a method for detecting the presence or absence of a cancer in a subject comprises:
(a) profiling methylation of the DNA sample as disclosed herein, to detect the presence or absence of hypermethylation at one or more cancer-associated genomic region; and
(b) one or more of: determining the presence or absence of one or more cancer-associated mutation (e.g., cancer-associated mutation in oncogenes/tumor suppressors); determining the presence or absence of cancer-associated copy number variation; and determining the presence or absence of cancer-associated nucleosomal positioning, wherein (a) and (b) are carried out using the same sequencing data, and wherein determining the presence of hypermethylation at one or more cancer- associated genomic region and at least one of: one or more cancer-associated mutation, cancer-associated copy number variation and cancer-associated nucleosomal positioning is indicative of the presence of cancer in the subject.
The non-methylation cancer-associated changes may be combined with methylation information in a dependent or independent manner, depending on whether or not the cancer- associated changes are found on the same DNA fragment, where changes that are found on the same fragment provide a stronger indication for the presence of cancer.
According to some embodiments, there is provided a method for profiling genetic and epigenetic characteristics of a DNA sample, the method comprising: profiling methylation of the DNA sample as disclosed herein; and determining at least one additional genetic or epigenetic characteristic of the DNA sample, wherein the at least one additional genetic or epigenetic characteristic is selected from DNA mutation, copy number variation and nucleosome positioning, wherein profiling the methylation and determining the at least one additional genetic or epigenetic characteristic are carried out using the same sequencing data, thereby profiling genetic and epigenetic characteristics of the DNA sample.
In some embodiments, there is provided a method for detecting the presence or absence of a disease in a subject, the method comprising: profiling methylation of the DNA sample as disclosed herein; and determining at least one additional genetic or epigenetic characteristic of the DNA sample, wherein the at least one additional genetic or epigenetic characteristic is selected from DNA mutation, copy number variation and nucleosome positioning.
Kits and reaction mixes
According to some embodiments, there is provided herein kits for analyzing a DNA sample. According to some embodiments, there is provided herein kits for detecting
methylation changes in a DNA sample. In some embodiments, there is provided herein kits, reaction mixes and methods for detecting genetic and epigenetic changes in a DNA sample. In additional embodiments, there is provided herein kits for detecting genetic and epigenetic changes in a DNA sample.
According to some embodiments, the kits and reaction mixes described herein are for profiling methylation of DNA samples according to the methods disclosed herein. In some embodiments, the kits and reaction mixes are for profiling genetic and epigenetic characteristics of DNA samples according to the methods disclosed herein. In additional embodiments, the kits and reaction mixes are for detecting genetic and epigenetic changes in a DNA sample according to the methods disclosed herein.
According to some embodiments, a kit or a reaction mix according to the present invention comprises components needed for DNA digestion in addition to the restriction enzyme(s), such as one or more buffers.
According to some embodiments, a kit or a reaction mix according to the present invention comprises components needed for DNA amplification.
According to some embodiments, the kit comprises at least one methylationsensitive restriction endonuclease or at least one methylation-dependent restriction endonuclease as described hereinabove. According to some embodiments, the kit comprises a digestion buffer comprising magnesium.
As used herein, the term "about", when referring to a measurable value is meant to encompass variations of +/- 10%, for example +/-5%, +/- 1 % , and +/-0.1% from the specified value.
The following examples are presented in order to more fully illustrate certain embodiments of the invention. They should in no way, however, be construed as limiting the broad scope of the invention. One skilled in the art can readily devise many variations and modifications of the principles disclosed herein without departing from the scope of the invention.
EXAMPLES
Example 1 -Library yields
Several samples of Cell-free DNA (cfDNA) (30ng from each sample) were subjected to digestion using HinPlI and Acil in CutSmart® buffer (NEB) for 2h at 37°C followed by heat inactivation at 65°C for 20min. The digested DNA samples were pooled
together and half of the digested amount was purified using QIAquick™ PCR purification kit. Next, a sequencing library was prepared from 7ng of purified digested DNA using NEB commercial kit (NEBNext® Ultra™ II DNA Library Prep Kit for Illumina) according to manufacturer instructions. In addition, sequencing libraries were prepared from the same amount of unpurified digested DNA (namely, 7ng), according to NEB commercial protocol or according to a modified protocol in which one or more parameter is modified, as detailed in Figures 1A-1E. Following library preparation DNA was quantified using Qubit. Library yields are shown in Figures 1A-1E.
The initial amount of digested DNA that was available for library preparation was the same in all protocols (7ng DNA), meaning that the library yield for each protocol reflects the efficiency of the various library preparation reactions (end-repair, A-tailing, adapter ligation). As can be seen in the figures, preparing a sequencing library from an unpurified DNA sample according the commercial protocol resulted in a lower library yield compared to a purified DNA sample, reflecting a significant decrease in the efficiency of the library preparation reactions when unpurified DNA is used. As can be seen further in the figures, longer end repair incubation, longer adapter ligation incubation, lower adapter ligation temperature and/or higher adapter concentration improved the library yield of unpurified samples. Remarkably, a combination of these adjustments provided a library yield which is similar to that obtained by processing a purified sample using the commercial protocol (144ng vs. 162.03ng, respectively). Thus, a protocol according to the present invention applied to an unpurified sample provides substantially the same efficiency as a purified sample.
Example 2 -Library yields (II)
Samples of cell-free DNA (30ng) were subjected to digestion as described in Example 1 and subsequently to library preparation according to the following protocols:
Protocol 1: digested DNA was purified as described in Example 1 and subjected to library preparation using NEB commercial kit according to manufacturer instructions.
Protocol 2: digested DNA was subjected to library preparation using NEB commercial kit without purifying the DNA prior to library preparation.
Protocol 3: digested DNA was subjected to library preparation according to a modified protocol in which end-repair and adapter ligation are carried out under modified
conditions and a modified concentration of adapters (see Table 1) without purifying the digested DNA prior to library preparation.
Following library preparation, the DNA was cleaned and quantified as described in Example 1. Library yields are shown in Figure 2. As can be seen in the figure, processing a DNA sample according to the commercial protocol without purifying the DNA between the digestion and subsequent library preparation steps resulted in a lower yield of library DNA compared to processing a DNA sample according to the commercial kit with purification after the digestion (325.3 ng vs. 197.4 ng, respectively). Even though the initial amount of DNA that was available for library preparation was higher in the unpurified sample than the purified sample (since DNA was lost during the purification), the efficiency of the various library preparation reactions was significantly reduced in the unpurified sample, resulting in the lower yield.
Remarkably, the DNA sample that was processed according to the modified protocol without purification after digestion provided the highest yield of adapter-ligated DNA - 550 ng, indicating an efficient library chemistry despite the presence of impurities from the digestion step, along with preservation of the DNA material by avoiding the purification step.
Example 3 - Sequencing depth
Cell-free DNA (30ng) was subjected to digestion as described in Example 1 and purified. A sequencing library for targeted sequencing was prepared from the purified sample using Agilent™ commercial kit (Agilent SureSelect XT H2S™ Library preparation kit for Illumina) according to manufacturer instructions. In parallel, cfDNA (30ng) was subjected to digestion without purification, and a sequencing library for targeted sequencing
was prepared from the unpurified sample according to the modified protocol described in Example 2 (Table 1).
For both samples, unique molecular identifiers (UMIs), marking the original DNA molecules, were included in the sequencing adapters.
Both samples were subjected to target capture using Agilent SureSelect XT HS2™ Target Enrichment kit and SureSelect™ custom probes panel and subsequently the samples were sequenced using Illumina NovaSeq 6000 system.
The library preparation step included 6 PCR cycles in the modified protocol. In addition, 10 post-hybridization PCR cycles were performed.
Figure 3 shows the average coverage per target after collapsing all reads with the same UMI, thus looking at the depth of original DNA molecules (without PCR amplified ones). The results demonstrate that a library preparation protocol according to the present invention enables sequencing of more original molecules than a commercial protocol: the average coverage per target using the protocol according to the present invention was approximately 2000 hits per target, whereas the average coverage per target using the commercial protocol was less than 1700 hits per target. The protocol according to the present invention does not require purification of the DNA sample following digestion, thus avoiding loss of DNA material, yet maintains efficient library chemistry, resulting in large amounts of library DNA for sequencing and analysis.
Example 4 - qPCR analysis after DNA digestion performed in the same reaction mix
A. A series of digestion/qPCR buffers comprising different concentrations of MgCh were created: Tris-CHaCOOH 20 mM, KC1 50 mM, MgCh 2-6 mM, recombinant Albumin 100 ug/ml, and also: Acil+HinPlI digesting enzymes, PCR primers and probe, and Taq Polymerase. The functionality of several common MSREs, along with Taq polymerase, was assessed against gBlocks (Integrated DNA Technologies) and human blood cfDNA targets following up to 16 hours of digestion. The impact of the buffers on digestion and qPCR (QuantStudio 7 Pro, Thermo Fisher) phases were also separately interrogated, as well as contrasted with a conventional two-stage process. qPCR curve behavior (Cq values, shape, and terminal plateaus) served as the primary readouts, with secondary Bioanalyzer/Fragment Analyzer (Agilent) runs used to assess off-targeting.
The results showed that both Acil and HinPlI tolerated a wide range of cation levels, with overall performance ultimately confined by Mg concentrations ranging from 2 to 4 mM. The analysis showed that the targeted locus was methylated and uncut, while the unmethylated locus was digested without any amplification of the targeted DNA (Figure 4A). Although these values were within standard Taq dependencies, they were notably low for typical endonuclease activity. MSRE function was ultimately preserved by a combination of reserve kinetic capacity and extended incubation times, allowing for near total digestion of analytes. Importantly, this single digestion/qPCR reaction allowed for substantial retention of putative qPCR cfDNA fragments, increasing rare target sensitivity by 2-fold over a two-stage approach.
B. A series of qPCR buffers comprising 0, 2, 4, 6, 8, or lOmM of MgCh was created. The influence of the Mg concentration on qPCR efficiency was evaluated by comparing Cq values of several genomic loci (denoted herein IR, L3032, L3124 and LEI). The results are summarized in Figure 4B. Mg concentrations below 2 mM did not facilitate amplification, as the DNA polymerase that was used is Mg-dependent. When the Mg concentration exceeded 6 mM, a significant slowdown in the reaction was observed.
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. The means, materials, and steps for carrying out various disclosed chemical structures and functions may take a variety of alternative forms without departing from the invention.
Claims
1. A method for preparing a DNA sample for methylation analysis, the method comprising:
(i) subjecting the DNA sample to digestion with at least one methylation-sensitive or methylation-dependent restriction endonuclease in a digestion reaction mix supporting cleavage of the DNA sample by the at least one methylationsensitive or methylation-dependent restriction endonuclease, to obtain restriction endonuclease-treated DNA; and
(ii) preparing a sequencing library from the restriction endonuclease-treated DNA without purifying the restriction endonuclease-treated DNA from the digestion reaction mix, wherein preparing the sequencing library comprises:
(a) performing end-repair by forming a mixture of the endonuclease- treated DNA and an end-repair reaction mix without purifying the endonuclease-treated DNA prior to forming the mixture, and incubating the mixture for 45 minutes - 4 hours at 15-25°C and subsequently for 20-45 minutes at 60-75°C, to obtain end-repaired DNA; and
(b) ligating sequencing adapters to the end-repaired DNA by mixing an adapter ligation reaction mix with the end-repaired DNA, wherein the adapter ligation mix is mixed to obtain an adapter concentration of 0.08pM-0.4pM and incubating for 45 minutes - 20 hours at 2-20°C, to obtain a library of adapter- ligated DNA for high-throughput sequencing.
2. A method for preparing a cell-free DNA sample for methylation analysis, the method comprising:
(i) subjecting cfDNA extracted from 6-10 ml blood to digestion with at least one methylation- sensitive or methylation-dependent restriction endonuclease in a digestion reaction mix supporting cleavage of the DNA sample by the at least one methylation- sensitive or methylation-dependent restriction endonuclease, to obtain restriction endonuclease-treated DNA; and
(ii) preparing a sequencing library for target- specific high-throughput sequencing from the restriction endonuclease-treated DNA without purifying the restriction endonuclease-treated DNA from the digestion reaction mix, wherein preparing the sequencing library comprises:
(a) performing end-repair by forming a mixture of the endonuclease- treated DNA and an end-repair reaction mix without purifying the endonuclease-treated DNA prior to forming the mixture, and incubating the mixture for 45 minutes - 4 hours at 15-25°C and subsequently for 20-45 minutes at 60-75°C, to obtain end-repaired DNA;
(b) ligating sequencing adapters to the end-repaired DNA by mixing an adapter ligation reaction mix with the end-repaired DNA, wherein the adapter ligation mix is mixed to obtain adapter concentration of 0.08pM-0.4pM, and incubating for 45 minutes - 20 hours at 2-20°C, to obtain adapter-ligated DNA; and
(c) subjecting the adapter-ligated DNA to target capture, to enrich target sequences of interest, thereby obtaining a sequencing library for target- specific high-throughput sequencing providing an average coverage per target of at least 1800 reads per target.
3. A method for analyzing methylation of a DNA sample, the method comprising:
(A) preparing a sequencing library from the DNA sample according to claim 1 or claim 2;
(B) sequencing the sequencing library by a high-throughput sequencing method to provide sequencing data; and
(C) determining from the sequencing data a methylation value for at least one restriction locus.
4. The method of claim 3, wherein the at least one restriction locus is a plurality of restriction loci.
5. The method of any one of claims 1-4, wherein the incubating in step ii(a) is carried out for 45-90 minutes at 15-25°C and subsequently for 20-45 minutes at 60-75°C.
6. The method of any one of claims 1-5, wherein the incubating in step ii(a) is carried out for about 60 minutes at about 20°C and subsequently for about 30 minutes at about 65°C.
7. The method of any one of claims 1-6, wherein the adapter ligation mix is mixed in step ii(b) to obtain an adapter concentration of 0.2pM, and the incubating is for 1-18 hours at 4-18°C.
8. The method of any one of claims 1-7, wherein the adapter ligation mix is mixed in step ii(b) to obtain an adapter concentration of about 0.2pM, and the incubating is for about 16 hours at about 16°C.
9. The method of any one of claims 1-8, wherein the incubating in step ii(a) is carried out for about 60 minutes at about 20°C and subsequently for about 30 minutes at about 65°C, to obtain end-repaired DNA; and the adapter ligation mix is added in step ii(b) to obtain an adapter concentration of 0.2pM, and the incubating is for about 16 hours at about 16°C, to obtain a library of adapter-ligated DNA for high-throughput sequencing.
10. The method of any one of claims 1-9, wherein the at least one methylation-sensitive restriction endonuclease is selected from the group consisting of Acil, HinPlI and Hhal.
11. The method of any one of claims 1-10, wherein the at least one methylation-sensitive or methylation-dependent restriction endonuclease is a plurality of methylationsensitive or methylation-dependent restriction endonucleases, and wherein the digestion with the plurality of methylation-sensitive or methylation-dependent restriction endonucleases is a simultaneous digestion.
12. The method of any one of claims 1-11, wherein step (i) comprises digestion with a combination of restriction enzymes comprising HinPlI and Acil.
13. The method of any one of claims 1-12, wherein step (i) comprises digestion with a combination of restriction enzymes consisting of HinPlI and Acil.
14. The method of any one of claims 1-13, wherein the DNA sample is cell-free DNA from a human plasma sample.
15. The method of claim 14, wherein the amount of the cell-free DNA is an amount obtained from 6-10 ml of blood.
16. The method of claim 14, wherein the amount of cell-free DNA is between 1-400 ng.
17. The method of any one of claims 1-16, wherein the DNA sample is from a subject suspected of having a disease and/or a subject at risk of developing a disease, and the method comprises detecting methylation changes and determining whether the DNA sample is a healthy or disease DNA sample.
18. The method of claim 17, wherein the disease is cancer.
19. The method of claim 18, wherein the cancer is lung cancer.
20. A reaction mix for adapter ligation comprising 0.08pM-0.4pM sequencing adapters, a DNA ligase and l-400ng DNA that was subjected to methylation- sensitive or methylation-dependent enzymatic digestion and end-repair.
21. A method for profiling methylation of a DNA sample from a subject, the method comprising:
(i) subjecting the DNA sample to digestion with at least one methylation-sensitive or methylation-dependent restriction endonuclease, to obtain restriction endonuclease-treated DNA; and
(ii) PCR amplifying from the restriction endonuclease-treated DNA at least one restriction locus, wherein the PCR amplification is carried out in the same reaction mix as the digestion, without adjusting the reaction mix between the digestion and the PCR amplification steps.
22. The method of claim 21, wherein the digestion is with Acil and/or HinPlI.
23. The method of claim 21 or claim 22, wherein the PCR is quantitative PCR (qPCR).
24. The method of any one of claims 21-23, wherein the reaction mix comprises between 2.6 mM divalent cation(s); optionally wherein the reaction mix comprises between 2-4 mM divalent cation(s).
25. The method of claim 24, wherein the divalent cation(s) is selected from the group consisting of Mg2+, Mn2+, Ca2+, Fe2+, Co2+, Ni2+, Zn2+, or Cd2+; optionally wherein the divalent cation is magnesium (Mg2+).
26. The method of any one of claims 21-25, wherein the reaction mix comprises between 2-6 mM Mg2+; optionally wherein the reaction mix comprises between 2-4 mM Mg2+.
27. The method of any one of claims 21-26, wherein the reaction mix comprises between 2-6 mM MgCh; optionally wherein the reaction mix comprises between 2-4 mM MgCh
28. The method of any one of claims 21-27, wherein the DNA digestion step is up to 20 hours; optionally wherein the DNA digestion step is up to 16 hours.
29. The method of any one of claims 21-28, wherein the DNA sample is cell-free DNA.
30. The method of any one of claims 21-29, wherein the amplification step comprises coamplifying at least one restriction locus differentially methylated between normal and disease DNA and at least one control locus; optionally wherein the control locus is not digested by the at least one methylation-sensitive or methylation-dependent restriction endonuclease.
31. A combined DNA digestion and PCR reaction mix, wherein the reaction mix comprises 2-6 mM divalent cation(s), at least one methylation-sensitive or methylation-dependent restriction endonuclease, and a DNA polymerase.
32. The reaction mix of claim 31, wherein the divalent cation is magnesium; optionally wherein the reaction mix comprises 2-6 mM MgCh; such as 2-4 mM MgCh.
33. The reaction mix of claim 31 or claim 32, comprising Acil and/or HinPlI.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| IL309867A IL309867A (en) | 2023-12-31 | 2023-12-31 | Methods for preparing libraries for DNA sequencing |
| IL309867 | 2023-12-31 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025141580A1 true WO2025141580A1 (en) | 2025-07-03 |
Family
ID=96217426
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IL2024/051230 Pending WO2025141580A1 (en) | 2023-12-31 | 2024-12-29 | Methods for preparing dna sequencing libraries |
Country Status (2)
| Country | Link |
|---|---|
| IL (1) | IL309867A (en) |
| WO (1) | WO2025141580A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022107145A1 (en) * | 2020-11-19 | 2022-05-27 | Nucleix Ltd. | Detecting methylation changes in dna samples using restriction enzymes and high throughput sequencing |
| WO2023283591A2 (en) * | 2021-07-07 | 2023-01-12 | The Regents Of The University Of California | Methods of methylation analysis for disease detection |
| WO2023227954A1 (en) * | 2022-05-22 | 2023-11-30 | Nucleix Ltd. | Sample preparation for cell-free dna analysis |
-
2023
- 2023-12-31 IL IL309867A patent/IL309867A/en unknown
-
2024
- 2024-12-29 WO PCT/IL2024/051230 patent/WO2025141580A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022107145A1 (en) * | 2020-11-19 | 2022-05-27 | Nucleix Ltd. | Detecting methylation changes in dna samples using restriction enzymes and high throughput sequencing |
| WO2023283591A2 (en) * | 2021-07-07 | 2023-01-12 | The Regents Of The University Of California | Methods of methylation analysis for disease detection |
| WO2023227954A1 (en) * | 2022-05-22 | 2023-11-30 | Nucleix Ltd. | Sample preparation for cell-free dna analysis |
Non-Patent Citations (2)
| Title |
|---|
| "Nanotoxicity : Methods and Protocols Methods in molecular biology ", vol. 1894, 1 January 2019, SPRINGER NEW YORK, New York, NY, ISBN: 9781493989157, ISSN: 1064-3745, article FENG LINGFANG, LOU JIANLIN: "Chapter 12: DNA Methylation Analysis", pages: 181 - 227, XP009563858, DOI: 10.1007/978-1-4939-8916-4_12 * |
| LI DAOFENG, ZHANG BO, XING XIAOYUN, WANG TING: "Combining MeDIP-seq and MRE-seq to investigate genome-wide CpG methylation", METHODS, vol. 72, 1 January 2015 (2015-01-01), NL , pages 29 - 40, XP055784890, ISSN: 1046-2023, DOI: 10.1016/j.ymeth.2014.10.032 * |
Also Published As
| Publication number | Publication date |
|---|---|
| IL309867A (en) | 2025-07-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113661249A (en) | Compositions and methods for isolating cell-free DNA | |
| US20240026453A1 (en) | Detecting methylation changes in dna samples using restriction enzymes and high throughput sequencing | |
| KR20220092561A (en) | Ovarian Cancer Detection | |
| CN106460046A (en) | Detecting colorectal neoplasm | |
| JP7788372B2 (en) | Method for library preparation to enrich for informative DNA fragments using enzymatic digestion - Patent Application 20070122999 | |
| EP3541950A1 (en) | Multimodal assay for detecting nucleic acid aberrations | |
| US12252747B2 (en) | Unbiased DNA methylation markers define an extensive field defect in histologically normal prostate tissues associated with prostate cancer: new biomarkers for men with prostate cancer | |
| CA3173044A1 (en) | Methods and kits for screening colorectal neoplasm | |
| TW202417642A (en) | Methylation markers for identifying cancer and the applications | |
| WO2025141580A1 (en) | Methods for preparing dna sequencing libraries | |
| WO2024114696A1 (en) | Cpg island methylation enrichment sequencing technology based on restriction enzyme digestion | |
| US20250327115A1 (en) | Reaction buffer compositions and methods for dna amplification and sequencing | |
| WO2023228174A9 (en) | Useful combinations of restriction enzymes | |
| EP2978861B1 (en) | Unbiased dna methylation markers define an extensive field defect in histologically normal prostate tissues associated with prostate cancer: new biomarkers for men with prostate cancer | |
| WO2022232795A1 (en) | Compositions and methods related to modification and detection of pseudouridine and 5-hydroxymethylcytosine | |
| WO2024241089A1 (en) | Detecting methylation changes in dna samples using restriction enzymes and high throughput sequencing | |
| AU2024276412A1 (en) | Detecting methylation changes in dna samples using restriction enzymes and high throughput sequencing | |
| US20250011858A1 (en) | Whole genome cpg analysis | |
| JP6418594B2 (en) | Method for obtaining information on endometrial cancer, and marker and kit for obtaining information on endometrial cancer | |
| Kumar et al. | Techniques/Tools to Study Epigenetic Biomarkers in Human Cancer Detection | |
| WO2024157256A1 (en) | Markers of disease | |
| WO2025019370A1 (en) | Methods for assaying circulating tumor dna | |
| JP2005192421A (en) | Nucleic acid amplification primer, nucleic acid amplification primer set and method for inspecting cancer therewith | |
| CN115896276A (en) | Tumor detection method and application |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24911761 Country of ref document: EP Kind code of ref document: A1 |