NL2036501B1 - Method for sequencing nucleic acid molecules and related methods. - Google Patents
Method for sequencing nucleic acid molecules and related methods. Download PDFInfo
- Publication number
- NL2036501B1 NL2036501B1 NL2036501A NL2036501A NL2036501B1 NL 2036501 B1 NL2036501 B1 NL 2036501B1 NL 2036501 A NL2036501 A NL 2036501A NL 2036501 A NL2036501 A NL 2036501A NL 2036501 B1 NL2036501 B1 NL 2036501B1
- Authority
- NL
- Netherlands
- Prior art keywords
- dna
- stranded dna
- molecules
- sample
- sequencing
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention is related to methods for sequencing nucleic acid molecules and kits useful in such methods. The invention is in particular related to a method for sequencing nucleic acid molecules in a sample comprising a plurality of linear DNA molecules; contacting said sample with a terminal deoxynucleotidyl transferase (TdT) and a mixture of two or more different nucleotides under conditions in which nucleotides in the mixture are sequentially and randomly added to the 3’ termini of the linear DNA molecules by the TdT 10 and denaturing, when present, double stranded DNA in said sample, thereby generating linear single-stranded DNA molecules comprising sequence identifiers at their 3’ termini. The linear DNA molecules comprising the sequence identifiers a circularized with a singlestranded DNA ligase and the circles in the sample are subjected to rolling circle amplification. The amplification products are subsequently sequenced whereby sequence 15 identifiers are used to assemble sequences from the same original nucleic molecule in the sample. The invention also relates to kits comprising materials that are useful in the performing the methods.
Description
P136423NL00
Title: Method for sequencing nucleic acid molecules and related methods.
The present invention relates to the field of molecular diagnostics, more specifically to methods and kits for barcoding, amplifying, and sequencing a plurality of nucleic acid molecules and the use thereof for diagnostic purposes, particularly in the field of cancer.
Circulating free DNA, often released into the bloodstream by apoptotic and necrotic cells, serves as a rich source of genetic information, including mutations associated with cancer.
The accurate identification and characterization of this circulating free DNA can be used in tailoring precise and personalized cancer treatment strategies. Regrettably, prevailing methodologies are impeded by challenges such as the low abundance of tumor-derived cfDNA, its fragmentation, and the presence of non-tumor-derived DNA, thereby compromising the precision and reliability of sequencing results.
This patent application seeks to introduce a comprehensive solution to the limitations tied to existing techniques, thereby propelling the field of liquid biopsy diagnostics and enhancing the personalized care of cancer patients. The means and methods not only provide enhanced sensitivity, specificity, and overall efficiency of cfDNA analysis in cancer patients but also overcome the hurdles associated with accurate profiling of efDNA in diverse cancer contexts.
Currently employed technologies for cfDNA sequencing primarily include next-generation sequencing (NGS) methods. These encompass approaches such as targeted sequencing, whole exome sequencing. and whole genome sequencing. Targeted sequencing focuses on specific genomic regions of interest, offering a cost-effective and focused approach, while whole exome sequencing and whole genome sequencing provide a broader scope, enabling comprehensive analysis of the entire exome or genome, respectively.
Despite their utility, existing technologies face limitations in accurately capturing low- abundance tumor-derived cfDNA, and the patent application herein discloses a novel sequencing method that significantly augments the precision and reliability of cfDNA analysis in cancer patients. The subsequent sections provide a detailed exposition of the inventive steps and distinctive features underpinning our improved sequencing methodology.
Circulating free DNA (also known as cell-free DNA) are degraded DNA fragments released to body fluids such as blood plasma, urine, cerebrospinal fluid, etc. refers to DNA that is found outside of cells in the bloodstream or other bodily fluids. It is released into the circulation through processes such as cell death (apoptosis and necrosis) and is being increasingly studied for its potential as a non-invasive biomarker for various medical conditions, including cancer. cfDNA is typically small because it originates from the fragmentation of DNA released from cells. The size of cfDNA fragments is influenced by various biological processes, and the primary reasons for the small size of cfDNA include Apoptosis and Necrosis, During apoptosis, DNA is cleaved into fragments as part of the cellular breakdown. Necrosis, which is uncontrolled cell death often due to injury or disease, can also result in the release of
DNA fragments. Various enzymes, such as nucleases, are present in the cellular environment, the extracellular environment, and the various fluid streams, such as the bloodstream and urine. These enzymes can further degrade DNA into smaller fragments.
DNA fragments can be released into the bloodstream as part of normal physiological processes, such as tissue turnover and repair. Other processes, such as the clearance from the body, also affect the level and the size distribution of efDNA fragments.
The combination of these factors results in the presence of small DNA fragments in the hloodstream. In the context of liquid biopsy and diagnostic applications, the small size of cfDNA is advantageous because it allows for easier extraction and analysis. The analysis of c¢fDNA has become a valuable tool in medical research and clinical applications, such as the detection of genetic mutations, monitoring of cancer, and prenatal testing.
The fragments of cfDNA are often double-stranded and relatively short, typically a few hundred base pairs in length. These fragments can have different termini (ends), including blunt ends or staggered ends with overhangs.
The size distribution of cell-free DNA (cfDNA) can vary, but there are general trends observed in its fragmentation. The size range of cfDNA is commonly described in terms of its median or average size and its broader range. Keep in mind that these values can be influenced by factors such as the underlying biology of the source tissue, the specific extraction methods, and the analytical techniques used for measurement.
The median size of cfDNA is often reported to be around 160-180 base pairs (bp).
This means that, on average, half of the cfDNA fragments are shorter than this size, and half are longer. The broader size range of cfDNA fragments can extend from around 70 to several hundred base pairs. In healthy individuals, cfDNA fragments may vary in size, but they typically fall within this range.
In the context of cfDNA,. the organization of histones is disrupted due to the fragmentation of DNA. However, certain studies have explored the possibility of analyzing nucleosome- sized fragments within cfDNA, which could potentially provide information about the chromatin structure of the cells from which the ¢fDNA originated. The idea being that the sensitivity of DNA to cleavage and fragmentation is not uniformly distributed over the chromosomes and can vary over time, differentiation status, activation status, mutation status, and other factors.
Although many applications are being developed with ¢fDNA and its usefulness has been established, much still can be improved with respect to retrieving the information that is present in cfDNA. It is not easy to distill the above-mentioned information from the sequences of cf DNA determined with the present methods. For instance, it is difficult to get good information on the exact ends of the cfDNA or the relative abundance of specific cfDNA.
The invention provides a method for sequencing nucleic acid molecules, comprising: - providing a sample comprising a plurality of linear DNA molecules; - contacting said sample with a terminal deoxynucleotidyl transferase (TdT) and a mixture of two or more different nucleotides under conditions in which nucleotides in the mixture are sequentially and randomly added to the 3 termini of the linear DNA molecules by the TdT and denaturing, when present, double stranded DNA in said sample, thereby generating linear single-stranded DNA molecules comprising sequence identifiers at their 3 termini; - contacting the linear single-stranded DNA molecules comprising the sequence identifiers with a single-stranded DNA ligase under conditions in which the ends of linear single-stranded DNA molecules comprising the sequence identifiers are self-ligated thereby generating a plurality of circular single-stranded DNA molecules each comprising a DNA molecule and a sequence identifier; - contacting the plurality of circular single-stranded DNA molecules with a polymerase with strand displacement activity and a primer and performing a rolling circle amplification (RCA) under conditions in which a plurality of amplification products are produced each comprising one or more copies of a different circular single-stranded DNA molecule from the plurality of circular single-stranded DNA molecules; - sequencing the plurality of amplification products thereby generating a plurality of sequences of the plurality of amplification products; and - identifying one or more sequences of a single-stranded DNA molecule in the sample using a sequence identifier.
The invention also provides a kit for sequencing a plurality of DNA molecules, wherein the kit comprises a TdT, a single-strand DNA ligase, a polymerase with strand displacement activity, and at least two nucleotides.
The invention also allows for the detection of circular DNA in cfDNA samples.
Circular DNA in human cell-free DNA (cfDNA) samples is a noteworthy topic in molecular biology and genetics. Circular DNA, unlike the more common linear DNA, forms a closed loop and is often associated with extrachromosomal DNA elements. In humans, ¢fDNA primarily originates from the breakdown of cells and is present in bodily fluids like blood.
The presence of circular DNA next to linear DNA in ¢f DNA samples is significant because it can provide valuable information about various physiological and pathological states. For instance, circular DNA can be used as a biomarker for detecting and monitoring diseases, including cancers, where it may reflect tumor-specific genetic alterations. Additionally, the study of circular DNA in cfDNA can shed light on aging processes, as its accumulation in cells has been linked to cellular senescence and age-related diseases. Therefore, the presence and analysis of circular DNA in human cfDNA samples have substantial implications for diagnostics, prognostics, and understanding human biology at the molecular level. 5
Circular DNA present in the ¢fDNA fraction will not be provided with a sequence identifier by the TdT but it is still present in the reaction, and it will be amplified. The absence of a TdT-synthesized tag in the sequencing reads will then indicate the reads that were generated starting from a circular DNA source.
Figure 1: Schematic representation of classic barcoding strategies and TdT strategies.
Figure 2: Schematic representation of a method of the invention and the ability to distinguish between circular and linear DNA in a sample.
A plurality of nucleic acid molecules refers to a diverse set or a multitude of DNA (deoxyribonucleic acid) or RNA (ribonucleic acid) molecules. The term "plurality" in this context emphasizes the presence of a variety or multiple types of nucleic acid molecules.
These molecules can differ in their sequences, lengths, and functions, reflecting the genetic diversity within a biological system. In a genomic context, a plurality of nucleic acid molecules may encompass the entire genetic material of an organism or be a subset thereof, representing a part of the complexity and heterogeneity of the genetic information encoded within that collection of molecules.
Terminal transferases are enzymes that catalyze the addition of nucleotides to the 3' end (terminus) of a DNA or RNA molecule. Terminal deoxynucleotidyl transferase (TdT) is an enzyme particularly known for its ability to add nucleotides to the 3' ends of DNA in a template-independent manner, meaning that it does not require a DNA template to guide the addition of nucleotides. In humans, the protein is encoded by the DNTT gene. Aliases for the gene or protein in humans are DNA Nucleotidylexotransferase; TDT; Terminal
Deoxynucleotidyltransferase; Terminal Addition Enzyme; Terminal Transferase; or EC 2.7.7,31. For the present invention, it is preferred to use the commercially available human
TdT. However, when suitably available, it is possible to substitute this for the ortholog in other species, It is also possible to substitute the enzyme for another transferase with the same functional properties in kind, not necessarily in amount.
TdT is utilized in the labeling of DNA fragments for techniques such as DNA sequencing and in the addition of labeled nucleotides for various labeling experiments.
Important applications of TdT are in the cloning and other molecular biology techniques. By adding a homopolymeric tail to the 3' end of a DNA fragment, researchers can create a template for priming the synthesis of a complementary DNA strand using a primer complementary to the homopolymeric tail.
Homopolymeric tailing can be achieved by contacting a DNA molecule with TdT in the presence of only one type of nucleotide. TdT will add the nucleotides to the 3’ end of the
DNA molecule. Because only one type of nucleotide is added, the resulting sequence is known and can be used to design a primer to synthesize a reverse complementary strand of the DNA molecule.
In the present invention, the TdT is incubated under conditions that allow the addition of two or more different nucleotides to the 3’ termini of the plurality of single- stranded DNA molecules. In such circumstances, TdT can add nucleotides depending on the availability of the particular nucleotide in the mixture. The two or more different nucleotides are typically added in equimolar amounts. In such a case, when the two or more different nucleotides are present in equimolar amounts, for instance, the inclusion in the nascent sequence is apparently random. The complexity of newly formed sequences increases with the number of different nucleotides in the mixture and the number of nucleotides that is added to the 3 termini. When TdT is allowed to add one nucleotide to the 3 end, the complexity of the new sequences added is very low. When two different nucleotides, for instance A and T are used there are only two different ends, specifically: ends with an A and ends with a T. The complexity rapidly increases with the number of nucleotides that are added to the 3’ ends and the number of different nucleotides in the mixture. For two different nucleotides the number of different sequences formed is 2 where n is the average length of the newly formed sequences. For three different nucleotides there are 31 different sequences whereas for 4 or more the complexity is (4 or more)». The two or more different nucleotides are preferably three or more, preferably four, different nucleotides. The nucleotides are preferably A, C, Gand T.
The complexity (or number of different sequences) does not usually need to exactly match the number of sequences in the plurality of single-stranded DNA molecules.
Discrimination of different molecules in the mixture is aided by the increasing complexity of sequences in the plurality. The combination of an identifier and the sequence of the DNA molecule also provides identification information. So even when there are two or more identifiers with the same sequence, the linkage to target DNA molecules with a substantially different sequence allows adequate discrimination and subsequent identification of the target DNA molecules in the plurality of single-stranded DNA molecules. The complexity (number of different) sequence identifiers generated can be increased with an increased number of different nucleotides and increased length of the sequence identifier. The complexity can be increased to such an extent that each sequence identifier attached to the 3 end of the plurality of DNA molecules is unique. This is a nice feature but, as mentioned herein above, not an essential feature.
A method of the invention is particularly suited to sequence the entirety of linear
DNA fragments, including the ends of the fragments. Barcoding target DNA or providing
DNA with sequence identifiers is typically accomplished by capturing the DNA with a backbone or with adaptors by means of ligation. This typically involves manipulation of the ends of the DNA to allow the efficient ligation of the barcodes, causing the loss of terminal sequence. Other common methods rely on PCR to generate barcoded amplicons. This clearly leads to the loss of sequences outside the amplification range.
Instead, by using the method of this invention, a terminal transferase, such as preferably TdT, to produce unique identifiers and circularizing the resulting DNA, it is possible to sequence the target DNAs completely, including the ends of the fragments (Figure 1).
Conditions in which nucleotides in the mixture are sequentially and randomly added to the 3 termini of the DNA molecules by the TdT are known to the person skilled in the art. The human enzyme is well known, and various commercial sources are available.
Conditions encompass a suitable buffer for the enzyme to work in, a suitable temperature, and sufficient time to produce sequence identifiers of sufficient complexity at the 3 termini.
Upon provision with sequence identifiers, the DNA molecules comprising the sequence identifiers at their 3 termini are contacted with a single-stranded DNA ligase under conditions in which the ends of the DNA molecules comprising the sequence identifiers are self-ligated, thereby generating a plurality of circular single-stranded DNA molecules each comprising a DNA molecule and a sequence identifier. In a preferred embodiment, the single-stranded DNA ligase is a CirelLigase™., CircLigase™ is a thermostable ligase that catalyzes the intramolecular ligation (i.e. circularization) of ssDNA and ssRNA templates. Self-ligation can further be enhanced over intermolecular ligation by reducing the concentration of DNA molecules in solution by increasing the volume of the ligation reaction. The ligation reaction is performed using buffer conditions that support efficient ligation of the single strands. If necessary the DNA molecules comprising the sequence identifiers at their 3 termini are provided with a phosphate group at the 5-end prior to the ligation. This can be done with a suitable polynucleotide kinase (PNK) using the protocol of the manufacturer.
Methods of the invention are particularly suited to produce DNA circles that each have one single-stranded DNA molecule from the plurality of single-stranded DNA molecules and one sequence identifier.
The plurality of circular single-stranded DNA molecules is subsequently contacted with a polymerase with strand displacement activity and one or more primers whereupon a rolling circle amplification reaction is performed, thereby generating a plurality of amplification products, each comprising one or more copies of a circular single-stranded
DNA molecule from the plurality of circular single-stranded DNA molecules.
In embodiments of the invention, leftover linear DNA, if any, is preferably removed prior to the rolling circle amplification. Performing a rolling circle amplification after removing linear DNA typically produces more high molecular weight concatemers of backbone and target DNA.
S
Methods further include subjecting DNA circles that are produced in the ligation reaction to an amplification reaction, preferably a rolling circle amplification (RCA). Rolling circle amplification produces an ordered array of copies of at least two of said DNA circles.
Rolling circle amplification produces DNA molecules of high molecular weight. Which is suited for sequencing. The multiple copies of the same circle and, thus, the same DNA molecule from the plurality and the same sequence identifier allow for a more accurate sequence determination and reduce sequencing errors.
Rolling circle amplification has recently been reviewed by Mohsen and Kool (2016)
Acc Chem Res. Vol 49(11): pp 2540-2550: Published online 2016 Oct 24. doi: 10.102 1/acs.accounts.6b00417. The terms rolling circle amplification and rolling circle replication are sometimes used interchangeably in the art. In other instances, rolling circle replication is used to refer to the replication of naturally occurring plasmid and virus genomes, The terms refer to a similar underlying principle, i.e. the repeated copying of the same circular DNA producing a longer nucleic acid molecule with an ordered array of backbone-target nucleic acid copies. Present techniques for rolling circle amplification enable the production of large arrays containing many copies of the produced DNA circles.
Concatemers can have 2 or more copies, preferably 4 or more copies of the produced circles.
Rolling circle amplification is performed by a polymerase with strand displacement activity and requires the usual priming sequence to generate the start. Particular polymerases with strand displacement activity and with high processivity are available to produce concatemers of considerable length. Polymerases with high processivity are polymerases that can polymerize a thousand nucleotides or more without dissociating from the DNA template. They can preferably polymerize two, three, four thousand nucleotides or more without dissociating from the DNA template. Polymerases with high processivity are among others discussed in Kelman et al; 1998: Structure Vol 6; pp 121-125. Rolling circle amplification can yield very high molecular weight concatemers using polymerases with high processivity and strand-displacement capacity, such as phi29 polymerase. This polymerase can polymerize 10 kb or more. High processivity polymerases are, therefore. preferably polymerases that polymerize 10 kb or more without dissociating from the DNA template (Blanco et al; 1999. J. Biol. Chem. 264 (15): 8935-40). The polymerization can be started on a nick in the double-strand DNA, or the DNA can be melted and annealed in the presence of one or more suitable primers. Examples of suitable primers are random hexamer primers, one or more backbone-specific primers, one or more target nucleic acid- specific primers, or a combination thereof. Random primers are typically preferred when target nucleic acid sequences are not known or when a variety of target nucleic acid sequences are to be sequenced. One or more specific primers can be used to sequence- specific target nucleic acids of which the basis sequence is known. A variant is one or more primers that are specific for one or more particular sequences in a plurality of DNA molecules. Such primers can be used in different situations, for instance, in the focused sequencing of particular sets of sequences, for instance, sequences known to he associated with certain cancers or certain pathogens. The polymerase is preferably Phi29 DNA
Polymerase, Bst DNA Polymerase, or Vent (exo-) DNA Polymerase. In a preferred embodiment, the polymerase is Phi29 DNA Polymerase. The Phi29 DNA Polymerase is preferably EquiPhi29 DNA polymerase. Both are available from Thermofisher. EquiPhi29
DNA polymerase is a mutant of Phi29 DNA polymerase with particularly favorable properties.
Sequencing methods have evolved over time. The old Sanger sequencing method has been replaced by the now common next-generation sequencing (NGS) methods. These methods have recently been reviewed by Goodwin et al. (2016: Nature Reviews | Genetics
Volume 17:pp 333-351: doi: 10.1038/nrg.2016.49). 'The most common NGS methods rely on the sequencing of short stretches of DNA. Sequencing techniques for short stretches of DNA suffer from inherent error profiles. Errors are reduced by independently sequencing multiple copies of the same target sequence. However, for each individual sequence read it is impossible to determine whether a change represents an error or a true mutation. The cumulative evidence across several independent sequence reads allows for the filtering of mutations introduced durmg amplification and errors in sequencing, Longer target DNAs can also be sequenced with short-read methods. This is typically done by sequencing overlapping fragments that can be aligned to create an assembled longer sequence. This so- called short read paired-end technique has been very successful in the sequencing of large target nucleic acid and has been instrumental in the various genome projects. The genome projects have revealed that genomes are highly complex with many long repetitive elements, copy number alterations and structural variations. Many of these elements are so long that short-read paired-end technologies are insufficient to resolve them. Long-read sequencing delivers reads in excess of several kilobases and allows for the resolution of these large structural features in whole genomes. Two popular platforms for long read sequencing are the Pacific Biosciences systems (the RSII and the Sequel) and the Oxford
Nanopore systems (MK1 MinION and PromethION). Both are single-molecule sequencers,
Both platforms allow reads in excess of 55 kb and longer. However, these systems have even higher error rates than next (second) generation sequencers. These errors can be reduced by increasing the number of times the same target nucleic acid is sequenced (Goodwin et al 2016; doi: 10.1038/nrg.2016.49). For the present invention it is preferred to use a short read sequencing method, preferably the Illumina Sequencing Platform (Illumina sequencing). Illumina sequencing, also known as sequencing by synthesis, is a widely adopted high-throughput DNA sequencing technology. [Illumina sequencing is characterized by its high accuracy, high throughput, and relatively short read lengths. It is widely used for various applications, including whole-genome sequencing, resequencing, metagenomics, and transcriptomics. The technology has played a pivotal role in advancing genomics research and has become a cornerstone in many laboratories for its efficiency and cost-effectiveness.
Determined sequences can be aligned based on the sequence identifiers and, optionally, the determined sequence of a single-stranded DNA molecule. The sequences from the aligned different reads can be used to filter out errors and determine the sequence of one single-stranded DNA molecule from the plurality of single-stranded DNA molecules.
The plurality of linear single-stranded DNA molecules can be derived from any source. When the source is or contains linear double-stranded DNA, the source can be denatured to create the plurality of single-stranded DNA molecules. The denaturing step can be done before or after the step wherein the sequence identifier is added to the 3-end of the DNA molecules. The denaturing step is preferably done after the sequence identifier is added. With denaturing is meant that the two strands of a double stranded DNA are separated from each other. This can be done in various ways such as but not limited to a pH shift or a heat treatment. In the present invention it is preferred that the strands are separated by a heat treatment, such as an incubation at 98 °C for 10 minutes.
An RNA source can be copied into DNA and made single-stranded to become a plurality of single-stranded DNA molecules. To this end, it is preferred that a plurality of
RNA molecules is contacted with a reverse transcriptase under conditions in which RNA molecules are copied into cDNA molecules, thereby producing said plurality of single- stranded DNA molecules.
A plurality of single-stranded nucleic acid molecules is a plurality of linear single- stranded nucleic acid molecules. A plurality of single-stranded DNA molecules is a plurality of linear single-stranded DNA molecules. A plurality of single-stranded RNA molecules is a plurality of linear single-stranded RNA molecules.
The plurality of single-stranded molecules preferably comprises circulating free
DNA. DNA that circulates freely or that is associated with cellular particles in the blood or other bodily fluid samples is typically smaller than 400 nucleotides. Target nucleic acid molecules of such lengths are particularly suited to the methods of the invention. Other samples with relatively small nucleic acid molecules are some types of forensic samples, fossil samples, samples of nucleic acid isolated from environments that are inherently hostile to nucleic acid molecule integrity, such as stool samples, surface water samples, and other samples rich in microbial organisms. The sample can be any sample comprising one or more nucleic acid molecules of which the sequence is to be determined. One example is a sample comprising tumor DNA. The plurality of single-stranded DNA molecules can also be circulating tumor DNA (ctDNA) or cell-free DNA {(cfDNA) present in liquid biopsies, including but not limited to blood, saliva, pleural fluid, or ascites fluid. The plurality of single-stranded DNA molecules can also be single-stranded cDNA derived from messenger
RNA, microRNA, CRISPR RNA, non-coding RNA, viral RNA, or other sources of RNA. The plurality of single-stranded DNA molecules can also be derived from genomic DNA, PCR products, plasmid DNA, viral DNA, or other sources. The means and methods of the present invention are particularly suited for the sequencing of short DNA. Preferably 400 base pairs or smaller. DNA in the plurality of single-stranded DNA molecules is preferably 400 base pairs or less, more preferably 300 base pairs or less, more preferably 200 base pairs or less, more preferably 150 or less. The lower limit of the target DNA is preferably 20 base pairs, more preferably 30 base pairs, more preferably 40 base pairs and more preferably 50 base pairs. Any lower limit can be combined with any upper limit.
The size of DNA fragments is given in nucleotides here. This refers, of course, to the number in one strand.
The invention further provides a kit for sequencing a plurality of DNA molecules, wherein the kit comprises a TdT, a single-strand DNA ligase, a polymerase with strand displacement activity, and at least two nucleotides.
The kit preferably comprises CircLigase as a ligase and a Phi29 Polymerase as the polymerase.
CÍDNA holds significant diagnostic potential due to its unique properties and the wealth of information it carries. Released into the fluid streams such as the bloodstream and urine from various cells, including tumor cells, apoptotic cells, and normal cells, cfDNA represents a non-invasive source of genetic material. Its diagnostic utility lies, among others, in detecting specific genetic alterations, such as mutations, methylation patterns, and copy number variations, reflecting the genomic landscape of tissues of origin. In cancer diagnostics, the analysis of efDNA allows for the identification of tumor-specific mutations, enabling early detection, monitoring of treatment response, and the assessment of minimal residual disease. Furthermore, efDNA can be indicative of other pathological conditions, such as prenatal abnormalities or autoimmune disorders. The ability to obtain genetic information through a simple blood draw makes ¢fDNA a promising biomarker for diverse diagnostic applications, emphasizing its potential to revolutionize personalized medicine and enhance the precision of disease diagnosis and monitoring
The sample comprising circulating fluid is preferably a blood sample such as a serum sample, a urine sample, a cerebrospinal fluid sample, a lymph fluid sample, or a saliva sample. The sample is preferably a blood sample, preferably a serum sample.
Example 1: Sequencing Complete cfDNA Fragments for Cancer Diagnostics
The method of the present invention was applied to sequence cfDNA fragments obtained from a group of cancer patients. The objective was to capture the complete sequences of these fragments, including their terminal regions. which are often challenging to sequence but hold crucial information for cancer diagnostics.
Procedure: cfDNA Extraction: cfDNA was extracted from blood samples of patients using a standard cfDNA extraction kit (QIA kit) using the manufacturers protocol. The extracted cfDNA predominantly consisted of short, double-stranded DNA fragments.
Terminal Deoxynucleotidyl Transferase (TdT) Treatment: The cfDNA was treated with TdT (obtained from New England Biolabs) and a mixture of four different nucleotides (AT, C, (5; Sigma) using the manufacturers protocol, This step randomly added a sequence of nucleotides to the 3’ ends of each cfDNA fragment, serving as unique sequence identifiers.
Preparation of Single-Stranded DNA: The double-stranded cfDNA was denatured for 10 minutes at 98 °C to generate single-stranded DNA molecules.
Circularization: The modified cfDNA fragments were then treated with the single-strand
DNA ligase CircLigase (Biosearch technologies) using the manufacturers protocol, leading to the circularization of these molecules.
Rolling Circle Amplification (RCA): The circularized DNA fragments underwent RCA, using Phi29 DNA Polymerase (Thermofischer) using the manufacturers protocol, to produce a linear DNA molecules comprising multiple copies of a cfDNA fragment linked to a sequence identifier.
Sequencing: The amplified products were sequenced using the Illumina Sequencing
Platform, allowing for the identification of both the original efDNA sequences and the unique sequence identifiers,
This procedure is expected to result in:
Complete Sequencing of cfDNA Fragments: The method enabled the sequencing of entire cfDNA fragments, including their terminal regions. This comprehensive approach revealed mutations and alterations at the edges of cfDNA fragments, which are typically missed by conventional methods.
Utility of Terminal Sequences: The terminal regions of ¢cfDNA provided valuable information, such as the fragmentation patterns characteristic of certain types of cancers, and helped in distinguishing between cfDNA from tumor and non-tumor origins.
Example 2: Integration with Al for Enhanced Diagnostics
The sequences obtained, including the terminal regions, can be fed into an Al-based analytical model designed to identify patterns and mutations associated with various cancers. The Al model can then be trained to recognize cancer-specific signatures in the c¢fDNA sequences, including those present in the sequence edges.
Including these features is expected to bring:
Improved Diagnostic Accuracy: The Al model is expected to use the end-sequence information, together with other features, to distinguish between benign and malignant c¢fDNA samples.
Personalized Treatment Strategies: The model also provided insights into the genetic makeup of individual tumors, facilitating personalized treatment planning,
Conclusion
The described method represents a significant advancement in the field of cancer diagnostics through liquid biopsies. The ability to sequence complete cf DNA fragments, including their terminal regions, combined with Al-based analysis, offers a promising approach to early cancer detection and personalized treatment strategies.
Materials and Methods
Materials e Terminal Deoxynucleotidyl Transferase (NEB) e CircLigase (Biosearch) e EquiPhi29 DNA Polymerase (Thermofisher) eo Exo-resistant Random Primers (Thermofisher) e CoCl2 2.5mM e dNTPs 10mM s ATP 10mM s MnCl2 50mM s DTT 100mM e TdT Buffer (10X) o 500 mM Potassium Acetate o 200 mM Tris-acetate o 100 mM Magnesium Acetate e CircLigaseBuffer (10X) o MOPS 500 mM o KCl 100 mM o MgCl12 50 mM o DTT 10 mM e CutSmart Buffer (10X) o 500 mM Potassium Acetate o 200 mM Tris-acetate o 100 mM Magnesium Acetate o 1000 pg/ml BSA
Detailed Protocol 1 TdT tagging cfDNA (0.5ngd) [10Gng
TdT Buffer | 1.5 dNTPs (10mM) | 15
Terminal Transferase | 0.5
Incubate for 20 min at 37C
Heat Inactivate for 10 min at 70C 2) ¢fDNA denaturing
Reagent Volume)
CircLigase Buffer 10X | 3.5
Incubate for 10 min at 98C.
Fast Cool Down immediately on ice. 33 as DNA selfoiveudarization
To the previous reaction, add:
Reagent ~~ [Volume@)
Previous Reaction | 30
MnCIZ (50m) 35
CireLigase | 1
Incubate for 1h at 60C.
Inactivate at 80C for 10 min.
43 Rolling Civele Amphfication
Add 5ul of Exo-resistant Random Hexamers (10uM).
Incubate for 10 min at 98C.
Slow Cool Down to 22C
Pre-mix the following reagents and then add them to the previous reaction
Reagent | Volume (ul)
H20 173
CutSmart 10 dNTPs (10mM) 10
DTT qoomM) 2
EquiPhi29 Poly. 5
Incubate at 42C for 3h, then keep at 4C
Inactivate at 70C for 10 min 5) Sequencing
This part differs based on the platform used, any platform can be used. The RCA product can be readily sequenced in case of long-read sequencing, or it can be fragmented to generate short reads. Follow the manufacturer's methods as they fit.
Claims (15)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| NL2036501A NL2036501B1 (en) | 2023-12-12 | 2023-12-12 | Method for sequencing nucleic acid molecules and related methods. |
| PCT/NL2024/050664 WO2025127927A1 (en) | 2023-12-12 | 2024-12-12 | Method for sequencing nucleic acid molecules and related methods |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| NL2036501A NL2036501B1 (en) | 2023-12-12 | 2023-12-12 | Method for sequencing nucleic acid molecules and related methods. |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| NL2036501B1 true NL2036501B1 (en) | 2025-06-20 |
Family
ID=89898189
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| NL2036501A NL2036501B1 (en) | 2023-12-12 | 2023-12-12 | Method for sequencing nucleic acid molecules and related methods. |
Country Status (1)
| Country | Link |
|---|---|
| NL (1) | NL2036501B1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016053638A1 (en) * | 2014-09-30 | 2016-04-07 | Ge Healthcare Bio-Sciences Corp. | Method for nucleic acid analysis directly from an unpurified biological sample |
| WO2018035170A1 (en) * | 2016-08-15 | 2018-02-22 | Accuragen Holdings Limited | Compositions and methods for detecting rare sequence variants |
| EP3798318A1 (en) * | 2019-09-30 | 2021-03-31 | Diagenode S.A. | A high throughput sequencing method and kit |
-
2023
- 2023-12-12 NL NL2036501A patent/NL2036501B1/en active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016053638A1 (en) * | 2014-09-30 | 2016-04-07 | Ge Healthcare Bio-Sciences Corp. | Method for nucleic acid analysis directly from an unpurified biological sample |
| WO2018035170A1 (en) * | 2016-08-15 | 2018-02-22 | Accuragen Holdings Limited | Compositions and methods for detecting rare sequence variants |
| EP3798318A1 (en) * | 2019-09-30 | 2021-03-31 | Diagenode S.A. | A high throughput sequencing method and kit |
Non-Patent Citations (4)
| Title |
|---|
| BLANCO ET AL., J. BIOL. CHEM., vol. 264, no. 15, 1999, pages 8935 - 40 |
| GOODWIN ET AL., NATURE REVIEWS I GENETICS, vol. 17, 2016, pages 333 - 351 |
| KELMAN ET AL., STRUCTURE, vol. 6, 1998, pages 121 - 125 |
| MOHSENKOOL, ACC CHEM RES, vol. 49, no. 11, 24 October 2016 (2016-10-24), pages 2540 - 2550 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11214798B2 (en) | Methods and compositions for rapid nucleic acid library preparation | |
| EP3475449B1 (en) | Uses of a cell-free nucleic acid standards | |
| US20200063213A1 (en) | Methods of Amplifying DNA to Maintain Methylation Status | |
| CN117778531A (en) | Molecular library preparation methods and compositions and uses thereof | |
| US20220333186A1 (en) | Method and system for targeted nucleic acid sequencing | |
| WO2016181128A1 (en) | Methods, compositions, and kits for preparing sequencing library | |
| CN110295231A (en) | It is enriched with by selective allele or the sequencing of consumptive use low depth detects and quantitative rare variant | |
| CN113039285B (en) | Liquid Sample Workflow for Nanopore Sequencing | |
| US20170175182A1 (en) | Transposase-mediated barcoding of fragmented dna | |
| US20240301466A1 (en) | Efficient duplex sequencing using high fidelity next generation sequencing reads | |
| JP2023507876A (en) | Detection and analysis of methylation in mammalian DNA | |
| NL2036501B1 (en) | Method for sequencing nucleic acid molecules and related methods. | |
| US11174511B2 (en) | Methods and compositions for selecting and amplifying DNA targets in a single reaction mixture | |
| KR20230124636A (en) | Compositions and methods for highly sensitive detection of target sequences in multiplex reactions | |
| CN114450420A (en) | Compositions and methods for accurate assays in oncology | |
| US12037640B2 (en) | Sequencing an insert and an identifier without denaturation | |
| WO2025127927A1 (en) | Method for sequencing nucleic acid molecules and related methods | |
| WO2024218469A1 (en) | T cell receptor sequencing | |
| US20250109446A1 (en) | Compositions and methods for oncology assays | |
| HK40001895A (en) | Uses of a cell-free nucleic acid standards | |
| HK40001895B (en) | Uses of a cell-free nucleic acid standards | |
| HK40004109B (en) | Method of improved sequencing by strand identification | |
| HK40004109A (en) | Method of improved sequencing by strand identification |