[go: up one dir, main page]

WO2025101114A1 - TISSUE OF ORIGIN OF cfDNA - Google Patents

TISSUE OF ORIGIN OF cfDNA Download PDF

Info

Publication number
WO2025101114A1
WO2025101114A1 PCT/SE2024/050958 SE2024050958W WO2025101114A1 WO 2025101114 A1 WO2025101114 A1 WO 2025101114A1 SE 2024050958 W SE2024050958 W SE 2024050958W WO 2025101114 A1 WO2025101114 A1 WO 2025101114A1
Authority
WO
WIPO (PCT)
Prior art keywords
cfdna
skin
derived
extracted
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/SE2024/050958
Other languages
French (fr)
Inventor
Erik LEKHOLM
Anders STÅHLBERG
Kerryn ELLIOT
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of WO2025101114A1 publication Critical patent/WO2025101114A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the present invention generally relates to processing of circulating free deoxyribonucleic acid (cfDNA) and in particular to determining the tissue of origin of such cfDNA.
  • cfDNA free deoxyribonucleic acid
  • Circulating free deoxyribonucleic acid also referred to as cell-free DNA in the art, are degraded DNA fragments released to body fluids, such as blood plasma, urine, cerebrospinal fluid, etc.
  • the pool of cfDNA can contain various forms of DNA freely circulating in body fluids, including circulating tumor DNA (ctDNA), cell-free mitochondrial DNA (cf-mtDNA), cell-free fetal DNA (cf-fDNA) and donor-derived cell-free DNA (dd-cfDNA).
  • cfDNA Elevated levels of cfDNA are observed in certain diseases and medical conditions, including cancer, especially in advanced cancer diseases. Accordingly, cfDNA has been suggested as a biomarker for diagnosis of not only cancer but also other diseases and medical conditions, such as trauma, sepsis, aseptic inflammation, myocardial infarction, stroke, transplantation, diabetes, and sickle cell disease.
  • apoptosis apoptosis, necrosis and neutrophil extracellular traps (NET) activation and release (NETosis).
  • NET neutrophil extracellular traps
  • NETosis neutrophil extracellular traps
  • the rapidly increased accumulation of ctDNA in blood during tumor development is caused by an excessive DNA release by apoptotic cells and necrotic cells and also secretion of DNA from cancer cells.
  • a disease diagnosis based on cfDNA requires or at least would benefit from determining the tissue of origin of the cfDNA in order to give an accurate diagnosis, such as the location of a primary tumor.
  • Attempts to perform such a determination of the origin of cfDNA include probing methylation patterns, which are believed to differ depending on the cell type as disclosed in Warton et al. 2016, or cfDNA fractionation patterns, which depend on cells-specific histone positioning patterns as disclosed in Snyder et al. 2016.
  • An aspect of the invention relates to a method for determining tissue of origin of cfDNA.
  • the method comprises extracting cfDNA from a body fluid sample.
  • the method also comprises analyzing the extracted cfDNA for the presence of mutations at at least one E26 transformation-specific (ETS) transcription factor binding site in the extracted cfDNA.
  • the method further comprises determining the cfDNA to be skin-derived cfDNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
  • ETS E26 transformation-specific
  • the present invention predicts whether cfDNA extracted from a body fluid sample is skin-derived.
  • the invention can be used in disease diagnosis, such as detection, of diseases causing release of cfDNA from skin tissue into the circulation, in particular diagnosis of the presence of skin cancer or in the prognosis of skin tumors.
  • Fig. 1 is a flow chart illustrating a method for determining tissue of origin of cfDNA according to an embodiment
  • Fig. 2 is a flow chart illustrating an embodiment of the analyzing step in Fig. 1 ;
  • Fig. 3 is a flow chart illustrating an embodiment of the amplifying step in Fig. 2;
  • Fig. 4 is a flow chart illustrating additional optional steps of the method
  • FIG. 5 schematically illustrates the amplification steps of the embodiment shown in Fig. 3;
  • Fig. 6 illustrates UV exposure detected in cultured cells (A375) using an assay according to the invention.
  • Fig. 7 illustrates skin-derived cfDNA detectable in blood plasma samples from 15 melanoma patients analyzed using an assay according to the invention.
  • the present invention generally relates to processing of circulating free deoxyribonucleic acid (cfDNA) and in particular to determining the tissue of origin of such cfDNA.
  • cfDNA are degraded DNA fragments released into body fluids and can contain various forms of DNA freely circulating in body fluids, including circulating tumor DNA (ctDNA), cell-free mitochondrial DNA (cf- mtDNA), cell-free fetal DNA (cf-fDNA) and donor-derived cell-free DNA (dd-cfDNA).
  • Elevated levels of cfDNA are observed in certain diseases and medical conditions, especially cancer, but also other in other diseases and medical conditions, such as trauma, sepsis, aseptic inflammation, myocardial infarction, stroke, transplantation, diabetes, and sickle cell disease. It may be highly valuable to determine the origin of the cfDNA, in particular when using the cfDNA as a biomarker for disease diagnosis. For instance, detection of ctDNA in a body fluid sample from a subject indicates that the subject suffers from a cancer disease. However, the tissue of origin of the cancer disease is not directly evident merely by the detection of such ctDNA in the body fluid sample.
  • the present invention enables determination or at least prediction of the tissue of origin of cfDNA, including ctDNA.
  • the invention can determine or predict whether cfDNA is of skin origin, i.e., is skin-derived cfDNA.
  • the present invention can be used to determine whether ctDNA is of skin origin, i.e., is skin-derived DNA released from skin cancer cells.
  • Skin cancers are either nonmelanoma skin cancer (NMSC) including basal-cell skin cancer (BCC) and squamous-cell skin cancer (SCC), or melanoma.
  • ctDNA is accumulating in body fluids, such as blood, during tumor development causing an excessive DNA release by apoptotic cells and necrotic cells. Determination of the ctDNA to be of skin-origin may assist in the diagnosis of the subject as suffering from skin cancer.
  • skin diseases may cause release of skin-derived DNA into body fluids due to DNA released into circulation by secretion or following necrosis, apoptosis or other mechanisms of cell death.
  • skin diseases include, but are not limited to, vascular malformations, psoriasis, and systemic lupus erythematosus (SLE).
  • the present invention is based on the finding that ultraviolet (UV) radiation (UVR) or light induces mutations at certain positions in the genome that are highly vulnerable to mutagenesis by UV light.
  • UV radiation UVR
  • these mutations have a characteristic sequence signature, primarily cytosine (C) to thymine (T) substitutions in dipyrimidine sequence contexts, which arise due to the main mutagenic mechanism of UV light involving formation of DNA damage in the form of cyclobutene pyrimidine dimers (CPDs).
  • C cytosine
  • T thymine
  • CPDs cyclobutene pyrimidine dimers
  • Some proteins when bound to genomic DNA, can alter the local propensity for forming DNA damage from UV light.
  • transcription factors of the E26 transformation specific (ETS) family when bound to their specific binding sites or elements, can potently increased CPD formation in response to UV light.
  • ETS transcription factor binding sites occurs primarily at pyrimidines immediately upstream of the core motif (YYTTCCK) but also, to a lesser extent, at the central dipyrimidine position within the core motif (TTCCK). This phenomenon leads to widespread recurrent mutation hotspots in skin cells at such ETS transcription factor binding sites and sharply elevated mutation rate at such hotspot sites. In agreement with these ETS transcription factor binding sites being hypersensitive specifically to UV light, ETS hotspots mutations are common in sun-exposed skin cancers as well as normal skin cells, while being completely absent in other non-UV exposed tissue.
  • ETS transcription factor binding sites are exclusively taking place in cells that have been directly exposed to UV light and it is therefore highly unlikely that these ETS transcription factor binding sites are mutated in cells that have not been exposed to UV light.
  • This means that mutations in such ETS transcription factor binding sites in cfDNA can be used as an indication that the cfDNA is of skin origin, such as being skin-derived ctDNA.
  • An aspect of the invention therefore relates to a method for determining tissue of origin of cfDNA, see Fig. 1 .
  • the method comprises extracting cfDNA from a body fluid sample in step S1 .
  • the method also comprises analyzing the extracted cfDNA for the presence of mutations at at least one ETS transcription factor binding site, i.e., at one or more such ETS transcription factor binding sites, in the extracted cfDNA in step S2.
  • the next step S3 comprises determining, or at least predicting, the cfDNA to be skin-derived if mutations are detected, based on the analysis in step S2, at the at least one ETS transcription factor binding site.
  • step S3 also comprises determining, or at least predicting, the cfDNA to not be skin- derived if no mutations are detected, based on the analysis in step S2, at the at least one ETS transcription factor binding site.
  • the body fluid sample, from which the cfDNA is extracted in step S1 can be any body fluid sample that comprises such cfDNA.
  • body fluid samples include a saliva sample, a urine sample, a cerebrospinal fluid sample, a blood sample, a blood plasma sample, a blood serum sample, an amniotic fluid sample, a pleural effusion sample, a bronchial lavage sample, a bronchial aspirate sample, a breast milk sample, a tear sample, a seminal fluid sample, a peritoneal fluid sample, a flexural effusion sample, and a colostrum sample.
  • the body fluid is selected from the group consisting of blood, blood plasma, and blood serum.
  • the body fluid sample could be taken from a subject, i.e., is a so-called liquid biopsy taken form a subject.
  • the subject is an animal, preferably a mammal, and more preferably a human.
  • the body fluid sample could also be a processed fluid sample that originates from a subject but have then be processed ex vivo, such as filtered, centrifuged, frozen and thawed, etc.
  • Step S1 in the method shown in Fig. 1 comprises extracting the cfDNA from the body fluid sample.
  • a cfDNA extraction could be done using any known cfDNA extraction process.
  • Illustrative, but nonlimiting, examples of such extraction processes involve using columns, such as spin columns containing silica membranes, or magnetic beads, phenol-chloroform-based processes, filtration-based processes, microfluidic chips, etc.
  • Kits for extraction of cfDNA from body fluid samples are commercially available from various vendors including Qiagen, such as QIAmp® circulating nucleic acid kit; ThermoFisher Scientific, such as MagMAXTM Cell-Free DNA Isolation kit; Promega, such as Maxwell® RSC ccfDNA plasma kit; Omega Bio-Tek, such as Mag-Bind® Blood & Tissue DNA HDQ 96 kit; Active Motif, such as Active Motif® Cell-Free (cfDNA) Purification kit; and Beckman Coulter Life Sciences, such as Hä MiniMaxTM High Efficiency cfDNA Isolation kit.
  • Qiagen such as QIAmp® circulating nucleic acid kit
  • ThermoFisher Scientific such as MagMAXTM Cell-Free DNA Isolation kit
  • Promega such as Maxwell® RSC ccfDNA plasma kit
  • Omega Bio-Tek such as Mag-Bind® Blood & Tissue DNA HDQ 96 kit
  • Active Motif such
  • Step S2 in the method shown in Fig. 1 analyzes the cfDNA extracted in step S1 for the presence of mutations at one or more ETS transcription factor binding site in the extracted cfDNA.
  • the ETS family is one of the largest families of transcription factors that is unique to animals. ETS stands for E26 transformation-specific, also referred to as E-twenty-six or erythroblast transformation specific in the art.
  • Transcription factors of the ETS family i.e., ETS transcription factors
  • K represents guanine (G) or T, i.e., TTCC[G/T]
  • Binding of an ETS transcription factor to such a binding site can make the binding site vulnerable to UV-induced mutations primarily at pyrimidines immediately upstream of the core motif of the binding site but also, typically at a lesser extent, at a central dipyrimidine position within the core motif, i.e. , within TTCCK.
  • UV-induced mutations at an ETS transcription factor binding site preferably implies UV- induced mutations within and/or flanking the 5’ end of the ETS transcription factor binding site.
  • step S2 of Fig. 1 comprises analyzing the extracted cfDNA for the presence of mutations at at least one TTCCK or Y1Y2TTCCK sequence in the extracted DNA.
  • K is as defined above and Y1 and Y2 are independently C or T.
  • the dipyrimidines that are particularly mutagenic are CC, CT and TC.
  • TT is generally regarded as being less mutagenic.
  • Y1 and Y2 are independently C or T with the proviso that Y1 and Y2 are not both T.
  • UV-induced mutations at such ETS transcription factor sites typically occur at the following positions Y1CTTCCK, CY2TTCCK, NNTTCCK, wherein N represents any of T, C, G or adenine (A).
  • the UV-induced mutations are typically in the form of C-to-T (C>T) mutations.
  • the mutated version of the above-mentioned sequences are Y1TTTCCK, TY2TTCCK, NNTTTCK.
  • the UV-induced mutations are somatic mutations, i.e., non-inherited mutations.
  • Step S2 of Fig. 1 could involve analysis of mutations at one ETS transcription factor binding site in extracted cfDNA.
  • the analysis of step S2 is performed at a plurality, i.e., at least two and more preferably at least three, ETS transcription factor binding sites.
  • step S2 comprises analyzing the extracted cfDNA for the presence of mutations at a plurality of ETS transcription factor binding sites in the extracted cfDNA.
  • step S3 comprises, in this embodiment, determining the cfDNA to be skin-derived cfDNA if mutations are detected, based on the analysis in step S2, at at least one ETS transcription factor binding site of the plurality of ETS transcription factor binding sites.
  • step S2 is performed at at least 5 ETS transcription factor binding sites, preferably at at least 10 ETS transcription factor binding sites, and more preferably at at least 15 ETS transcription factor binding sites, such as at at least 20 EST transcription factor binding sites.
  • each such ETS transcription factor binding site could include up to three UV-induced mutations.
  • step S3 of the method comprises determining the cfDNA to be skin-derived cfDNA if mutations are detected, based on the analysis in step S2, at multiple ETS transcription factor binding sites of the plurality of ETS transcription factor binding sites.
  • Tumors are, to variable degrees, clonal expansions originating from a single cancer cell.
  • UV-induced mutations in skin cancer tumors tend to exhibit a more binary pattern, with a given UV-induced mutation being either present or absent.
  • the pattern is in each case determined by the genetic makeup of the original skin cancer cell clone that gave rise to the skin cancer tumor mass. This is to be compared to the non-clonal cell population exposed to UV light in Fig. 6, which exhibits broad presence of nearly all UV-induced mutations assayed.
  • the degree of clonality indicated by UV-induced mutations in skin-derived cfDNA can indicate whether the cfDNA originates from normal healthy or at least non-cancerous skin or a clonally expanded tumor cell mass, i.e., skin-derived ctDNA.
  • detection of UV-induced mutations at only a limited number of ETS transcription factor binding sites indicates that the cfDNA is skin-derived ctDNA, while UV-induced mutations detected at a broader spectrum of ETS transcription factor binding sites would instead indicate that the cfDNA is skin-derived non-tumor cfDNA.
  • the method comprises an additional step S4 as shown in Fig. 1.
  • This step S4 comprises determining whether the skin-derived cfDNA is skin-derived ctDNA or skin-derived and non- tumor-derived cfDNA based on a pattern or distribution of mutations at the plurality of ETS transcription factor binding sites in the extracted cfDNA.
  • step S4 comprises determining the skin-derived cfDNA to be skin-derived ctDNA if the mutations are only present in a limited subset of the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non- tumor-derived cfDNA. In another embodiment, step S4 comprises determining the skin-derived cfDNA to be skin-derived and non-tumor derived cfDNA if the mutations are distributed among the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived ctDNA.
  • Skin-derived and non-tumor-derived cfDNA as used herein indicate that the cfDNA is not derived or originating from a skin cancer cell.
  • a panel of a plurality of ETS transcription factor binding sites are assayed for the presence or absence of any UV-induced mutations. If UV-induced mutations are detected in most of these ETS transcription factor binding sites, or in at least a majority thereof, as shown in Fig. 6, then the skin-derived cfDNA is determined or predicted to be skin-derived cfDNA but not being tumor-derived. Correspondingly, if UV-induced mutations are detected in merely one or a limited number, i.e., less than the majority, of the assayed ETS transcription factor binding sites then the skin-derived cfDNA is determined or predicted to the skin-derived ctDNA.
  • the particular threshold value or values to be used for differentiating the skin-derived ctDNA from skin- derived and non-tumor-derived cfDNA depends on the number of ETS transcription factor binding sites included in the assay panel. As a general rule, if UV-induced mutations are found in at least a majority of the ETS transcription factor binding sites then the skin-derived ctDNA is most likely skin-derived and non- tumor-derived cfDNA.
  • the actual threshold value(s) could be determined as shown in the Example section, such as by exposing non-skin-cancer cells in vitro to UV light, extracting genomic DNA from the cells and analyzing the extracted genomic DNA for the presence of mutations at a plurality of ETS transcription factor binding sites.
  • the minimum number of ETS transcription factor bindings sites to be mutated by the UV light in order for the DNA to regarded as being skin-derived and non-tumor-derived cfDNA can be determined from such an analysis.
  • genomic DNA could be extracted from skin cancer cells from subjects and analyzed for the presence of mutations at the plurality of ETS transcription factor binding sites.
  • the maximum number of ETS transcription factor bindings sites to contain UV-induced mutations in order for the DNA to regarded as being skin-derived and tumor-derived cfDNA can be determined from such an analysis.
  • step S4 in Fig. 1 comprises determining the skin-derived cfDNA to be skin-derived ctDNA if the mutations are present in less than 50 % of the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
  • step S4 in Fig. 1 comprises determining the skin-derived cfDNA to be skin- derived ctDNA if the mutations are present in less than 40 % of the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
  • step S4 in Fig. 1 comprises determining the skin-derived cfDNA to be skin- derived ctDNA if the mutations are present in less than 30 %, preferably less than 25 % of the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
  • the analysis of the extracted cfDNA is step S2 can performed by any method or process that is able to detect a specific DNA-base in a specific position in the extracted cfDNA.
  • amplification-based methods such as polymerase chain reaction (PCR), including quantitative real-time PCR (qPCR), digital PCR (dPCR) and droplet dPCR (ddPCR); sequencing-based methods, such as high-throughput sequencing, next generation sequencing, and SiMSen@-seq; hybridization-based methods including microarrays and variants thereof, such as NanoString’s modified DNA microarrays; and combinations thereof.
  • PCR polymerase chain reaction
  • qPCR quantitative real-time PCR
  • dPCR digital PCR
  • dddPCR droplet dPCR
  • sequencing-based methods such as high-throughput sequencing, next generation sequencing, and SiMSen@-seq
  • hybridization-based methods including microarrays and variants thereof, such as NanoString’s modified DNA microarrays
  • step S2 of Fig. 1 comprises, see Fig. 2, amplifying, in step S10, at least one portion of the extracted cfDNA comprising the at least one ETS transcription factor binding site to form a plurality of amplicons.
  • Step S2 also comprises, in this embodiment, sequencing, in step S11 , at least a respective portion of the amplicons to form respective sequence reads.
  • the presence of mutations at the one or more ETS transcription factor binding sites can be determined by analyzing the sequence reads, i.e., determining the particular nucleotide sequence at the one or more ETS transcription factor binding sites to detect any possible UV-induced mutations, typically C>T mutations.
  • step S10 in Fig. 2 comprises, see Figs. 3 and 5, contacting, in step S20, the extracted cfDNA 1 with at least one hairpin barcode forward primer 10 and at least one reverse primer 20.
  • Each hairpin barcode forward primer 10 of the at least one hairpin barcode forward primer 10 comprises, from a 5’ end 11 to a 3’ end 17, a 5’ stem sequence 12, an adapter sequence 13, a unique molecular identifier (UMI) 14, a 3’ stem sequence 15 and a target-specific sequence 16 complementary to a nucleotide sequence upstream 2 of an ETS transcription factor binding site 4 of the at least one ETS transcription binding site 4.
  • UMI unique molecular identifier
  • Each reverse primer 20 of the at least one reverse primer 20 comprises, from a 5’ end 21 to a 3’ end 27, an adapter sequence 23 and a target-specific sequence 26 complementary to a nucleotide sequence downstream 3 of the ETS transcription factor binding site 4 of the at least one ETS transcription binding site 4.
  • At least a portion of the 5’ stem sequence 12 of the hairpin barcode forward primer 10 is complementary to at least a portion of the 3’ stem sequence 15 of the hairpin barcode forward primer 10.
  • the 5’ stem sequence 12 and the 3’ stem sequence 15 are configured to hybridize to each other at or under a closed annealing temperature and not hybridize to each other at or above an open annealing temperature.
  • Step S10 also comprises, in this embodiment, amplifying, in step S21 , the at least one portion of the extracted cfDNA 1 comprising the at least one ETS transcription binding site 4 by performing PCR pre-amplification of the at least one portion of the extracted cfDNA 1 comprising the at least one ETS transcription binding site 4 to form a plurality of barcoded PCR products 50.
  • the PCR preamplification in step S21 has an annealing temperature equal to or less than the closed annealing temperature of the at least one hairpin barcode forward primer 10.
  • Step S10 further comprises, in this embodiment, contacting, in step S22, the plurality of barcoded PCR products 50 with an adapter-specific forward primer 30 and an adapter-specific reverse primer 40 and amplifying, in step S23, the barcoded PCR products 50 by performing PCR amplification on the barcoded PCR products 50 to form a library of amplified barcoded PCR products 60.
  • At least a portion of cycles of the PCR amplification in step S23 has an annealing temperature equal to or greater than the open annealing temperature of the at least one hairpin barcode forward primer 10.
  • UMIs are known to be used to reduce such PCR-induced amplification biases and errors.
  • UMIs can be added to DNA by either ligation- or PCR-based approaches. Ligation-based UMI approaches require that target DNA is captured before the analysis, otherwise, all DNA molecules present in a sample will be analyzed. Another limitation is that DNA molecules are lost due to limited ligation efficiency. In comparison, PCR-based UMI approaches are simpler since no capture step is needed. PCR-based methods are potentially also more sensitive since they do not suffer from ineffective capture and ligation steps.
  • all hairpin barcode forward primers 10 comprise the same 5’ stem sequence 12, the same adapter sequence 13 and/or the same 3’ stem sequence 15. However, the hairpin barcode forward primers 10 comprise different UMIs 14.
  • the hairpin barcode forward primers 10 may all have the same target-specific sequence 16 (but different UMIs 14), multiple, i.e., at least two, hairpin barcode forward primers 10 have the same target-specific sequence 16 or the hairpin barcode forward primers 10 may have different target-specific sequences 16.
  • hairpin barcode forward primers 10 that can be used in accordance with the embodiments are shown in Table 1.
  • the 5’ stem sequence 12 consists of the nucleotide sequence GGACACTCTTTCCC (SEQ ID NO: 53) that is complementary to and capable of hybridizing to the 3’ stem sequence 15 consisting of the nucleotide sequence GGGAAAGAGTGTCC (SEQ ID NO: 54)
  • NNNNNNNNNN represents the UM1 14.
  • the hairpin barcode forward primer 10 additionally comprises a nucleotide sequence forming the adapter sequence 13. This adapter sequence 13 may be positioned between the 5’ stem sequence 12 and the UMI 14.
  • an adapter sequence 14 of TACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 55) could be used in the examples shown in Table 1.
  • step S20 comprises contacting the extracted cfDNA 1 with at least one pair of a hairpin barcode forward primer 10 and a reverse primer 20 selected from the group consisting of SEQ ID NO: 1 and 2, 3 and 4, 5 and 6, 7 and 8, 9 and 10, 11 and 12, 13 and 14, 15 and 16, 17 and 18, 19 and 20, 21 and 22, 23 and 24, 25 and 26, 27 and 28, 29 and 30, 31 and 32, 33 and 34, 35 and 36, 37 and 38, 39 and 40, 41 and 42, 43 and 44, 45 and 46, 47 and 48, 49 and 50, and 51 and 52.
  • the extracted cfDNA 1 are contacted with all such pairs of hairpin barcode forward primer 10 and reverse primer 20 as listed in Table 1 .
  • the present invention is not limited to the particular hairpin barcode forward primers and reverse primers as listed in Table 1 or to the ETS transcription factor binding sites listed in Table 2. Hence, also, or alternatively, other ETS transcription factor binding sites than the ones listed in Table 2 could be used according to the embodiments.
  • each reverse primer 20 is used together with the hairpin barcode forward primers 10.
  • Each reverse primer 20 then preferably comprises a respective target-specific sequence 23 that is complementary to a respective sequence or region 3 of the cfDNA 1 that is specific for a given ETS transcription factor binding site 4.
  • all reverse primers 20 comprise the same adapter sequence 23.
  • the reverse primers 20 may comprise different target-specific sequences 26.
  • the adapter sequence 13 of the hairpin barcode forward primers 10 is different from the adapter sequence 23 of the at least one reverse primer 20.
  • the UM1 14 is a random nm2n3...nk sequence.
  • each hairpin barcode forward primer 10 preferably comprises a respective unique UM1 14 having a random sequence that is different from the random sequences of UMIs 14 in other hairpin barcode forward primers 10.
  • the length of the UMIs 14 is preferably selected at least partly based on the number of cfDNA molecules 1 in the sample. For instance, the number of unique UMIs 14 is equal for 4 k for an UM1 14 of length k nucleotides. This number 4 k should preferably be significantly larger than the number of cfDNA molecules 1 in the body fluid sample.
  • the amplification of the cfDNA in step S21 comprises performing PCR pre-amplification at an annealing temperature equal to or less than the closed annealing temperature of the at least one hairpin barcode forward primer 10. Accordingly, at this PCR pre-amplification in step S21 at least a majority of the hairpin barcode forward primers 10 have an intact hairpin loop, i.e., the 5’ stem sequence 12 is hybridized to the 3’ stem sequence 15. This in turn significantly reduces the amount of non-specific PCR products that may otherwise occur during the pre-amplification by the random nucleotide sequence of the UMI 14. Hence, the vast majority of the PCR products from the PCR-amplification in step S1 are the desired barcoded PCR products 50 corresponding to an amplified portion 2, 3, 4 of the cfDNA 1.
  • the closed annealing temperature is equal to or less than 65°C.
  • the PCR pre-amplification of step S21 is preferably performed at an annealing temperature equal to or less than to 65°C.
  • the PCR pre-amplification of step S21 could be performed at an annealing temperature selected within an interval of from 60°C up to 65°C, preferably from 60°C up to 64°C, such as at about 62°C.
  • the PCR pre-amplification of the cfDNA 1 in step S21 is performed using a polymerase, preferably a DNA polymerase, and more preferably a heat-stable DNA polymerase.
  • a polymerase preferably a DNA polymerase, and more preferably a heat-stable DNA polymerase.
  • DNA polymerases that can be used according to the embodiments include Thermus thermophilus (Tth) DNA polymerase, Bacillus stearothermophilus DNA polymerase, Thermus aquaticus (Taq) DNA polymerase, Thermus flavus (Tfl) polymerase, Vent® DNA polymerase, Pfu polymerase, and Escherichia coll DNA polymerase I.
  • the DNA polymerase lacks 5'-nuclease activity.
  • DNA polymerase examples include Klenow fragmentof DNA polymerase 1 , Stoeffel fragment of Taq polymerase, Pfu polymerase or Vent® polymerase.
  • the DNA polymerase is a so-called thermoactivated DNA polymerase, also referred to as, hot-start DNA polymerase.
  • DNA polymerases include Takara PRIME STAR GXL polymerase I, Clontech's ADVANTAGE HD Polymerase, NEB Q5® High-Fidelity DNA Polymerases NEB PHUSION® High-Fidelity DNA Polymerases, ThermoFisher PLATINUM® Taq DNA Polymerase High Fidelity, ThermoFisher ACCUPRIMETM Pfx DNA Polymerase, ThermoFisher ACCUPRIMETM Taq DNA Polymerase High Fidelity, ThermoFisher PhusionTM High Fidelity Polymerase, ThermoFisher PlatinumTM SuperFiTM II DNA Polhymerase, Promega Pfu DNA Polymerase, and Qiagen HOTSTAR HIFIDELITY Polymerase.
  • the above presented examples of DNA polymerases that can be used in the PCR pre-amplification in step S21 may also be used in the PCR amplification in step S23.
  • step S21 comprises amplifying the nucleic acid molecules 1 by performing 1-20 cycles, preferably 2-15 cycles, and more preferably 2-10 of PCR pre-amplification of the cfDNA 1 to form the plurality of barcoded PCR products 50.
  • the barcoded PCR products 50 obtained in the PCR pre-amplification are then contacted in step S22 with an adapter-specific forward primer 30 and an adapter-specific reverse primer 40.
  • the adapter-specific forward primer 30 comprises a sequence equal to or complementary to, preferably equal to, the adapter sequence 13 of the hairpin barcode forward primers 10.
  • the adapter-specific reverse primer 40 comprises a sequence equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20.
  • the adapter-specific forward primer 30 comprises, from a 5’ end 31 to a 3’ end 34, one of a P5 sequence and a P7 sequence 32 and the sequence 33 equal to or complementary to, preferably equal to, the adapter sequence 13 of the hairpin barcode forward primers 10.
  • the adapter-specific reverse primer 40 comprises, from a 5’ end 41 to a 3’ end 44, the other of the P5 sequence and the P7 sequence 42 and the sequence 43 equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20.
  • the adapter-specific reverse primer 40 comprises, from the 5’ end 41 to the 3’ end 44, the other of the P5 sequence and the P7 sequence 42, an index sequence and the sequence 43 equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20.
  • the P5 and P7 sequences are P5 and P7 ILLUMINA® sequences.
  • the P5 sequence comprises AATGATACGGCGACCACCGA (SEQ ID NO: 56).
  • the P5 sequence could comprise, such as consist of, AATGATACGGCGACCACCGAGATCTACAC (SEQ ID NO: 57).
  • at least one of the 3’ nucleotides of the P5 sequence may be common for the P5 sequence and the following sequence 33.
  • the P7 comprises, preferably consists of, CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO: 58).
  • the PCR amplification in step S23 is performed at an annealing temperature equal to or greater than the open annealing temperature of the at least one hairpin barcode forward primer 10.
  • an annealing temperature equal to or greater than the open annealing temperature of the at least one hairpin barcode forward primer 10.
  • the open annealing temperature is at least 70°C.
  • the PCR amplification is performed at a temperature equal to or above 70°C, such as 71 °C, 72°C or even higher.
  • the open annealing temperature of the hairpin barcode forward primers 10 is higher than the closed annealing temperature of the hairpin barcode forward primers 10.
  • the open annealing temperature is at least 1 °C, preferably at least 2°C, such at least 3°C or 4°C, and more preferably at least 5°C higher than the closed annealing temperature.
  • step S23 comprises amplifying the barcoded PCR products 50 by performing at least 2, preferably at least 5, more preferably at least 10, such as at least 15, at least 20 or more preferably at least 25 cycles of PCR amplification on the barcoded PCR products 50 to form a library of amplified barcoded PCR products 60.
  • the number of PCR cycles is preferably selected to achieve sufficient number of amplified barcoded PCR products 60 for sequencing even for cfDNA molecules 1 present in a low copy number in the body fluid sample.
  • the number of PCR cycles in step S23 is higher than the number of PCR cycles in step 21.
  • the method comprises an additional step following step S21 but prior to step S23.
  • This additional step then comprises degrading the polymerase used for PCR preamplifying the cfDNA 1 in step S21 prior to amplifying the barcoded PCR products in the PCR amplification of step S23.
  • a protease could be added to the amplification products from the PCR pre-amplification in step S21 to enzymatically degrade the polymerase used for amplifying the cfDNA 1 in the PCR pre-amplification.
  • a degradation of the polymerase once the PCR pre-amplification in step S21 has been completed additionally inhibits formation of non-specific PCR products.
  • sequencing the at least a respective portion of the amplicons in step S11 of Fig. 2 comprises sequencing at least a respective portion of the amplified barcoded PCR products 60 to form respective sequence reads comprising the UMI(s) 64 and an ETS transcription binding site sequence 66.
  • This embodiment also comprises detecting in step S27, see Fig. 4, presence of mutations at the at least one ETS transcription factor binding site 4 in the extracted cfDNA 1 based on the respective sequence reads.
  • the sequencing is achieved by means of at least one sequencing primer.
  • the result of the sequencing is respective sequence reads comprising at least nucleotide sequences of the UMI 64 and ETS transcription factor binding site sequence(s) 66.
  • the sequencing comprises in situ sequencing the at least a portion of the amplified barcoded PCR products 60 immobilized onto a solid support.
  • the preferred P5 and P7 sequences introduced into the amplified barcoded PCR products 60 by means of the adapterspecific forward primers 30 and the adapter-specific reverse primers 40 could be used to immobilize the amplified barcoded PCR products 60 onto the solid support.
  • the solid support preferably comprises immobilized nucleotide sequences complementary to the P5 sequence and/or immobilized nucleotide sequences complementary to the P7 sequence.
  • the in situ sequencing preferably comprises in situ sequencing by synthesis of the at least a portion of the amplified product 60.
  • the adapter-specific forward and reverse primers 30, 40 comprise P5 and P7 sequences, respectively
  • the ILLUMINA® sequencing technology could be used to in situ sequence at least a portion of the amplified barcoded PCR products 60 by synthesis.
  • the amplified barcoded PCR products 60 are immobilized on a flow cell surface designed to present the amplified barcoded PCR sequences 60 in a manner that facilitates access to enzymes while ensuring high stability of surface bound amplified barcoded PCR products 60 and low non-specific binding of fluorescently labeled nucleotides.
  • SBS Sequence By Synthesis
  • dNTP deoxynucleoside triphosphate
  • detecting presence of mutations comprises, see Fig. 4, demultiplexing the sequence reads based on nucleic acid sequences of the UMIs 64 in step S25, mapping the demultiplexed sequence reads to a respective ETS transcription binding site region based on nucleic acid sequences of the at least one ETS transcription binding site sequence 66 in step S26 and detecting presence of mutations at the at least one ETS transcription factor binding site 4 in the extracted cfDNA 1 based on the demultiplexed and mapped sequence reads in step S27.
  • the sequence reads are then demultiplexed in step S25 based on the nucleic acid sequences of the UMIs 64.
  • demultiplezing comprises dividing the sequence reads into groups having a same nucleotide sequence of the UMIs 64, optionally with at most a predefined number of mismatches allowed for nucleotide sequences of UMIs 64 in a same group.
  • the demultiplexed sequence reads are then mapped in step S26 to respective ETS transcription binding site region based on nucleic acid sequences of the ETS transcription binding site sequence 66 in the sequence reads.
  • this mapping comprises dividing the demultiplexed sequence reads into groups having a same nucleotide sequence of the ETS transcription binding site sequences 66, optionally with at most a predefined number of mismatches allowed for nucleotide sequences of ETS transcription binding site sequence sequences 66 in a same group.
  • kits for determining tissue of origin of cfDNA comprises M>2 primer pairs of a hairpin barcode forward primer 10 and a reverse primer 20.
  • Each hairpin barcode forward primer 10 of the M pairs comprises, from a 5’ end 11 to a 3’ end 17, a 5’ stem sequence 12, an adapter sequence 13, a UMI 14, a 3’ stem sequence 15 and a first target-specific sequence 16 complementary to a nucleotide sequence upstream 2 of an ETS transcription factor binding site 4 in the cfDNA 1.
  • Each reverse primer 20 of the M primer pairs comprises, from a 5’ end 21 to a 3’ end 27, an adapter sequence 23 and a second target-specific sequence 26 complementary to a nucleotide sequence downstream 3 of the ETS transcription factor binding site 4 in the cfDNA.
  • At least a portion of the 5’ stem sequence 12 of the hairpin barcode forward primer 10 is complementary to at least a portion of the 3’ stem sequence 15 of the hairpin barcode forward primer 10.
  • the 5’ stem sequence 12 and the 3’ stem sequence 15 are configured to hybridize to each other at or under a closed annealing temperature and not hybridize to each other at or above an open annealing temperature.
  • the kit also comprises an adapter-specific forward primer 30 comprising a sequence equal to or complementary to, preferably equal to, the adapter sequence 13 of the hairpin barcode forward primers 10.
  • the kit further comprises an adapter-specific reverse primer 40 comprising comprises a sequence equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20.
  • the kit of the invention can advantageously be used in the method for determining the tissue of origin of cfDNA.
  • the M primer pairs of the hairpin barcode forward primer 10 and the reverse primer 20 are selected from the group consisting of SEQ ID NO: 1 and 2, 3 and 4, 5 and 6, 7 and 8, 9 and 10, 11 and 12, 13 and 14, 15 and 16, 17 and 18, 19 and 20, 21 and 22, 23 and 24, 25 and 26, 27 and 28, 29 and 30, 31 and 32, 33 and 34, 35 and 36, 37 and 38, 39 and 40, 41 and 42, 43 and 44, 45 and 46, 47 and 48, 49 and 50, and 51 and 52.
  • the adapter-specific forward primer 30 comprises a nucleotide sequence according to SEQ ID NO: 59 and the adapter-specific reverse primer 40 comprises a nucleotide sequence according to SEQ ID NO: 60.
  • the method and kit of the invention provides valuable information that can be used in the diagnosis of subjects.
  • the presence of skin-derived cfDNA in a body fluid sample taken from a subject may indicate that the subject is suffering from a skin disease causing, for instance secretion of DNA or apoptosis, necrosis and/or NETosis, and thereby release of DNA from skin cells.
  • presence of skin-derived cfDNA in the body fluid sample may, thus, indicate that the subject is suffering from any of the previously mentioned skin diseases, i.e., skin cancer, vascular malformations, psoriasis, or SLE.
  • the presence of skin-derived ctDNA in the body fluid sample could be an indication that the subject is suffering from skin cancer.
  • the present invention could also be used in the prognosis of a yet non-detected skin tumor, such as when the tumor mass currently is too small to be detected by other means, but still contain sufficient number of skin cancer cells to release DNA into the circulation.
  • the method comprises predicting a subject, from which the body fluid sample has been extracted, as suffering from skin cancer, such as BCC, SCC or melanoma, preferably melanoma, if the skin-derived cfDNA is determined to be skin-derived ctDNA.
  • the method comprises predicting a subject, from which the body fluid sample has been extracted, as suffering from a non-cancerous skin disease, preferably psoriasis or SLE, if the skin- derived cfDNA is determined to be skin-derived and non-tumor-derived cfDNA.
  • a non-cancerous skin disease preferably psoriasis or SLE
  • cancer of unknown primary means that cancer spread (secondary tumor) has been found in the body, but the original tissue or organ where the cancer started (primary tumor) is not known.
  • DNA can be extracted from a tumor sample, in particular a secondary tumor sample, the presence of mutations at one or more ETS transcription factor binding sites in the extracted DNA can be analyzed and used to determine whether the primary tumor or cancer is of skin origin (skin cancer) or not. Such information is valuable to guide treatment of the cancer disease.
  • Another aspect of the invention therefore relates to a method of determining a primary cancer for a tumor sample.
  • the method comprises extracting DNA from the tumor sample, preferably from a secondary tumor sample.
  • the method also comprises analyzing the extracted DNA for the presence of mutations at at least one ETS transcription factor binding site in the extracted DNA.
  • the method further comprises determining the primary cancer to be skin cancer if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
  • Primary cancer refers to the original, or first, tumor in the body. Cancer cells from a primary cancer may spread to other parts of the body and form new, or secondary, tumors in a process referred to as metastasis. The secondary tumors are the same type of cancer as the primary cancer. Primary cancer is also referred to as primary tumor in the art. As an illustrative example, cancer cells from the skin (primary cancer or tumor) can spread to form new tumors in the lung (secondary cancer or tumor). The cancer cells in the lung are then just like the ones in the skin.
  • the present method can be used to determine that the primary cancer of the (secondary) tumor in the lung is skin cancer.
  • the method comprises determining the primary cancer to not be skin cancer if no mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site, preferably at at least five, more preferably at at least ten and more preferably at at least 15, at least 20 or at least 25 ETS transcription binding sites.
  • Another application is within forensic sciences and in particular when there is a need to detect the tissue of origin of DNA collected at a crime scene. Hence, in some forensic applications it would be useful to determine whether DNA samples found at a crime scene is from the skin or not.
  • a further aspect of the invention relates to a forensic DNA analysis method.
  • the method comprises analyzing a DNA sample for the presence of mutations at at least one ETS transcription factor binding site in the DNA.
  • the method further comprises determining the DNA to be skin-derived DNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
  • a DNA sample such as naked DNA
  • a body sample such as skin or hair, or a body fluid sample, such as blood or saliva
  • DNA is present at the crime scene or the body sample is in so minute amounts or damaged state so that it is not possible to determine whether it is skin or some other type of tissue.
  • the body (fluid) sample is a mixed sample containing DNA from several different persons. For instance, drops of blood from one person could contain skin cells, and thereby skin-derived DNA, from another person. In such a case, it could be important to determine that most of the DNA extracted from the blood is non-skin derived and thereby from one person but some is actually skin-derived and from another person.
  • the method also comprises extracting DNA from a body sample or a body fluid sample.
  • analyzing the DNA comprises analyzing the extracted DNA for the presence of mutations at at least one ETS transcription factor binding site in the extracted DNA.
  • the determining step comprises determining the extracted DNA to be skin-derived DNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
  • the method comprises determining the DNA to not be skin-derived DNA if no mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site, preferably at at least five, more preferably at at least ten and more preferably at at least 15, at least 20 or at least 25 ETS transcription binding sites.
  • the various embodiment described herein for the method for determining tissue of origin of cfDNA also applies to the method of determining a primary cancer for a tumor sample and to the forensic DNA analysis method.
  • A375 cells were treated with 36 J/m 2 ultraviolet-C (UVC) light five times a week for 10 weeks. Samples were split when confluent and reseeded at 1 :10 approximately every three days and remaining cells were frozen at -20°C for later DNA extraction. DNA was extracted using the DNA Blood and Tissue mini kit spin protocol (Qiagen). Samples were eluted in 200 pl buffer EB (Qiagen). DNA concentration varied between 50 to 100 ng/pl. DNA concentration was measured using a QubitTM dsDNA High Sensitivity kit (Invitrogen).
  • SiMSen®-seq was performed in two steps (Simsen Diagnostics AB).
  • 10 ng of the extracted DNA samples were amplified with the UV hotspot multiplex primer set (T able 1 ) for two cycles to label the DNA fragments with a unique molecular index (UMI) barcode.
  • UMI UV hotspot multiplex primer set
  • protease treatment to digest the first polymerase and a second PCR reaction using index primers (Table 3) for Illumina® sequencing for 28 cycles.
  • index primers Table 3 for Illumina® sequencing for 28 cycles.
  • For cfDNA due to the lower DNA concentration, two 10 pl reactions were performed and were pooled prior to clean up with AMPure® beads.
  • UV ETS hotspot regions (primers are listed in Table 1).
  • the hotspot regions all contained ETS motifs (TTCC[G/T]) flanked at the 5’ side by a dipyrimidine (CC, CT or TC; not TT as it is not mutagenic).
  • the hotspot regions were initially identified based on recurrent somatic mutations in whole genome sequencing data from melanoma. As mutations can arise at up to three key positions flanking or overlapping an ETS transcription factor binding site, there are multiple hotspot positions at each ETS transcription factor binding site. All individual informative hotspot positions are listed in Table 2.
  • the hotspot regions and the assay were verified in cell culture using A375 cells, which were initially negative for all but one of the hotspot mutations covered by this specific panel of primers (Table 1). Cells were exposed to daily doses of UVC light (36 J/m 2 UVC five days per week) and DNA was harvested after 10 weeks. In UV-exposed DNA, mutations were detected in the vast majority of hotspot regions at expected positions within the ETS motifs, at variant allele frequencies up to 2.7% (Fig. 6).
  • cfDNA was isolated from blood plasma from 15 melanoma patients, 6 of which had a known residual tumor burden. The remaining 9 patients had lesions successfully removed by surgery. Of these, 3 patients had a positive sentinel node biopsy, which is an imperfect predictor of metastatic spread.
  • Skin-derived cfDNA was also found in 2 of the 9 patients without a known tumor burden, both with negative sentinel node biopsy, see Fig. 7. This may indicate presence of undiagnosed residual disease, or alternatively a contribution from healthy skin to cfDNA in blood plasma.
  • skin-derived cfDNA was detectable in blood plasma samples of human subjects on the basis of a UV hotspot-based mutation panel assay.
  • **NNNNNNNNNNNNNN represents UMI barcode

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to determination of tissue of origin of cfDNA (1) The method comprises extracting cfDNA (1) from a body fluid sample, analyzing the extracted cfDNA (1) for the presence of mutations at one or more ETS transcription factor binding site (4) in the extracted cfDNA (1) and determining the cfDNA (1) to be skin-derived cfDNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site (4). The present invention finds uses in the skin disease diagnosis, such as detection of skin cancer.

Description

TISSUE OF ORIGIN OF cfDNA
TECHNICAL FIELD
The present invention generally relates to processing of circulating free deoxyribonucleic acid (cfDNA) and in particular to determining the tissue of origin of such cfDNA.
BACKGROUND
Circulating free deoxyribonucleic acid (cfDNA), also referred to as cell-free DNA in the art, are degraded DNA fragments released to body fluids, such as blood plasma, urine, cerebrospinal fluid, etc. The pool of cfDNA can contain various forms of DNA freely circulating in body fluids, including circulating tumor DNA (ctDNA), cell-free mitochondrial DNA (cf-mtDNA), cell-free fetal DNA (cf-fDNA) and donor-derived cell-free DNA (dd-cfDNA).
Elevated levels of cfDNA are observed in certain diseases and medical conditions, including cancer, especially in advanced cancer diseases. Accordingly, cfDNA has been suggested as a biomarker for diagnosis of not only cancer but also other diseases and medical conditions, such as trauma, sepsis, aseptic inflammation, myocardial infarction, stroke, transplantation, diabetes, and sickle cell disease.
The release of cfDNA into body fluids takes place due to different reasons, including apoptosis, necrosis and neutrophil extracellular traps (NET) activation and release (NETosis). For instance, the rapidly increased accumulation of ctDNA in blood during tumor development is caused by an excessive DNA release by apoptotic cells and necrotic cells and also secretion of DNA from cancer cells.
In some applications, a disease diagnosis based on cfDNA requires or at least would benefit from determining the tissue of origin of the cfDNA in order to give an accurate diagnosis, such as the location of a primary tumor. Attempts to perform such a determination of the origin of cfDNA include probing methylation patterns, which are believed to differ depending on the cell type as disclosed in Warton et al. 2016, or cfDNA fractionation patterns, which depend on cells-specific histone positioning patterns as disclosed in Snyder et al. 2016.
There is still a need for determining the tissue of origin of cfDNA and in particular for a technology that enables determining whether the cfDNA is skin-derived cfDNA.
SUMMARY It is a general objective to determine whether cfDNA is skin-derived cfDNA.
This and other objectives are met by embodiments disclosed herein.
The present invention is defined in the independent claim. Further embodiments of the invention are defined in the dependent claims.
An aspect of the invention relates to a method for determining tissue of origin of cfDNA. The method comprises extracting cfDNA from a body fluid sample. The method also comprises analyzing the extracted cfDNA for the presence of mutations at at least one E26 transformation-specific (ETS) transcription factor binding site in the extracted cfDNA. The method further comprises determining the cfDNA to be skin-derived cfDNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
The present invention predicts whether cfDNA extracted from a body fluid sample is skin-derived. The invention can be used in disease diagnosis, such as detection, of diseases causing release of cfDNA from skin tissue into the circulation, in particular diagnosis of the presence of skin cancer or in the prognosis of skin tumors.
BRIEF DESCRIPTION OF DRAWINGS
The embodiments, together with further objects and advantages thereof, may best be understood by making reference to the following description taken together with the accompanying drawings, in which:
Fig. 1 is a flow chart illustrating a method for determining tissue of origin of cfDNA according to an embodiment;
Fig. 2 is a flow chart illustrating an embodiment of the analyzing step in Fig. 1 ;
Fig. 3 is a flow chart illustrating an embodiment of the amplifying step in Fig. 2;
Fig. 4 is a flow chart illustrating additional optional steps of the method;
Fig. 5 schematically illustrates the amplification steps of the embodiment shown in Fig. 3; Fig. 6 illustrates UV exposure detected in cultured cells (A375) using an assay according to the invention; and
Fig. 7 illustrates skin-derived cfDNA detectable in blood plasma samples from 15 melanoma patients analyzed using an assay according to the invention.
DETAILED DESCRIPTION
The present invention generally relates to processing of circulating free deoxyribonucleic acid (cfDNA) and in particular to determining the tissue of origin of such cfDNA. cfDNA are degraded DNA fragments released into body fluids and can contain various forms of DNA freely circulating in body fluids, including circulating tumor DNA (ctDNA), cell-free mitochondrial DNA (cf- mtDNA), cell-free fetal DNA (cf-fDNA) and donor-derived cell-free DNA (dd-cfDNA). Elevated levels of cfDNA are observed in certain diseases and medical conditions, especially cancer, but also other in other diseases and medical conditions, such as trauma, sepsis, aseptic inflammation, myocardial infarction, stroke, transplantation, diabetes, and sickle cell disease. It may be highly valuable to determine the origin of the cfDNA, in particular when using the cfDNA as a biomarker for disease diagnosis. For instance, detection of ctDNA in a body fluid sample from a subject indicates that the subject suffers from a cancer disease. However, the tissue of origin of the cancer disease is not directly evident merely by the detection of such ctDNA in the body fluid sample.
The present invention enables determination or at least prediction of the tissue of origin of cfDNA, including ctDNA. In particular, the invention can determine or predict whether cfDNA is of skin origin, i.e., is skin-derived cfDNA. As an illustrative, but non-limiting, example, the present invention can be used to determine whether ctDNA is of skin origin, i.e., is skin-derived DNA released from skin cancer cells. Skin cancers are either nonmelanoma skin cancer (NMSC) including basal-cell skin cancer (BCC) and squamous-cell skin cancer (SCC), or melanoma. ctDNA is accumulating in body fluids, such as blood, during tumor development causing an excessive DNA release by apoptotic cells and necrotic cells. Determination of the ctDNA to be of skin-origin may assist in the diagnosis of the subject as suffering from skin cancer.
Also other skin diseases than skin cancer may cause release of skin-derived DNA into body fluids due to DNA released into circulation by secretion or following necrosis, apoptosis or other mechanisms of cell death. Such other skin diseases include, but are not limited to, vascular malformations, psoriasis, and systemic lupus erythematosus (SLE).
The present invention is based on the finding that ultraviolet (UV) radiation (UVR) or light induces mutations at certain positions in the genome that are highly vulnerable to mutagenesis by UV light. Specifically, these mutations have a characteristic sequence signature, primarily cytosine (C) to thymine (T) substitutions in dipyrimidine sequence contexts, which arise due to the main mutagenic mechanism of UV light involving formation of DNA damage in the form of cyclobutene pyrimidine dimers (CPDs). Some proteins, when bound to genomic DNA, can alter the local propensity for forming DNA damage from UV light. In particular, transcription factors of the E26 transformation specific (ETS) family, when bound to their specific binding sites or elements, can potently increased CPD formation in response to UV light. Increased damage formation at ETS transcription factor binding sites occurs primarily at pyrimidines immediately upstream of the core motif (YYTTCCK) but also, to a lesser extent, at the central dipyrimidine position within the core motif (TTCCK). This phenomenon leads to widespread recurrent mutation hotspots in skin cells at such ETS transcription factor binding sites and sharply elevated mutation rate at such hotspot sites. In agreement with these ETS transcription factor binding sites being hypersensitive specifically to UV light, ETS hotspots mutations are common in sun-exposed skin cancers as well as normal skin cells, while being completely absent in other non-UV exposed tissue. Importantly, mutations at ETS transcription factor binding sites are exclusively taking place in cells that have been directly exposed to UV light and it is therefore highly unlikely that these ETS transcription factor binding sites are mutated in cells that have not been exposed to UV light. This means that mutations in such ETS transcription factor binding sites in cfDNA can be used as an indication that the cfDNA is of skin origin, such as being skin-derived ctDNA.
An aspect of the invention therefore relates to a method for determining tissue of origin of cfDNA, see Fig. 1 . The method comprises extracting cfDNA from a body fluid sample in step S1 . The method also comprises analyzing the extracted cfDNA for the presence of mutations at at least one ETS transcription factor binding site, i.e., at one or more such ETS transcription factor binding sites, in the extracted cfDNA in step S2. The next step S3 comprises determining, or at least predicting, the cfDNA to be skin-derived if mutations are detected, based on the analysis in step S2, at the at least one ETS transcription factor binding site. In an embodiment, step S3 also comprises determining, or at least predicting, the cfDNA to not be skin- derived if no mutations are detected, based on the analysis in step S2, at the at least one ETS transcription factor binding site.
The body fluid sample, from which the cfDNA is extracted in step S1 , can be any body fluid sample that comprises such cfDNA. Illustrative, but non-limiting, examples of such body fluid samples include a saliva sample, a urine sample, a cerebrospinal fluid sample, a blood sample, a blood plasma sample, a blood serum sample, an amniotic fluid sample, a pleural effusion sample, a bronchial lavage sample, a bronchial aspirate sample, a breast milk sample, a tear sample, a seminal fluid sample, a peritoneal fluid sample, a flexural effusion sample, and a colostrum sample. In a preferred embodiment, the body fluid is selected from the group consisting of blood, blood plasma, and blood serum.
The body fluid sample could be taken from a subject, i.e., is a so-called liquid biopsy taken form a subject. The subject is an animal, preferably a mammal, and more preferably a human. The body fluid sample could also be a processed fluid sample that originates from a subject but have then be processed ex vivo, such as filtered, centrifuged, frozen and thawed, etc.
Step S1 in the method shown in Fig. 1 comprises extracting the cfDNA from the body fluid sample. Such a cfDNA extraction could be done using any known cfDNA extraction process. Illustrative, but nonlimiting, examples of such extraction processes involve using columns, such as spin columns containing silica membranes, or magnetic beads, phenol-chloroform-based processes, filtration-based processes, microfluidic chips, etc. Kits for extraction of cfDNA from body fluid samples are commercially available from various vendors including Qiagen, such as QIAmp® circulating nucleic acid kit; ThermoFisher Scientific, such as MagMAX™ Cell-Free DNA Isolation kit; Promega, such as Maxwell® RSC ccfDNA plasma kit; Omega Bio-Tek, such as Mag-Bind® Blood & Tissue DNA HDQ 96 kit; Active Motif, such as Active Motif® Cell-Free (cfDNA) Purification kit; and Beckman Coulter Life Sciences, such as Apostle MiniMax™ High Efficiency cfDNA Isolation kit.
Step S2 in the method shown in Fig. 1 analyzes the cfDNA extracted in step S1 for the presence of mutations at one or more ETS transcription factor binding site in the extracted cfDNA. The ETS family is one of the largest families of transcription factors that is unique to animals. ETS stands for E26 transformation-specific, also referred to as E-twenty-six or erythroblast transformation specific in the art. Transcription factors of the ETS family, i.e., ETS transcription factors, bind to specific binding sites or elements in the genomic DNA: TTCCK, wherein K represents guanine (G) or T, i.e., TTCC[G/T], Binding of an ETS transcription factor to such a binding site can make the binding site vulnerable to UV-induced mutations primarily at pyrimidines immediately upstream of the core motif of the binding site but also, typically at a lesser extent, at a central dipyrimidine position within the core motif, i.e. , within TTCCK.
UV-induced mutations at an ETS transcription factor binding site as used herein preferably implies UV- induced mutations within and/or flanking the 5’ end of the ETS transcription factor binding site.
Hence, in an embodiment, step S2 of Fig. 1 comprises analyzing the extracted cfDNA for the presence of mutations at at least one TTCCK or Y1Y2TTCCK sequence in the extracted DNA. K is as defined above and Y1 and Y2 are independently C or T. The dipyrimidines that are particularly mutagenic are CC, CT and TC. Thus, TT is generally regarded as being less mutagenic. Hence, in an embodiment, Y1 and Y2 are independently C or T with the proviso that Y1 and Y2 are not both T.
This means that UV-induced mutations at such ETS transcription factor sites typically occur at the following positions Y1CTTCCK, CY2TTCCK, NNTTCCK, wherein N represents any of T, C, G or adenine (A).
The UV-induced mutations are typically in the form of C-to-T (C>T) mutations. Hence, in such a case the mutated version of the above-mentioned sequences are Y1TTTCCK, TY2TTCCK, NNTTTCK.
The UV-induced mutations are somatic mutations, i.e., non-inherited mutations.
Step S2 of Fig. 1 could involve analysis of mutations at one ETS transcription factor binding site in extracted cfDNA. However, in a preferred embodiment, the analysis of step S2 is performed at a plurality, i.e., at least two and more preferably at least three, ETS transcription factor binding sites.
Hence, in an embodiment, step S2 comprises analyzing the extracted cfDNA for the presence of mutations at a plurality of ETS transcription factor binding sites in the extracted cfDNA. Step S3 comprises, in this embodiment, determining the cfDNA to be skin-derived cfDNA if mutations are detected, based on the analysis in step S2, at at least one ETS transcription factor binding site of the plurality of ETS transcription factor binding sites.
In an embodiment, the analysis of step S2 is performed at at least 5 ETS transcription factor binding sites, preferably at at least 10 ETS transcription factor binding sites, and more preferably at at least 15 ETS transcription factor binding sites, such as at at least 20 EST transcription factor binding sites. As is disclosed above, each such ETS transcription factor binding site could include up to three UV-induced mutations.
UV-induced mutations typically occur and thereby accumulate at more than one ETS transcription factor binding site in genomic DNA of skin-origin, such as from a skin cell. This means that skin-derived cfDNA may comprise mutations at multiple, i.e., two or more, ETS transcription factor binding sites. In such a case, step S3 of the method comprises determining the cfDNA to be skin-derived cfDNA if mutations are detected, based on the analysis in step S2, at multiple ETS transcription factor binding sites of the plurality of ETS transcription factor binding sites.
Tumors are, to variable degrees, clonal expansions originating from a single cancer cell. This means that UV-induced mutation patterns in skin cancer tumors will differ from non-clonally expanded cell masses, such as healthy human skin. Specifically, UV-induced mutations in skin cancer tumors tend to exhibit a more binary pattern, with a given UV-induced mutation being either present or absent. The pattern is in each case determined by the genetic makeup of the original skin cancer cell clone that gave rise to the skin cancer tumor mass. This is to be compared to the non-clonal cell population exposed to UV light in Fig. 6, which exhibits broad presence of nearly all UV-induced mutations assayed. Hence, the degree of clonality indicated by UV-induced mutations in skin-derived cfDNA can indicate whether the cfDNA originates from normal healthy or at least non-cancerous skin or a clonally expanded tumor cell mass, i.e., skin-derived ctDNA. Specifically, detection of UV-induced mutations at only a limited number of ETS transcription factor binding sites indicates that the cfDNA is skin-derived ctDNA, while UV-induced mutations detected at a broader spectrum of ETS transcription factor binding sites would instead indicate that the cfDNA is skin-derived non-tumor cfDNA.
In an embodiment, the method comprises an additional step S4 as shown in Fig. 1. This step S4 comprises determining whether the skin-derived cfDNA is skin-derived ctDNA or skin-derived and non- tumor-derived cfDNA based on a pattern or distribution of mutations at the plurality of ETS transcription factor binding sites in the extracted cfDNA.
In an embodiment, step S4 comprises determining the skin-derived cfDNA to be skin-derived ctDNA if the mutations are only present in a limited subset of the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non- tumor-derived cfDNA. In another embodiment, step S4 comprises determining the skin-derived cfDNA to be skin-derived and non-tumor derived cfDNA if the mutations are distributed among the plurality of ETS transcription factor binding sites in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived ctDNA.
Skin-derived and non-tumor-derived cfDNA as used herein indicate that the cfDNA is not derived or originating from a skin cancer cell.
In an example, a panel of a plurality of ETS transcription factor binding sites, such as shown in Table 2, are assayed for the presence or absence of any UV-induced mutations. If UV-induced mutations are detected in most of these ETS transcription factor binding sites, or in at least a majority thereof, as shown in Fig. 6, then the skin-derived cfDNA is determined or predicted to be skin-derived cfDNA but not being tumor-derived. Correspondingly, if UV-induced mutations are detected in merely one or a limited number, i.e., less than the majority, of the assayed ETS transcription factor binding sites then the skin-derived cfDNA is determined or predicted to the skin-derived ctDNA.
The particular threshold value or values to be used for differentiating the skin-derived ctDNA from skin- derived and non-tumor-derived cfDNA depends on the number of ETS transcription factor binding sites included in the assay panel. As a general rule, if UV-induced mutations are found in at least a majority of the ETS transcription factor binding sites then the skin-derived ctDNA is most likely skin-derived and non- tumor-derived cfDNA.
The actual threshold value(s) could be determined as shown in the Example section, such as by exposing non-skin-cancer cells in vitro to UV light, extracting genomic DNA from the cells and analyzing the extracted genomic DNA for the presence of mutations at a plurality of ETS transcription factor binding sites. In such a case, the minimum number of ETS transcription factor bindings sites to be mutated by the UV light in order for the DNA to regarded as being skin-derived and non-tumor-derived cfDNA can be determined from such an analysis.
Correspondingly, genomic DNA could be extracted from skin cancer cells from subjects and analyzed for the presence of mutations at the plurality of ETS transcription factor binding sites. In such a case, the maximum number of ETS transcription factor bindings sites to contain UV-induced mutations in order for the DNA to regarded as being skin-derived and tumor-derived cfDNA can be determined from such an analysis.
In an embodiment, step S4 in Fig. 1 comprises determining the skin-derived cfDNA to be skin-derived ctDNA if the mutations are present in less than 50 % of the plurality of ETS transcription factor binding sties in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
In another embodiment, step S4 in Fig. 1 comprises determining the skin-derived cfDNA to be skin- derived ctDNA if the mutations are present in less than 40 % of the plurality of ETS transcription factor binding sties in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
In a further embodiment, step S4 in Fig. 1 comprises determining the skin-derived cfDNA to be skin- derived ctDNA if the mutations are present in less than 30 %, preferably less than 25 % of the plurality of ETS transcription factor binding sties in the extracted cfDNA and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
The analysis of the extracted cfDNA is step S2 can performed by any method or process that is able to detect a specific DNA-base in a specific position in the extracted cfDNA. Illustrative, but non-limiting, examples of such methods include amplification-based methods, such as polymerase chain reaction (PCR), including quantitative real-time PCR (qPCR), digital PCR (dPCR) and droplet dPCR (ddPCR); sequencing-based methods, such as high-throughput sequencing, next generation sequencing, and SiMSen@-seq; hybridization-based methods including microarrays and variants thereof, such as NanoString’s modified DNA microarrays; and combinations thereof. Amplification-based methods have the advantage of being fast and cost-effective, while sequencing-based methods may provide more accurate quantification and better sensitivity.
In an embodiment, step S2 of Fig. 1 comprises, see Fig. 2, amplifying, in step S10, at least one portion of the extracted cfDNA comprising the at least one ETS transcription factor binding site to form a plurality of amplicons. Step S2 also comprises, in this embodiment, sequencing, in step S11 , at least a respective portion of the amplicons to form respective sequence reads. In such an embodiment, the presence of mutations at the one or more ETS transcription factor binding sites can be determined by analyzing the sequence reads, i.e., determining the particular nucleotide sequence at the one or more ETS transcription factor binding sites to detect any possible UV-induced mutations, typically C>T mutations.
In an embodiment, step S10 in Fig. 2 comprises, see Figs. 3 and 5, contacting, in step S20, the extracted cfDNA 1 with at least one hairpin barcode forward primer 10 and at least one reverse primer 20. Each hairpin barcode forward primer 10 of the at least one hairpin barcode forward primer 10 comprises, from a 5’ end 11 to a 3’ end 17, a 5’ stem sequence 12, an adapter sequence 13, a unique molecular identifier (UMI) 14, a 3’ stem sequence 15 and a target-specific sequence 16 complementary to a nucleotide sequence upstream 2 of an ETS transcription factor binding site 4 of the at least one ETS transcription binding site 4. Each reverse primer 20 of the at least one reverse primer 20 comprises, from a 5’ end 21 to a 3’ end 27, an adapter sequence 23 and a target-specific sequence 26 complementary to a nucleotide sequence downstream 3 of the ETS transcription factor binding site 4 of the at least one ETS transcription binding site 4. At least a portion of the 5’ stem sequence 12 of the hairpin barcode forward primer 10 is complementary to at least a portion of the 3’ stem sequence 15 of the hairpin barcode forward primer 10. The 5’ stem sequence 12 and the 3’ stem sequence 15 are configured to hybridize to each other at or under a closed annealing temperature and not hybridize to each other at or above an open annealing temperature. Step S10 also comprises, in this embodiment, amplifying, in step S21 , the at least one portion of the extracted cfDNA 1 comprising the at least one ETS transcription binding site 4 by performing PCR pre-amplification of the at least one portion of the extracted cfDNA 1 comprising the at least one ETS transcription binding site 4 to form a plurality of barcoded PCR products 50. The PCR preamplification in step S21 has an annealing temperature equal to or less than the closed annealing temperature of the at least one hairpin barcode forward primer 10. Step S10 further comprises, in this embodiment, contacting, in step S22, the plurality of barcoded PCR products 50 with an adapter-specific forward primer 30 and an adapter-specific reverse primer 40 and amplifying, in step S23, the barcoded PCR products 50 by performing PCR amplification on the barcoded PCR products 50 to form a library of amplified barcoded PCR products 60. At least a portion of cycles of the PCR amplification in step S23 has an annealing temperature equal to or greater than the open annealing temperature of the at least one hairpin barcode forward primer 10.
Amplification steps traditionally used to amplify DNA, including cfDNA, sequences introduce PCR- induced amplification biases and errors, which effectively prevent accurate mutation determination and quantification. UMIs are known to be used to reduce such PCR-induced amplification biases and errors. UMIs can be added to DNA by either ligation- or PCR-based approaches. Ligation-based UMI approaches require that target DNA is captured before the analysis, otherwise, all DNA molecules present in a sample will be analyzed. Another limitation is that DNA molecules are lost due to limited ligation efficiency. In comparison, PCR-based UMI approaches are simpler since no capture step is needed. PCR-based methods are potentially also more sensitive since they do not suffer from ineffective capture and ligation steps. However, introduction of UMIs into PCR primers may cause massive formation of nonspecific PCR products caused by the random nucleotide sequence of UMIs. This problem is solved according to the above-described and in Figs. 3 and 5 shown embodiment by shielding the UM1 14 in a secondary structure in the hairpin barcode forward primers 10. Hence, in order to minimize the formation of non-specific PCR products, the UMI 14 is protected inside a hairpin loop that opens and closes its secondary structure in a temperature-dependent manner.
In an embodiment, all hairpin barcode forward primers 10 comprise the same 5’ stem sequence 12, the same adapter sequence 13 and/or the same 3’ stem sequence 15. However, the hairpin barcode forward primers 10 comprise different UMIs 14. The hairpin barcode forward primers 10 may all have the same target-specific sequence 16 (but different UMIs 14), multiple, i.e., at least two, hairpin barcode forward primers 10 have the same target-specific sequence 16 or the hairpin barcode forward primers 10 may have different target-specific sequences 16.
Examples of hairpin barcode forward primers 10 that can be used in accordance with the embodiments are shown in Table 1. In these illustrative examples, the 5’ stem sequence 12 consists of the nucleotide sequence GGACACTCTTTCCC (SEQ ID NO: 53) that is complementary to and capable of hybridizing to the 3’ stem sequence 15 consisting of the nucleotide sequence GGGAAAGAGTGTCC (SEQ ID NO: 54) In these examples, NNNNNNNNNNNN represents the UM1 14. The hairpin barcode forward primer 10 additionally comprises a nucleotide sequence forming the adapter sequence 13. This adapter sequence 13 may be positioned between the 5’ stem sequence 12 and the UMI 14. It is, however, also possible that all or a 3’ portion of the nucleotides of the 5’ stem sequence 12 also form part of the adapter sequence 13. For instance, an adapter sequence 14 of TACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 55) could be used in the examples shown in Table 1.
In an embodiment, step S20 comprises contacting the extracted cfDNA 1 with at least one pair of a hairpin barcode forward primer 10 and a reverse primer 20 selected from the group consisting of SEQ ID NO: 1 and 2, 3 and 4, 5 and 6, 7 and 8, 9 and 10, 11 and 12, 13 and 14, 15 and 16, 17 and 18, 19 and 20, 21 and 22, 23 and 24, 25 and 26, 27 and 28, 29 and 30, 31 and 32, 33 and 34, 35 and 36, 37 and 38, 39 and 40, 41 and 42, 43 and 44, 45 and 46, 47 and 48, 49 and 50, and 51 and 52. In an embodiment, the extracted cfDNA 1 are contacted with all such pairs of hairpin barcode forward primer 10 and reverse primer 20 as listed in Table 1 .
The present invention is not limited to the particular hairpin barcode forward primers and reverse primers as listed in Table 1 or to the ETS transcription factor binding sites listed in Table 2. Hence, also, or alternatively, other ETS transcription factor binding sites than the ones listed in Table 2 could be used according to the embodiments.
In an embodiment, multiple different reverse primers 20 is used together with the hairpin barcode forward primers 10. Each reverse primer 20 then preferably comprises a respective target-specific sequence 23 that is complementary to a respective sequence or region 3 of the cfDNA 1 that is specific for a given ETS transcription factor binding site 4.
In an embodiment, all reverse primers 20 comprise the same adapter sequence 23. However, the reverse primers 20 may comprise different target-specific sequences 26.
Illustrative, but non-limiting, example of reverse primers 20 that can be used in accordance with the embodiments are listed in Table 1.
In an embodiment, the adapter sequence 13 of the hairpin barcode forward primers 10 is different from the adapter sequence 23 of the at least one reverse primer 20.
In an embodiment, the UM1 14 is a random nm2n3...nk sequence. In this embodiment, ni, i=1...k, is one of A, T, C and G, and k is preferably from 6 up to 18, more preferably from 10 up to 15, and such as 12. Hence, each hairpin barcode forward primer 10 preferably comprises a respective unique UM1 14 having a random sequence that is different from the random sequences of UMIs 14 in other hairpin barcode forward primers 10. The length of the UMIs 14 is preferably selected at least partly based on the number of cfDNA molecules 1 in the sample. For instance, the number of unique UMIs 14 is equal for 4k for an UM1 14 of length k nucleotides. This number 4k should preferably be significantly larger than the number of cfDNA molecules 1 in the body fluid sample.
The amplification of the cfDNA in step S21 comprises performing PCR pre-amplification at an annealing temperature equal to or less than the closed annealing temperature of the at least one hairpin barcode forward primer 10. Accordingly, at this PCR pre-amplification in step S21 at least a majority of the hairpin barcode forward primers 10 have an intact hairpin loop, i.e., the 5’ stem sequence 12 is hybridized to the 3’ stem sequence 15. This in turn significantly reduces the amount of non-specific PCR products that may otherwise occur during the pre-amplification by the random nucleotide sequence of the UMI 14. Hence, the vast majority of the PCR products from the PCR-amplification in step S1 are the desired barcoded PCR products 50 corresponding to an amplified portion 2, 3, 4 of the cfDNA 1.
In an embodiment, the closed annealing temperature is equal to or less than 65°C. Hence, in such an embodiment, the PCR pre-amplification of step S21 is preferably performed at an annealing temperature equal to or less than to 65°C. For instance, the PCR pre-amplification of step S21 could be performed at an annealing temperature selected within an interval of from 60°C up to 65°C, preferably from 60°C up to 64°C, such as at about 62°C.
The PCR pre-amplification of the cfDNA 1 in step S21 is performed using a polymerase, preferably a DNA polymerase, and more preferably a heat-stable DNA polymerase. Non-limiting, but illustrative, examples of DNA polymerases that can be used according to the embodiments include Thermus thermophilus (Tth) DNA polymerase, Bacillus stearothermophilus DNA polymerase, Thermus aquaticus (Taq) DNA polymerase, Thermus flavus (Tfl) polymerase, Vent® DNA polymerase, Pfu polymerase, and Escherichia coll DNA polymerase I. In some embodiments, the DNA polymerase lacks 5'-nuclease activity. Examples of such polymerases include Klenow fragmentof DNA polymerase 1 , Stoeffel fragment of Taq polymerase, Pfu polymerase or Vent® polymerase. In an embodiment, the DNA polymerase is a so-called thermoactivated DNA polymerase, also referred to as, hot-start DNA polymerase. Specific examples of DNA polymerases include Takara PRIME STAR GXL polymerase I, Clontech's ADVANTAGE HD Polymerase, NEB Q5® High-Fidelity DNA Polymerases NEB PHUSION® High-Fidelity DNA Polymerases, ThermoFisher PLATINUM® Taq DNA Polymerase High Fidelity, ThermoFisher ACCUPRIME™ Pfx DNA Polymerase, ThermoFisher ACCUPRIME™ Taq DNA Polymerase High Fidelity, ThermoFisher PhusionTM High Fidelity Polymerase, ThermoFisher PlatinumTM SuperFiTM II DNA Polhymerase, Promega Pfu DNA Polymerase, and Qiagen HOTSTAR HIFIDELITY Polymerase. The above presented examples of DNA polymerases that can be used in the PCR pre-amplification in step S21 may also be used in the PCR amplification in step S23.
In an embodiment, step S21 comprises amplifying the nucleic acid molecules 1 by performing 1-20 cycles, preferably 2-15 cycles, and more preferably 2-10 of PCR pre-amplification of the cfDNA 1 to form the plurality of barcoded PCR products 50. Hence, it is generally preferred to perform a rather low number of PCR cycles in the PCR pre-amplification. This low number of PCR cycles, together with protecting UMIs 14 within hairpin loops, significantly reduces the amount of non-specific PCR products produced in the PCR pre-amplification.
The barcoded PCR products 50 obtained in the PCR pre-amplification are then contacted in step S22 with an adapter-specific forward primer 30 and an adapter-specific reverse primer 40. In an embodiment, the adapter-specific forward primer 30 comprises a sequence equal to or complementary to, preferably equal to, the adapter sequence 13 of the hairpin barcode forward primers 10. Correspondingly, the adapter-specific reverse primer 40 comprises a sequence equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20.
In a particular embodiment, the adapter-specific forward primer 30 comprises, from a 5’ end 31 to a 3’ end 34, one of a P5 sequence and a P7 sequence 32 and the sequence 33 equal to or complementary to, preferably equal to, the adapter sequence 13 of the hairpin barcode forward primers 10. In this particular embodiment, the adapter-specific reverse primer 40 comprises, from a 5’ end 41 to a 3’ end 44, the other of the P5 sequence and the P7 sequence 42 and the sequence 43 equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20. In a particular embodiment, the adapter-specific reverse primer 40 comprises, from the 5’ end 41 to the 3’ end 44, the other of the P5 sequence and the P7 sequence 42, an index sequence and the sequence 43 equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20.
In an embodiment, the P5 and P7 sequences are P5 and P7 ILLUMINA® sequences. In an embodiment, the P5 sequence comprises AATGATACGGCGACCACCGA (SEQ ID NO: 56). For instance, the P5 sequence could comprise, such as consist of, AATGATACGGCGACCACCGAGATCTACAC (SEQ ID NO: 57). In this latter example, at least one of the 3’ nucleotides of the P5 sequence may be common for the P5 sequence and the following sequence 33. Correspondingly, in an embodiment, the P7 comprises, preferably consists of, CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO: 58).
The PCR amplification in step S23 is performed at an annealing temperature equal to or greater than the open annealing temperature of the at least one hairpin barcode forward primer 10. At such an annealing temperature, at least a significant portion of the barcoded PCR products 50 are in an open structure, i.e., there is no significant hairpin loop formation by hybridization of complementary stem portions of the barcoded PCR products 50. In a particular embodiment, the open annealing temperature is at least 70°C. In such a particular embodiment, the PCR amplification is performed at a temperature equal to or above 70°C, such as 71 °C, 72°C or even higher.
In an embodiment, the open annealing temperature of the hairpin barcode forward primers 10 is higher than the closed annealing temperature of the hairpin barcode forward primers 10. In a particular embodiment, the open annealing temperature is at least 1 °C, preferably at least 2°C, such at least 3°C or 4°C, and more preferably at least 5°C higher than the closed annealing temperature.
In an embodiment, step S23 comprises amplifying the barcoded PCR products 50 by performing at least 2, preferably at least 5, more preferably at least 10, such as at least 15, at least 20 or more preferably at least 25 cycles of PCR amplification on the barcoded PCR products 50 to form a library of amplified barcoded PCR products 60. The number of PCR cycles is preferably selected to achieve sufficient number of amplified barcoded PCR products 60 for sequencing even for cfDNA molecules 1 present in a low copy number in the body fluid sample.
In a general embodiment, the number of PCR cycles in step S23 is higher than the number of PCR cycles in step 21.
In an optional, but preferred embodiment, the method comprises an additional step following step S21 but prior to step S23. This additional step then comprises degrading the polymerase used for PCR preamplifying the cfDNA 1 in step S21 prior to amplifying the barcoded PCR products in the PCR amplification of step S23.
Various techniques could be used to degrade polymerases including, but not limited to, heat treatment, chemical treatment, addition of a protease, sample dilution and enzymatic treatment. For instance, a protease could be added to the amplification products from the PCR pre-amplification in step S21 to enzymatically degrade the polymerase used for amplifying the cfDNA 1 in the PCR pre-amplification. Such a degradation of the polymerase once the PCR pre-amplification in step S21 has been completed additionally inhibits formation of non-specific PCR products.
In an embodiment, sequencing the at least a respective portion of the amplicons in step S11 of Fig. 2 comprises sequencing at least a respective portion of the amplified barcoded PCR products 60 to form respective sequence reads comprising the UMI(s) 64 and an ETS transcription binding site sequence 66. This embodiment also comprises detecting in step S27, see Fig. 4, presence of mutations at the at least one ETS transcription factor binding site 4 in the extracted cfDNA 1 based on the respective sequence reads.
The sequencing is achieved by means of at least one sequencing primer. The result of the sequencing is respective sequence reads comprising at least nucleotide sequences of the UMI 64 and ETS transcription factor binding site sequence(s) 66.
In a particular embodiment, the sequencing comprises in situ sequencing the at least a portion of the amplified barcoded PCR products 60 immobilized onto a solid support. For instance, the preferred P5 and P7 sequences introduced into the amplified barcoded PCR products 60 by means of the adapterspecific forward primers 30 and the adapter-specific reverse primers 40 could be used to immobilize the amplified barcoded PCR products 60 onto the solid support. In such an embodiment, the solid support preferably comprises immobilized nucleotide sequences complementary to the P5 sequence and/or immobilized nucleotide sequences complementary to the P7 sequence.
The in situ sequencing preferably comprises in situ sequencing by synthesis of the at least a portion of the amplified product 60. For instance, if the adapter-specific forward and reverse primers 30, 40 comprise P5 and P7 sequences, respectively, the ILLUMINA® sequencing technology could be used to in situ sequence at least a portion of the amplified barcoded PCR products 60 by synthesis. In more detail, the amplified barcoded PCR products 60 are immobilized on a flow cell surface designed to present the amplified barcoded PCR sequences 60 in a manner that facilitates access to enzymes while ensuring high stability of surface bound amplified barcoded PCR products 60 and low non-specific binding of fluorescently labeled nucleotides.
Sequence By Synthesis (SBS) uses two or four fluorescently labeled nucleotides to sequence the amplified barcoded PCR products 60 on the flow cell surface in parallel. During each sequencing cycle, a single labeled deoxynucleoside triphosphate (dNTP) is added to the nucleic acid chain. The nucleotide label serves as a terminator for polymerization so after each dNTP incorporation, the fluorescent dye is imaged to identify the base and then enzymatically cleaved to allow incorporation of the next nucleotide.
As an illustrative, but non-limiting, example ILLUMINA® MiniSeq™ system could be used to sequence the amplified barcoded PCR products 60. In an embodiment, detecting presence of mutations comprises, see Fig. 4, demultiplexing the sequence reads based on nucleic acid sequences of the UMIs 64 in step S25, mapping the demultiplexed sequence reads to a respective ETS transcription binding site region based on nucleic acid sequences of the at least one ETS transcription binding site sequence 66 in step S26 and detecting presence of mutations at the at least one ETS transcription factor binding site 4 in the extracted cfDNA 1 based on the demultiplexed and mapped sequence reads in step S27.
The sequence reads, thus, are then demultiplexed in step S25 based on the nucleic acid sequences of the UMIs 64. In an embodiment, such demultiplezing comprises dividing the sequence reads into groups having a same nucleotide sequence of the UMIs 64, optionally with at most a predefined number of mismatches allowed for nucleotide sequences of UMIs 64 in a same group. The demultiplexed sequence reads are then mapped in step S26 to respective ETS transcription binding site region based on nucleic acid sequences of the ETS transcription binding site sequence 66 in the sequence reads. In an embodiment, this mapping comprises dividing the demultiplexed sequence reads into groups having a same nucleotide sequence of the ETS transcription binding site sequences 66, optionally with at most a predefined number of mismatches allowed for nucleotide sequences of ETS transcription binding site sequence sequences 66 in a same group.
Another aspect of the invention relates to a kit for determining tissue of origin of cfDNA. The kit comprises M>2 primer pairs of a hairpin barcode forward primer 10 and a reverse primer 20. Each hairpin barcode forward primer 10 of the M pairs comprises, from a 5’ end 11 to a 3’ end 17, a 5’ stem sequence 12, an adapter sequence 13, a UMI 14, a 3’ stem sequence 15 and a first target-specific sequence 16 complementary to a nucleotide sequence upstream 2 of an ETS transcription factor binding site 4 in the cfDNA 1. Each reverse primer 20 of the M primer pairs comprises, from a 5’ end 21 to a 3’ end 27, an adapter sequence 23 and a second target-specific sequence 26 complementary to a nucleotide sequence downstream 3 of the ETS transcription factor binding site 4 in the cfDNA. At least a portion of the 5’ stem sequence 12 of the hairpin barcode forward primer 10 is complementary to at least a portion of the 3’ stem sequence 15 of the hairpin barcode forward primer 10. The 5’ stem sequence 12 and the 3’ stem sequence 15 are configured to hybridize to each other at or under a closed annealing temperature and not hybridize to each other at or above an open annealing temperature. The kit also comprises an adapter-specific forward primer 30 comprising a sequence equal to or complementary to, preferably equal to, the adapter sequence 13 of the hairpin barcode forward primers 10. The kit further comprises an adapter-specific reverse primer 40 comprising comprises a sequence equal to or complementary to, preferably equal to, the adapter sequence 23 of the reverse primers 20. The kit of the invention can advantageously be used in the method for determining the tissue of origin of cfDNA.
Various embodiments discussed in the foregoing in connection with the method also applies mutatis mutandis to the kit.
In an embodiment, the M primer pairs of the hairpin barcode forward primer 10 and the reverse primer 20 are selected from the group consisting of SEQ ID NO: 1 and 2, 3 and 4, 5 and 6, 7 and 8, 9 and 10, 11 and 12, 13 and 14, 15 and 16, 17 and 18, 19 and 20, 21 and 22, 23 and 24, 25 and 26, 27 and 28, 29 and 30, 31 and 32, 33 and 34, 35 and 36, 37 and 38, 39 and 40, 41 and 42, 43 and 44, 45 and 46, 47 and 48, 49 and 50, and 51 and 52.
In an embodiment, the adapter-specific forward primer 30 comprises a nucleotide sequence according to SEQ ID NO: 59 and the adapter-specific reverse primer 40 comprises a nucleotide sequence according to SEQ ID NO: 60.
The method and kit of the invention provides valuable information that can be used in the diagnosis of subjects. For instance, the presence of skin-derived cfDNA in a body fluid sample taken from a subject may indicate that the subject is suffering from a skin disease causing, for instance secretion of DNA or apoptosis, necrosis and/or NETosis, and thereby release of DNA from skin cells. Hence, presence of skin-derived cfDNA in the body fluid sample may, thus, indicate that the subject is suffering from any of the previously mentioned skin diseases, i.e., skin cancer, vascular malformations, psoriasis, or SLE. In a particular embodiment, the presence of skin-derived ctDNA in the body fluid sample could be an indication that the subject is suffering from skin cancer. The present invention could also be used in the prognosis of a yet non-detected skin tumor, such as when the tumor mass currently is too small to be detected by other means, but still contain sufficient number of skin cancer cells to release DNA into the circulation.
The above-described embodiment of differentiating the skin-derived cfDNA between skin-derived ctDNA and skin-derived and non-tumor cfDNA could also be used as a basis for predicting whether the subject is suffering from a skin cancer or may suffering from another skin disease or condition causing release of DNA from non-cancerous cells. In an embodiment, the method comprises predicting a subject, from which the body fluid sample has been extracted, as suffering from skin cancer, such as BCC, SCC or melanoma, preferably melanoma, if the skin-derived cfDNA is determined to be skin-derived ctDNA.
In an embodiment, the method comprises predicting a subject, from which the body fluid sample has been extracted, as suffering from a non-cancerous skin disease, preferably psoriasis or SLE, if the skin- derived cfDNA is determined to be skin-derived and non-tumor-derived cfDNA.
The above described technology of determining the tissue of origin of cfDNA can also be used in other applications where there is a need to determine the tissue of origin. As an example, cancer of unknown primary (CUP) means that cancer spread (secondary tumor) has been found in the body, but the original tissue or organ where the cancer started (primary tumor) is not known. In such a case, DNA can be extracted from a tumor sample, in particular a secondary tumor sample, the presence of mutations at one or more ETS transcription factor binding sites in the extracted DNA can be analyzed and used to determine whether the primary tumor or cancer is of skin origin (skin cancer) or not. Such information is valuable to guide treatment of the cancer disease.
Another aspect of the invention therefore relates to a method of determining a primary cancer for a tumor sample. The method comprises extracting DNA from the tumor sample, preferably from a secondary tumor sample. The method also comprises analyzing the extracted DNA for the presence of mutations at at least one ETS transcription factor binding site in the extracted DNA. The method further comprises determining the primary cancer to be skin cancer if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
Primary cancer as used herein refers to the original, or first, tumor in the body. Cancer cells from a primary cancer may spread to other parts of the body and form new, or secondary, tumors in a process referred to as metastasis. The secondary tumors are the same type of cancer as the primary cancer. Primary cancer is also referred to as primary tumor in the art. As an illustrative example, cancer cells from the skin (primary cancer or tumor) can spread to form new tumors in the lung (secondary cancer or tumor). The cancer cells in the lung are then just like the ones in the skin. This means that if mutations at one or more ETS transcription factor binding sites are detected in DNA extracted from a tumor sample taken from the secondary tumor in the lung, then the present method can be used to determine that the primary cancer of the (secondary) tumor in the lung is skin cancer. In an embodiment, the method comprises determining the primary cancer to not be skin cancer if no mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site, preferably at at least five, more preferably at at least ten and more preferably at at least 15, at least 20 or at least 25 ETS transcription binding sites.
Another application is within forensic sciences and in particular when there is a need to detect the tissue of origin of DNA collected at a crime scene. Hence, in some forensic applications it would be useful to determine whether DNA samples found at a crime scene is from the skin or not.
A further aspect of the invention relates to a forensic DNA analysis method. The method comprises analyzing a DNA sample for the presence of mutations at at least one ETS transcription factor binding site in the DNA. The method further comprises determining the DNA to be skin-derived DNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
Thus, in this method a DNA sample, such as naked DNA, is detected at a crime scene and there is need to determine whether the DNA in the DNA sample is skin-derived DNA or not skin-derived DNA. In forensic techniques, a body sample, such as skin or hair, or a body fluid sample, such as blood or saliva, is present at the crime scene so the source of the sample is readily known. However, in some situations, DNA is present at the crime scene or the body sample is in so minute amounts or damaged state so that it is not possible to determine whether it is skin or some other type of tissue. Another example would be that the body (fluid) sample is a mixed sample containing DNA from several different persons. For instance, drops of blood from one person could contain skin cells, and thereby skin-derived DNA, from another person. In such a case, it could be important to determine that most of the DNA extracted from the blood is non-skin derived and thereby from one person but some is actually skin-derived and from another person.
In an embodiment, the method also comprises extracting DNA from a body sample or a body fluid sample. In such an embodiment, analyzing the DNA comprises analyzing the extracted DNA for the presence of mutations at at least one ETS transcription factor binding site in the extracted DNA. In this embodiment, the determining step comprises determining the extracted DNA to be skin-derived DNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site.
In an embodiment, the method comprises determining the DNA to not be skin-derived DNA if no mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site, preferably at at least five, more preferably at at least ten and more preferably at at least 15, at least 20 or at least 25 ETS transcription binding sites.
The various embodiment described herein for the method for determining tissue of origin of cfDNA also applies to the method of determining a primary cancer for a tumor sample and to the forensic DNA analysis method.
EXAMPLES
Materials and methods
Extraction of DNA from UV treated A375 cells
A375 cells were treated with 36 J/m2 ultraviolet-C (UVC) light five times a week for 10 weeks. Samples were split when confluent and reseeded at 1 :10 approximately every three days and remaining cells were frozen at -20°C for later DNA extraction. DNA was extracted using the DNA Blood and Tissue mini kit spin protocol (Qiagen). Samples were eluted in 200 pl buffer EB (Qiagen). DNA concentration varied between 50 to 100 ng/pl. DNA concentration was measured using a Qubit™ dsDNA High Sensitivity kit (Invitrogen).
Extraction of cfDNA
For each patient (n=15), 8 x 225 pl frozen plasma samples were thawed at room temperature (20-25°C), pooled and DNA was extracted using the QIAamp® Circulating Nucleic Acid kit (Qiagen) vacuum protocol. Samples were eluted in 50 pl buffer AVE (Qiagen) and concentrated to approximately 10 pl using a Vivacon® 500 column with molecular weight cutoff 30,000 Da (Sartorius). Total DNA extracted varied between 1 and 2.5 ng/pl. DNA concentration was measured using a Qubit™ dsDNA High Sensitivity kit (Invitrogen).
SiMSen®-seq
SiMSen®-seq was performed in two steps (Simsen Diagnostics AB). First, 10 ng of the extracted DNA samples were amplified with the UV hotspot multiplex primer set (T able 1 ) for two cycles to label the DNA fragments with a unique molecular index (UMI) barcode. This was followed by protease treatment to digest the first polymerase and a second PCR reaction using index primers (Table 3) for Illumina® sequencing for 28 cycles. For cfDNA, due to the lower DNA concentration, two 10 pl reactions were performed and were pooled prior to clean up with AMPure® beads. Equimolar amounts of each SiMSen®-seq library were pooled and sequencing was performed on a MiniSeq™ sequencing system (Illumina) using a high output 150 cycle kit. Fastq files obtained from the Miniseq™ sequencing system, after demultiplexing, were processed while accounting for UMI barcodes to produce error-corrected consensus reads (minimum 3* oversampling) as previously described previously in Elliott et al. 2018. Samples were called positive for a given mutation if there were more than one consensus reads supporting the mutation.
Results
A multiplexed error-corrected amplicon sequencing assay (SiMSen@-seq) was used to assay 26 UV ETS hotspot regions (primers are listed in Table 1). The hotspot regions all contained ETS motifs (TTCC[G/T]) flanked at the 5’ side by a dipyrimidine (CC, CT or TC; not TT as it is not mutagenic). The hotspot regions were initially identified based on recurrent somatic mutations in whole genome sequencing data from melanoma. As mutations can arise at up to three key positions flanking or overlapping an ETS transcription factor binding site, there are multiple hotspot positions at each ETS transcription factor binding site. All individual informative hotspot positions are listed in Table 2.
The hotspot regions and the assay were verified in cell culture using A375 cells, which were initially negative for all but one of the hotspot mutations covered by this specific panel of primers (Table 1). Cells were exposed to daily doses of UVC light (36 J/m2 UVC five days per week) and DNA was harvested after 10 weeks. In UV-exposed DNA, mutations were detected in the vast majority of hotspot regions at expected positions within the ETS motifs, at variant allele frequencies up to 2.7% (Fig. 6). No mutations were detected at hotspot regions in DNA from non-exposed control A375 cells apart from the single preexisting UV hotspot mutation, which was clonally present at an allele frequency close to 50%, explained by the fact the A375 is a melanoma cell line (Fig. 6). These results establish this assay as a useful tool for determining a history of UV exposure in DNA and for detection of UV-induced mutations at ETS transcription factor binding sites.
The assay was then tested to determine whether UV-exposed cfDNA could be detected in human blood. Accordingly, cfDNA was isolated from blood plasma from 15 melanoma patients, 6 of which had a known residual tumor burden. The remaining 9 patients had lesions successfully removed by surgery. Of these, 3 patients had a positive sentinel node biopsy, which is an imperfect predictor of metastatic spread.
All 15 liquid biopsy samples were analyzed using the assay established above for the A375 cell culture. Skin cell-derived cfDNA, as indicated by the presence of mutations in investigated hotspot regions, was detected in 6 of the 15 samples, and in 4 of the 6 samples originating from patients having a known residual tumor burden, see Fig. 7.
Skin-derived cfDNA was also found in 2 of the 9 patients without a known tumor burden, both with negative sentinel node biopsy, see Fig. 7. This may indicate presence of undiagnosed residual disease, or alternatively a contribution from healthy skin to cfDNA in blood plasma.
In summary, skin-derived cfDNA was detectable in blood plasma samples of human subjects on the basis of a UV hotspot-based mutation panel assay.
Table 1 - UV hotspot multiplex primer set
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Figure imgf000027_0002
*F - forward primer (5’^3’); R - reverse primer (5’^3’)
**NNNNNNNNNNNN represents UMI barcode
Table 2 - Hotspot positions flanking or overlapping ETS transcription factor binding sites
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
*Human reference genome hg38 (GRCh38), NCBI RefSeq assembly GCF_000001405.26, 17 December
2013
Table 3 - Index primers
Figure imgf000029_0002
*NNNNNN represents Illumina® Index adapter sequences
The embodiments described above are to be understood as a few illustrative examples of the present invention. It will be understood by those skilled in the art that various modifications, combinations and changes may be made to the embodiments without departing from the scope of the present invention. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible. The scope of the present invention is, however, defined by the appended claims.
REFERENCES Elliott, K., M. Bostrom, S. Filges, M. Lindberg, J. Van den Eynden, A. Stahlberg, E. Larsson (2018). "Elevated pyrimidine dimer formation at distinct genomic bases underlies promoter mutation hotspots in UV-exposed cancers" PLoS Genet 14(12): e1007849 Snyder, M. W., M. Kircher, A. J. Hill, R. M. Daza and J. Shendure (2016). "Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin" Cell 164(1-2): 57-68
Warton, K., K. L. Mahon and G. Samimi (2016). "Methylated circulating tumor DNA in blood: power in cancer prognosis and response" EndocrRelat Cancer 23(3): R157-171

Claims

1 . A method for determining tissue of origin of circulating free deoxyribonucleic acid (cfDNA) (1 ), the method comprises: extracting (S1) cfDNA (1) from a body fluid sample; analyzing (S2) the extracted cfDNA (1) for the presence of mutations at at least one E26 transformation-specific (ETS) transcription factor binding site (4) in the extracted cfDNA (1); and determining (S3) the cfDNA (1) to be skin-derived cfDNA if mutations are detected, based on the analysis, at the at least one ETS transcription factor binding site (4).
2. The method according to claim 1 , wherein analyzing (S2) the extracted cfDNA (1) comprises analyzing (S1) the extracted cfDNA (1) for the presence of mutations at a plurality of ETS transcription factor binding sites (4) in the extracted cfDNA (1); and determining (S3) the cfDNA (1) to be skin-derived cfDNA comprises determining (S3) the cfDNA (1) to be skin-derived cfDNA if mutations are detected, based on the analysis, at at least one ETS transcription factor binding site (4) of the plurality of ETS transcription factor binding sites (4).
3. The method according to claim 2, further comprising determining (S4) whether the skin-derived cfDNA is skin-derived circulating tumor DNA (ctDNA) or skin-derived and non-tumor-derived cfDNA based on a pattern of mutations at the plurality of ETS transcription factor binding sites (4) in the extracted cfDNA (1).
4. The method according to claim 3, wherein determining (S4) whether the skin-derived cfDNA is skin-derived ctDNA or skin-derived and non-tumor-derived cfDNA comprises: determining the skin-derived cfDNA to be skin-derived ctDNA if the mutations are present only in a limited subset of the plurality of ETS transcription factor binding sites (4) in the extracted cfDNA (1); and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
5. The method according to claim 4, wherein determining (S4) whether the skin-derived cfDNA is skin-derived ctDNA or skin-derived and non-tumor-derived cfDNA comprises: determining the skin-derived cfDNA to be skin-derived ctDNA if the mutations are present in less than 50 %, preferably less than 40 %, and more preferably less than 30 %, of the plurality of ETS transcription factor binding sites (4) in the extracted cfDNA (1); and otherwise determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA.
6. The method according to any one of claims 3 to 5, wherein determining (S4) whether the skin- derived cfDNA is skin-derived ctDNA or skin-derived and non-tumor-derived cfDNA comprises: determining the skin-derived cfDNA to be skin-derived and non-tumor-derived cfDNA if the mutations are distributed among the plurality of ETS transcription factor binding sites (4) in the extracted cfDNA (1); and otherwise determining the skin-derived cfDNA to be skin-derived ctDNA.
7. The method according to any one of claims 1 to 6, wherein analyzing (S2) the extracted cfDNA (1) comprises analyzing (S2) the extracted cfDNA (1) for the presence of mutations within and/or flanking a 5’ end of the at least one ETS transcription factor binding site (4) in the extracted cfDNA (1).
8. The method according to any one of claims 1 to 7, wherein the at least one ETS transcription binding site (4) comprises the sequence TTCCK; and
K represents G or T.
9. The method according to claim 8, wherein analyzing (S2) the extracted cfDNA (1) comprises analyzing (S2) the extracted cfDNA (1) for the presence of mutations at at least one TTCCK or Y1Y2TTCCK sequence in the extracted cfDNA (1); and
Y1 and Y2 are independently C or T, preferably Y1 and Y2 are independently C or T with the proviso that Y1 and Y2 are not both T.
10. The method according to any one of claims 1 to 9, wherein analyzing (S2) the extracted cfDNA (1) comprises: amplifying (S10) at least one portion of the extracted cfDNA (1 ) comprising the at least one ETS transcription binding site (2) to form a plurality of amplicons (60); and sequencing (S11) at least a respective portion of the amplicons (60) to form respective sequence reads.
11. The method according to claim 10, wherein amplifying (S10) the at least one portion of the extracted cfDNA (1) comprises: contacting (S20) the extracted cfDNA (1) with at least one hairpin barcode forward primer (10) and at least one reverse primer (20), wherein each hairpin barcode forward primer (10) of the at least one hairpin barcode forward primer (10) comprises, from a 5’ end (11) to a 3’ end (17), a 5’ stem sequence (12), an adapter sequence (13), a unique molecular identifier (UMI) (14), a 3’ stem sequence (15) and a target-specific sequence (16) complementary to a nucleotide sequence upstream (2) of an ETS transcription factor binding site (4) of the at least one ETS transcription binding site (4); each reverse primer (20) of the at least one reverse primer (20) comprises, from a 5’ end (21) to a 3’ end (27), an adapter sequence (23) and a target-specific sequence (26) complementary to a nucleotide sequence downstream (3) of the ETS transcription factor binding site (4) of the at least one ETS transcription binding site (4); at least a portion of the 5’ stem sequence (12) of the hairpin barcode forward primer (10) is complementary to at least a portion of the 3’ stem sequence (15) of the hairpin barcode forward primer (10), the 5’ stem sequence (12) and the 3’ stem sequence (15) are configured to hybridize to each other at or under a closed annealing temperature and not hybridize to each other at or above an open annealing temperature; amplifying (S21) the at least one portion of the extracted cfDNA (1) comprising the at least one ETS transcription binding site (4) by performing polymerase chain reaction (PCR) pre-amplification of the at least one portion of the extracted cfDNA (1) comprising the at least one ETS transcription binding site (4) to form a plurality of barcoded PCR products (50), wherein the PCR pre-amplification has an annealing temperature equal to or less than the closed annealing temperature of the at least one hairpin barcode forward primer (10); contacting (S22) the plurality of barcoded PCR products (50) with an adapter-specific forward primer (30) and an adapter-specific reverse primer (40); and amplifying (S23) the barcoded PCR products (50) by performing PCR amplification on the barcoded PCR products (50) to form a library of amplified barcoded PCR products (60), wherein at least a portion of cycles of the PCR amplification has an annealing temperature equal to or greater than the open annealing temperature of the at least one hairpin barcode forward primer (10).
12. The method according to claim 11 , wherein contacting (S20) the extracted cfDNA (1) comprises contacting (S20) the extracted cfDNA (1) with at least one pair of a hairpin barcode forward primer (10) and a reverse primer (20) selected from the group consisting of SEQ ID NO: 1 and 2, 3 and 4, 5 and 6, 7 and 8, 9 and 10, 11 and 12, 13 and 14, 15 and 16, 17 and 18, 19 and 20, 21 and 22, 23 and 24, 25 and 26, 27 and 28, 29 and 30, 31 and 32, 33 and 34, 35 and 36, 37 and 38, 39 and 40, 41 and 42, 43 and 44, 45 and 46, 47 and 48, 49 and 50, and 51 and 52.
PCT/SE2024/050958 2023-11-10 2024-11-08 TISSUE OF ORIGIN OF cfDNA Pending WO2025101114A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE2351292-4 2023-11-10
SE2351292 2023-11-10

Publications (1)

Publication Number Publication Date
WO2025101114A1 true WO2025101114A1 (en) 2025-05-15

Family

ID=93563283

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2024/050958 Pending WO2025101114A1 (en) 2023-11-10 2024-11-08 TISSUE OF ORIGIN OF cfDNA

Country Status (1)

Country Link
WO (1) WO2025101114A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200377956A1 (en) * 2017-08-07 2020-12-03 The Johns Hopkins University Methods and materials for assessing and treating cancer

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200377956A1 (en) * 2017-08-07 2020-12-03 The Johns Hopkins University Methods and materials for assessing and treating cancer

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
ANDERS STÅHLBERG ET AL: "Simple, multiplexed, PCR-based barcoding of DNA enables sensitive mutation detection in liquid biopsies using sequencing", NUCLEIC ACIDS RESEARCH, vol. 44, no. 11, 7 April 2016 (2016-04-07), GB, pages e105 - e105, XP055417872, ISSN: 0305-1048, DOI: 10.1093/nar/gkw224 *
DE SUBHAJYOTI ET AL: "Signatures Beyond Oncogenic Mutations in Cell-Free DNA Sequencing for Non-Invasive, Early Detection of Cancer", FRONTIERS IN GENETICS, vol. 12, 14 October 2021 (2021-10-14), Switzerland, XP093244528, ISSN: 1664-8021, Retrieved from the Internet <URL:https://pmc.ncbi.nlm.nih.gov/articles/PMC8551553/pdf/fgene-12-759832.pdf> DOI: 10.3389/fgene.2021.759832 *
DOUGLAS A. MATA: "Prevalence of UV Mutational Signatures Among Cutaneous Primary Tumors", JAMA NETWORK OPEN, vol. 5, no. 3, 23 March 2022 (2022-03-23), pages 1 - 4, XP093166842, ISSN: 2574-3805, DOI: 10.1001/jamanetworkopen.2022.3833 *
ELLIOTT KERRYN ET AL: "Elevated pyrimidine dimer formation at distinct genomic bases underlies promoter mutation hotspots in UV-exposed cancers", PLOS GENETICS, vol. 14, no. 12, 26 December 2018 (2018-12-26), US, pages e1007849, XP093244451, ISSN: 1553-7404, Retrieved from the Internet <URL:https://journals.plos.org/plosgenetics/article/file?id=10.1371/journal.pgen.1007849&type=printable> DOI: 10.1371/journal.pgen.1007849 *
ELLIOTT, K.M. BOSTROMS. FILGESM. LINDBERGJ. VAN DEN EYNDENA. STAHLBERGE. LARSSON: "Elevated pyrimidine dimer formation at distinct genomic bases underlies promoter mutation hotspots in UV-exposed cancers", PLOS GENET, vol. 14, no. 12, 2018, pages e1007849
PFEIFER GERD P.: "Mechanisms of UV-induced mutations and skin cancer", GENOME INSTABILITY & DISEASE, vol. 1, no. 3, 19 March 2020 (2020-03-19), pages 99 - 113, XP093244910, ISSN: 2524-7662, Retrieved from the Internet <URL:https://pmc.ncbi.nlm.nih.gov/articles/PMC8477449/pdf/42764_2020_Article_9.pdf> DOI: 10.1007/s42764-020-00009-8 *
SNYDER, M. W.M. KIRCHERA. J. HILLR. M. DAZAJ. SHENDURE: "Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin", CELL, vol. 164, no. 1-2, 2016, pages 57 - 68
WARTON, K.K. L. MAHONG. SAMIMI: "Methylated circulating tumor DNA in blood: power in cancer prognosis and response", ENDOCR RELAT CANCER, vol. 23, no. 3, 2016, pages R157 - 171, XP055723832, DOI: 10.1530/ERC-15-0369

Similar Documents

Publication Publication Date Title
EP3673081B1 (en) Accurate and massively parallel quantification of nucleic acid
EP2825675B1 (en) Measurement of nucleic acid variants using highly-multiplexed error-suppressed deep sequencing
US20210324468A1 (en) Compositions and methods for screening mutations in thyroid cancer
CN109952381B (en) Method for multiplex detection of methylated DNA
US9422592B2 (en) System and method of detecting RNAS altered by cancer in peripheral blood
US20150126376A1 (en) Compositions and methods for sensitive mutation detection in nucleic acid molecules
JP2018530347A (en) Method for preparing cell-free nucleic acid molecules by in situ amplification
US20210214781A1 (en) Measurement of nucleic acid
US20230026916A1 (en) Viral Oncogene Influences and Gene Expression Patterns as Indicators of Early Tumorigenesis
EP3655552A1 (en) Method of identifying metastatic breast cancer by differentially methylated regions
US20190390282A1 (en) Target enrichment and sequencing of modified nucleic acids for human cancer detection
GB2623570A (en) Method and products for biomarker identification
EP3565906B1 (en) Quantifying dna sequences
CN114667355B (en) Methods for detecting colorectal cancer
WO2025101114A1 (en) TISSUE OF ORIGIN OF cfDNA
US20230002807A1 (en) Methods and compositions for nucleic acid analysis
KR102280363B1 (en) A Method for Detection of Methylated SDC2 Gene
WO2011002024A1 (en) Method for determining presence or absence of epithelial cancer-origin cell in biological sample, and molecular marker and kit therefor
KR102816628B1 (en) Metabolic syndrome-specific epigenetic methylation markers and uses thereof
JP7775546B2 (en) Data collection method and kit for determining likelihood of developing Alzheimer&#39;s disease
JP7297902B2 (en) Analysis method and kit
US20230015571A1 (en) Method for diagnosing colorectal cancer by detecting intragenic methylation
WO2023056300A1 (en) Personalized cancer liquid biopsies using primers from a primer bank
HK40080902A (en) Gene methylation markers or combinations thereof for concomitant diagnosis of her2 in gastric carcinoma, and use thereof
EP3696279A1 (en) Methods for noninvasive prenatal testing of fetal abnormalities

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24808761

Country of ref document: EP

Kind code of ref document: A1