WO2025224260A1 - Target enrichment - Google Patents
Target enrichmentInfo
- Publication number
- WO2025224260A1 WO2025224260A1 PCT/EP2025/061253 EP2025061253W WO2025224260A1 WO 2025224260 A1 WO2025224260 A1 WO 2025224260A1 EP 2025061253 W EP2025061253 W EP 2025061253W WO 2025224260 A1 WO2025224260 A1 WO 2025224260A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acid
- primer
- acid molecule
- previous
- nucleotide
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6827—Hybridisation assays for detection of mutation or polymorphism
Definitions
- the present invention relates to the fields of molecular biology and biotechnology and particularly, although not exclusively, to methods of enriching DNA molecules.
- MRD minimal residual disease
- radiological imaging has several limitations: a) It exposes patients multiple times (3-6 monthly) to radiation, b) patients with additional health complications find it difficult to metabolise contrast dyes, c) it does not provide molecular information and most importantly d) it does not measure minimal residual disease and is hence unable to intercept relapse or recurrence earlier than it occurs.
- Circulating Tumor DNA are fragments of DNA circulating in the bloodstream of cancer patients that originate from malignant tumor tissue or from circulating tumor cells.
- ctDNA carries mutations and epigenetic changes characteristic of the tumor that they originate from.
- Recent work in the field has established the importance of ctDNA as a biomarker of MRD in cancer patients and underpinned the importance of ctDNA based non-invasive MRD tests in early interception of recurrence/relapse (Nagasaka, M et al. Molecular Cancer. 2021).
- ctDNA based MRD measurement outperforms radiological imaging by 6 months to 1 year in detecting early signs of recurrence which in turn leads to better survival outcome (Kim, T. et al, Thorac Cancer 2019 ).
- ctDNA based MRD tests in regular clinical practice is limited due to prohibitively high costs and low sensitivity.
- a typical plasma sample is flooded with cell free DNA (cfDNA) shed by normal cells (e.g. haematopoietic cells) of the body. Mutant ctDNA shed by tumour cells is a small fraction (0.1-10%) of this cfDNA sample. Therefore, detection of ctDNA via liquid biopsy is a ‘needle in a haystack’ problem.
- Current approaches to liquid biopsy that measure somatic mutations use target capture panels to enrich genetic regions of interest that are known to bear cancer causing mutations followed by shot-gun sequencing.
- the present disclosure provides method of enriching a variant nucleic acid molecule.
- the present disclosure also provides a method of enriching variant nucleic acid molecule associated with a disease.
- the present disclosure also provides a method of detecting a mutation in a population of nucleic acids.
- the present disclosure provides a method of determining whether a sample comprises a variant nucleic acid molecule.
- the sample is a liquid biopsy sample.
- the liquid biopsy sample is a blood sample, a plasma sample, a saliva sample, or a urine sample. In some embodiments, the liquid biopsy sample is derived from a blood sample, a plasma sample, a saliva sample, or a urine sample.
- the present disclosure also provides a method of detecting the presence of or recurrence of cancer. In another aspect, the present disclosure provides a method of detecting the presence of or recurrence of metastasis.
- the present disclosure also provides a method of detecting minimal residual disease.
- the present disclosure also provides a method of selecting a subject for therapy.
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation. In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising differential methylation.
- the method comprises:
- the method comprises:
- the method comprises:
- the nucleic acid is DNA or RNA. In some embodiments, the nucleic acid is DNA. In some embodiments, the nucleic acid is RNA. In some embodiments, the nucleic acid molecule is a DNA molecule or an RNA molecule. In some embodiments, the nucleic acid is a DNA molecule. In some embodiments, the nucleic acid is an RNA molecule. In some embodiments, the variant nucleic acid molecule is a variant DNA molecule or a variant RNA molecule. In some embodiments, the variant nucleic acid molecule is a variant DNA molecule. In some embodiments, the variant nucleic acid molecule is a variant RNA molecule. In some embodiments, the method of enriching a variant nucleic acid molecule is a method of enriching a variant DNA molecule.
- the method comprises:
- the method comprises:
- a variant nucleic acid molecule may alternatively be described as a mutant nucleic acid molecule, a mutant allele, or an allele.
- a variant DNA molecule may alternatively be described as a mutant DNA molecule, a mutant allele, or an allele.
- the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule.
- the primer comprises a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and the corresponding wild-type nucleic acid molecule.
- the primer comprises a complementary region.
- the complementary region is complimentary to the variant nucleic acid molecule and the corresponding wild-type nucleic acid molecule.
- the complementary region is between 16-28 nucleotides long.
- the primer is between 18-30 nucleotides long.
- primer extension occurs through the action of a polymerase.
- a polymerase is provided at step (a), (b), (c), or (d). In some embodiments, a polymerase is provided to the reaction mixture at step (b) or (d).
- the exonuclease has 5'- 3' exonuclease activity.
- the nuclease has single-strand-specific nuclease activity.
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
- the method comprises a method of enriching a variant DNA molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant DNA molecule and a corresponding wild-type DNA molecule
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
- the method comprises a method of enriching a variant DNA molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant DNA molecule and a corresponding wild-type DNA molecule
- nuclease contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
- nuclease contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
- the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
- the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
- nuclease contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
- bioinformatic analysis is performed to analyse data generated through sequencing
- the population of nucleic acid molecules is provided as a library of nucleic acid molecules.
- library preparation comprises a single stranded library preparation approach and/or the inclusion of an individual Unique Molecular Identifiers to each strand of the nucleic acid.
- the population of nucleic acid molecules is comprised within or isolated from a liquid biopsy sample.
- the population of nucleic acid molecules comprises cell-free DNA (cfDNA).
- the population of nucleic acid molecules comprises or is suspected of comprising circulating tumor DNA (ctDNA).
- ctDNA circulating tumor DNA
- the variant nucleic acid molecule is a single nucleotide variant DNA molecule.
- the mutation is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
- the variant nucleic acid molecule comprises nucleic acid altered through bisulphite conversion.
- the primer is a variant sequence specific primer and/or an allele distinguishing primer.
- the nucleotide that is specific for the mutation of the variant nucleic acid molecule is present on the 3’ portion of the primer.
- the 3’ portion of the primer is the portion of the primer which is closer to the 3’ end than the 5’ end of the primer.
- the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 5 nucleotides of the final nucleotide on the 3’ end of the primer. In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 4 nucleotides of the final nucleotide on the 3’ end of the primer. In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 3 nucleotides of the final nucleotide on the 3’ end of the primer. In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 2 nucleotides of the final nucleotide on the 3’ end of the primer.
- the nucleotide that is specific for the mutation of the variant nucleic acid molecule is final nucleotide on the 3' end of the primer.
- the mismatched non-complimentary nucleotide is not complimentary to the variant nucleic acid molecule or the corresponding wild-type nucleic acid molecule.
- the mismatched non-complimentary nucleotide is present on the 3’ portion of the primer.
- the mismatched non-complimentary nucleotide is present within 5 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule. In some embodiments, the mismatched non-complimentary nucleotide is present within 4 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule. In some embodiments, the mismatched non- complimentary nucleotide is present within 3 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule. In some embodiments, the mismatched non-complimentary nucleotide is present within 2 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
- the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
- the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule, at the 5’ side of the mutation of the variant nucleic acid molecule.
- the primer comprises an affinity tag at the 5' end.
- the primer is biotinylated at the 5’ end. In some embodiments, the primer is 18-30 nucleotides long.
- the primer is at least 18 nucleotides long. In some embodiments, the primer is at least 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, or 29 nucleotides long.
- the primer is at most 30 nucleotides long. In some embodiments, the primer is at most 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides long.
- primers are not provided as primer pairs.
- the incubation is performed under conditions that allow hybridization of the primer with its target sequence.
- the primer sequence enables primer extension following binding to the variant nucleic acid molecule. In some embodiments, the primer sequence prevents primer extension following binding to a wild-type nucleic acid molecule.
- the primer sequence enables primer extension following binding to the variant nucleic acid molecule, and prevents primer extension following binding to a corresponding wild-type nucleic acid molecule.
- primer extension occurs more frequently upon base pairing with the variant nucleic acid molecule, compared with base pairing with the corresponding wild-type nucleic acid molecule.
- primer extension occurs when the primer is bound to the variant nucleic acid molecule, but not when the primer is bound to the corresponding wild-type nucleic acid molecule.
- the nuclease/exonuclease does not digest variant nucleic acid molecules as these molecules are comprised within double stranded duplex nucleic acid after primer extension.
- the nuclease/exonuclease does digest corresponding wild-type nucleic acid molecules as these molecules are single stranded, because primer extension did not occur.
- the nuclease/exonuclease is single-strand specific.
- the nuclease/exonuclease does not have 3'- 5' exonuclease activity.
- the step of contacting the population of nucleic acid molecules in the reaction mixture with a nuclease specifically enriches the variant nucleic acid.
- a single strand specific nuclease digests the 5’ un-extended end of the wild-type nucleic acid molecule and the 3’ single stranded overhang in both the variant nucleic acid molecule and the wild-type nucleic acid molecule. This step reduces the wild-type nucleic acid molecule to a short 30-40 bp double stranded fragment which is removed by purification.
- a 5’ - 3’ exonuclease digests the 5’ unextended end of the wild-type nucleic acid molecule.
- the method of the invention comprises nuclease-mediated background wild-type nucleic acid molecule depletion.
- the method does not comprise use of a primer comprising modification(s) which make the primer and/or extension product resistant to exonuclease digestion (e.g. phosphorothioate linkage).
- the reaction mixture comprises a polymerase.
- Polymerases are enzymes that catalyze the synthesis of nucleic acid chains.
- the polymerase may comprise 5'- 3' exonuclease activity.
- the polymerase may comprise 3'- 5' exonuclease activity.
- the polymerase comprises no 5'- 3' exonuclease activity.
- the polymerase comprises no 3'- 5' exonuclease activity.
- the polymerase comprises no exonuclease activity.
- the polymerase is a Taq polymerase.
- the polymerase is a mutated Taq DNA polymerase with maximum 3’ mismatch distinguishing ability.
- the polymerase is AptaTaqAexo.
- the polymerase is a high-specificity polymerase.
- the polymerase is optimised for SNP analysis.
- the reaction mixture is cooled after incubation of the reaction mixture with a polymerase.
- purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang comprises the use of a binding partner for an affinity tag present on the primer.
- purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang comprises pull down with a streptavidin coated magnetic bead. In some embodiments, purifying duplex nucleic acid molecules comprising a variant nucleic acid molecule comprises the use of a binding partner for an affinity tag present on the primer.
- purifying duplex nucleic acid molecules comprising a variant nucleic acid molecule comprises pull down with a streptavidin coated magnetic bead.
- NaOH is used to release the variant nucleic acid molecule from binding partneraffinity tag complex.
- purification may comprise the use of a biotin-tagged primer and streptavidin as the binding partner.
- An NaOH-based method may be used to release the enriched variant nucleic acid into the solution from the biotinylated primer-streptavidin complex.
- purified duplex DNA comprises the variant DNA molecule and does not comprise a corresponding wild-type DNA molecule.
- purified duplex DNA comprises a greater variantcorresponding wild-type DNA molecule ratio, compared to the population of DNA molecules provided at the start of the method.
- the method further comprises a step of amplifying the variant nucleic acid/DNA molecule after purifying the duplex nucleic/acid DNA comprising a single stranded 3’ overhang.
- the method further comprises a step of amplifying the variant nucleic acid molecule after purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule.
- the amplification step may be performed using any suitable amplification method, for example the amplification step may comprise PCR amplification.
- the amplification step may be performed using nested PCR reactions.
- the amplification step may comprise two separate PCR reactions.
- the amplification step allows for: a) further nucleic acid enrichment, b) adding the 3’ adapter (which was lost during the nuclease digestion step) back to the nucleic acid molecule, and c) amplifying the enriched variant nucleic acid molecule.
- the step of amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules comprises contacting the purified duplex nucleic acid molecules a 3' adapter. In some embodiments, the step of amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules comprises contacting the purified duplex nucleic acid molecules a 3' adapter and a forward primer (e.g. a P7 forward primer). In some embodiments, the step of amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules comprises contacting the purified duplex nucleic acid molecules a 3’ adapter, a P7 forward primer, and a reverse primer.
- a forward primer e.g. a P7 forward primer
- the 3’ adapter, forward primer and/or reverse primer may be any suitable primer known to the skilled person.
- the 3’ adapter, forward primer and/or reverse primer is a 3’ adapter, forward primer and/or reverse primer disclosed herein.
- the method further comprises a step of DNA sequencing.
- the method further comprises a step of DNA sequencing.
- sequencing is next generation sequencing (NGS).
- NGS next generation sequencing
- the method further comprises bioinformatic analysis.
- the method further comprises bioinformatic analysis of sequence data.
- the method further comprises bioinformatic analysis of NGS data.
- purifying duplex DNA comprising a single stranded 3' overhang comprises pulldown with a binding partner for an affinity tag present on the primer.
- the method comprises a step of designing a primer.
- the disclosure provides a method of designing a primer.
- the step of designing a primer comprises the identification of a target variant nucleic acid sequence and a corresponding wild-type nucleic acid sequence.
- the step of designing a primer comprises the use of a quantitative stability model.
- the quantitative stability model is described in Panjkovich, A. et al. (Bioinformatics, 2005), which is hereby incorporated by reference in its entirety.
- the quantitative stability model is used to predict the destabilizing effect of each mutation/mismatch pair within a given primer.
- the quantitative stability model is used to determine an optimum primer sequence.
- the method comprises a first step of designing a primer to enable primer extension following binding to the variant nucleic acid molecule.
- the method comprises a first step of designing a primer to prevent primer extension following binding to a wild-type nucleic acid molecule.
- the method comprises designing a primer to enable primer extension following binding to the variant nucleic acid molecule, and prevent primer extension following binding to a corresponding wild-type nucleic acid molecule.
- the invention includes the combination of the aspects and preferred features described except where such a combination is clearly impermissible or expressly avoided.
- FIG. 1 Schematic diagram of primers. 5’ biotinylated (circle) mutation allele specific primers consist of a 30 bp complementary region (large bar) with the mutation to be detected (dark bar) at the 3’ end and a deliberate mismatch (light bar) at one of the 3 preceding bases of the 3’ end. The primer extends differentially depending on the presence or absence of a mutation in the template DNA.
- Figure 2 Schematic diagram showing one specific strategy for altered DNA capture and library preparation.
- the cartoon shows a step-by-step workflow for capture of altered allele from a background of native nucleic acid sequences for next generation sequencing.
- Star represents a DNA alteration/mutation/variation
- small circle represents biotin label
- large circle represents magnetic bead coated with streptavidin.
- FIG. 3 Flowchart showing bioinformatics analysis pipeline
- Figure 4 Schematic representation of allele-specific primer extension. Allele-specific biotinylated primers (purple, with red stars and circles indicating sequence specificity and biotin, respectively) are annealed to adapter-ligated double-stranded DNA fragments. In the presence of a matched target mutation, allelespecific extension occurs (green arrow), enabling downstream capture. In cases of wild-type allele, no extension occurs (red “No extension” label), and the unextended primers remain annealed with 3' and 5’ overhangs. Treatment with a single-strand-specific nuclease selectively digests the single strand overhangs on the WT and mutant allele fragments. The wild type allele is subsequently removed during bead-based cleanup. Only extended DNA fragments with intact 5’ ends are retained, enabling high- specificity enrichment of target alleles.
- FIG. 5 & Figure 6 Representative electropherogram traces illustrating fragment size distribution in the method of this disclosure and the conventional enrichment method respectively. These figures depict electropherogram generated using a fragment analyser.
- the x-axis represents fragment size in base pairs (bp), while the y-axis corresponds to fluorescence intensity.
- Two labelled markers frame the primary region of interest: a Lower Marker (LM) at the left (around 50-100 bp) and an Upper Marker (UM) at the right (around 7000 bp), serving as internal size references.
- LM Lower Marker
- UM Upper Marker
- a cluster of peaks spanning approximately 100-800 bp is evident, indicative of the dominant fragment populations in the respective electropherograms.
- Figure 7 Comparison of sequencing efficiency and target coverage between the methods of this disclosure and conventional methods.
- Top panel Total sequencing reads (in millions) generated from the disclosed workflow (2Strands) and two conventional enrichment workflows (Conventional-1 and Conventional-2).
- Bottom panel Mean target coverage achieved in each experiment. Despite significantly lower sequencing depth, the method of this disclosure yielded markedly higher mean target coverage, highlighting its superior enrichment efficiency. Note the broken y-axis to illustrate the large dynamic range in coverage.
- Figure 8 Enhanced on target coverage by the disclosed method compared to conventional methods.
- the top row (2Strands) illustrates read pileups obtained using the disclosed allele-specific enrichment probes, demonstrating strong, focused coverage at variant-containing regions.
- the middle and bottom rows show data from two independent conventional primer extension target capture experiments, which display lower and less specific coverage across the same loci as demonstrated by the broader peaks.
- the increased signal intensity and sharp enrichment peaks in the top row highlights the method’s superiority. Genome coordinate ranges are indicated above each panel.
- Figure 9 Superior enrichment of mutations by the disclosed allele-specific technology. Mutation detection performance at two clinically relevant loci: EGFR L858R (G>T) and PIK3CA E545K (G>A). Visualized using IGV, the read alignments illustrate variant allele frequencies (VAFs) and coverage levels for each method. In both loci, 2Strands-Exp2 (top panels, black) shows markedly enhanced variant detection sensitivity and significantly higher local sequencing depth (e.g., 11.1% VAF at 2131 x coverage for EGFR L858R; 12.9% VAF at 551 x for PIK3CA E545K).
- VAFs variant allele frequencies
- FIG. 10 Comparison of SNP Detection Sensitivity across down-sampled read depths.
- the graph illustrates the fraction of single nucleotide polymorphisms (SNPs) detected (y-axis) as a function of the total number of sequencings reads down-sampled in millions (x-axis).
- the disclosed workflow black line
- the conventional primer extension target approach grey line
- Each point represents the average of 10 replicate read down-sampling, and the vertical/horizontal grid lines serve as visual aids.
- FIG 11 Effect of Single-Strand specific Nuclease (SSN) on target molecule with 3’ and 5’ overhangs.
- the top panel shows a schematic representation of the target molecule (EGFR L858R locus) and the qPCR strategy for detection of the 5’ and 3’ overhangs.
- the bar chart illustrates how single-strand nuclease (SSN) affects the measured level of the target molecule (y-axis) by quantitative real time PCR.
- the x-axis labels represent distinct SSN volumes added (0 pL, 0.1 pL, 0.5 pL, and 1 pL). At the baseline (0 pL SSN), the target molecule is detected by qPCR both upstream and downstream of the capture probe.
- Figure 12 Comparative qPCR amplification profiles using allele-specific (AS) and non-allele-specific (NAS) KRAS G12D, EGFR L858R and NRAS Q61 K primers respectively.
- Each figure comprises three sub-panels (top-left, top-right, and bottom-centre) demonstrating amplification of 5% vs 0%, 1 % vs. 0% and 0.1 % vs 0% mutation allele frequency cfDNA reference material.
- Each panel depicts real-time quantitative PCR (qPCR) amplification curves under different primer conditions.
- the x-axis of each panel indicates the number of PCR cycles, while the y-axis (RFU) represents the fluorescent signal corresponding to accumulating amplification products.
- AS and NAS primers amplify the target region at different cycle thresholds demonstrating that the AS primers are specific for the low frequency target mutations present in the reference sample.
- the AS primers amplify the WT template with 0% mutation at very high cycle numbers or very low efficiency, in some cases (KRAS G12D) distinguishing even between 0.1 % allele frequency.
- KRAS G12D very low efficiency
- the present invention is based on the development of an innovative method to accurately enrich very-low frequency altered DNA sequences (targets) from a background of native DNA sequences.
- the enrichment is not PCR amplicon based which can often be error prone and subject to non-specific amplification.
- the method uses an allele distinguishing biotinylated primer that extends only upon base pairing with the intended DNA mutation sequence (e.g., a single nucleotide variant, Indel, deletion, Fusion, or DNA altered via bisulphite conversion).
- primer extension forms a stable duplex with the original altered DNA fragment which can be pulled down via a streptavidin coated magnetic bead.
- the original mutated DNA fragment (not the extension product) is then amplified and sequenced. This method allows enrichment of variant sequences several folds above background DNA sequence thereby increasing signal to noise ratio.
- the method of enrichment according to the present disclosure combines sensitivity and affordability.
- Data-driven plasma-only panel design not only ensures faster turnaround but also representation of molecular features typical of recurrent/metastatic tumours.
- the current approach therefore, is an effort to obtain the ideal combination that would make liquid biopsy based MRD monitoring of cancer patients a practical solution for regular clinical practice.
- the ultrasensitive nature of the assay should allow expansion of the clinical utility of this technology beyond the four solid tumours (lung, colorectal, bladder and breast) that currently have MRD solutions on offering.
- the invention may be use in the assessment of highly aggressive diseases (e.g., metastatic diseases) with high recurrence such as pancreatic cancer, ovarian, hepatocellular carcinoma, thyroid and gastrointestinal, liver, head and neck, and other cancers which remain a major challenge in the field of oncology.
- Probe hybridization-based target capture though highly specific is a time-consuming method. The disclosed method is expected to yield a greater reduction in sequence depth requirement and a superior sensitivity due to the combination of single stranded library preparation and nuclease aided allele specific primer extension target enrichment. Primer extension-based target capture assures specificity at the same level as probe hybridization while reducing turnaround time, cost of oligonucleotide probe/primer synthesis while generating more balanced and on-target sequencing reads.
- variant nucleic acid molecule may be used to describe a nucleic acid molecule comprising an alteration (or mutation), compared to a corresponding nucleic acid molecule (e.g., a wild-type nucleic acid molecule).
- variant nucleic acid molecule may alternatively be phrased as: mutant nucleic acid molecule, mutated nucleic acid molecule, nucleic acid molecule comprising a mutation, a mutant allele, an alternative allele, or an allele.
- variant DNA molecule may be used to describe a DNA molecule comprising an alteration (or mutation), compared to a corresponding DNA molecule (e.g., a wild-type DNA molecule).
- variant DNA molecule may alternatively be phrased as: mutant DNA molecule, mutated DNA molecule, DNA molecule comprising a mutation, a mutant allele, an alternative allele, or an allele.
- mutation refers to a difference in a nucleic acid sequence (e.g. DNA or RNA) in a sample compared to a reference.
- a mutation may be a single nucleotide variant (SNV), multiple nucleotide variants, a deletion mutation, an insertion mutation, a translocation, a missense mutation, a translocation, a fusion, etc. Mutations may be identified using sequence data.
- An "indel mutation” refers to an insertion and/or deletion of bases in a nucleotide sequence (e.g. DNA or RNA) of an organism.
- the mutation (or sequence variation) is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
- a mutation is a somatic mutation.
- a “somatic mutation” is a mutation that is present in a tumour or modified cell (or genetic material derived therefrom), but not in a corresponding (matched) normal or non-modified cell.
- Nucleic acid may be DNA or RNA. In some embodiments, the nucleic acid is DNA. In some embodiments, the nucleic acid may be a population of DNA molecules (e.g., DNA in a sample or DNA extracted/purified from a sample).
- sample refers to a biological sample.
- the sample comprises cfDNA, including a sample of a biological fluid, or an extract therefrom.
- the sample may be a urine sample, a blood, plasma or serum sample, or a sample derived therefrom.
- the sample may be one which has been freshly obtained from the subject or may be one which has been processed and/or stored prior to making a determination (e.g. frozen, fixed or subjected to one or more purification, enrichment or extractions steps, including centrifugation).
- the sample is a liquid biopsy sample, or a sample derived from a liquid biopsy.
- Liquid biopsies are described in detail in the art. See, e.g., Poulet et al., Acta Cytol 63(6): 449-455 (2019), Chen and Zhao, Hum Genomics 13(1): 34 (2019).
- a liquid biopsy sample comprises Cell-free DNA (cfDNA).
- cfDNA Cell-free DNA
- cfDNA is fragmented DNA that is shed into circulation through biological processes like apoptosis, necrosis, and active secretion.
- the cfDNA found in biological fluids can originate from different cell types. In cancer patients, a small proportion of cfDNA originates from tumour cells, and these tumour- derived cfDNA are referred to as circulating tumour DNA (ctDNA). Tumour-derived cfDNA is referred to herein as ctDNA.
- the term cfDNA encompasses ctDNA. However, cfDNA without ctDNA can also be referred to herein as non-tumour derived cfDNA.
- Non-tumour derived cfDNA constitutes the entirety of a cfDNA sample from a subject who does not have cancer (also referred to herein as “healthy subject”).
- Non-tumour cfDNA is also expected to represent a proportion of a cfDNA sample from a subject who does have cancer (also referred to herein as “cancer subject” or “cancer patient”).
- the sample may be derived from one or more of the above biological samples.
- the sample may comprise a nucleic acid library generated from the biological sample and may optionally be a barcoded or otherwise tagged nucleic acid library.
- a plurality of samples may be taken from a single patient, e.g. serially during a course of treatment.
- a plurality of samples may be taken from a plurality of patients.
- a nucleic acid library comprises nucleic acid labelled with a unique molecular identifier (UMI).
- UMI unique molecular identifier
- a nucleic acid library comprises nucleic acid labelled with two UMIs.
- the sample may comprise a DNA library generated from the biological sample and may optionally be a barcoded or otherwise tagged DNA library.
- a plurality of samples may be taken from a single patient, e.g. serially during a course of treatment. Moreover, a plurality of samples may be taken from a plurality of patients.
- a DNA library comprises DNA labelled with a unique molecular identifier (UMI).
- UMI unique molecular identifier
- a DNA library comprises DNA labelled with two UMIs.
- a primer is a short single-stranded nucleic acid used by living organisms in the initiation of nucleic acid synthesis.
- a synthetic primer may also be referred to as an oligo, short for oligonucleotide.
- DNA polymerase (responsible for DNA replication) enzymes are only capable of adding nucleotides to the 3’- end of an existing nucleic acid, requiring a primer be bound to the template before DNA polymerase can begin a complementary strand.
- Synthetic primers are chemically synthesized oligonucleotides, usually of DNA, which can be designed to anneal/hybridize/bind with a specific site/sequence on a target nucleic acid molecule. In solution, the primer spontaneously hybridizes with the template through Watson-Crick base pairing before being extended by DNA polymerase.
- the method comprises a first step of designing a primer.
- the step of designing a primer comprises the identification of a target variant DNA sequence and a corresponding wild-type DNA sequence.
- the step of designing a primer comprises the use of a quantitative stability model.
- the quantitative stability model is described in Panjkovich, A. et al. (Bioinformatics, 2005), which is hereby incorporated by reference in its entirety.
- the quantitative stability model is used to predict the destabilizing effect of each mutation/mismatch pair within a given primer.
- the quantitative stability model is used to determine an optimum primer sequence.
- the method comprises a first step of designing a primer to enable primer extension following binding to the variant DNA molecule. In some embodiments, the method comprises a first step of designing a primer to prevent primer extension following binding to a wild-type DNA molecule. In some embodiments, the method comprises designing a primer to enable primer extension following binding to the variant DNA molecule, and prevent primer extension following binding to a corresponding wild-type DNA molecule.
- Nucleases are enzymes that cleave polynucleotides into nucleic acids of smaller units.
- the nuclease is single-strand specific. In some embodiments, the nuclease has single-strand-specific nuclease activity. Single-strand-specific nucleases are known to the skilled person and are reviewed e.g. in Desai et al. (FEMS microbiology reviews 2003 26(5): 457-491), which is hereby incorporated by reference. Single-strand-specific nucleases exhibit high selectivity for single-stranded nucleic acids and single-stranded regions in double-stranded nucleic acids. In some embodiments, the single-strand specific nuclease has exonuclease activity. In some embodiments, the single-strand specific nuclease has endonuclease activity.
- the single-strand specific nuclease has exonuclease activity and endonuclease activity.
- Examples of known single-strand-specific nucleases include S1 nuclease, P1 nuclease, N. crassa mycelia nuclease, N. crassa conidia nuclease, BAL 31 slow form nuclease, BAL 31 fast form nuclease, U. Maydis a nuclease, U.
- Maydis nuclease Nuclease Bh1, Aspergillus nuclease, Physarum nuclease, SP nuclease, Mung bean nuclease, Wheat chloroplast nuclease, Rye germ ribosomes Nuclease I, Pea seeds nuclease, Tobacco nuclease I, Alfalfa seedling nucleases (e.g. acid, neutral), SK nuclease, hen liver nuclease, rat liver nuclei nuclease, and mouse mitochondria nuclease.
- the nuclease is an exonuclease.
- Exonucleases are enzymes that work by cleaving nucleotides one at a time from the end (exo) of a polynucleotide chain. A hydrolysing reaction that breaks phosphodiester bonds at either the 3' or the 5' end occurs. Exonucleases are reviewed by Manils et al. (Cells. 2022 Jul; 11 (14): 2157), which is hereby incorporated by reference in its entirety.
- Exonuclease I breaks apart single-stranded DNA in a 3' 5' direction, releasing deoxyribonucleoside 5'-monophosphates one after another. It does not cleave DNA strands without terminal 3'-OH groups because they are blocked by phosphoryl or acetyl groups.
- Exonuclease II is associated with DNA polymerase I, which contains a 5' exonuclease that clips off the RNA primer contained immediately upstream from the site of DNA synthesis in a 5' — > 3' manner.
- Exonuclease III has four catalytic activities: 3' to 5' exodeoxyribonuclease activity, which is specific for double-stranded DNA, RNase activity, 3' phosphatase activity, and AP endonuclease activity.
- Exonuclease IV adds a water molecule, so it can break the bond of an oligonucleotide to nucleoside 5' monophosphate.
- Exonuclease V is a 3' to 5' hydrolysing enzyme that catalyses linear double-stranded DNA and single-stranded DNA, which requires Ca2+. This enzyme is extremely important in the process of homologous recombination.
- Exonuclease VIII is 5' to 3' dimeric protein that does not require ATP or any gaps or nicks in the strand, but requires a free 5' OH group to carry out its function.
- the exonuclease is single-strand specific. In some embodiments, the exonuclease has 5'- 3' exonuclease activity. In some embodiments, the exonuclease has 3'- 5' exonuclease activity. In some embodiments, the exonuclease does not have 5'- 3' exonuclease activity. In some embodiments, the exonuclease does not have 3'- 5' exonuclease activity.
- the 3-5' exonucleases are reviewed by Shevelev and Hubscher (Nature Reviews Molecular Cell Biology volume 3, pages 364-376. 2002), which is hereby incorporated by reference in its entirety.
- NGS next- generation sequencing
- Sequencing of the enriched nucleic acids can be achieved using sequencing by ligation or sequencing by synthesis. Sequencing by synthesis relies on a DNA polymerase to incorporate four reversible terminatorbound dNTPs. One base is added per cycle and the fluorescently labelled reversible terminator is imaged as each dNTP is added. Sequencing by ligation uses the mismatch sensitivity of DNA ligase instead to distinguish the sequence of interest and incorporate a pool of fluorescently labelled oligonucleotides of varying lengths. Sequencing by ligation has high accuracy but may encounter problems with palindromic sequences.
- sequence data refers to information that is indicative of the presence and/or amount of genomic material in a sample that has a particular sequence.
- Such information may be obtained using sequencing technologies, such as e.g. next generation sequencing (NGS, such as e.g. whole exome sequencing (WES), whole genome sequencing (WGS, including shallow whole genome sequencing, sWGS), or sequencing of captured genomic loci (targeted or panel sequencing)), or using array technologies, such as e.g. SNP arrays, or other molecular counting assays.
- NGS next generation sequencing
- WES whole exome sequencing
- WGS whole genome sequencing
- sWGS including shallow whole genome sequencing, sWGS
- sequencing of captured genomic loci targeted or panel sequencing
- array technologies such as e.g. SNP arrays, or other molecular counting assays.
- the sequence data may comprise a count of the number of sequencing reads (also referred to as “sequence reads” or “sequence read data”) that have a particular
- sequence data may comprise a signal (e.g. an intensity value) that is indicative of the number of sequences in the sample that have a particular sequence, for example by comparison to an appropriate control.
- Sequence data may be mapped to a reference sequence, for example a reference genome, using methods known in the art. Thus, counts of sequencing reads or equivalent non-digital signals may be associated with a particular genomic location.
- Sequence reads data may be provided or obtained directly, e.g., by sequencing the cfDNA sample or library or by obtaining or being provided with sequencing data that has already been generated, for example by retrieving sequence read data from a non-volatile or volatile computer memory, data store or network location.
- the sequencing may be paired-end sequencing.
- the sequence reads may be in a suitable data format, such as FASTQ, SAM or BAM.
- the sequence read data e.g., FASTQ files
- the sequence data files may be processed using one or more tools selected from as FastQC v0.11 .5, a tool to remove adaptor sequences (e.g. cutadapt v1 .9.1).
- the sequence reads e.g. trimmed sequence reads
- the human reference genome GRCh37 for a human subject.
- read or “sequencing read” may be taken to mean the sequence that has been read from one molecule and read once. Each molecule can be read any number of times, depending on the sequencing performed.
- sequence data is data from sWGS, WGS, WES, or any capture panels including custom capture panels. Subjects
- the words “subject” and “patient” are used herein interchangeably.
- the subject may be a mammalian subject, such as a human subject or an animal model or pet, such as e.g. a mouse, rat, rabbit, horse, dog, etc.
- the subject may be a subject who has been diagnosed as having or being at risk of developing cancer.
- a cancer may be selected from: bladder cancer, gastric cancer, oesophageal cancer, breast cancer, colorectal cancer, cervical cancer, ovarian cancer, endometrial cancer, kidney cancer (renal cell), lung cancer (small cell, non-small cell and mesothelioma), brain cancer (gliomas, astrocytomas, glioblastomas), melanoma, lymphoma, small bowel cancers (duodenal and jejunal), leukemia, pancreatic cancer, hepatobiliary tumours, germ cell cancers, prostate cancer, head and neck cancers, thyroid cancer and sarcomas.
- the cancer may be selected from: glioblastoma, melanoma, renal cancer, lung cancer, pancreatic cancer, breast cancer, gastric cancer, colorectal cancer, bile duct cancer, and ovarian cancer.
- a “fragment”, “variant” or “homologue” of a nucleic acid molecule or protein may optionally be characterised as having at least 50%, preferably one of 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to the sequence of the reference nucleic acid molecule or protein. Fragments, variants, isoforms and homologues may be characterised by the ability to perform a function performed by the reference molecule.
- Pairwise and multiple sequence alignment for the purpose of determining percent identity between two or more amino acid or nucleic acid sequences can be achieved in various ways known to a person of skill in the art, for instance, using publicly available computer software such as ClustalOmega (Soding, J. 2005, Bioinformatics 21 , 951-960), T-coffee (Notredame etal. 2000, J. Mol. Biol. (2000) 302, 205-217), Kalign (Lassmann and Sonnhammer 2005, BMC Bioinformatics, 6(298)) and MAFFT (Katoh and Standley 2013, Molecular Biology and Evolution, 30(4) 772-780 software.
- the default parameters e.g. for gap penalty and extension penalty, are preferably used.
- a “fragment” generally refers to a fraction of the reference nucleic acid molecule or protein.
- a “variant” generally refers to a molecule having a nucleic acid molecule or amino acid sequence comprising one or more substitutions, insertions, deletions, or other modifications relative to the sequence of the reference nucleic acid molecule or protein, but retaining a considerable degree of sequence identity (e.g. at least 60%) to the sequence of the reference nucleic acid molecule or protein.
- An “isoform” generally refers to a variant of the reference nucleic acid molecule or protein expressed by the same species as the species of the reference nucleic acid molecule or protein.
- a “homologue” generally refers to a variant of the reference nucleic acid molecule or protein produced by a different species as compared to the species of the reference nucleic acid molecule or protein.
- a “fragment” may be of any length (by number of amino acids), although may optionally be at least 25% of the length of the reference nucleic acid molecule or protein (that is, the nucleic acid molecule or protein from which the fragment is derived) and may have a maximum length of one of 50%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% of the length of the reference nucleic acid molecule or protein.
- Pairwise and multiple sequence alignment for the purpose of determining percent identity between two or more amino acid or nucleic acid sequences can be achieved in various ways known to a person of skill in the art, for instance, using publicly available computer software such as ClustalOmega (Soding, J. 2005, Bioinformatics 21 , 951-960), T-coffee (Notredame etal. 2000, J. Mol. Biol. (2000) 302, 205-217), Kalign (Lassmann and Sonnhammer 2005, BMC Bioinformatics, 6(298)) and MAFFT (Katoh and Standley 2013, Molecular Biology and Evolution, 30(4) 772-780 software.
- the default parameters e.g. for gap penalty and extension penalty, are preferably used.
- a method of enriching a variant nucleic acid molecule from a population of nucleic acid molecules comprising:
- the primer comprises a mismatched nucleotide and a nucleotide which is complimentary to a mutation in the variant nucleic acid molecule, wherein the mismatched nucleotide is non-complimentary to the variant nucleic acid molecule and non-complimentary to a corresponding wild-type nucleic acid molecule;
- nucleic acid molecule is a DNA molecule. 5. The method according to any previous para, wherein the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 5 nucleotides of the final nucleotide on the 3’ end of the primer.
- nucleotide that is specific for the mutation of the variant nucleic acid molecule is final nucleotide on the 3’ end of the primer.
- mismatched non-complimentary nucleotide is present within 5 nucleotides of the nucleotide that is specific for the mutation of the variant DNA molecule.
- mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
- mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule, at the 5’ side of the mutation of the variant nucleic acid molecule.
- the population of nucleic acid molecules is provided as a library of nucleic acid molecules, optionally wherein the library of nucleic acid molecules are prepared through a method comprising a single stranded library preparation approach and/or the addition of an individual Unique Molecular Identifiers to each strand of the nucleic acid.
- the population of nucleic acid molecules and/or the liquid biopsy sample comprise cell-free DNA (cfDNA).
- the variant nucleic acid molecule is a single nucleotide variant DNA molecule.
- the mutation is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
- a method of determining whether a liquid biopsy sample comprises a variant nucleic acid molecule comprising the method according to any one of paras 1 to 27.
- a method of detecting the presence of or recurrence of cancer in a subject comprising the method according to any one of paras 1 to 27.
- a method of detecting minimal residual disease in a subject comprising the method according to any one of paras 1 to 27.
- a method of designing a primer comprising:
- each primer sequence comprises a mismatched nucleotide and a nucleotide which is complimentary to a mutation in the variant nucleic acid molecule, wherein the mismatched nucleotide is non-complimentary to the variant nucleic acid molecule and non-complimentary to a corresponding wild-type nucleic acid molecule,
- the panel consists of 200-500 carefully curated cancer associated DNA ‘alterations’ (single nucleotide variants or SNVs, indels, fusions and copy number variations or CNVs and methylation changes). These ‘alterations’ may be hotspots or frequent mutations that are not classified as hotspots but occur in large proportion of patients in a given cancer type as analysed from genomic datasets of metastatic/ recurrent tumours (Martinez-Jimenez, F, et al, 2023). Alternatively, these are high frequency clonal tumour specific mutations identified from whole genome or whole exome sequencing of a given patient. Known biomarkers for current targeted treatments and immunotherapy are carefully curated from literature and included in the panel selection.
- the methods of the present invention uses unique allele-specific primers for specifically capturing altered DNA fragments (targets) from a background of native DNA fragments in the same genomic region.
- Primers (18-40bp long) are biotinylated at the 5’ end and the alteration (SNVs, Indels, fusions, CNVs, bisulphite converted sequences) is placed at the 3' end.
- An intentional mismatch is introduced at one of the last 3 bases before the single nucleotide 'sequence alteration’ (Little, S. Curr. Protoc. Hum genetics, 2001).
- Quantitative stability model (Panjkovich, A.
- the panel of pooled oligonucleotides is designed such that each primer within the pool does not form secondary structures with themselves or each other and have an annealing temperature of 60 degree C ( Figure 1).
- Primers are also designed such that there is no complementarity to the Illumina sequencing adapters.
- Table 1 shows allele specific primers against 32 alterations on the +ve and -ve strands used for target capture in 3 separate pools. The pools were constructed to ensure minimum self and cross-primer complementarity. Naturally occurring germline variations are investigated in the 10 base pairs upstream of the target “alteration” and design degenerate primers if necessary. Table 1. Allele specific primers
- cfDNA contains both mono-nucleosomal fragments around 167bp as well as shorter ( ⁇ 100bp) non nucleosomal DNA fragments that are perhaps associated with DNA protection by other proteins such as transcription factors (Troll, C.J. et al, BMC Genomics 2019). It has been shown that tumour derived ctDNA exhibits an increased proportion of short fragments (Guo, J. et al BMC Genomics 2020). Conventional dsDNA library preparation involves end polishing and blunt-end ligation. End-polishing obscures the native termini and changes the true length of the cfDNA fragment (Troll, C.J. et al, BMC Genomics 2019) which is a limitation for studying ctDNA fragment length.
- Blunt-end ligation is an inefficient method which is unable to convert shorter single stranded or nicked double stranded DNA to sequencing ready libraries (Troll, C.J. et al, BMC Genomics 2019 ).
- single stranded library preparation approach e. G IDT Xgen ssDNA and low input DNA library preparation
- individual Unique Molecular Identifiers are added to the two strands of the DNA.
- Early addition of unique molecular identifier allows efficient PCR error correction and identification of each strand of DNA.
- Amplified ctDNA libraries (100-1000 ng) from the above step is used as input for nuclease-aided allelespecific primer extension target enrichment (NAAS- PETE).
- COT1 Human DNA is added 100x in excess of the primers to block repeat elements such as Alu and LINE that are common in ctDNA.
- 0.5uM of biotinylated primer pool, 2mM MgCI2, 0.2mM dNTP mix, appropriate PCR buffer (at 1X final concentration) is added to the reaction mix.
- a mutated Taq DNA polymerase with maximum 3’ mismatch distinguishing ability and no exonuclease activity (e.g., AptaTaqAexo, from Roche Custom Biotech) is used for the primer extension reaction.
- the reaction may be performed using the following conditions in an appropriate thermocycler: Lid Temperature 105, Denaturation 95°C for 30 secs, Primer Annealing 60°C for 10 min and Extension 72°C for 2 mins and the reaction is quickly cooled to 4°C.
- thermocycling may be performed with the lid temperature set to 105°C.
- the cycling conditions include initial denaturation at 95°C for 2 minutes, primer annealing at 60°C for 10 minutes 30 seconds, and extension at 68-72°C for 30 seconds to 1 minute, followed by rapid cooling to 4°C.
- the reaction is then purified with AMPure XP beads (Beckman Coulter), or Illumina purification beads, according to manufacturer’s protocol in order to remove unannealed biotinylated primers from the mix.
- a nuclease purification step is next performed.
- Strategy 1 To the purified DNA, 1 ul (30 units) of singlestranded DNA specific exonuclease RecJf (NEB) which removes bases exclusively in the 5’ -3’ direction is added along with NEBuffer2 (final cone. 1X) and incubated 20 minutes at 37°C. RecJf helps to remove the DNA fragments that bear the WT allele sequences. These fragments have a primer annealed to them but not extended. Hence, these WT alleles have a 30bp double stranded region (annealed primer) and a 5’ single stranded overhang which can be digested by RecJf thereby removing the 5’seqeuncing adapter.
- NEB DNA specific exonuclease RecJf
- the altered alleles are double stranded duplex DNA generated by primer extension with a 3’ single stranded overhang. This 3’ overhang remains unaffected by RecJf.
- Strategy 2 0.1 pL of a single-strand-specific nuclease (SSN) is added to the purified DNA to digest 5' and 3' single-stranded overhangs. This step selectively eliminates DNA fragments containing wild-type (WT) alleles, which are characterized by annealed but un-extended primers, resulting in -30-40 bp double-stranded regions with single-stranded overhangs that are susceptible to nuclease digestion.
- SSN single-strand-specific nuclease
- the nuclease purification step therefore, is a critical step that ensures specific capturing of the enriched altered allele over WT allele.
- the altered allele specific primer extension product duplexed with the original altered DNA template is then captured on a streptavidin coated magnetic bead (e.g., Dynabeads). This must then be amplified using an appropriate method
- Amplification strategy 1 Altered DNA with sequencing adaptors on both ends is amplified using universal primers against Illumina adapters and a high-fidelity polymerase (e.g., Kapa hifi hotstart library amplification kit) using 6-10 cycles to generate NGS ready libraries. It is important to note that the primer extension product has adaptor only at one end and hence does not get amplified by the universal primers ( Figure 2). This ensures none of the alterations identified by sequencing are artificially introduced by allele specific primers used for target enrichment.
- the amplified library is purified once more by DNA binding beads (e.g. Ampure XP) following manufacturer’s protocol, subjected to standard quality control measures for concentration and size distribution measurement followed by next generation sequencing.
- Amplification strategy 2 The original template is selectively eluted from the beads using a defined protocol (Vuokko, T et al, NAR 1992). A 3' adapter is then added to the enriched fragment via PCR using a P7 forward primer and a hybrid reverse primer that overlaps the original capture primer and includes part of the sequencing adapter (Table 5). The PCR product is purified with magnetic beads, and a second PCR is performed using P7 and P5 primers to generate the final amplified library which can be sequenced on an appropriate NGS platform ( Figure 4)
- Sequencing reads in the FastQ files undergo pre-processing steps to ensure data quality and integrity. Initially, read quality filtering is performed to discard low-quality reads, followed by adapter sequencing removal to eliminate adapter contamination. Subsequently, unique molecular identifier (UMI) error correction is conducted to mitigate errors introduced during sequencing. Reads are then grouped by their UMIs to collapse duplicates and reduce PCR amplification bias. Reads without UMI are discarded. A preprocessing report is generated to summarize the outcomes of these procedures and assess data quality.
- UMI unique molecular identifier
- reads are aligned against a Fasta file containing the genomic sequences of the target mutations .
- Reads with an exact match against these mutation sequences are recorded in a BAM file dedicated to the reads with mutations. Additionally, reads that do not have an exact match but align with up to 3 mismatches and still contain the target mutation enriched by our test are also preserved in the mutation BAM file. Allowing mismatches is to account for naturally occurring germline variation in humans. Reads that fail to align to the Fasta file containing the target mutations are aligned against the human reference genome and subsequently stored in a separate BAM file.
- Reads in BAM files are piled up to calculate allele frequencies and sequencing coverage. This pileup data is utilized for variant calling, and the results are stored in a VCF file. Furthermore, a report is generated, containing the sequencing coverage for the region directly adjacent to the target mutations, along with differences in allele frequencies between the target enriched mutations and the reference wild-type alleles ( Figure 5, Figure 6).
- VAF variable allele frequencies
- a panel of primers were designed following the innovative design strategy of this invention, against the mutations in Table 1 as positive control, and an additional 6 mutations well known in NSCLC but not present in the reference standard were chosen as negative controls to test specificity and allele differentiating ability of the assay (Table 2).
- Table 3 List of mutations used as negative control
- the above example is set up to consider the following hypotheses: a) The method of enrichment according to the present disclosure enriches mutations >100X compared to conventional target enrichment strategies. b) The method of enrichment according to the present disclosure requires -50-100X fewer sequencing reads to achieve the same coverage per read.
- Conventional target enrichment is determined for each mutation listed in Table 2.
- the minimum sequencing depth required to call a given mutation using the method of enrichment according to the present disclosure versus conventional approach at each VAF dilution is determined.
- the limit of detection (LoD) of the current assay is determined by diluting the cfDNA reference standard to 0.01%, 0.001% and 0.0001% in comparison to the said conventional approach.
- VAF 5% variant allele frequency
- cfDNA reference standard Horizon, #HD780
- Workflow according to the present disclosure 500 ng of adapter-ligated library was incubated with 0.1 pM of a biotinylated pool of allele-specific primers (see Table 4) in 1X NEBuffer r2.1 . The mixture was subjected to thermal cycling at 94°C for 2 minutes, followed by 60°C for 5 minutes, and subsequently cooled to room temperature.
- Table 4 Exemplary allele specific primers
- Table 5 Exemplary PCR1 and PCR2 primers
- the enriched library produced according to the workflow of the present disclosure was sequenced on an Illumina NovaSeq 6000, generating 15 million reads.
- the conventionally enriched samples were sequenced on an Illumina NextSeq 2000, generating 80 million reads per sample.
- the workflow of the present disclosure achieved a mean on-target coverage of 38,000X with only 15 million reads. Despite significantly lower sequencing depth, the workflow of the present disclosure yielded markedly higher mean on-target coverage, highlighting its superior enrichment efficiency.
- Figure 8 shows Integrated Genome Viewer (IGV) screenshots of 4 representative targets on Chr 7 EGFR loci demonstrating that the workflow of the present disclosure produces sharp, well-defined on-target peaks, in contrast to the broader, less specific peaks observed with the conventional method. This difference in read distribution explains the high on-target coverage seen with the workflow of the present disclosure.
- IOV Integrated Genome Viewer
- Figure 9 demonstrates allele-specific enrichment and higher read coverage of 2 representative clinically relevant target mutations using the workflow of the present disclosure, as compared to the conventional protocol.
- Figure 10 shows that the workflow of the present disclosure enables stable detection of target mutations with as few as 2 million reads, while the conventional primer extension target capture method show progressively lower fraction of mutation detection with down-sampled sequencing reads. Taken together this data demonstrates the superiority of the workflow of the present disclosure above conventional primer extension target capture methods.
- Figure 11 illustrates the qPCR strategy.
- the lower panels present a bar graph showing that, in the absence of nuclease treatment, the primer-annealed DNA fragment with 3' and 5’ overhangs could be captured using streptavidin beads, as evidenced by detectable qPCR signal both upstream and downstream of the capture primer.
- treatment with single-strand-specific nuclease led to complete loss of this qPCR signal, indicating efficient digestion of the 3’ and 5’ single-stranded overhangs and removal by DNA beads purification.
- thermostable exo- Taq polymerase compared to the exo- Klenow fragment used previously, we performed qPCR using adapter-ligated cfDNA libraries prepared from Horizon cfDNA reference standards at 5%, 1%, 0.1%, and 0% variant allele frequencies (VAF). Allele-specific (AS) reverse primers from the 2Strands target capture probe set (Table 1) were used , along with nondiscriminating reverse primers or non-allele specific primers (NAS) and common forward primers (Table 7).
- AS Allele-specific reverse primers from the 2Strands target capture probe set
- NAS non-allele specific primers
- Table 7 common forward primers
- Each qPCR reaction included 5 ng of adapter- ligated library (5%, 15, 0.1% or 0%) , 0.25 pM of each primer, 0.2 mM dNTPs, 0.2 U of Taq polymerase, 1X reaction buffer, and SYBR Green dye (1 :10,000 dilution), in a final volume of 25 pL.
- Thermal cycling was carried out in a Bio-Rad CFX96 real Time PCR system using the following conditions: initial denaturation at 94°C for 2 minutes, followed by 40 cycles of 94°C for 15 seconds, 60°C for 30 seconds, and 68°C for 30 seconds.
- Figures 9-11 shows the ability of the capture primers and a thermostable taq enzyme to effectively and specifically distinguish mutant alleles from wild-type allele even at low frequencies -5%, 1% and 0.1%, thereby underpinning the effectiveness of the disclosed allele-specific primer design approach and the exo- Taq in distinguishing between alleles demonstrating the robustness of our method in allele distinction.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present disclosure provides a method of enriching a variant nucleic acid molecule from a population of nucleic acid molecules, said method comprising contacting a population of nucleic acid molecules with a primer in a reaction mixture, incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule, incubation of the reaction mixture for primer extension though the action of a polymerase, contacting the population of nucleic acid molecules in the reaction mixture with a nuclease, purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule. The present disclosure also provides a method of designing a primer.
Description
TARGET ENRICHMENT
This application claims priority from SG 10202401187W filed 25 April 2024, the contents and elements of which are herein incorporated by reference for all purposes.
Field of the Invention
The present invention relates to the fields of molecular biology and biotechnology and particularly, although not exclusively, to methods of enriching DNA molecules.
Background
Cancer is a major economic and public health burden worldwide. It is the second leading cause of death globally. Recurrence/Relapse is the major cause of cancer related deaths. Minimal residual disease (MRD) is defined as a small number of cancer cells left in a patient after curative-intent treatment that causes recurrence/relapse of cancer (Pantel, K. etal. Nat Rev Clin Oncol. 2019). Early interception of relapse/recurrence can greatly benefit patients and improve overall survival (Medina, J.E. et al. J Immunotherapy Cancer, 2023). Radiological imaging (MRI and CT scans) is currently the gold standard for monitoring patients after primary treatment and adjuvant therapy to assess response to treatment and to surveil for recurrence. However, radiological imaging has several limitations: a) It exposes patients multiple times (3-6 monthly) to radiation, b) patients with additional health complications find it difficult to metabolise contrast dyes, c) it does not provide molecular information and most importantly d) it does not measure minimal residual disease and is hence unable to intercept relapse or recurrence earlier than it occurs.
Circulating Tumor DNA (ctDNA) are fragments of DNA circulating in the bloodstream of cancer patients that originate from malignant tumor tissue or from circulating tumor cells. ctDNA carries mutations and epigenetic changes characteristic of the tumor that they originate from. Recent work in the field has established the importance of ctDNA as a biomarker of MRD in cancer patients and underpinned the importance of ctDNA based non-invasive MRD tests in early interception of recurrence/relapse (Nagasaka, M et al. Molecular Cancer. 2021). In fact, ctDNA based MRD measurement outperforms radiological imaging by 6 months to 1 year in detecting early signs of recurrence which in turn leads to better survival outcome (Kim, T. et al, Thorac Cancer 2019 ).
In spite of great advantages, adoption of ctDNA based MRD tests in regular clinical practice is limited due to prohibitively high costs and low sensitivity. A typical plasma sample is flooded with cell free DNA (cfDNA) shed by normal cells (e.g. haematopoietic cells) of the body. Mutant ctDNA shed by tumour cells is a small fraction (0.1-10%) of this cfDNA sample. Therefore, detection of ctDNA via liquid biopsy is a ‘needle in a haystack’ problem. Current approaches to liquid biopsy that measure somatic mutations use target capture panels to enrich genetic regions of interest that are known to bear cancer causing mutations followed by shot-gun sequencing. These sequencing libraries contain DNA fragments from the regions of interest irrespective of whether they are originating from tumour or normal cells and irrespective of whether the enriched DNA fragment bears a mutation or not. These technologies rely on
ultra-deep sequencing (5000-10,000X) to detect mutations with confidence (Medina, J.E. et al. J Immunotherapy Cancer, 2023). The need for ultradeep sequencing is one of the major cost drivers of liquid biopsy.
Current technologies achieve a limit of detection ranging between 0.01 -0.4% allele frequency (Medina, J.E. et al. J Immunotherapy Cancer, 2023) with ultra-deep sequencing. One approach (Natera and Foundation Medicine) uses whole genome sequencing of primary tumours to design a panel of mutations personal to a patient for subsequent disease monitoring. However, this approach is both time consuming and expensive. Furthermore, extensive research in cancer genomics has clearly established that there is a significant molecular evolution of a tumour with disease progression and often primary and metastatic or secondary tumors may have distinct molecular features (Zhong, L et al. Sig Transduct Target Ther, 2023). The approach of the present disclosure, described below, was developed to achieve an ideal combination of sensitivity, affordability, and turnaround time to make this a practical test for clinical practice.
Summary of the Invention
In one aspect, the present disclosure provides method of enriching a variant nucleic acid molecule.
The present disclosure also provides a method of enriching variant nucleic acid molecule associated with a disease.
The present disclosure also provides a method of detecting a mutation in a population of nucleic acids.
In another aspect, the present disclosure provides a method of determining whether a sample comprises a variant nucleic acid molecule. In some embodiments, the sample is a liquid biopsy sample.
In some embodiments, the liquid biopsy sample is a blood sample, a plasma sample, a saliva sample, or a urine sample. In some embodiments, the liquid biopsy sample is derived from a blood sample, a plasma sample, a saliva sample, or a urine sample.
The present disclosure also provides a method of detecting the presence of or recurrence of cancer. In another aspect, the present disclosure provides a method of detecting the presence of or recurrence of metastasis.
The present disclosure also provides a method of detecting minimal residual disease.
The present disclosure also provides a method of selecting a subject for therapy.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising differential methylation.
In some embodiments, the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension;
(e) contacting nucleic acid molecules in the reaction mixture with a nuclease;
(f) purifying duplex nucleic acid molecules.
In some embodiments, the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension;
(e) contacting nucleic acid molecules in the reaction mixture with an exonuclease;
(f) purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang.
In some embodiments, the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension;
(e) contacting nucleic acid molecules in the reaction mixture with an nuclease;
(f) purifying duplex nucleic acid molecules comprising a variant nucleic acid molecule.
In some embodiments, the nucleic acid is DNA or RNA. In some embodiments, the nucleic acid is DNA. In some embodiments, the nucleic acid is RNA. In some embodiments, the nucleic acid molecule is a DNA molecule or an RNA molecule. In some embodiments, the nucleic acid is a DNA molecule. In some embodiments, the nucleic acid is an RNA molecule. In some embodiments, the variant nucleic acid molecule is a variant DNA molecule or a variant RNA molecule. In some embodiments, the variant nucleic acid molecule is a variant DNA molecule. In some embodiments, the variant nucleic acid molecule is a variant RNA molecule.
In some embodiments, the method of enriching a variant nucleic acid molecule is a method of enriching a variant DNA molecule.
In some embodiments, the method comprises:
(a) provision of a population of DNA molecules;
(b) contacting the population of DNA molecules with a primer to form a reaction mixture,
(c) incubation of the reaction mixture for hybridization of the primer with a DNA molecule;
(d) further incubation of the reaction mixture to enable primer extension;
(e) contacting DNA molecules in the reaction mixture with an exonuclease;
(f) purifying duplex DNA comprising a single stranded 3’ overhang.
In some embodiments, the method comprises:
(a) provision of a population of DNA molecules;
(b) contacting the population of DNA molecules with a primer to form a reaction mixture,
(c) incubation of the reaction mixture for hybridization of the primer with a DNA molecule;
(d) further incubation of the reaction mixture to enable primer extension;
(e) contacting DNA molecules in the reaction mixture with an nuclease;
(f) purifying duplex DNA comprising a variant DNA molecule.
A variant nucleic acid molecule may alternatively be described as a mutant nucleic acid molecule, a mutant allele, or an allele. A variant DNA molecule may alternatively be described as a mutant DNA molecule, a mutant allele, or an allele.
In some embodiments, the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule.
In some embodiments, the primer comprises a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and the corresponding wild-type nucleic acid molecule.
In some embodiments, the primer comprises a complementary region.
In some embodiments, the complementary region is complimentary to the variant nucleic acid molecule and the corresponding wild-type nucleic acid molecule.
In some embodiments, the complementary region is between 16-28 nucleotides long.
In some embodiments, the primer is between 18-30 nucleotides long.
In some embodiments, primer extension occurs through the action of a polymerase.
In some embodiments, a polymerase is provided at step (a), (b), (c), or (d).
In some embodiments, a polymerase is provided to the reaction mixture at step (b) or (d).
In some embodiments, the exonuclease has 5'- 3' exonuclease activity.
In some embodiments, the nuclease has single-strand-specific nuclease activity.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with an exonuclease, wherein the exonuclease has 5'- 3' exonuclease activity;
(f) purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang.
In some embodiments, the method comprises a method of enriching a variant DNA molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of DNA molecules;
(b) contacting the population of DNA molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant DNA molecule and a corresponding wild-type DNA molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a DNA molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with an exonuclease, wherein the exonuclease has 5'- 3' exonuclease activity;
(f) purifying duplex DNA comprising a single stranded 3’ overhang.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant
DNA molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with an exonuclease, wherein the exonuclease has 5'- 3' exonuclease activity;
(f) purifying duplex nucleic acid molecule comprising a single stranded 3’ overhang;
(g) amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecule.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with an exonuclease, wherein the exonuclease has 5'- 3' exonuclease activity;
(f) purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang;
(g) amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules;
(h) sequencing the amplified nucleic acid molecule.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
(f) purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule.
In some embodiments, the method comprises a method of enriching a variant DNA molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of DNA molecules;
(b) contacting the population of DNA molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant DNA molecule and a corresponding wild-type DNA molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a DNA molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
(f) purifying duplex DNA comprising the variant DNA molecule.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant DNA molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
(f) purifying duplex nucleic acid molecule comprising the variant nucleic acid molecule;
(g) amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecule.
In some embodiments, the method comprises a method of enriching a variant nucleic acid molecule comprising a mutation, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer to form a reaction mixture, wherein the primer comprises a nucleotide which is complimentary to the mutation of the variant
nucleic acid molecule and a mismatched nucleotide which is non-complimentary to the variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule,
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture to enable primer extension through the action of a polymerase;
(e) contacting nucleic acid molecules in the reaction mixture with a nuclease, optionally wherein the nuclease has single-strand-specific nuclease activity;
(f) purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule;
(g) amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules;
(h) sequencing the amplified nucleic acid molecule.
In some embodiments, bioinformatic analysis is performed to analyse data generated through sequencing
In some embodiments, the population of nucleic acid molecules is provided as a library of nucleic acid molecules.
In some embodiments, library preparation comprises a single stranded library preparation approach and/or the inclusion of an individual Unique Molecular Identifiers to each strand of the nucleic acid.
In some embodiments, the population of nucleic acid molecules is comprised within or isolated from a liquid biopsy sample.
In some embodiments, the population of nucleic acid molecules comprises cell-free DNA (cfDNA).
In some embodiments, the population of nucleic acid molecules comprises or is suspected of comprising circulating tumor DNA (ctDNA).
In some embodiments, the variant nucleic acid molecule is a single nucleotide variant DNA molecule.
In some embodiments, the mutation is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
In some embodiments, the variant nucleic acid molecule comprises nucleic acid altered through bisulphite conversion.
In some embodiments, the primer is a variant sequence specific primer and/or an allele distinguishing primer.
In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is present on the 3’ portion of the primer. The 3’ portion of the primer is the portion of the primer which is closer to the 3’ end than the 5’ end of the primer.
In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 5 nucleotides of the final nucleotide on the 3’ end of the primer. In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 4 nucleotides of the final nucleotide on the 3’ end of the primer. In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 3 nucleotides of the final nucleotide on the 3’ end of the primer. In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 2 nucleotides of the final nucleotide on the 3’ end of the primer.
In some embodiments, the nucleotide that is specific for the mutation of the variant nucleic acid molecule is final nucleotide on the 3' end of the primer.
In some embodiments, the mismatched non-complimentary nucleotide is not complimentary to the variant nucleic acid molecule or the corresponding wild-type nucleic acid molecule.
In some embodiments, the mismatched non-complimentary nucleotide is present on the 3’ portion of the primer.
In some embodiments, the mismatched non-complimentary nucleotide is present within 5 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule. In some embodiments, the mismatched non-complimentary nucleotide is present within 4 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule. In some embodiments, the mismatched non- complimentary nucleotide is present within 3 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule. In some embodiments, the mismatched non-complimentary nucleotide is present within 2 nucleotides of the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
In some embodiments, the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
In some embodiments, the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule, at the 5’ side of the mutation of the variant nucleic acid molecule.
In some embodiments, the primer comprises an affinity tag at the 5' end.
In some embodiments, the primer is biotinylated at the 5’ end.
In some embodiments, the primer is 18-30 nucleotides long.
In some embodiments, the primer is at least 18 nucleotides long. In some embodiments, the primer is at least 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, or 29 nucleotides long.
In some embodiments, the primer is at most 30 nucleotides long. In some embodiments, the primer is at most 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides long.
In some embodiments, primers are not provided as primer pairs.
In some embodiments, the incubation is performed under conditions that allow hybridization of the primer with its target sequence.
In some embodiments, the primer sequence enables primer extension following binding to the variant nucleic acid molecule. In some embodiments, the primer sequence prevents primer extension following binding to a wild-type nucleic acid molecule.
In some embodiments, the primer sequence enables primer extension following binding to the variant nucleic acid molecule, and prevents primer extension following binding to a corresponding wild-type nucleic acid molecule.
In some embodiments, primer extension occurs more frequently upon base pairing with the variant nucleic acid molecule, compared with base pairing with the corresponding wild-type nucleic acid molecule.
In some embodiments, primer extension occurs when the primer is bound to the variant nucleic acid molecule, but not when the primer is bound to the corresponding wild-type nucleic acid molecule.
In some embodiments, the nuclease/exonuclease does not digest variant nucleic acid molecules as these molecules are comprised within double stranded duplex nucleic acid after primer extension.
In some embodiments, the nuclease/exonuclease does digest corresponding wild-type nucleic acid molecules as these molecules are single stranded, because primer extension did not occur.
In some embodiments, the nuclease/exonuclease is single-strand specific.
In some embodiments, the nuclease/exonuclease does not have 3'- 5' exonuclease activity.
In the present method, the step of contacting the population of nucleic acid molecules in the reaction mixture with a nuclease specifically enriches the variant nucleic acid. In some embodiments, a single strand specific nuclease digests the 5’ un-extended end of the wild-type nucleic acid molecule and the 3’
single stranded overhang in both the variant nucleic acid molecule and the wild-type nucleic acid molecule. This step reduces the wild-type nucleic acid molecule to a short 30-40 bp double stranded fragment which is removed by purification. In some embodiments, a 5’ - 3’ exonuclease digests the 5’ unextended end of the wild-type nucleic acid molecule. This step reduces the wild-type variant nucleic acid molecule to a short 30-40 bp double stranded fragment with a 3’ single stranded overhang which is removed by purification. That is, the method of the invention comprises nuclease-mediated background wild-type nucleic acid molecule depletion. In some embodiments, the method does not comprise use of a primer comprising modification(s) which make the primer and/or extension product resistant to exonuclease digestion (e.g. phosphorothioate linkage).
In some embodiments, the reaction mixture comprises a polymerase. Polymerases are enzymes that catalyze the synthesis of nucleic acid chains. The polymerase may comprise 5'- 3' exonuclease activity. The polymerase may comprise 3'- 5' exonuclease activity.
In some embodiments, the polymerase comprises no 5'- 3' exonuclease activity.
In some embodiments, the polymerase comprises no 3'- 5' exonuclease activity.
In some embodiments, the polymerase comprises no exonuclease activity.
In some embodiments, the polymerase is a Taq polymerase.
In some embodiments, the polymerase is a mutated Taq DNA polymerase with maximum 3’ mismatch distinguishing ability.
In some embodiments, the polymerase is AptaTaqAexo.
In some embodiments, the polymerase is a high-specificity polymerase.
In some embodiments, the polymerase is optimised for SNP analysis.
In some embodiments, the reaction mixture is cooled after incubation of the reaction mixture with a polymerase.
In some embodiments, purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang comprises the use of a binding partner for an affinity tag present on the primer.
In some embodiments, purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang comprises pull down with a streptavidin coated magnetic bead.
In some embodiments, purifying duplex nucleic acid molecules comprising a variant nucleic acid molecule comprises the use of a binding partner for an affinity tag present on the primer.
In some embodiments, purifying duplex nucleic acid molecules comprising a variant nucleic acid molecule comprises pull down with a streptavidin coated magnetic bead.
In some embodiments, NaOH is used to release the variant nucleic acid molecule from binding partneraffinity tag complex. For example, purification may comprise the use of a biotin-tagged primer and streptavidin as the binding partner. An NaOH-based method may be used to release the enriched variant nucleic acid into the solution from the biotinylated primer-streptavidin complex.
In some embodiments, purified duplex DNA comprises the variant DNA molecule and does not comprise a corresponding wild-type DNA molecule.
In some embodiments, purified duplex DNA comprises a greater variantcorresponding wild-type DNA molecule ratio, compared to the population of DNA molecules provided at the start of the method.
In some embodiments, the method further comprises a step of amplifying the variant nucleic acid/DNA molecule after purifying the duplex nucleic/acid DNA comprising a single stranded 3’ overhang.
In some embodiments, the method further comprises a step of amplifying the variant nucleic acid molecule after purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule.
The amplification step may be performed using any suitable amplification method, for example the amplification step may comprise PCR amplification. The amplification step may be performed using nested PCR reactions. The amplification step may comprise two separate PCR reactions. The amplification step allows for: a) further nucleic acid enrichment, b) adding the 3’ adapter (which was lost during the nuclease digestion step) back to the nucleic acid molecule, and c) amplifying the enriched variant nucleic acid molecule.
In some embodiments, the step of amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules comprises contacting the purified duplex nucleic acid molecules a 3' adapter. In some embodiments, the step of amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules comprises contacting the purified duplex nucleic acid molecules a 3' adapter and a forward primer (e.g. a P7 forward primer). In some embodiments, the step of amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules comprises contacting the purified duplex nucleic acid molecules a 3’ adapter, a P7 forward primer, and a reverse primer. The 3’ adapter, forward primer and/or reverse primer may be any suitable primer known to the skilled person. In some embodiments, the 3’ adapter, forward primer and/or reverse primer is a 3’ adapter, forward primer and/or reverse primer disclosed herein.
In some embodiments, the method further comprises a step of DNA sequencing.
In some embodiments, the method further comprises a step of DNA sequencing.
In some embodiments, sequencing is next generation sequencing (NGS).
In some embodiments, the method further comprises bioinformatic analysis.
In some embodiments, the method further comprises bioinformatic analysis of sequence data.
In some embodiments, the method further comprises bioinformatic analysis of NGS data.
In some embodiments, purifying duplex DNA comprising a single stranded 3' overhang comprises pulldown with a binding partner for an affinity tag present on the primer.
In some embodiments, the method comprises a step of designing a primer.
In a further aspect, the disclosure provides a method of designing a primer.
In some embodiments, the step of designing a primer comprises the identification of a target variant nucleic acid sequence and a corresponding wild-type nucleic acid sequence.
In some embodiments, the step of designing a primer comprises the use of a quantitative stability model.
In some embodiments, the quantitative stability model is described in Panjkovich, A. et al. (Bioinformatics, 2005), which is hereby incorporated by reference in its entirety.
In some embodiments, the quantitative stability model is used to predict the destabilizing effect of each mutation/mismatch pair within a given primer.
In some embodiments, the quantitative stability model is used to determine an optimum primer sequence.
In some embodiments, the method comprises a first step of designing a primer to enable primer extension following binding to the variant nucleic acid molecule.
In some embodiments, the method comprises a first step of designing a primer to prevent primer extension following binding to a wild-type nucleic acid molecule.
In some embodiments, the method comprises designing a primer to enable primer extension following binding to the variant nucleic acid molecule, and prevent primer extension following binding to a corresponding wild-type nucleic acid molecule.
The invention includes the combination of the aspects and preferred features described except where such a combination is clearly impermissible or expressly avoided.
Summary of the Figures
Embodiments and experiments illustrating the principles of the invention will now be discussed with reference to the accompanying figures in which:
Figure 1 : Schematic diagram of primers. 5’ biotinylated (circle) mutation allele specific primers consist of a 30 bp complementary region (large bar) with the mutation to be detected (dark bar) at the 3’ end and a deliberate mismatch (light bar) at one of the 3 preceding bases of the 3’ end. The primer extends differentially depending on the presence or absence of a mutation in the template DNA.
Figure 2: Schematic diagram showing one specific strategy for altered DNA capture and library preparation. The cartoon shows a step-by-step workflow for capture of altered allele from a background of native nucleic acid sequences for next generation sequencing. Star represents a DNA alteration/mutation/variation, small circle represents biotin label and large circle represents magnetic bead coated with streptavidin.
Figure 3: Flowchart showing bioinformatics analysis pipeline
Figure 4: Schematic representation of allele-specific primer extension. Allele-specific biotinylated primers (purple, with red stars and circles indicating sequence specificity and biotin, respectively) are annealed to adapter-ligated double-stranded DNA fragments. In the presence of a matched target mutation, allelespecific extension occurs (green arrow), enabling downstream capture. In cases of wild-type allele, no extension occurs (red “No extension” label), and the unextended primers remain annealed with 3' and 5’ overhangs. Treatment with a single-strand-specific nuclease selectively digests the single strand overhangs on the WT and mutant allele fragments. The wild type allele is subsequently removed during bead-based cleanup. Only extended DNA fragments with intact 5’ ends are retained, enabling high- specificity enrichment of target alleles.
Figure 5 & Figure 6: Representative electropherogram traces illustrating fragment size distribution in the method of this disclosure and the conventional enrichment method respectively. These figures depict electropherogram generated using a fragment analyser. The x-axis represents fragment size in base pairs (bp), while the y-axis corresponds to fluorescence intensity. Two labelled markers frame the primary region of interest: a Lower Marker (LM) at the left (around 50-100 bp) and an Upper Marker (UM) at the right (around 7000 bp), serving as internal size references. A cluster of peaks spanning approximately 100-800 bp is evident, indicative of the dominant fragment populations in the respective electropherograms.
Figure 7: Comparison of sequencing efficiency and target coverage between the methods of this disclosure and conventional methods. Top panel: Total sequencing reads (in millions) generated from the
disclosed workflow (2Strands) and two conventional enrichment workflows (Conventional-1 and Conventional-2). Bottom panel: Mean target coverage achieved in each experiment. Despite significantly lower sequencing depth, the method of this disclosure yielded markedly higher mean target coverage, highlighting its superior enrichment efficiency. Note the broken y-axis to illustrate the large dynamic range in coverage.
Figure 8: Enhanced on target coverage by the disclosed method compared to conventional methods. Representative Integrative Genomics Viewer (IGV) screenshots showing read coverage at four genomic loci on chromosome 7 (EGFR locus). The top row (2Strands) illustrates read pileups obtained using the disclosed allele-specific enrichment probes, demonstrating strong, focused coverage at variant-containing regions. In contrast, the middle and bottom rows show data from two independent conventional primer extension target capture experiments, which display lower and less specific coverage across the same loci as demonstrated by the broader peaks. The increased signal intensity and sharp enrichment peaks in the top row highlights the method’s superiority. Genome coordinate ranges are indicated above each panel.
Figure 9: Superior enrichment of mutations by the disclosed allele-specific technology. Mutation detection performance at two clinically relevant loci: EGFR L858R (G>T) and PIK3CA E545K (G>A). Visualized using IGV, the read alignments illustrate variant allele frequencies (VAFs) and coverage levels for each method. In both loci, 2Strands-Exp2 (top panels, black) shows markedly enhanced variant detection sensitivity and significantly higher local sequencing depth (e.g., 11.1% VAF at 2131 x coverage for EGFR L858R; 12.9% VAF at 551 x for PIK3CA E545K). In contrast, conventional primer extension target capture method (lighter grey) yielded lower coverage and VAFs (4.8-5.1 %) or failed to detect the mutation (0% for PIK3CA E545K in Run2). These results demonstrate the selective enrichment and improved detection limits of the disclosed approach for mutations in circulating tumor DNA.
Figure 10: Comparison of SNP Detection Sensitivity across down-sampled read depths. The graph illustrates the fraction of single nucleotide polymorphisms (SNPs) detected (y-axis) as a function of the total number of sequencings reads down-sampled in millions (x-axis). The disclosed workflow (black line) remains consistently at 100% SNP detection over the tested range of down-sampled reads, whereas the conventional primer extension target approach (grey line) exhibits a progressive decline in the fraction of SNPs detected when the total read depth is reduced from 15 to 2 million reads. This demonstrates the robustness of the disclosed method (2Strands; black line) compared to the conventional primer extension target capture method (grey line) at lower read depths. Each point represents the average of 10 replicate read down-sampling, and the vertical/horizontal grid lines serve as visual aids.
Figure 11 : Effect of Single-Strand specific Nuclease (SSN) on target molecule with 3’ and 5’ overhangs. The top panel shows a schematic representation of the target molecule (EGFR L858R locus) and the qPCR strategy for detection of the 5’ and 3’ overhangs. In the bottom panel the bar chart illustrates how single-strand nuclease (SSN) affects the measured level of the target molecule (y-axis) by quantitative real time PCR. The x-axis labels represent distinct SSN volumes added (0 pL, 0.1 pL, 0.5 pL, and 1 pL). At the baseline (0 pL SSN), the target molecule is detected by qPCR both upstream and downstream of
the capture probe. Upon adding SSN, the signal disappears, indicating degradation or cleavage of the single-stranded overhangs under investigation. The error bar on the baseline column depicts the standard deviation (or standard error) from replicate assays. Overall, these data confirm the enzymatic activity of the used SSN.
Figure 12, Figure 13 & Figure 14: Comparative qPCR amplification profiles using allele-specific (AS) and non-allele-specific (NAS) KRAS G12D, EGFR L858R and NRAS Q61 K primers respectively. Each figure comprises three sub-panels (top-left, top-right, and bottom-centre) demonstrating amplification of 5% vs 0%, 1 % vs. 0% and 0.1 % vs 0% mutation allele frequency cfDNA reference material. Each panel depicts real-time quantitative PCR (qPCR) amplification curves under different primer conditions. The x-axis of each panel indicates the number of PCR cycles, while the y-axis (RFU) represents the fluorescent signal corresponding to accumulating amplification products. AS and NAS primers amplify the target region at different cycle thresholds demonstrating that the AS primers are specific for the low frequency target mutations present in the reference sample. Importantly, the AS primers amplify the WT template with 0% mutation at very high cycle numbers or very low efficiency, in some cases (KRAS G12D) distinguishing even between 0.1 % allele frequency. Overall, these amplification curves underscore the utility of the disclosed allele-specific primer design in distinguishing between alleles and illustrate the difference in amplification kinetics when non-allele-specific primers are employed. This clear separation in Ct values and signal intensities highlights the invention’s robust approach to allele detection.
Detailed Description of the Invention
Aspects and embodiments of the present invention will now be discussed with reference to the accompanying figures. Further aspects and embodiments will be apparent to those skilled in the art. All documents mentioned in this text are incorporated herein by reference.
The present invention is based on the development of an innovative method to accurately enrich very-low frequency altered DNA sequences (targets) from a background of native DNA sequences.
In some embodiments, the enrichment is not PCR amplicon based which can often be error prone and subject to non-specific amplification. In some embodiments, the method uses an allele distinguishing biotinylated primer that extends only upon base pairing with the intended DNA mutation sequence (e.g., a single nucleotide variant, Indel, deletion, Fusion, or DNA altered via bisulphite conversion).
In some embodiments, primer extension forms a stable duplex with the original altered DNA fragment which can be pulled down via a streptavidin coated magnetic bead. The original mutated DNA fragment (not the extension product) is then amplified and sequenced. This method allows enrichment of variant sequences several folds above background DNA sequence thereby increasing signal to noise ratio.
The method of enrichment according to the present disclosure combines sensitivity and affordability. Data-driven plasma-only panel design not only ensures faster turnaround but also representation of
molecular features typical of recurrent/metastatic tumours. The current approach, therefore, is an effort to obtain the ideal combination that would make liquid biopsy based MRD monitoring of cancer patients a practical solution for regular clinical practice.
The ultrasensitive nature of the assay should allow expansion of the clinical utility of this technology beyond the four solid tumours (lung, colorectal, bladder and breast) that currently have MRD solutions on offering. The invention may be use in the assessment of highly aggressive diseases (e.g., metastatic diseases) with high recurrence such as pancreatic cancer, ovarian, hepatocellular carcinoma, thyroid and gastrointestinal, liver, head and neck, and other cancers which remain a major challenge in the field of oncology.
Single stranded library preparation with dual incorporation of UMIs followed by nuclease-aided allelespecific primer extension target enrichment for NGS has not been demonstrated before. Allele distinguishing primers have been used for low-throughput genotyping of single genes (primarily naturally occurring germline SNPs) via PCR based methods, but not for high-throughput allele- specific target capture for an NGS based liquid biopsy assay. The only known high-throughput method for mutation allele specific target enrichment uses double stranded probes for hybridization and allele-specific target enrichment followed by Illumina sequencing (Gydush, G. et al, Nat Biomed Eng 2022). This work has shown to yield a 50X reduction in sequence depth requirement for variant detection (Gydush, G. et al, Nat Biomed Eng 2022) from cfDNA. Probe hybridization-based target capture, though highly specific is a time-consuming method. The disclosed method is expected to yield a greater reduction in sequence depth requirement and a superior sensitivity due to the combination of single stranded library preparation and nuclease aided allele specific primer extension target enrichment. Primer extension-based target capture assures specificity at the same level as probe hybridization while reducing turnaround time, cost of oligonucleotide probe/primer synthesis while generating more balanced and on-target sequencing reads.
Variant nucleotides
The term variant nucleic acid molecule may be used to describe a nucleic acid molecule comprising an alteration (or mutation), compared to a corresponding nucleic acid molecule (e.g., a wild-type nucleic acid molecule). The term variant nucleic acid molecule may alternatively be phrased as: mutant nucleic acid molecule, mutated nucleic acid molecule, nucleic acid molecule comprising a mutation, a mutant allele, an alternative allele, or an allele.
The term variant DNA molecule may be used to describe a DNA molecule comprising an alteration (or mutation), compared to a corresponding DNA molecule (e.g., a wild-type DNA molecule). The term variant DNA molecule may alternatively be phrased as: mutant DNA molecule, mutated DNA molecule, DNA molecule comprising a mutation, a mutant allele, an alternative allele, or an allele.
The term “mutation” refers to a difference in a nucleic acid sequence (e.g. DNA or RNA) in a sample compared to a reference. For example, a mutation may be a single nucleotide variant (SNV), multiple
nucleotide variants, a deletion mutation, an insertion mutation, a translocation, a missense mutation, a translocation, a fusion, etc. Mutations may be identified using sequence data. An "indel mutation" (or simply “indel”) refers to an insertion and/or deletion of bases in a nucleotide sequence (e.g. DNA or RNA) of an organism. In some embodiments, the mutation (or sequence variation) is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
In some embodiments, a mutation is a somatic mutation. A “somatic mutation” is a mutation that is present in a tumour or modified cell (or genetic material derived therefrom), but not in a corresponding (matched) normal or non-modified cell.
Population of nucleic acid molecules
Nucleic acid may be DNA or RNA. In some embodiments, the nucleic acid is DNA. In some embodiments, the nucleic acid may be a population of DNA molecules (e.g., DNA in a sample or DNA extracted/purified from a sample).
A “sample” as used herein refers to a biological sample. In some embodiments, the sample comprises cfDNA, including a sample of a biological fluid, or an extract therefrom. Within the context of the present invention, the sample may be a urine sample, a blood, plasma or serum sample, or a sample derived therefrom. The sample may be one which has been freshly obtained from the subject or may be one which has been processed and/or stored prior to making a determination (e.g. frozen, fixed or subjected to one or more purification, enrichment or extractions steps, including centrifugation).
In some embodiments, the sample is a liquid biopsy sample, or a sample derived from a liquid biopsy. Liquid biopsies are described in detail in the art. See, e.g., Poulet et al., Acta Cytol 63(6): 449-455 (2019), Chen and Zhao, Hum Genomics 13(1): 34 (2019). In some embodiments, a liquid biopsy sample comprises Cell-free DNA (cfDNA).
Cell-free DNA (cfDNA) is fragmented DNA that is shed into circulation through biological processes like apoptosis, necrosis, and active secretion. The cfDNA found in biological fluids can originate from different cell types. In cancer patients, a small proportion of cfDNA originates from tumour cells, and these tumour- derived cfDNA are referred to as circulating tumour DNA (ctDNA). Tumour-derived cfDNA is referred to herein as ctDNA. The term cfDNA encompasses ctDNA. However, cfDNA without ctDNA can also be referred to herein as non-tumour derived cfDNA. Non-tumour derived cfDNA constitutes the entirety of a cfDNA sample from a subject who does not have cancer (also referred to herein as “healthy subject”). Non-tumour cfDNA is also expected to represent a proportion of a cfDNA sample from a subject who does have cancer (also referred to herein as “cancer subject” or “cancer patient”).
The sample may be derived from one or more of the above biological samples. For example, the sample may comprise a nucleic acid library generated from the biological sample and may optionally be a barcoded or otherwise tagged nucleic acid library. A plurality of samples may be taken from a single patient, e.g. serially during a course of treatment. Moreover, a plurality of samples may be taken from a
plurality of patients. In some embodiments, a nucleic acid library comprises nucleic acid labelled with a unique molecular identifier (UMI). In some embodiments, a nucleic acid library comprises nucleic acid labelled with two UMIs.
The sample may comprise a DNA library generated from the biological sample and may optionally be a barcoded or otherwise tagged DNA library. A plurality of samples may be taken from a single patient, e.g. serially during a course of treatment. Moreover, a plurality of samples may be taken from a plurality of patients. In some embodiments, a DNA library comprises DNA labelled with a unique molecular identifier (UMI). In some embodiments, a DNA library comprises DNA labelled with two UMIs.
Primers
A primer is a short single-stranded nucleic acid used by living organisms in the initiation of nucleic acid synthesis. A synthetic primer may also be referred to as an oligo, short for oligonucleotide. DNA polymerase (responsible for DNA replication) enzymes are only capable of adding nucleotides to the 3’- end of an existing nucleic acid, requiring a primer be bound to the template before DNA polymerase can begin a complementary strand.
Synthetic primers are chemically synthesized oligonucleotides, usually of DNA, which can be designed to anneal/hybridize/bind with a specific site/sequence on a target nucleic acid molecule. In solution, the primer spontaneously hybridizes with the template through Watson-Crick base pairing before being extended by DNA polymerase.
In some embodiments, the method comprises a first step of designing a primer. In some embodiments, the step of designing a primer comprises the identification of a target variant DNA sequence and a corresponding wild-type DNA sequence. In some embodiments, the step of designing a primer comprises the use of a quantitative stability model. In some embodiments, the quantitative stability model is described in Panjkovich, A. et al. (Bioinformatics, 2005), which is hereby incorporated by reference in its entirety. In some embodiments, the quantitative stability model is used to predict the destabilizing effect of each mutation/mismatch pair within a given primer. In some embodiments, the quantitative stability model is used to determine an optimum primer sequence. In some embodiments, the method comprises a first step of designing a primer to enable primer extension following binding to the variant DNA molecule. In some embodiments, the method comprises a first step of designing a primer to prevent primer extension following binding to a wild-type DNA molecule. In some embodiments, the method comprises designing a primer to enable primer extension following binding to the variant DNA molecule, and prevent primer extension following binding to a corresponding wild-type DNA molecule.
Nucleases
Nucleases are enzymes that cleave polynucleotides into nucleic acids of smaller units.
In some embodiments, the nuclease is single-strand specific. In some embodiments, the nuclease has single-strand-specific nuclease activity. Single-strand-specific nucleases are known to the skilled person
and are reviewed e.g. in Desai et al. (FEMS microbiology reviews 2003 26(5): 457-491), which is hereby incorporated by reference. Single-strand-specific nucleases exhibit high selectivity for single-stranded nucleic acids and single-stranded regions in double-stranded nucleic acids. In some embodiments, the single-strand specific nuclease has exonuclease activity. In some embodiments, the single-strand specific nuclease has endonuclease activity. In some embodiments, the single-strand specific nuclease has exonuclease activity and endonuclease activity. Examples of known single-strand-specific nucleases include S1 nuclease, P1 nuclease, N. crassa mycelia nuclease, N. crassa conidia nuclease, BAL 31 slow form nuclease, BAL 31 fast form nuclease, U. Maydis a nuclease, U. Maydis nuclease, Nuclease Bh1, Aspergillus nuclease, Physarum nuclease, SP nuclease, Mung bean nuclease, Wheat chloroplast nuclease, Rye germ ribosomes Nuclease I, Pea seeds nuclease, Tobacco nuclease I, Alfalfa seedling nucleases (e.g. acid, neutral), SK nuclease, hen liver nuclease, rat liver nuclei nuclease, and mouse mitochondria nuclease.
In some embodiments, the nuclease is an exonuclease. Exonucleases are enzymes that work by cleaving nucleotides one at a time from the end (exo) of a polynucleotide chain. A hydrolysing reaction that breaks phosphodiester bonds at either the 3' or the 5' end occurs. Exonucleases are reviewed by Manils et al. (Cells. 2022 Jul; 11 (14): 2157), which is hereby incorporated by reference in its entirety.
Different exonucleases are known. For example, Exonuclease I breaks apart single-stranded DNA in a 3' 5' direction, releasing deoxyribonucleoside 5'-monophosphates one after another. It does not cleave DNA strands without terminal 3'-OH groups because they are blocked by phosphoryl or acetyl groups. Exonuclease II is associated with DNA polymerase I, which contains a 5' exonuclease that clips off the RNA primer contained immediately upstream from the site of DNA synthesis in a 5' — > 3' manner. Exonuclease III has four catalytic activities: 3' to 5' exodeoxyribonuclease activity, which is specific for double-stranded DNA, RNase activity, 3' phosphatase activity, and AP endonuclease activity. Exonuclease IV adds a water molecule, so it can break the bond of an oligonucleotide to nucleoside 5' monophosphate. Exonuclease V is a 3' to 5' hydrolysing enzyme that catalyses linear double-stranded DNA and single-stranded DNA, which requires Ca2+. This enzyme is extremely important in the process of homologous recombination. Exonuclease VIII is 5' to 3' dimeric protein that does not require ATP or any gaps or nicks in the strand, but requires a free 5' OH group to carry out its function.
In some embodiments, the exonuclease is single-strand specific. In some embodiments, the exonuclease has 5'- 3' exonuclease activity. In some embodiments, the exonuclease has 3'- 5' exonuclease activity. In some embodiments, the exonuclease does not have 5'- 3' exonuclease activity. In some embodiments, the exonuclease does not have 3'- 5' exonuclease activity. The 3-5' exonucleases are reviewed by Shevelev and Hubscher (Nature Reviews Molecular Cell Biology volume 3, pages 364-376. 2002), which is hereby incorporated by reference in its entirety.
Sequencing
In some embodiments, following enrichment, sequencing is performed. In some embodiments, following enrichment, DNA sequencing is performed. In some embodiments, the DNA sequencing is next-
generation sequencing (NGS). NGS technologies have been previously described (Levy et al. PLoS Biol 55, e254 (2007); Wheeler et al. Nature 452:872-876 (2008); Bentley et al., Nature 456:53-59 (2008)).
Sequencing of the enriched nucleic acids can be achieved using sequencing by ligation or sequencing by synthesis. Sequencing by synthesis relies on a DNA polymerase to incorporate four reversible terminatorbound dNTPs. One base is added per cycle and the fluorescently labelled reversible terminator is imaged as each dNTP is added. Sequencing by ligation uses the mismatch sensitivity of DNA ligase instead to distinguish the sequence of interest and incorporate a pool of fluorescently labelled oligonucleotides of varying lengths. Sequencing by ligation has high accuracy but may encounter problems with palindromic sequences.
The term ‘‘sequence data” refers to information that is indicative of the presence and/or amount of genomic material in a sample that has a particular sequence. Such information may be obtained using sequencing technologies, such as e.g. next generation sequencing (NGS, such as e.g. whole exome sequencing (WES), whole genome sequencing (WGS, including shallow whole genome sequencing, sWGS), or sequencing of captured genomic loci (targeted or panel sequencing)), or using array technologies, such as e.g. SNP arrays, or other molecular counting assays. When NGS technologies are used, the sequence data may comprise a count of the number of sequencing reads (also referred to as “sequence reads” or “sequence read data”) that have a particular sequence. When non-digital technologies are used such as array technology, the sequence data may comprise a signal (e.g. an intensity value) that is indicative of the number of sequences in the sample that have a particular sequence, for example by comparison to an appropriate control. Sequence data may be mapped to a reference sequence, for example a reference genome, using methods known in the art. Thus, counts of sequencing reads or equivalent non-digital signals may be associated with a particular genomic location. Sequence reads data may be provided or obtained directly, e.g., by sequencing the cfDNA sample or library or by obtaining or being provided with sequencing data that has already been generated, for example by retrieving sequence read data from a non-volatile or volatile computer memory, data store or network location. The sequencing may be paired-end sequencing. The sequence reads may be in a suitable data format, such as FASTQ, SAM or BAM. The sequence read data, e.g., FASTQ files, may be subjected to one or more processing or clean-up steps prior to or as part of the step of reads collapsing into read families. For example, the sequence data files may be processed using one or more tools selected from as FastQC v0.11 .5, a tool to remove adaptor sequences (e.g. cutadapt v1 .9.1). The sequence reads (e.g. trimmed sequence reads) may be aligned to an appropriate reference genome (or may have been previously aligned to an appropriate reference sequence, e.g. in the case of SAM/BAM files), for example, the human reference genome GRCh37 for a human subject. As used herein “read” or “sequencing read” may be taken to mean the sequence that has been read from one molecule and read once. Each molecule can be read any number of times, depending on the sequencing performed. In embodiments, the sequence data is data from sWGS, WGS, WES, or any capture panels including custom capture panels.
Subjects
The words “subject” and “patient” are used herein interchangeably. The subject may be a mammalian subject, such as a human subject or an animal model or pet, such as e.g. a mouse, rat, rabbit, horse, dog, etc. The subject may be a subject who has been diagnosed as having or being at risk of developing cancer. A cancer may be selected from: bladder cancer, gastric cancer, oesophageal cancer, breast cancer, colorectal cancer, cervical cancer, ovarian cancer, endometrial cancer, kidney cancer (renal cell), lung cancer (small cell, non-small cell and mesothelioma), brain cancer (gliomas, astrocytomas, glioblastomas), melanoma, lymphoma, small bowel cancers (duodenal and jejunal), leukemia, pancreatic cancer, hepatobiliary tumours, germ cell cancers, prostate cancer, head and neck cancers, thyroid cancer and sarcomas. The cancer may be selected from: glioblastoma, melanoma, renal cancer, lung cancer, pancreatic cancer, breast cancer, gastric cancer, colorectal cancer, bile duct cancer, and ovarian cancer.
General definitions
As used herein, a “fragment”, “variant” or “homologue” of a nucleic acid molecule or protein may optionally be characterised as having at least 50%, preferably one of 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to the sequence of the reference nucleic acid molecule or protein. Fragments, variants, isoforms and homologues may be characterised by the ability to perform a function performed by the reference molecule.
Pairwise and multiple sequence alignment for the purpose of determining percent identity between two or more amino acid or nucleic acid sequences can be achieved in various ways known to a person of skill in the art, for instance, using publicly available computer software such as ClustalOmega (Soding, J. 2005, Bioinformatics 21 , 951-960), T-coffee (Notredame etal. 2000, J. Mol. Biol. (2000) 302, 205-217), Kalign (Lassmann and Sonnhammer 2005, BMC Bioinformatics, 6(298)) and MAFFT (Katoh and Standley 2013, Molecular Biology and Evolution, 30(4) 772-780 software. When using such software, the default parameters, e.g. for gap penalty and extension penalty, are preferably used.
A “fragment” generally refers to a fraction of the reference nucleic acid molecule or protein. A “variant” generally refers to a molecule having a nucleic acid molecule or amino acid sequence comprising one or more substitutions, insertions, deletions, or other modifications relative to the sequence of the reference nucleic acid molecule or protein, but retaining a considerable degree of sequence identity (e.g. at least 60%) to the sequence of the reference nucleic acid molecule or protein. An “isoform” generally refers to a variant of the reference nucleic acid molecule or protein expressed by the same species as the species of the reference nucleic acid molecule or protein. A “homologue” generally refers to a variant of the reference nucleic acid molecule or protein produced by a different species as compared to the species of the reference nucleic acid molecule or protein.
A “fragment” may be of any length (by number of amino acids), although may optionally be at least 25% of the length of the reference nucleic acid molecule or protein (that is, the nucleic acid molecule or protein from which the fragment is derived) and may have a maximum length of one of 50%, 75%, 80%, 85%,
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% of the length of the reference nucleic acid molecule or protein.
Pairwise and multiple sequence alignment for the purpose of determining percent identity between two or more amino acid or nucleic acid sequences can be achieved in various ways known to a person of skill in the art, for instance, using publicly available computer software such as ClustalOmega (Soding, J. 2005, Bioinformatics 21 , 951-960), T-coffee (Notredame etal. 2000, J. Mol. Biol. (2000) 302, 205-217), Kalign (Lassmann and Sonnhammer 2005, BMC Bioinformatics, 6(298)) and MAFFT (Katoh and Standley 2013, Molecular Biology and Evolution, 30(4) 772-780 software. When using such software, the default parameters, e.g. for gap penalty and extension penalty, are preferably used.
Numbered statements
The following numbered paragraphs (paras) describe particular aspects and embodiments of the present disclosure:
1. A method of enriching a variant nucleic acid molecule from a population of nucleic acid molecules, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer in a reaction mixture, wherein the primer comprises a mismatched nucleotide and a nucleotide which is complimentary to a mutation in the variant nucleic acid molecule, wherein the mismatched nucleotide is non-complimentary to the variant nucleic acid molecule and non-complimentary to a corresponding wild-type nucleic acid molecule;
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture for primer extension through the action of a polymerase;
(e) contacting the population of nucleic acid molecules in the reaction mixture with an exonuclease, wherein the exonuclease has 5'- 3' exonuclease activity;
(f) purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang.
2. The method according to para 1 , further comprising the following step:
(g) amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules.
3. The method according to para 1 or para 2, further comprising the following step: sequencing amplified nucleic acid molecules, optionally followed by bioinformatic analysis of sequence data.
4. The method according to any previous para, wherein the nucleic acid molecule is a DNA molecule.
5. The method according to any previous para, wherein the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 5 nucleotides of the final nucleotide on the 3’ end of the primer.
6. The method according to any previous para, wherein the nucleotide that is specific for the mutation of the variant nucleic acid molecule is final nucleotide on the 3’ end of the primer.
7. The method according to any previous para, wherein the mismatched non-complimentary nucleotide is present on the 3’ portion of the primer.
8. The method according to any previous para, wherein the mismatched non-complimentary nucleotide is present within 5 nucleotides of the nucleotide that is specific for the mutation of the variant DNA molecule.
9. The method according to any previous para, wherein the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
10. The method according to any previous para, wherein the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule, at the 5’ side of the mutation of the variant nucleic acid molecule.
11 . The method according to any previous para, wherein the primer comprises an affinity tag at the 5' end.
12. The method according to any previous para, wherein the primer is biotinylated at the 5’ end.
13. The method according to any previous para, wherein the primer is 18-30 nucleotides long.
14. The method according to any previous para, wherein the population of nucleic acid molecules is provided as a library of nucleic acid molecules, optionally wherein the library of nucleic acid molecules are prepared through a method comprising a single stranded library preparation approach and/or the addition of an individual Unique Molecular Identifiers to each strand of the nucleic acid.
15. The method according to any previous para, wherein the population of nucleic acid molecules are comprised within or isolated from a liquid biopsy sample.
16. The method according to any previous para, wherein the population of nucleic acid molecules and/or the liquid biopsy sample comprise cell-free DNA (cfDNA).
17. The method according to any previous para, wherein the variant nucleic acid molecule is a single nucleotide variant DNA molecule.
18. The method according to any previous para, wherein the mutation is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
19. The method according to any previous para, wherein the variant nucleic acid molecule comprises DNA altered through bisulphite conversion.
20. The method according to any previous para, wherein the exonuclease is single-strand specific.
21 . The method according to any previous para, wherein the exonuclease does not have 3'- 5' exonuclease activity.
22. The method according to any previous para, wherein a polymerase is provided to the reaction mixture during step (b).
23. The method according to any previous para, wherein the polymerase comprises no 5'- 3' exonuclease activity.
24. The method according to any previous para, wherein the polymerase comprises no 3'- 5' exonuclease activity.
25. The method according to any previous para, wherein the polymerase comprises no exonuclease activity.
26. The method according to any previous para, wherein purifying duplex nucleic acid molecules comprising a single stranded 3’ overhang comprises pull-down with a binding partner for an affinity tag present on the primer.
27. The method according to any previous para, wherein the method comprises a first step of designing a primer.
28. A method of determining whether a liquid biopsy sample comprises a variant nucleic acid molecule, the method comprising the method according to any one of paras 1 to 27.
29. A method of detecting the presence of or recurrence of cancer in a subject, the method comprising the method according to any one of paras 1 to 27.
30. A method of detecting minimal residual disease in a subject, the method comprising the method according to any one of paras 1 to 27.
31 . A method of designing a primer, the method comprising:
(a) the identification of a target variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule;
(b) generating primer nucleotide sequences, wherein each primer sequence comprises a mismatched nucleotide and a nucleotide which is complimentary to a mutation in the variant nucleic acid molecule, wherein the mismatched nucleotide is non-complimentary to the variant nucleic acid molecule and non-complimentary to a corresponding wild-type nucleic acid molecule,
(c) employing a model to predict the destabilizing effect of a mismatch within a primer;
32. The method according to para 31 , further comprising the following step:
(d) identifying a primer sequence which enables primer extension in the presence of a polymerase upon binding to the variant nucleic acid molecule, and does not enable primer extension in the presence of a polymerase upon binding to a corresponding wild-type nucleic acid molecule.
***
The features disclosed in the foregoing description, or in the following claims, or in the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for obtaining the disclosed results, as appropriate, may, separately, or in any combination of such features, be utilised for realising the invention in diverse forms thereof.
While the invention has been described in conjunction with the exemplary embodiments described above, many equivalent modifications and variations will be apparent to those skilled in the art when given this disclosure. Accordingly, the exemplary embodiments of the invention set forth above are considered to be illustrative and not limiting. Various changes to the described embodiments may be made without departing from the spirit and scope of the invention.
For the avoidance of any doubt, any theoretical explanations provided herein are provided for the purposes of improving the understanding of a reader. The inventors do not wish to be bound by any of these theoretical explanations.
Any section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
Throughout this specification, including the claims which follow, unless the context requires otherwise, the word “comprise” and “include”, and variations such as “comprises”, “comprising”, and “including” will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Ranges may be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by the use of the antecedent “about,” it will be understood that the particular value forms another embodiment. The term “about” in relation to a numerical value is optional and means for example +/- 10%.
Examples
Panel Design
The panel consists of 200-500 carefully curated cancer associated DNA ‘alterations’ (single nucleotide variants or SNVs, indels, fusions and copy number variations or CNVs and methylation changes). These ‘alterations’ may be hotspots or frequent mutations that are not classified as hotspots but occur in large proportion of patients in a given cancer type as analysed from genomic datasets of metastatic/ recurrent tumours (Martinez-Jimenez, F, et al, 2023). Alternatively, these are high frequency clonal tumour specific mutations identified from whole genome or whole exome sequencing of a given patient. Known biomarkers for current targeted treatments and immunotherapy are carefully curated from literature and included in the panel selection. Mutations that are physically close (within 5 base pairs) to common germline variants in our target ethnic population are annotated to design primers and probes suitable to our target ethnic population. Variants well known to arise from clonal haematopoiesis of indeterminate potential (CHIP) which is a major confounder of plasma based MRD tests are excluded. The panel serves as a ‘off-the-shelf’ panel for MRD measurement in particular solid tumours of choice.
Differential methylation has been shown as a valuable biomarker in circulating tumour DNA (Luo, H. et al, Trends Mol Med May 2021). Bisulphite conversion converts unmethylated cytosines to uracil while leaving the methylated cytosines. This is used as a single nucleotide alteration in DNA and can be targeted using methods of the present invention.
Primer Design
The methods of the present invention uses unique allele-specific primers for specifically capturing altered DNA fragments (targets) from a background of native DNA fragments in the same genomic region. Primers (18-40bp long) are biotinylated at the 5’ end and the alteration (SNVs, Indels, fusions, CNVs, bisulphite converted sequences) is placed at the 3' end. An intentional mismatch is introduced at one of the last 3 bases before the single nucleotide 'sequence alteration’ (Little, S. Curr. Protoc. Hum genetics, 2001). Quantitative stability model (Panjkovich, A. et al, Bioinformatics 2005) is utilised to predict the destabilizing effect of each ‘alteration’/deliberate -mismatch pair within a given primer. The ‘alteration’ and the intentional mismatch at the 3’ end destabilise the primer sufficiently to block primer extension upon base pairing with the WT allele while allowing specific extension of the matching mutant allele. The best allele-distinguishing primer for a given ‘alteration’ is selected by balancing stability of the hybrid for efficient primer extension in the ‘altered’ allele and sufficient destabilization in the WT allele to achieve high selectivity. The panel of pooled oligonucleotides is designed such that each primer within the pool does not form secondary structures with themselves or each other and have an annealing temperature of 60 degree C (Figure 1). Primers are also designed such that there is no complementarity to the Illumina sequencing adapters. Table 1 shows allele specific primers against 32 alterations on the +ve and -ve strands used for target capture in 3 separate pools. The pools were constructed to ensure minimum self and cross-primer complementarity. Naturally occurring germline variations are investigated in the 10 base pairs upstream of the target “alteration” and design degenerate primers if necessary.
Table 1. Allele specific primers
Library Preparation cfDNA contains both mono-nucleosomal fragments around 167bp as well as shorter (<100bp) non nucleosomal DNA fragments that are perhaps associated with DNA protection by other proteins such as transcription factors (Troll, C.J. et al, BMC Genomics 2019). It has been shown that tumour derived ctDNA exhibits an increased proportion of short fragments (Guo, J. et al BMC Genomics 2020). Conventional dsDNA library preparation involves end polishing and blunt-end ligation. End-polishing obscures the native termini and changes the true length of the cfDNA fragment (Troll, C.J. et al, BMC Genomics 2019) which is a limitation for studying ctDNA fragment length. Blunt-end ligation is an inefficient method which is unable to convert shorter single stranded or nicked double stranded DNA to sequencing ready libraries (Troll, C.J. et al, BMC Genomics 2019 ). To overcome this problem and to maximise the conversion of ctDNA into sequencing libraries single stranded library preparation approach (e. G IDT Xgen ssDNA and low input DNA library preparation) is used and also to improve specificity 2 individual Unique Molecular Identifiers are added to the two strands of the DNA. Early addition of unique molecular identifier allows efficient PCR error correction and identification of each strand of DNA.
Nuclease Aided Primer Extension Allele-specific Target Enrichment
Amplified ctDNA libraries (100-1000 ng) from the above step is used as input for nuclease-aided allelespecific primer extension target enrichment (NAAS- PETE). COT1 Human DNA is added 100x in excess of the primers to block repeat elements such as Alu and LINE that are common in ctDNA. 0.5uM of biotinylated primer pool, 2mM MgCI2, 0.2mM dNTP mix, appropriate PCR buffer (at 1X final concentration) is added to the reaction mix. A mutated Taq DNA polymerase with maximum 3’ mismatch distinguishing ability and no exonuclease activity (e.g., AptaTaqAexo, from Roche Custom Biotech) is used for the primer extension reaction.
The reaction may be performed using the following conditions in an appropriate thermocycler: Lid Temperature 105, Denaturation 95°C for 30 secs, Primer Annealing 60°C for 10 min and Extension 72°C for 2 mins and the reaction is quickly cooled to 4°C. Alternatively, thermocycling may be performed with the lid temperature set to 105°C. The cycling conditions include initial denaturation at 95°C for 2 minutes, primer annealing at 60°C for 10 minutes 30 seconds, and extension at 68-72°C for 30 seconds to 1 minute, followed by rapid cooling to 4°C. The reaction is then purified with AMPure XP beads (Beckman
Coulter), or Illumina purification beads, according to manufacturer’s protocol in order to remove unannealed biotinylated primers from the mix.
A nuclease purification step is next performed. Strategy 1 : To the purified DNA, 1 ul (30 units) of singlestranded DNA specific exonuclease RecJf (NEB) which removes bases exclusively in the 5’ -3’ direction is added along with NEBuffer2 (final cone. 1X) and incubated 20 minutes at 37°C. RecJf helps to remove the DNA fragments that bear the WT allele sequences. These fragments have a primer annealed to them but not extended. Hence, these WT alleles have a 30bp double stranded region (annealed primer) and a 5’ single stranded overhang which can be digested by RecJf thereby removing the 5’seqeuncing adapter. The altered alleles, on the other hand, are double stranded duplex DNA generated by primer extension with a 3’ single stranded overhang. This 3’ overhang remains unaffected by RecJf. Strategy 2: 0.1 pL of a single-strand-specific nuclease (SSN) is added to the purified DNA to digest 5' and 3' single-stranded overhangs. This step selectively eliminates DNA fragments containing wild-type (WT) alleles, which are characterized by annealed but un-extended primers, resulting in -30-40 bp double-stranded regions with single-stranded overhangs that are susceptible to nuclease digestion. Only a short 30-40 bp doublestranded WT fragment remains, which is removed during the next bead purification. In contrast, extended fragments corresponding to altered alleles form >90 bp double-stranded duplexes with a 3' singlestranded overhang. While the nuclease removes the overhang, the core extended duplex remains intact for downstream processing. These fragments undergo another round of bead purification and streptavidin pull-down to enrich for the altered allele population while depleting WT sequences.
The nuclease purification step, therefore, is a critical step that ensures specific capturing of the enriched altered allele over WT allele.
The altered allele specific primer extension product duplexed with the original altered DNA template is then captured on a streptavidin coated magnetic bead (e.g., Dynabeads). This must then be amplified using an appropriate method
Amplification strategy 1 : Altered DNA with sequencing adaptors on both ends is amplified using universal primers against Illumina adapters and a high-fidelity polymerase (e.g., Kapa hifi hotstart library amplification kit) using 6-10 cycles to generate NGS ready libraries. It is important to note that the primer extension product has adaptor only at one end and hence does not get amplified by the universal primers (Figure 2). This ensures none of the alterations identified by sequencing are artificially introduced by allele specific primers used for target enrichment. The amplified library is purified once more by DNA binding beads (e.g. Ampure XP) following manufacturer’s protocol, subjected to standard quality control measures for concentration and size distribution measurement followed by next generation sequencing.
Amplification strategy 2: The original template is selectively eluted from the beads using a defined protocol (Vuokko, T et al, NAR 1992). A 3' adapter is then added to the enriched fragment via PCR using a P7 forward primer and a hybrid reverse primer that overlaps the original capture primer and includes part of the sequencing adapter (Table 5). The PCR product is purified with magnetic beads, and a second
PCR is performed using P7 and P5 primers to generate the final amplified library which can be sequenced on an appropriate NGS platform (Figure 4)
Bioinformatics Analysis
Sequencing reads in the FastQ files undergo pre-processing steps to ensure data quality and integrity. Initially, read quality filtering is performed to discard low-quality reads, followed by adapter sequencing removal to eliminate adapter contamination. Subsequently, unique molecular identifier (UMI) error correction is conducted to mitigate errors introduced during sequencing. Reads are then grouped by their UMIs to collapse duplicates and reduce PCR amplification bias. Reads without UMI are discarded. A preprocessing report is generated to summarize the outcomes of these procedures and assess data quality.
Following pre-processing, clean reads are aligned against a Fasta file containing the genomic sequences of the target mutations . Reads with an exact match against these mutation sequences are recorded in a BAM file dedicated to the reads with mutations. Additionally, reads that do not have an exact match but align with up to 3 mismatches and still contain the target mutation enriched by our test are also preserved in the mutation BAM file. Allowing mismatches is to account for naturally occurring germline variation in humans. Reads that fail to align to the Fasta file containing the target mutations are aligned against the human reference genome and subsequently stored in a separate BAM file.
Reads in BAM files are piled up to calculate allele frequencies and sequencing coverage. This pileup data is utilized for variant calling, and the results are stored in a VCF file. Furthermore, a report is generated, containing the sequencing coverage for the region directly adjacent to the target mutations, along with differences in allele frequencies between the target enriched mutations and the reference wild-type alleles (Figure 5, Figure 6).
Example 2 - Exemplary process
20 ng of multiplex cfDNA reference standard, from Horizon (#HD:780) is used as a starting material for library preparation. This reference standard bears eight well-characterized mutations at known variable allele frequencies (VAF)- 5%, 1%, 0.1% and 0% (WT) (Table 2).
A panel of primers were designed following the innovative design strategy of this invention, against the mutations in Table 1 as positive control, and an additional 6 mutations well known in NSCLC but not present in the reference standard were chosen as negative controls to test specificity and allele differentiating ability of the assay (Table 2).
Table 2 Mutations and exact nucleotide changes present in the input sample
Table 3: List of mutations used as negative control
In addition to this, a panel of conventional probes (not targeting specific DNA alterations but regions in the 7 genes listed in tables 2 and 3) was designed for comparison. The final libraries were sequenced on Illumina platform (e.g,. NextSeq550 or NextSeq2000) to get ~120gb of data. The data was analysed as described in the methods section.
The above example is set up to consider the following hypotheses: a) The method of enrichment according to the present disclosure enriches mutations >100X compared to conventional target enrichment strategies. b) The method of enrichment according to the present disclosure requires -50-100X fewer sequencing reads to achieve the same coverage per read.
Fold enrichment between number of reads supporting DNA alteration in the method of enrichment according to the present disclosure versus. Conventional target enrichment is determined for each mutation listed in Table 2. The minimum sequencing depth required to call a given mutation using the method of enrichment according to the present disclosure versus conventional approach at each VAF dilution is determined. The limit of detection (LoD) of the current assay is determined by diluting the cfDNA reference standard to 0.01%, 0.001% and 0.0001% in comparison to the said conventional approach.
Example 3
20 ng of a 5% variant allele frequency (VAF) cfDNA reference standard (Horizon, #HD780) was used as input for sequencing library preparation with the Illumina Cell-Free DNA Prep Kit, following the manufacturer’s instructions.
Workflow according to the present disclosure: 500 ng of adapter-ligated library was incubated with 0.1 pM of a biotinylated pool of allele-specific primers (see Table 4) in 1X NEBuffer r2.1 . The mixture was subjected to thermal cycling at 94°C for 2 minutes, followed by 60°C for 5 minutes, and subsequently cooled to room temperature. Thereafter, 20 U of exo- Klenow Fragment (NEB), 0.1 mM dNTPs, and 8 mU of apyrase (NEB) were added. The final reaction volume was adjusted to 50 pL and incubated at 28°C for 15 minutes. The resulting primer extension products were purified using Illumina purification beads per the manufacturer’s protocol. The purified product was then treated with 0.1 pL of a single-strand-specific nuclease, followed by a second round of bead purification. Target alleles were subsequently captured using Dynabeads™ Streptavidin for target enrichment. Two rounds of PCR were performed to add the 3' sequencing adapter and amplify the enriched library. PCR1 and PCR2 were conducted using NEB Ultra II Q5 Master Mix for 7-10 and 4-5 cycles, respectively. Primer sequences are listed in Table 5.
Table 4 - Exemplary allele specific primers
Table 5 - Exemplary PCR1 and PCR2 primers
Conventional Workflow: An equivalent amount of adapter-ligated library was incubated with 0.1 pM of a biotinylated primer pool designed upstream of the target variants (Table 4). Primer extension products were captured using streptavidin beads and amplified using Ultra II Q5 Master Mix under standard PCR conditions.
The enriched library produced according to the workflow of the present disclosure was sequenced on an Illumina NovaSeq 6000, generating 15 million reads. The conventionally enriched samples were sequenced on an Illumina NextSeq 2000, generating 80 million reads per sample.
As shown in Figure 7, the workflow of the present disclosure achieved a mean on-target coverage of 38,000X with only 15 million reads. Despite significantly lower sequencing depth, the workflow of the present disclosure yielded markedly higher mean on-target coverage, highlighting its superior enrichment efficiency.
Figure 8 shows Integrated Genome Viewer (IGV) screenshots of 4 representative targets on Chr 7 EGFR loci demonstrating that the workflow of the present disclosure produces sharp, well-defined on-target
peaks, in contrast to the broader, less specific peaks observed with the conventional method. This difference in read distribution explains the high on-target coverage seen with the workflow of the present disclosure.
Figure 9 demonstrates allele-specific enrichment and higher read coverage of 2 representative clinically relevant target mutations using the workflow of the present disclosure, as compared to the conventional protocol. Figure 10 shows that the workflow of the present disclosure enables stable detection of target mutations with as few as 2 million reads, while the conventional primer extension target capture method show progressively lower fraction of mutation detection with down-sampled sequencing reads. Taken together this data demonstrates the superiority of the workflow of the present disclosure above conventional primer extension target capture methods.
Example 4
To test benefit and demonstrate the activity of the single-strand-specific nuclease in digesting 3’ and 5’ single-stranded overhangs, we treated a primer-annealed but non-extended, adapter-ligated cfDNA reference library (5% VAF) with varying amounts of nuclease (0, 0.1 pL, 0.5 pL, and 1 pL) at 30°C for 30 minutes. Following digestion, the reactions were purified using Illumina purification beads, and biotinylated targets were captured using Dynabeads™ Streptavidin for Target Enrichment.
To assess the effect of nuclease digestion, we performed qPCR both upstream and downstream of one of the allele-specific capture primers targeting the EGFR L858R mutation. Primer sequences are listed in Table 6. qPCR was carried out using Bio-Rad’s SsoAdvanced™ Universal SYBR Green Supermix on a Bio-Rad CFX96 Real-Time PCR System, following the manufacturer’s protocol.
Table 6 - Primer sequences
Figure 11 (top panel) illustrates the qPCR strategy. The lower panels present a bar graph showing that, in the absence of nuclease treatment, the primer-annealed DNA fragment with 3' and 5’ overhangs could be captured using streptavidin beads, as evidenced by detectable qPCR signal both upstream and downstream of the capture primer. In contrast, treatment with single-strand-specific nuclease led to complete loss of this qPCR signal, indicating efficient digestion of the 3’ and 5’ single-stranded overhangs and removal by DNA beads purification.
Example 5
To evaluate the use of a thermostable exo- Taq polymerase compared to the exo- Klenow fragment used previously, we performed qPCR using adapter-ligated cfDNA libraries prepared from Horizon cfDNA reference standards at 5%, 1%, 0.1%, and 0% variant allele frequencies (VAF). Allele-specific (AS) reverse primers from the 2Strands target capture probe set (Table 1) were used , along with nondiscriminating reverse primers or non-allele specific primers (NAS) and common forward primers (Table 7).
Table 7 - Primer sequences
Each qPCR reaction included 5 ng of adapter- ligated library (5%, 15, 0.1% or 0%) , 0.25 pM of each primer, 0.2 mM dNTPs, 0.2 U of Taq polymerase, 1X reaction buffer, and SYBR Green dye (1 :10,000 dilution), in a final volume of 25 pL. Thermal cycling was carried out in a Bio-Rad CFX96 real Time PCR system using the following conditions: initial denaturation at 94°C for 2 minutes, followed by 40 cycles of 94°C for 15 seconds, 60°C for 30 seconds, and 68°C for 30 seconds.
Figures 9-11 shows the ability of the capture primers and a thermostable taq enzyme to effectively and specifically distinguish mutant alleles from wild-type allele even at low frequencies -5%, 1% and 0.1%, thereby underpinning the effectiveness of the disclosed allele-specific primer design approach and the exo- Taq in distinguishing between alleles demonstrating the robustness of our method in allele distinction.
References
A number of publications are cited above in order to more fully describe and disclose the invention and the state of the art to which the invention pertains. Full citations for these references are provided below. The entirety of each of these references is incorporated herein.
1. Pantel, K., Alix-Panabieres, C. Liquid biopsy and minimal residual disease — latest advances and implications for cure. Nat Rev Clin Oncol 16, 409-424 (2019).
2. Arslan, S., Garcia, F.J., Guo, M. et al. Sequencing by avidity enables high accuracy with low reagent consumption. Nat Biotechnol (2023).
3. Briggs AW. Rapid retrieval of DNA target sequences by primer extension capture. Methods Mol Biol. 2011 ;772:145-54. doi: 10.1007/978-1 -61779-228-1_8. PMID: 22065436.
4. Chen X, Chang CW, Spoerke JM, Yoh KE, Kapoor V, Baudo C, Aimi J, Yu M, Liang-Chu MMY, Suttmann R, Huw LY, Gendreau S, Cummings C, Lackner MR. Low-pass Whole-genome Sequencing of Circulating Cell-free DNA Demonstrates Dynamic Changes in Genomic Copy Number in a Squamous Lung Cancer Clinical Cohort. Clin Cancer Res. 2019 Apr 1 ;25(7):2254- 2263. doi: 10.1158/1078-0432.CCR-18-1593. Epub 2019 Jan 7. PMID: 30617129.
5. Florent Mouliere and Nitzan Rosenfeld. Circulating tumor-derived DNA is shorter than somatic DNA in plasma. PNAS 112 (11).
6. Gydush G, Nguyen E, Bae JH, Blewett T, Rhoades J, Reed SC, Shea D, Xiong K, Liu R, Yu F, Leong KW, Choudhury AD, Stover DG, Tolaney SM, Krop IE, Christopher Love J, Parsons HA, Mike Makrigiorgos G, Golub TR, Adalsteinsson VA. Massively parallel enrichment of low- frequency alleles enables duplex sequencing at low depth. Nat Biomed Eng. 2022 Mar;6(3):257- 266. doi: 10.1038/s41551 -022-00855-9. Epub 2022 Mar 17. PMID: 35301450; PMCID: PMC9089460.
7. Jeffreys AJ, May CA. DNA enrichment by allele-specific hybridization (DEASH): a novel method for haplotyping and for detecting low-frequency base substitutional variants and recombinant DNA molecules. Genome Res. 2003 Oct; 13(10):2316-24. doi: 10.1101/gr.1214603. PMID: 14525930; PMCID: PMC403713.
8. Little S. Amplification-refractory mutation system (ARMS) analysis of point mutations. Curr Protoc Hum Genet. 2001 May;Chapter 9:Unit 9.8. doi: 10.1002/0471142905. hg0908s07. PMID: 18428319.
9. Lone, S.N., Nisar, S., Masoodi, T. et al. Liquid biopsy: a step closer to transform diagnosis, prognosis and future of cancer treatments. Mol Cancer 21 , 79 (2022). https ://doi .org/10.1186/s12943-022-01543-7
10. Medina JE, Dracopoli NC, Bach PB, Lau A, Scharpf RB, Meijer GA, Andersen CL, Velculescu VE. Cell-free DNA approaches for cancer early detection and interception. J Immunother Cancer. 2023 Sep;11 (9):e006013. doi: 10.1136/jitc-2022-006013. PMID: 37696619; PMCID:
PMC10496721.
11 . Naik T, Sharda M, C P L, Virbhadra K, Pandit A. High-quality single amplicon sequencing method for illumina MiSeq platform using pool of 'N' (0-10) spacer-linked target specific primers without
PhiX spike-in. BMC Genomics. 2023 Mar 23;24(1) :141 . doi: 10.1186/s12864-023-09233-4. PMID: 36959538; PMCID: PMC10037784.
12. Szymanski JJ, Sundby RT, Jones PA, Srihari D, Earland N, Harris PK, Feng W, Qaium F, Lei H, Roberts D, Landeau M, Bell J, Huang Y, Hoffman L, Spencer M, Spraker MB, Ding L, Widemann BC, Shern JF, Hirbe AC, Chaudhuri AA. Cell-free DNA ultra-low-pass whole genome sequencing to distinguish malignant peripheral nerve sheath tumor (MPNST) from its benign precursor lesion: A cross-sectional study. PLoS Med. 2021 Aug 31 ;18(8):e1003734. doi:
10.1371/journal.pmed.1003734. PMID: 34464388; PMCID: PMC8407545.
13. Zhang Y, Yao Y, Xu Y, Li L, Gong Y, Zhang K, Zhang M, Guan Y, Chang L, Xia X, Li L, Jia S, Zeng Q. Pan-cancer circulating tumor DNA detection in over 10,000 Chinese patients. Nat Commun. 2021 Jan 4;12(1):11 . doi: 10.1038/s41467-020-20162-8. Erratum in: Nat Commun. 2021 Feb 10;12(1):1048. PMID: 33397889; PMCID: PMC7782482.
14. Nagasaka, M., Uddin, M.H., Al-Hallak, M.N. et al. Liquid biopsy for therapy monitoring in early- stage non-small cell lung cancer. Mol Cancer 20, 82 (2021).
15. Kim T, Kim EY, Lee SH, Kwon DS, Kim A, Chang YS. Presence of mEGFR ctDNA predicts a poor clinical outcome in lung adenocarcinoma. Thorac Cancer. 2019 Dec;10(12):2267-2273. doi: 10.1 111/1759-7714.13219. Epub 2019 Oct 24. PMID: 31647198; PMCID: PMC6885440.
16. Luo H, Wei W, Ye Z, Zheng J, Xu RH. Liquid Biopsy of Methylation Biomarkers in Cell-Free DNA. Trends Mol Med. 2021 May;27(5):482-500. doi: 10.1016/j.molmed.2020.12.011 . Epub 2021 Jan 23. PMID: 33500194.
17. Troll CJ, Kapp J, Rao V, Harkins KM, Cole C, Naughton C, Morgan JM, Shapiro B, Green RE. A ligation-based single-stranded library preparation method to analyze cell-free DNA and synthetic oligos. BMC Genomics. 2019 Dec 27;20(1):1023. doi: 10.1 186/s12864-019-6355-0. PMID: 31881841 ; PMCID: PMC6935139.
18. Zhong, L., Zhao, Z. & Zhang, X. Genetic differences between primary and metastatic cancer: a pan-cancer whole-genome comparison study. Sig Transduct Target Ther 8, 363 (2023).
19. Martinez-Jimenez, F., Movasati, A., Brunner, S.R. et al. Pan-cancer whole-genome comparison of primary and metastatic solid tumours. Nature 618, 333-341 (2023).
20. Alejandro Panjkovich, Francisco Melo, Comparison of different melting temperature calculation methods for short DNA sequences, Bioinformatics, Volume 21 , Issue 6, March 2005, Pages 711 — 722.
21 . Vuokko T. Tormanen, Piotr M. Swidershi, Bruce E. Kaplan, Gerd P. Pfeifer, Arthur D. Riggs, Extension product capture improves genomic sequencing and DNase I footprinting by ligation- mediated PCR, Nucleic Acids Research, Volume 20, Issue 20, 25 October 1992, Pages 5487-5488, https://doi.Org/10.1093/nar/20.20.5487
For standard molecular biology techniques, see Sambrook, J., Russel, D.W. Molecular Cloning, A Laboratory Manual. 3 ed. 2001 , Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press.
Claims
1. A method of enriching a variant nucleic acid molecule from a population of nucleic acid molecules, wherein the method comprises:
(a) provision of a population of nucleic acid molecules;
(b) contacting the population of nucleic acid molecules with a primer in a reaction mixture, wherein the primer comprises a mismatched nucleotide and a nucleotide which is complimentary to a mutation in the variant nucleic acid molecule, wherein the mismatched nucleotide is non-complimentary to the variant nucleic acid molecule and non-complimentary to a corresponding wild-type nucleic acid molecule;
(c) incubation of the reaction mixture for hybridization of the primer with a nucleic acid molecule;
(d) further incubation of the reaction mixture for primer extension through the action of a polymerase;
(e) contacting the population of nucleic acid molecules in the reaction mixture with a nuclease;
(f) purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule.
2. The method according to claim 1 , further comprising the following step:
(g) amplifying the variant nucleic acid molecule from the purified duplex nucleic acid molecules.
3. The method according to claim 1 or claim 2, further comprising the following step: sequencing amplified nucleic acid molecules, optionally followed by bioinformatic analysis of sequence data.
4. The method according to any previous claim, wherein the nucleic acid molecule is a DNA molecule.
5. The method according to any previous claim, wherein the nucleotide that is specific for the mutation of the variant nucleic acid molecule is within 5 nucleotides of the final nucleotide on the 3’ end of the primer.
6. The method according to any previous claim, wherein the nucleotide that is specific for the mutation of the variant nucleic acid molecule is final nucleotide on the 3’ end of the primer.
7. The method according to any previous claim, wherein the mismatched non-complimentary nucleotide is present on the 3’ portion of the primer.
8. The method according to any previous claim, wherein the mismatched non-complimentary nucleotide is present within 5 nucleotides of the nucleotide that is specific for the mutation of the variant DNA molecule.
9. The method according to any previous claim, wherein the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule.
10. The method according to any previous claim, wherein the mismatched non-complimentary nucleotide is adjacent to the nucleotide that is specific for the mutation of the variant nucleic acid molecule, at the 5’ side of the mutation of the variant nucleic acid molecule.
11 . The method according to any previous claim, wherein the primer comprises an affinity tag at the 5' end.
12. The method according to any previous claim, wherein the primer is biotinylated at the 5’ end.
13. The method according to any previous claim, wherein the primer is 18-30 nucleotides long.
14. The method according to any previous claim, wherein the population of nucleic acid molecules is provided as a library of nucleic acid molecules, optionally wherein the library of nucleic acid molecules are prepared through a method comprising a single stranded library preparation approach and/or the addition of an individual Unique Molecular Identifiers to each strand of the nucleic acid.
15. The method according to any previous claim, wherein the population of nucleic acid molecules are comprised within or isolated from a liquid biopsy sample.
16. The method according to any previous claim, wherein the population of nucleic acid molecules and/or the liquid biopsy sample comprise cell-free DNA (cfDNA).
17. The method according to any previous claim, wherein the variant nucleic acid molecule is a single nucleotide variant DNA molecule.
18. The method according to any previous claim, wherein the mutation is a substitution, insertion, deletion, insertion-deletion (indel), a bisulphite converted sequence, and/or a fusion.
19. The method according to any previous claim, wherein the variant nucleic acid molecule comprises DNA altered through bisulphite conversion.
20. The method according to any previous claim, wherein the nuclease is single-strand specific.
21. The method according to any previous claim, wherein the nuclease is an exonuclease.
22. The method according to any previous claim, wherein the nuclease has 5'- 3' exonuclease activity.
23. The method according to any previous claim, wherein the nuclease does not have 3'- 5' exonuclease activity.
24. The method according to any previous claim, wherein a polymerase is provided to the reaction mixture during step (b).
25. The method according to any previous claim, wherein the polymerase comprises no 5'- 3' exonuclease activity.
26. The method according to any previous claim, wherein the polymerase comprises no 3'- 5' exonuclease activity.
27. The method according to any previous claim, wherein the polymerase comprises no exonuclease activity.
28. The method according to any previous claim, wherein purifying duplex nucleic acid molecules comprising the variant nucleic acid molecule comprises pull-down with a binding partner for an affinity tag present on the primer.
29. The method according to any previous claim, wherein the method comprises a first step of designing a primer.
30. A method of determining whether a liquid biopsy sample comprises a variant nucleic acid molecule, the method comprising the method according to any one of claims 1 to 29.
31 . A method of detecting the presence of or recurrence of cancer in a subject, the method comprising the method according to any one of claims 1 to 29.
32. A method of detecting minimal residual disease in a subject, the method comprising the method according to any one of claims 1 to 29.
33. A method of designing a primer, the method comprising:
(a) the identification of a target variant nucleic acid molecule and a corresponding wild-type nucleic acid molecule;
(b) generating primer nucleotide sequences, wherein each primer sequence comprises a mismatched nucleotide and a nucleotide which is complimentary to a mutation in the variant nucleic acid molecule, wherein the mismatched nucleotide is non-complimentary to the variant nucleic acid molecule and non-complimentary to a corresponding wild-type nucleic acid molecule,
(c) employing a model to predict the destabilizing effect of a mismatch within a primer;
34. The method according to claim 33, further comprising the following step:
(d) identifying a primer sequence which enables primer extension in the presence of a polymerase upon binding to the variant nucleic acid molecule, and does not enable primer extension in the presence of a polymerase upon binding to a corresponding wild-type nucleic acid molecule.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| SG10202401187W | 2024-04-24 | ||
| SG10202401187W | 2024-04-24 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025224260A1 true WO2025224260A1 (en) | 2025-10-30 |
Family
ID=95560171
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2025/061253 Pending WO2025224260A1 (en) | 2024-04-24 | 2025-04-24 | Target enrichment |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025224260A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030104372A1 (en) * | 1996-12-23 | 2003-06-05 | Pyrosequencing Ab. | Allele specific primer extension |
| US20170022551A1 (en) * | 2015-02-11 | 2017-01-26 | Zhitong LIU | Methods and compositions for reducing non-specific amplification products |
| WO2018013710A1 (en) * | 2016-07-12 | 2018-01-18 | F. Hoffman-La Roche Ag | Primer extension target enrichment |
| US20180094316A1 (en) * | 2000-02-07 | 2018-04-05 | Illumina, Inc. | Multiplex nucleic acid reactions |
| WO2024013241A1 (en) * | 2022-07-14 | 2024-01-18 | F. Hoffmann-La Roche Ag | Variant allele enrichment by unidirectional dual probe primer extension |
-
2025
- 2025-04-24 WO PCT/EP2025/061253 patent/WO2025224260A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030104372A1 (en) * | 1996-12-23 | 2003-06-05 | Pyrosequencing Ab. | Allele specific primer extension |
| US20180094316A1 (en) * | 2000-02-07 | 2018-04-05 | Illumina, Inc. | Multiplex nucleic acid reactions |
| US20170022551A1 (en) * | 2015-02-11 | 2017-01-26 | Zhitong LIU | Methods and compositions for reducing non-specific amplification products |
| WO2018013710A1 (en) * | 2016-07-12 | 2018-01-18 | F. Hoffman-La Roche Ag | Primer extension target enrichment |
| WO2024013241A1 (en) * | 2022-07-14 | 2024-01-18 | F. Hoffmann-La Roche Ag | Variant allele enrichment by unidirectional dual probe primer extension |
Non-Patent Citations (47)
| Title |
|---|
| "Current Protocols in Human Genetics", 1 May 2001, JOHN WILEY & SONS, INC., Hoboken, NJ, USA, ISBN: 978-0-471-14290-4, ISSN: 1934-8266, article STEPHEN LITTLE: "Amplification-Refractory Mutation System (ARMS) Analysis of Point Mutations", XP055254522, DOI: 10.1002/0471142905.hg0908s07 * |
| ALEJANDRO PANJKOVICHFRANCISCO MELO: "Comparison of different melting temperature calculation methods for short DNA sequences", BIOINFORMATICS, vol. 21, 6 March 2005 (2005-03-06), pages 711 - 722 |
| ARSLAN, S.GARCIA, F.J.GUO, M. ET AL.: "Sequencing by avidity enables high accuracy with low reagent consumption.", NAT BIOTECHNOL, 2023 |
| BENTLEY ET AL., NATURE, vol. 456, 2008, pages 872 - 876 |
| BRIGGS AW.: "Rapid retrieval of DNA target sequences by primer extension capture", METHODS MOL BIOL., vol. 772, 2011, pages 145 - 54 |
| CHEN XCHANG CWSPOERKE JMYOH KEKAPOOR VBAUDO CAIMI JYU MLIANG-CHU MMYSUTTMANN R: "Low-pass Whole-genome Sequencing of Circulating Cell-free DNA Demonstrates Dynamic Changes in Genomic Copy Number in a Squamous Lung Cancer Clinical Cohort.", CLIN CANCER RES., vol. 25, no. 7, 1 April 2019 (2019-04-01), pages 2254 - 2263 |
| CHENZHAO, HUM GENOMICS, vol. 13, no. 1, 2019, pages 34 |
| DARBEHESHTI FARZANEH ET AL: "Pre-PCR Mutation-Enrichment Methods for Liquid Biopsy Applications", CANCERS, vol. 14, no. 13, 27 June 2022 (2022-06-27), CH, pages 3143, XP093292682, ISSN: 2072-6694, Retrieved from the Internet <URL:https://www.mdpi.com/2072-6694/14/13/3143/pdf> DOI: 10.3390/cancers14133143 * |
| DESAI ET AL., FEMS MICROBIOLOGY REVIEWS, vol. 26, no. 5, 2003, pages 457 - 491 |
| FLORENT MOULIERENITZAN ROSENFELD: "Circulating tumor-derived DNA is shorter than somatic DNA in plasma.", PNAS, vol. 112, pages 11 |
| GUO, J. ET AL., BMC GENOMICS, 2020 |
| GYDUSH GNGUYEN EBAE JHBLEWETT TRHOADES JREED SCSHEA DXIONG KLIU RYU F: "Massively parallel enrichment of low-frequency alleles enables duplex sequencing at low depth.", NAT BIOMED ENG., vol. 6, no. 3, March 2022 (2022-03-01), pages 257 - 266, XP037766525, DOI: 10.1038/s41551-022-00855-9 |
| GYDUSH, G. ET AL., NAT BIOMED ENG, 2022 |
| JEFFREYS AJMAY CA: "DNA enrichment by allele-specific hybridization (DEASH): a novel method for haplotyping and for detecting low-frequency base substitutional variants and recombinant DNA molecules.", GENOME RES., vol. 13, no. 10, October 2003 (2003-10-01), pages 2316 - 24, XP002553628, DOI: 10.1101/gr.1214603 |
| KATOHSTANDLEY, MOLECULAR BIOLOGY AND EVOLUTION, vol. 30, no. 4, 2013, pages 772 - 780 |
| KIM TKIM EYLEE SHKWON DSKIM ACHANG YS: "Presence of mEGFR ctDNA predicts a poor clinical outcome in lung adenocarcinoma.", THORAC CANCER., vol. 10, no. 12, December 2019 (2019-12-01), pages 2267 - 2273 |
| KIM, T. ET AL., THORAC CANCER, 2019 |
| LASSMANNSONNHAMMER, BMC BIOINFORMATICS, vol. 298, 2005, pages 6 |
| LEVY ET AL., PLOS BIOL, vol. 55, 2007, pages e254 |
| LITTLE S.: "Amplification-refractory mutation system (ARMS) analysis of point mutations.", CURR PROTOC HUM GENET., May 2001 (2001-05-01) |
| LITTLE, S, CURR. PROTOC. HUM GENETICS, 2001 |
| LONE, S.N.NISAR, S.MASOODI, T. ET AL.: "Liquid biopsy: a step closer to transform diagnosis, prognosis and future of cancer treatments.", MOL CANCER, vol. 21, 2022, pages 79, Retrieved from the Internet <URL:https://doi.org/10.1186/s12943-022-01543-7> |
| LUO HWEI WYE ZZHENG JXU RH: "Liquid Biopsy of Methylation Biomarkers in Cell-Free DNA", TRENDS MOL MED., vol. 27, no. 5, May 2021 (2021-05-01), pages 482 - 500, XP086555411, DOI: 10.1016/j.molmed.2020.12.011 |
| LUO, H. ET AL., TRENDS MOL MED MAY, 2021 |
| MANILS, CELLS., vol. 11, no. 14, July 2022 (2022-07-01), pages 2157 |
| MARKOU ATHINA ET AL: "Nuclease-Assisted Minor Allele Enrichment Using Overlapping Probes-Assisted Amplification-Refractory Mutation System: An Approach for the Improvement of Amplification-Refractory Mutation System-Polymerase Chain Reaction Specificity in Liquid Biopsies", ANALYTICAL CHEMISTRY, vol. 91, no. 20, 20 September 2019 (2019-09-20), pages 13105 - 13111, XP093292653, ISSN: 0003-2700, Retrieved from the Internet <URL:https://pubs.acs.org/doi/pdf/10.1021/acs.analchem.9b03325?ref=article_openPDF> DOI: 10.1021/acs.analchem.9b03325 * |
| MARTÍNEZ-JIMÉNEZ, FMOVASATI, A.BRUNNER, S.R. ET AL.: "Pan-cancer whole-genome comparison of primary and metastatic solid tumours.", NATURE, vol. 618, 2023, pages 333 - 341 |
| MEDINA JEDRACOPOLI NCBACH PBLAU ASCHARPF RBMEIJER GAANDERSEN CL: "Velculescu VE. Cell-free DNA approaches for cancer early detection and interception.", J IMMUNOTHER CANCER, vol. 11, no. 9, September 2013 (2013-09-01), pages 006013 |
| MEDINA, J.E. ET AL., J IMMUNOTHERAPY CANCER, 2023 |
| NAGASAKA, M ET AL., MOLECULAR CANCER., 2021 |
| NAGASAKA, M.UDDIN, M.HAL-HALLAK, M.N ET AL.: "Liquid biopsy for therapy monitoring in early-stage non-small cell lung cancer.", MOL CANCER, vol. 20, 2021, pages 82 |
| NAIK T, SHARDA M, C P L, VIRBHADRA K, PANDIT A.: "High-quality single amplicon sequencing method for illumina MiSeq platform using pool of 'N' (0-10) spacer-linked target specific primers without PhiXspike-in. ", BMC GENOMICS, vol. 24, no. 1, 23 March 2023 (2023-03-23), pages 141, XP021316271, DOI: 10.1186/s12864-023-09233-4 |
| NAT COMMUN, vol. 12, no. 1, 10 February 2021 (2021-02-10), pages 1048 |
| NOTREDAME ET AL., J. MOL. BIOL., vol. 302, 2000, pages 205 - 217 |
| PANJKOVICH, A. ET AL., BIOINFORMATICS, vol. 21, 2005, pages 951 - 960 |
| PANTEL, K. ET AL., NAT REV CLIN ONCOL., 2019 |
| PANTEL, K.ALIX-PANABIÈRES, C.: "Liquid biopsy and minimal residual disease - latest advances and implications for cure.", NAT REV CLIN ONCOL, vol. 16, 2019, pages 409 - 424, XP036815955, DOI: 10.1038/s41571-019-0187-3 |
| POULET ET AL., ACTA CYTOL, vol. 63, no. 6, 2019, pages 449 - 455 |
| SAMBROOK, J.RUSSEL, D.W.: "Molecular Cloning, A Laboratory Manual.", 2001, COLD SPRING HARBOR LABORATORY PRESS |
| SHEVELEVHÜBSCHER, NATURE REVIEWS MOLECULAR CELL BIOLOGY, vol. 3, 2002, pages 364 - 376 |
| SZYMANSKI JJ, SUNDBY RT, JONES PA, SRIHARI D, EARLAND N, HARRIS PK, FENG W, QAIUM F, LEI H, ROBERTS D, LANDEAU M, BELL J, HUANG Y,: "Cell-free DNA ultra-low-pass whole genome sequencing to distinguish malignant peripheral nerve sheath tumor (MPNST) from its benign precursor lesion: A cross-sectional study.", PLOS MED., vol. 18, no. 8, 31 August 2021 (2021-08-31), pages e1003734 |
| TROLL CJKAPP JRAO VHARKINS KMCOLE CNAUGHTON CMORGAN JMSHAPIRO BGREEN RE.: "A ligation-based single-stranded library preparation method to analyze cell-free DNA and synthetic oligos.", BMC GENOMICS, vol. 20, no. 1, 27 December 2019 (2019-12-27), pages 1023, XP055848197, DOI: 10.1186/s12864-019-6355-0 |
| TROLL, C.J. ET AL., BMC GENOMICS, 2019 |
| VUOKKO T. TÖRMÄNEN, PIOTR M. SWIDERSHI, BRUCE E. KAPLAN, GERD P. PFEIFER, ARTHUR D. RIGGS: "Extension product capture improves genomic sequencing and DNase I footprinting by ligation-mediated PCR", NUCLEIC ACIDS RESEARCH, vol. 20, no. 20, 25 October 1992 (1992-10-25), pages 5487 - 5488, XP001152679 |
| VUOKKO, T ET AL., NAR, 1992 |
| ZHANG YYAO YXU YLI LGONG YZHANG KZHANG MGUAN YCHANG LXIA X: "Pan-cancer circulating tumor DNA detection in over 10,000 Chinese patients.", NAT COMMUN., vol. 12, no. 1, 4 January 2021 (2021-01-04), pages 11 |
| ZHONG, L.ZHAO, ZZHANG, X.: "Genetic differences between primary and metastatic cancer: a pan-cancer whole-genome comparison study.", SIG TRANSDUCT TARGET THER, vol. 8, 2023, pages 363 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10718010B2 (en) | Noninvasive diagnostics by sequencing 5-hydroxymethylated cell-free DNA | |
| JP7256748B2 (en) | Methods for targeted nucleic acid sequence enrichment with application to error-corrected nucleic acid sequencing | |
| EP3377647B1 (en) | Nucleic acids and methods for detecting methylation status | |
| KR102393608B1 (en) | Systems and methods to detect rare mutations and copy number variation | |
| EP4257701A2 (en) | Differential tagging of rna for preparation of a cell-free dna/rna sequencing library | |
| CA3126428A1 (en) | Compositions and methods for isolating cell-free dna | |
| CN114072527B (en) | Determine linear and circular forms of circulating nucleic acids | |
| US20190309352A1 (en) | Multimodal assay for detecting nucleic acid aberrations | |
| JP2020527340A (en) | Methods and systems for assessing DNA methylation in cell-free DNA | |
| US11371090B2 (en) | Compositions and methods for molecular barcoding of DNA molecules prior to mutation enrichment and/or mutation detection | |
| US20220177874A1 (en) | Methods for library preparation to enrich informative dna fragments using enzymatic digestion | |
| EP3775274B1 (en) | Detection method of somatic genetic anomalies, combination of capture probes and kit of detection | |
| Zhang et al. | Ultra-sensitive detection of melanoma NRAS mutant ctDNA based on programmable endonucleases | |
| CN112210602B (en) | Colorectal cancer screening method based on fecal sample | |
| WO2025029475A1 (en) | Methods to enrich nucleotide variants by negative selection | |
| WO2025224260A1 (en) | Target enrichment | |
| US20220127601A1 (en) | Method of determining the origin of nucleic acids in a mixed sample | |
| Jennings et al. | Validation of multiplex ligation-dependent probe amplification for confirmation of array comparative genomic hybridization | |
| EP4646493A1 (en) | Methods and compositions for amplifying methylated target dna molecules | |
| HK40102020A (en) | Differential tagging of rna for preparation of a cell-free dna/rna sequencing library | |
| TW202328459A (en) | A tumor detection method and application | |
| CN115896276A (en) | Tumor detection method and application |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25721874 Country of ref document: EP Kind code of ref document: A1 |