WO2022146773A1 - Procédés et compositions de séquençage et de détection de fusion à l'aide d'adaptateurs de queue de ligature (lta) - Google Patents
Procédés et compositions de séquençage et de détection de fusion à l'aide d'adaptateurs de queue de ligature (lta) Download PDFInfo
- Publication number
- WO2022146773A1 WO2022146773A1 PCT/US2021/064534 US2021064534W WO2022146773A1 WO 2022146773 A1 WO2022146773 A1 WO 2022146773A1 US 2021064534 W US2021064534 W US 2021064534W WO 2022146773 A1 WO2022146773 A1 WO 2022146773A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleotides
- sequence
- lta
- nucleic acid
- region
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
Definitions
- the present disclosure relates to the field of molecular biology. More particularly, it relates to methods and compositions useful for the detection, amplification, and quantification of nucleic acid molecules.
- PCR Polymerase chain reaction
- the resulting products can be used for further sequencing and detecting DNA variants.
- two primers specifically designed to bind to the breakpoints of target DNA segments are required.
- two separate target DNA sequences that are intended to bind to the primers must be conserved.
- chromosomes may exhibit large scale DNA structural variants (e.g., fusions and translocations).
- this application provides a method of preparing a nucleic acid template into a DNA library for sequencing, the method comprising the steps of: (a) Mixing a nucleic acid template, a Ligation Tail Adapter (LTA) molecule, a DNA ligase, and reagents for DNA ligase activity to form a first mixture; where the LTA molecule comprises an LTA-top strand and an LTA-bottom strand which two strands form a Double-Stranded End (DSE) and a DNA Tail (DT) region; (b) Subjecting the first mixture to a suitable temperature to allow for DNA ligation to form a ligation product mixture; (c) Introducing to the ligation product mixture from step (b) a targetspecific Outer Forward Primer (OFP) and a Splint, a DNA polymerase, and reagents for DNA polymerase activity to form a second mixture; where the OFP comprises a nucleic acid sequence that can specifically anne
- this application provides a method for preparing a nucleic acid for sequencing, the method comprising: (i) ligating a Ligation Tail Adapter (LTA) molecule to a nucleic acid comprising a known target nucleotide sequence to produce a ligation product, where the LTA molecule comprises an LTA-top strand and an LTA-bottom strand, and where the LTA- top strand and the LTA-bottom strand form a Double-Stranded End (DSE) and a DNA Tail (DT) region to produce a ligation product; (ii) amplifying the ligation product using a first target-specific primer that specifically anneals to the known target nucleotide sequence and a Splint, where the Splint comprises a 5' end subsequence that is not complementary to a subsequence of the LTA, and a 3' end subsequence that is complementary to the DT region to produce an amplification product; and (LTA) molecule to
- this application provides a method for preparing a nucleic acid for sequencing, the method comprising: (i) ligating a Ligation Tail Adapter (LTA) molecule to a nucleic acid comprising a known target nucleotide sequence to produce a ligation product, where the LTA molecule comprises an LTA-top strand and an LTA-bottom strand, and where the LTA- top strand and the LTA-bottom strand which two strands form a Double-Stranded End (DSE) and a DNA Tail (DT) region to produce a ligation product; (ii) amplifying the ligation product using a first target-specific primer that specifically anneals to the known target nucleotide sequence and a Splint, where the Splint comprises a 5' end subsequence that is not complementary to a subsequence of the LTA, and a 3' end subsequence that is complementary to the DT region to produce an a
- LTA Ligation Tail
- this application provides a method of determining the nucleotide sequence contiguous to a known target nucleotide sequence, the method comprising: (a) ligating a target nucleic acid molecule comprising the known target nucleotide sequence with a universal Ligation Tail Adapter (LTA), where the universal LTA comprises a non-amplification strand and an amplification strand to produce a ligation product; (b) amplifying a portion of the target nucleic acid molecule and the amplification strand of the universal LTA with a Splint and a first targetspecific primer from the ligation product to produce a first amplicon; (c) amplifying a portion of the first amplicon with a Universal Primer (UP) and a second target-specific primer to produce a second amplicon; and (d) sequencing the second amplicon using a first sequencing primer and a second sequencing primer; where the universal LTA comprises a ligatable Double-Stranded End (LTA), where the universal L
- this application provides a method for determining the nucleotide sequence contiguous to a known target nucleotide sequence of 10 or more nucleotides, the method comprising: (i) ligating a universal Ligation Tail Adapter (LTA) to a nucleic acid molecule comprising the known target nucleotide sequence to produce a ligation product; (ii) amplifying the ligation product via polymerase chain reaction using a Splint that specifically anneals to the universal LTA, and a first target-specific primer that specifically anneals to the known target nucleotide sequence to produce a first amplification product; (iii) amplifying the first amplification product via polymerase chain reaction using a Splint-specific primer and a second target-specific primer, where the second target-specific primer is nested relative to the first target-specific primer to produce a second amplification product; and (iv) sequencing the second amplification product using a first sequencing primer and
- LTA Universal Ligation
- FIG. 1 A schematic illustration of an example of adjacent nested PCR reaction scheme using LTAs.
- LTA molecules are ligated to template DNA molecules.
- the half arrows in the figure indicate the 3' end of a DNA strand.
- Letter P outlined with a circle means this DNA base is modified with a phosphate.
- Letter ‘T’ indicates a T base overhang.
- An LTA molecule comprises a LTA-top strand and a LTA-bottom strand.
- the DSE is a double-stranded DNA.
- Two PCR steps are shown.
- the first PCR step (an outer PCR step) is using an OFP as a forward primer and a Splint as a reverse primer.
- the OFP has over 90% percent complementarity within a binding site of the template sequence and the Splint has over 90% percent complementarity within a binding site in the reverse complementary strand of the LTA-top strand.
- the second PCR step uses an IFP as a forward primer and an UP as a reverse primer.
- the IFP has over 90% percent complementarity within a binding site in the template in the 3' downstream of the region relative to the OFP binding site.
- the UP has over 90% percent complementarity within a binding site in the Splint strand.
- the amplicon is obtained and named as Nested PCR Amplicon.
- FIG. 2 Schematic illustration of exemplary structures of the LTA.
- An LTA comprises a double-stranded DNA DSE and a double-stranded DT region with a single-stranded UMI loop between the DSE and the DT region
- An LTA comprises a DSE with a length of 5 - 50 bp.
- the LTA-top strand comprises a 3' overhang which cannot bind to the LTA-bottom strand.
- the LTA-bottom strand does not contain any single strand DNA that can bind to the LTA-top strand
- the structure is similar to the structure of (b) but with a Unique Molecular Identifier (UMI) or sample barcode (SB) between the DSE and the 5' overhang of the LTA-top strand
- UMI Unique Molecular Identifier
- SB sample barcode
- the LTA-bottom strand comprises a 5' overhang that cannot bind to the LTA-top strand and in turn constitutes a DT region
- the structure is similar to the structure of (c) but with a UMI or an SB between the double-stranded DNA part and the 5' overhang of the LTA-bottom strand.
- An LTA comprises an LTA-top strand and an LTA-bottom strand which can form a DSE region.
- the LTA-top strand comprises the 3' overhang that cannot bind to LTA-bottom strand.
- the LTA-bottom strand comprises a 5' overhang that cannot bind to the LTA-top strand
- the structure is similar to the structure of (1) but with a UMI or an SB between the DSE and the 5' overhang of the LTA-bottom strand
- the structure is similar to the structure of (1) but with a UMI or an SB between the DSE and the 3' overhang of the LTA-top strand.
- FIG. 3 Exemplary UMI composition with bases of 8 used in an LTA molecule, (a) An UMI sequence is added between the DSE and the DT region, (b) The UMI can be a mixture of fixed UMIs with pairwise Hamming Distance minimum. For example, UMI 1 and UMI 2 have two different bases in position 2 and position 5, and other bases are the same. As another example, UMI 2 and UMI 3 have two different bases in position 2 and position 7, and other bases are the same, (c) The UMI can be a mixture of fixed UMIs with pairwise Levenschtein Distance minimum. The UMI 1 and UMI 2 have 8 bases and they can match with each after filling in one base gap.
- UMI 1 and UMI 2 have the same bases in positions 1, 3, 4, 5, 6, 7, and 9, UMI 1 position 2 matches with a gap in UMI 2, and UMI 2 position 8 matches with a gap in UMI 1.
- the Randomer UMI that can be H and/or G compositions.
- FIG. 4 Exemplary schematic illustration of the binding structures of a Splint and an LTA-top strand.
- a Splint can have a 5' overhang that cannot bind to the LTA-top strand with a length of 1 - 100 nt.
- Splint- 1 can bind to the single-stranded DNA of the LTA-top from the second base following the DSE region to the 3' end of the LTA-top strand, where the binding region has a length of, for example, 10 nucleotides
- Splint-2 has a 3' region which is reverse complementary to the entire DT region of the LTA-top strand and have a 5' region overhang
- the 3' end of the Splint-3 binds to the LTA-top strand starting from second base of the 5' end to the second base of the LTA-top strand in DSE region 3' end.
- the 3' end of Splint-4 binds to the 5' end of the DSE region of the LTA-top strand
- the 5' end of Splint-5 comprises a hairpin structure.
- FIG. 5 Exemplary schematic illustration of the binding positions of the OFP and IFP.
- An OFP and an IFP can each have over 90% percent identity or complementarity to the reverse complementary strand of the template.
- the IFP binding position is generally in the 5' upstream compared to the OFP binding position, (a) The 5' end of an IFP tiles to the 3' end of an OFP. (b) The 5' end of an IFP is located at a distance of about 1 - 50 nt away from the 3' end of an OFP. (c) The 5' end of an IFP has a 1 - 40 bp overlap with the 3' end of an OFP.
- FIG. 6 Exemplary schematic illustration of a UP binding to a Splint.
- the UP-1 sequence comprises a sequence that is over 70% homologous to the reverse complementary sequence of the 5' overhang of the Splint.
- the 3' end of the UP-1 starts from the second base to the end of the 5' overhang of the Splint.
- the UP-1 has a single-stranded DNA oligo that cannot bind to the Splint.
- the UP -2 has over 90% percent identity to the reverse complement of the Splint 5' overhang.
- the UP-3 has over 90% percent identity to the reverse complement of the Splint where the 3' end of UP-3 matches the 5' end of the single-stranded DNA part of the LTA and the 5' end of UP-3 matches the 3' end of the Splint.
- FIG. 7 Exemplary schematic illustration of a nested PCR amplicon used for NGS library preparation.
- Nested PCR amplicon is used in this example as an insert template for an adapter PCR.
- the inner AD-FP adaptor forward primer
- the UP primer acts as the reverse primer and undergoes a 2-cycle PCR for adding the adapter to the template. This step also can be merged with the Figure 1 inner PCR step (i.e., the PCR step with the UP and the IFP).
- An index PCR follows using Illumina sequencing index primer P5 as forward primer and P7 as reverse primer. The final products can be sequenced by the Illumina platform using the pair end sequencing method.
- Figure 8 Exemplary schematic illustration of a design scheme for detecting unknown FGFR2 RNA fusions.
- the IFP here is shown to have a 5-20 nt gap to the breakpoint.
- the OFP binds to the upstream strand of the known partner.
- Figure 9 Exemplary reads distribution of WT library and fusion library align to genome reads distribution. All the reads are trimmed with a sequencing adapter, aligned to an Hgl 9 reference genome, and filtered out the reads that are shorter than 50 bp.
- (a) Reads distribution is aligned to the gene FGFR2. The reads are shown in loglO scale and the axis is from 0 to 400,000. The reads distribute in exon 16, 17, 18, and 19.
- fusion reads are found in exon 3 of the gene BICC1 which is from gBlock2.
- fusion reads are aligned to exon 2 of gene GAB2 and
- the fusion reads are aligned to exon 5 and exon 6 of the gene AHCYL1.
- FIG. 10 Exemplary NGS results from RNA fusion detection.
- RNA reference material is ordered from SeraCare.
- the RNA reference material contains a NTRK1 gene fusion and three other gene exons as fusion partners. All the three gene fusion partners are found from (a) to (c).
- NTRK1 gBlocks is mixed with NA18537 (control sample) as the control sample which contains the LMNA exon 2 sequence but not include TPM3 exon 7 partner or TGF exon 5 partner.
- the reads are mapped to TPM3 exon 7 in SeraCare RNA samples but not so in the control sample, since the control sample does not have this fusion partner.
- the gBlocks mixed with NA18537 does not detect any reads, since there is no fusion template in TPM3 genes.
- Panel (b) shows the reads are in TGF exon 5 in SeraCare sample but not in the control sample since control sample doesn’t have this fusion partner.
- panel (c) the LMNA exon 2 reads are shown in both SeraCare RNA and the control sample, since both of them have this fusion partner. All the fusion partners are detected in the reference RNA sample.
- Figure 11 Exemplary schematic illustration of design schemes for detecting unknown DNA fusion
- One design group contains an OFP and an IFP to target one region. The gap between different groups ranges from 0 nt to 100 nt. The groups will tile the whole intron region,
- (b) The known exon can be in the fusion downstream partner, so that we can design primers to target the 5' intron and target in the negative strand.
- Figure 12 Exemplary DNA fusion reads from NGS results.
- the aH2228 sample has been validated to have an EML4-ALK fusion.
- the WT reads only contain the ALK intron 19 sequence.
- FIG 13 Exemplary schematic illustration of ABDA workflow for fusion detection.
- the use of LTA and Splint molecules are essentially as illustrated in Figure 1.
- the left side workflow shows the ABDA for WT detection.
- the Blocker will bind to the WT template.
- the IFP cannot bind to the template and will inhibit the PCR reaction.
- the template comprises a gene fusion sequence
- the Blocker cannot bind to the template perfectly, and the IFP can displace the Blocker so that the fusion template can be enriched via a PCR reaction.
- the sequencing adapter is added to the amplicons through two cycles of PCR.
- FIG 14 Exemplary schematic illustration of having UMI in the LTA-bottom strand.
- an OFP amplifies the template, and a UMI or SB are added to the DNA template positive strand. Then, the OFP functions as a forward primer and the Splint as a reverse primer in the first PCR reaction.
- an IFP as a forward primer and a non- complementary portion of the Splint as a reverse primer are used to amplify for several cycles.
- the reverse primer can further include a sequencing adapter primer sequence for adding sequencing index.
- Figure 15 Exemplary on-target rate of Y adapter and Ligation Tail adapter. Five primers are designed targeting on different exons of the FGFR2 gene. On-target-rate values are compared between Y adapter and LTA. Three different reverse primers of UP-1, UP -2, and UP-3 (see Figure 6) are used for PCR amplification. All three different reverse primers achieve higher on-target-rate than Y adapters.
- composition provided herein is specifically envisioned for use with any applicable method provided herein.
- any composition provided herein is specifically envisioned for use with any applicable method provided herein.
- any and all combinations of the members that make up that grouping of alternatives is specifically envisioned. For example, if an item is selected from a group consisting of A, B, C, and D, the inventors specifically envision each alternative individually (e.g., A alone, B alone, etc.), as well as combinations such as A, B, and D; A and C; B and C; etc.
- the term “and/or” when used in a list of two or more items means any one of the listed items by itself or in combination with any one or more of the other listed items.
- the expression “A and/or B” is intended to mean either or both of A and B - i.e., A alone, B alone, or A and B in combination.
- the expression “A, B and/or C” is intended to mean A alone, B alone, C alone, A and B in combination, A and C in combination, B and C in combination, or A, B, and C in combination.
- range is understood to inclusive of the edges of the range as well as any number between the defined edges of the range.
- “between 1 and 10” includes any number between 1 and 10, as well as the number 1 and the number 10.
- a compound or “at least one compound” may include a plurality of compounds, including mixtures thereof.
- plural refers to any number greater than one.
- “approximately,” in reference to a number, means plus and/or minus 10% of that number.
- approximately 100 means anywhere from 90 to 110.
- LTA Ligation Tail Adapter
- LTA Ligation Tail Adapter
- LTA LTA
- LTA Ligatable Tail Adapter
- An amplification strand and a nonamplification strand differ in that the former (or portion thereol) is included in an amplification product, while the latter is not.
- an LTA-top strand or a non-amplification strand comprises a 5 ' duplex portion.
- an LTA-top strand or an amplification strand comprises an unpaired 5 ' portion, a 3 ' duplex portion, and a 3 ' T overhang and nucleic acid sequences identical to a first and second sequencing primers.
- the duplex portions of an LTA-top strand and an LTA-bottom strand are substantially complementary and form the first ligatable duplex end (e.g., DSE) comprising a 3 ' T overhang and the duplex portion is of sufficient length to remain in duplex form at the ligation temperature.
- an LTA-top strand comprises a phosphorylated nucleotide at its 5' end.
- an LTA-bottom strand comprises a thymine at its 3' end.
- an LTA-top strand comprises a phosphorylated nucleotide at its 5' end, and an LTA-bottom strand comprises a thymine at its 3' end.
- the second nucleotide position from the 3' end of an LTA-bottom strand comprises a phosphorothioate bond modification.
- a “DNA Tail” or “DT” region refers to a 3' region of an LTA-top strand or an LTA-bottom strand.
- a DT region comprises a single-stranded DNA.
- a DT region comprises a double-stranded DNA.
- a DT region comprises a doublestranded sequence comprising between 1 and 30 nucleotide mismatches.
- a DT region comprises a double-stranded sequence comprising between 1 and 25 nucleotide mismatches.
- a DT region comprises a double-stranded sequence comprising between 1 and 20 nucleotide mismatches.
- a DT region comprises a double-stranded sequence comprising between 1 and 15 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 1 and 10 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 1 and 5 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 5 and 10 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 10 and 15 nucleotide mismatches.
- a DT region comprises a double-stranded sequence comprising between 15 and 20 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 20 and 25 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 25 and 30 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 5 and 25 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 10 and 20 nucleotide mismatches.
- a DT region comprises a double-stranded sequence comprising between 5 and 30 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 10 and 30 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 15 and 30 nucleotide mismatches. In an aspect, a DT region comprises a double-stranded sequence comprising between 20 and 30 nucleotide mismatches.
- a “Double-Stranded End” or “DSE” refers to a region of doublestranded DNA formed between an LTA-top strand and an LTA-bottom strand.
- the forward strand of a DSE begins from the 5' end of an LTA-top strand
- the reverse strand of the DSE begins from the second nucleotide position from the 3' end of an LTA-bottom strand.
- a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 90 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 80 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 70 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 60 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 50 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 40 nucleotides. In an aspect, a DSE is a doublestranded DNA molecule comprising a length of between 5 nucleotides and 30 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 20 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 5 nucleotides and 10 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 10 nucleotides and 20 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 15 nucleotides and 25 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 20 nucleotides and 30 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 25 nucleotides and 35 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 30 nucleotides and 40 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 35 nucleotides and 45 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 40 nucleotides and 50 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 45 nucleotides and 55 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 50 nucleotides and 60 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 55 nucleotides and 65 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 60 nucleotides and 70 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 65 nucleotides and 75 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 70 nucleotides and 80 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 75 nucleotides and 85 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 10 nucleotides and 80 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 15 nucleotides and 70 nucleotides.
- a DSE is a double-stranded DNA molecule comprising a length of between 20 nucleotides and 60 nucleotides. In an aspect, a DSE is a double-stranded DNA molecule comprising a length of between 25 nucleotides and 50 nucleotides.
- the percent identity between a LTA-top strand and the reverse complementary of the LTA-bottom strand in a DSE region is at least 90%. In an aspect, the percent identity between a LTA-top strand and the reverse complementary of the LTA-bottom strand in a DSE region is at least 92.5%. In an aspect, the percent identity between a LTA-top strand and the reverse complementary of the LTA-bottom strand in a DSE region is at least 95%. In an aspect, the percent identity between a LTA-top strand and the reverse complementary of the LTA-bottom strand in a DSE region is at least 97.5%.
- the percent identity between a LTA-top strand and the reverse complementary of the LTA-bottom strand in a DSE region is at least 99%. In an aspect, the percent identity between a LTA-top strand and the reverse complementary of the LTA-bottom strand in a DSE region is 100%.
- a “Splint” refers to a nucleic acid molecule that is capable of bridging together two nucleic acid molecules (e.g, a first nucleic acid molecule and a second nucleic acid molecule) that do not share sequence complementarity and are not normally physically linked to each other.
- a Splint comprises at least 90% complementarity to a first nucleic acid molecule and at least 90% complementarity to a second nucleic acid molecule.
- a Splint is a primer.
- a Splint comprises a 5' sequence that does not bind with an LT A.
- a Splint comprises a 5' sequence that comprises between 1 nucleotide and 100 nucleotides.
- a Splint comprises a 5' sequence that comprises between 3 nucleotides and 5 nucleotides.
- a Splint comprises a 5' sequence that comprises between 5 nucleotides and 7 nucleotides.
- a Splint comprises a 5' sequence that comprises between 7 nucleotides and 10 nucleotides.
- a Splint comprises a 5' sequence that comprises between 10 nucleotides and 15 nucleotides.
- a Splint comprises a 5' sequence that comprises between 15 nucleotides and 20 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 20 nucleotides and 25 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 25 nucleotides and 30 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 35 nucleotides and 40 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 40 nucleotides and 45 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 45 nucleotides and 50 nucleotides.
- a Splint comprises a 5' sequence that comprises between 50 nucleotides and 55 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 55 nucleotides and 60 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 60 nucleotides and 65 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 65 nucleotides and 70 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 70 nucleotides and 75 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 75 nucleotides and 80 nucleotides.
- a Splint comprises a 5' sequence that comprises between 80 nucleotides and 85 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 85 nucleotides and 90 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 90 nucleotides and 95 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 90 nucleotides and 100 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 95 nucleotides and 100 nucleotides.
- a Splint comprises a 5' sequence that comprises between 3 nucleotides and 10 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 5 nucleotides and 15 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 10 nucleotides and 20 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 15 nucleotides and 25 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 20 nucleotides and 30 nucleotides.
- a Splint comprises a 5' sequence that comprises between 25 nucleotides and 35 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 35 nucleotides and 45 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 5 nucleotides and 25 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 15 nucleotides and 50 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 20 nucleotides and 55 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 35 nucleotides and 60 nucleotides.
- a Splint comprises a 5' sequence that comprises between 50 nucleotides and 75 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises between 65 nucleotides and 90 nucleotides.
- a Splint comprises a 5' sequence that comprises at least 3 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 5 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 7 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 10 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 15 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 20 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 25 nucleotides.
- a Splint comprises a 5' sequence that comprises at least 30 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 35 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 40 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 45 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 50 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 55 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 60 nucleotides.
- a Splint comprises a 5' sequence that comprises at least 65 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 70 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 75 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 80 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 85 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises at least 90 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 3 nucleotides.
- a Splint comprises a 5' sequence that comprises less than or equal to 5 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 7 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 10 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 15 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 20 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 25 nucleotides.
- a Splint comprises a 5' sequence that comprises less than or equal to 30 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 35 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 40 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 45 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 50 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 55 nucleotides.
- a Splint comprises a 5' sequence that comprises less than or equal to 60 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 65 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 70 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 75 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 80 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 85 nucleotides. In an aspect, a Splint comprises a 5' sequence that comprises less than or equal to 90 nucleotides.
- a Splint comprises a 3' sequence capable of binding to a single-stranded 3' overhang of an LTA-top strand from the second nucleotide position following the DSE to at least the 10 th nucleotide position on the 3' end of a DT region.
- a Splint comprises a 3' sequence capable of binding to a 3'overhang of an LTA-top strand from the first nucleotide position following a DSE.
- a Splint comprises a 3' sequence capable of binding to a DSE region starting from the second nucleotide position on the 5' end of an LTA-top strand to the second nucleotide position of the 3' end of the DSE.
- a Splint comprises a 3' region capable of biding to a DSE region starting from the first nucleotide position of the 5' region of an LTA-top strand.
- a Splint binds to an LTA-bottom strand instead of an LTA-top strand.
- a Splint comprises, in order from 5' to 3', a first sequence, a second sequence, a third sequences, and a fourth sequence, where the third sequence is complementary to the first sequence.
- a “5' sequence” refers to a sequence in the 5’ half of the Splint
- a “3' sequence” refers to a sequence in the 3' half of the Splint (with each half being based on the number of nucleotides in the Splint).
- the 3' end subsequence of a Splint can specifically anneal to a DT region or a portion of the DT region.
- an LTA-top strand comprises a single-stranded DT region that does not bind to an LTA-bottom strand.
- an LTA-bottom strand 5' region binds to an LTA-top strand to form a DSE.
- an LTA-top strand comprises a single-stranded DT region that does not bind to an LTA-bottom strand and an LTA-bottom strand 5' region binds to an LTA-top strand to form a DSE.
- an LTA-bottom strand comprises a single-stranded DT region that does not bind to an LTA-top strand.
- an LTA-top strand 3' region binds to an LTA-bottom strand without any overhanging nucleotides to form a DSE.
- an LTA-bottom strand comprises a single-stranded DT region that does not bind to an LTA-top strand, and the LTA-top strand 3' region binds to an LTA-bottom strand without any overhanging nucleotides to form a DSE.
- both an LTA-top strand 3' region and an LTA-bottom strand 5' region comprise a single-stranded DT region that does not bind to each other, and where the LTA-top strand 5' region and the LTA-bottom strand 3' region bind with each other to form a DSE.
- an LTA-top strand comprises a single-stranded UMI between a DSE and a DT region. In an aspect, an LTA-top strand comprises an SB between a DSE and a DT region. In an aspect, an LTA-top strand comprises a single-stranded UMI between a DSE and a DT region and an SB between the DSE and the DT region. In an aspect, an LTA-bottom strand comprises a single-stranded UMI between a DSE and a DT region. In an aspect, an LTA-bottom strand comprises an SB between a DSE and a DT region. In an LTA-bottom strand comprises a single-stranded UMI between a DSE and a DT region and an SB between the DSE and the DT region.
- both an LTA-top strand and an LTA-bottom strand comprise a singlestranded UMI between a DSE and a DT region. In an aspect, both an LTA-top strand and an LTA- bottom strand comprise an SB between a DSE and a DT region. In an aspect, both an LTA-top strand and an LTA-bottom strand comprise a single-stranded UMI between a DSE and a DT region and an SB between the DSE and the DT region.
- a “5' region” refers to the first 25% of the nucleotides of a given nucleic acid molecule when counting from 5' to 3' when starting from the 5'-most nucleotide.
- a 5' region of a first nucleic acid molecule binds to a second nucleic acid molecule, it will be appreciated that the entire 5' region of the first nucleic acid molecule does not have to bind to the second nucleic acid molecule (e.g, partial binding is specifically envisioned).
- a “3' region” refers to the last 25% of the nucleotides of a given nucleic acid molecule when counting from 5' to 3' when starting from the 5'-most nucleotide.
- a 3' region of a first nucleic acid molecule binds to a second nucleic acid molecule, it will be appreciated that the entire 3' region of the first nucleic acid molecule does not have to bind to the second nucleic acid molecule (e.g, partial binding is specifically envisioned).
- percent identity or “percent identical” as used herein in reference to two or more nucleotide or amino acid sequences is calculated by (i) comparing two optimally aligned sequences (nucleotide or amino acid) over a window of comparison (the “alignable” region or regions), (ii) determining the number of positions at which the identical nucleic acid base (for nucleotide sequences) or amino acid residue (for proteins and polypeptides) occurs in both sequences to yield the number of matched positions, (iii) dividing the number of matched positions by the total number of positions in the window of comparison, and then (iv) multiplying this quotient by 100% to yield the percent identity.
- the percent identity is being calculated in relation to a reference sequence without a particular comparison window being specified, then the percent identity is determined by dividing the number of matched positions over the region of alignment by the total length of the reference sequence. Accordingly, for purposes of the present application, when two sequences (query and subject) are optimally aligned (with allowance for gaps in their alignment), the “percent identity” for the query sequence is equal to the number of identical positions between the two sequences divided by the total number of positions in the query sequence over its length (or a comparison window), which is then multiplied by 100%.
- percent complementarity or “percent complementary” as used herein in reference to two nucleotide sequences refers to the percentage of nucleotides of a query sequence that optimally base-pair or hybridize to nucleotides a subject sequence when the query and subject sequences are linearly arranged and optimally base paired without secondary folding structures, such as loops, stems or hairpins. Such a percent complementarity can be between two DNA strands, two RNA strands, or a DNA strand and a RNA strand.
- the “percent complementarity” can be calculated by (i) optimally base-pairing or hybridizing the two nucleotide sequences in a linear and fully extended arrangement (i.e.
- Optimal base pairing of two sequences can be determined based on the known pairings of complementary nucleotide bases, such as guanine (G)-cytosine (C), adenine (A)-thymine (T), and A-uracil (U), through hydrogen binding.
- complementary nucleotide bases such as guanine (G)-cytosine (C), adenine (A)-thymine (T), and A-uracil (U)
- the percent identity is determined by dividing the number of complementary positions between the two linear sequences by the total length of the reference sequence.
- the “percent complementarity” for the query sequence is equal to the number of base-paired positions between the two sequences divided by the total number of positions in the query sequence over its length, which is then multiplied by 100%.
- a “portion” of a nucleic acid molecule refers to contiguous set of nucleotides comprised by that molecule. A portion can comprise all or only a subset of the nucleotides comprised by the molecule. A portion can be double-stranded or single-stranded.
- nucleotide mismatch refers to an alignment of two sequences that pairs two uncomplimentary nucleotides.
- mismatches include G-A, G-T, G-U, G-G, C-A, C-T, C-U, C-C, A-A, T-T, and T-U.
- matched alignments of nucleotides refer to complimentary pairs such as G-C, A-T, and A-U.
- the complement of the sequence 5'-ATGC-3' is 3'-TACG- 5'
- the reverse complement of 5'-ATGC-3' is 5'-GCAT-3'.
- the complement and reverse complement sequences are identical to each other when viewed in the 5' to 3' direction.
- a “primer” refers to a chemically synthesized single-stranded oligonucleotide which is designed to anneal to a specific site on a template nucleic acid molecule.
- a primer is used in PCR to initiate DNA synthesis.
- a primer is a DNA molecule.
- a primer is an RNA molecule.
- a primer comprises between 6 nucleotides and 70 nucleotides.
- a primer comprises between 10 nucleotides and 50 nucleotides.
- a primer comprises between 15 nucleotides and 30 nucleotides.
- a primer comprises between 18 nucleotides and 25 nucleotides. In an aspect, a primer comprises at least 6 nucleotides. In an aspect, a primer comprises at least 10 nucleotides. In an aspect, a primer comprises at least 15 nucleotides. In an aspect, a primer comprises at least 20 nucleotides. In an aspect, a primer is a forward primer. In an aspect, a primer is a reverse primer. As used herein, a “forward primer” hybridizes to the anti-sense strand of dsDNA, and a “reverse primer” hybridizes to the sense strand of dsDNA. In an aspect, a forward primer comprises DNA. In an aspect, a reverse primer comprises DNA. In an aspect, a forward primer comprises RNA. In an aspect, a reverse primer comprises RNA.
- a “target-specific primer” is a primer that is designed to hybridize (or anneal) to a single, specific target sequence at a given annealing temperature. Without being limiting, a target-specific primer is used to amplify a single target nucleotide sequence. In an aspect, a target specific primer anneals to a known target nucleotide sequence at an annealing temperature between 60°C and 75°C. In an aspect, a target specific primer anneals to a known target nucleotide sequence at an annealing temperature between 61°C and 72°C.
- a target specific primer anneals to a known target nucleotide sequence at an annealing temperature between 62°C and 70°C. In an aspect, a target specific primer anneals to a known target nucleotide sequence at an annealing temperature between 65°C and 78°C.
- hybridize As used herein, when referring to two nucleic acid molecules, the terms “hybridize,” “bind,” and “anneal” are used interchangeably.
- a primer comprises a sequence at least 80% identical or complementary to a template nucleic acid molecule. In an aspect, a primer comprises a sequence at least 85% identical or complementary to a template nucleic acid molecule. In an aspect, a primer comprises a sequence at least 90% identical or complementary to a template nucleic acid molecule. In an aspect, a primer comprises a sequence at least 95% identical or complementary to a template nucleic acid molecule. In an aspect, a primer comprises a sequence at least 99% identical or complementary to a template nucleic acid molecule. In an aspect, a primer comprises a sequence 100% identical or complementary to a template nucleic acid molecule.
- PCR polymerase chain reaction
- amplicons a molecular biology technique that allows one to generate multiple copies of a targeted region of DNA. Copies of DNA made via PCR are termed “amplicons,” “amplified product,” or “amplification product.” These amplicons contain copies of a portion of a particular target nucleic acid template strand and/or its complementary sequence, which correspond in nucleotide sequence to the template oligonucleotide sequence and/or its complementary sequence.
- An amplification product can further comprise sequence specific to the primers and which flanks sequence which is a portion of the target nucleic acid and/or its complement.
- an amplicon comprises a length of between 30 nucleotides and 1000 nucleotides. In an aspect, an amplicon comprises a length of between 30 nucleotides and 750 nucleotides. In an aspect, an amplicon comprises a length of between 30 nucleotides and 500 nucleotides. In an aspect, an amplicon comprises a length of between 30 nucleotides and 250 nucleotides. In an aspect, an amplicon comprises a length of between 30 nucleotides and 100 nucleotides.
- an amplicon comprises a length of at least 20 nucleotides. In an aspect, an amplicon comprises a length of at least 25 nucleotides. In an aspect, an amplicon comprises a length of at least 30 nucleotides. In an aspect, an amplicon comprises a length of at least 50 nucleotides. In an aspect, an amplicon comprises a length of at least 75 nucleotides. In an aspect, an amplicon comprises a length of at least 100 nucleotides. In an aspect, an amplicon comprises a length of at least 125 nucleotides. In an aspect, an amplicon comprises a length of at least 150 nucleotides.
- an amplicon comprises a length of at least 175 nucleotides. In an aspect, an amplicon comprises a length of at least 200 nucleotides. In an aspect, an amplicon comprises a length of at least 250 nucleotides. In an aspect, an amplicon comprises a length of at least 300 nucleotides. In an aspect, an amplicon comprises a length of at least 400 nucleotides. In an aspect, an amplicon comprises a length of at least 500 nucleotides. In an aspect, an amplicon comprises a length of at least 600 nucleotides. In an aspect, an amplicon comprises a length of at least 700 nucleotides.
- an amplicon comprises a length of at least 800 nucleotides. In an aspect, an amplicon comprises a length of at least 900 nucleotides. In an aspect, an amplicon comprises a length of at least 1000 nucleotides. In an aspect, an amplicon comprises a length of at least 1250 nucleotides. In an aspect, an amplicon comprises a length of at least 1500 nucleotides. In an aspect, an amplicon comprises a length of at least 1750 nucleotides. In an aspect, an amplicon comprises a length of at least 2000 nucleotides. In an aspect, an amplicon is amplified from a target DNA sequence region. In an aspect, an amplicon is amplified from a template nucleic acid molecule.
- a method comprises purifying at least one amplicon.
- a method comprises sequencing at least one amplicon.
- a method comprises re-amplification of at least one amplicon using fluorescent dideoxynucleotide triphosphates (ddNTPs).
- ddNTPs dideoxynucleotide triphosphates
- an amplicon is sequenced via Sanger sequencing. See Sanger and Coulson, J. Mol. Biol., 94:441- 446 (1975).
- an amplicon is sequenced via next-generation sequencing.
- Non-limiting examples of next-generation sequencing include single-molecule real-time sequencing (e.g., Pacific Biosciences), Ion Torrent sequencing, sequencing-by-synthesis (e.g., Illumina), sequencing by ligation (SOLiD sequencing), nanopore sequencing, and GenapSys sequencing.
- an amplicon is sequenced via Oxford Nanopore sequencing.
- a method comprises high-throughput sequencing.
- a method comprises subjecting a plurality of amplicons to high-throughput sequencing.
- “high-throughput sequencing” refers to any sequences method that is capable of sequencing multiple (e.g., tens, hundreds, thousands, millions, hundreds of millions) DNA molecules in parallel.
- Sanger sequencing is not high-throughput sequencing.
- high- throughput sequencing comprises the use of a sequencing-by-synthesis (SBS) flow cell.
- SBS flow cell is selected from the group consisting of an Illumina SBS flow cell and a Pacific Biosciences (PacBio) SBS flow cell.
- high-throughput sequencing is performed via electrical current measurements in conjunction with an Oxford nanopore.
- PCR requires a mixture comprising a targeted region of DNA to be amplified, a set of oligonucleotide primers that flank the targeted region of DNA, a thermostable DNA polymerase, and nucleotides.
- the mixture is subjected to thermal cycling in order to amplify the targeted region of DNA.
- thermal cycling often includes a denaturation stage to separate double-stranded DNA (dsDNA) into single strands; an annealing stage, which allows the primers to hybridize with the targeted region of DNA; and an extension stage, which allows the DNA polymerase to extend DNA from the primers, generating new dsDNA.
- dsDNA double-stranded DNA
- an extension stage which allows the DNA polymerase to extend DNA from the primers, generating new dsDNA.
- the annealing and extension stages can be combined into a single stage.
- a DNA polymerase is a thermostable DNA polymerase.
- a “thermostable DNA polymerase” refers to DNA polymerases that can function at high temperatures (e.g., greater than 65°C) and can survive higher temperatures (e.g, up to about 100°C). Thermostable DNA polymerases often have maximal catalytic activity at temperatures between 70°C and 80°C.
- a thermostable DNA polymerase is selected from the group consisting of comprising Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase.
- a DNA polymerase is a non-thermostable DNA polymerase.
- a “non-thermostable DNA polymerase” refers to DNA polymerases that cannot function at high temperatures.
- a non-thermostable DNA polymerase is selected from the group consisting of phi29 DNA polymerase and Bst DNA polymerase.
- a DNA polymerase is selected from the group consisting of Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase, phi29 DNA polymerase, Klenow fragment, Bst DNA polymerase, T4 DNA polymerase, Vent® DNA polymerase, LongAmp® Taq DNA polymerase, and OneTaq® DNA polymerase.
- a DNA polymerase used in step (b) of a method provided herein is selected from the group consisting of Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase, phi29 DNA polymerase, Klenow fragment, Bst DNA polymerase, T4 DNA polymerase, Vent® DNA polymerase, LongAmp® Taq DNA polymerase, and OneTaq® DNA polymerase.
- a DNA polymerase used in step (c) of a method provided herein is selected from the group consisting of Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase [0075]
- the term “blocker” refers to an oligonucleotide that is designed to selectively bind to a DNA template molecule to retard the amplification of a target sequence.
- a blocker is a DNA molecule.
- a mixture comprises a plurality of blockers.
- a blocker is an RNA molecule.
- a blocker comprises at least one continuous strand of from about 12 to about 100 nucleotides in length which strand preferably anneals to a to-be-blocked allele sequence relative to a non-blocked allele sequence, and further comprising a functional group or nucleotide sequence at its 3’ end that prevents enzymatic extension during an amplification process such as polymerase chain reaction.
- a blocker comprises between 11 and 100 nucleotides.
- a blocker comprises between 11 and 90 nucleotides.
- a blocker comprises between 11 and 80 nucleotides.
- a blocker comprises between 11 and 70 nucleotides.
- a blocker comprises between 11 and 60 nucleotides. In an aspect, a blocker comprises between 11 and 50 nucleotides. In an aspect, a blocker comprises between 11 and 40 nucleotides. In an aspect, a blocker comprises between 11 and 30 nucleotides. In an aspect, a blocker comprises between 11 and 20 nucleotides.
- a blocker comprises at least 11 nucleotides. In an aspect, a blocker comprises at least 12 nucleotides. In an aspect, a blocker comprises at least 15 nucleotides. In an aspect, a blocker comprises at least 20 nucleotides. In an aspect, a blocker comprises at least 25 nucleotides. In an aspect, a blocker comprises at least 30 nucleotides. In an aspect, a blocker comprises at least 40 nucleotides. In an aspect, a blocker comprises at least 50 nucleotides. In an aspect, a blocker comprises at least 60 nucleotides. In an aspect, a blocker comprises at least 70 nucleotides. In an aspect, a blocker comprises at least 80 nucleotides. In an aspect, a blocker comprises at least 90 nucleotides. In an aspect, a blocker comprises at least 100 nucleotides.
- a blocker comprises a chemical functionalization that prevents DNA polymerase extension.
- a blocker comprises a chemical functionalization that prevents DNA polymerase extension on its 3' end.
- a chemical functionalization comprises a 3- carbon spacer.
- a chemical functionalization comprises an inverted nucleotide.
- a chemical functionalization comprises a minor groove binder.
- a chemical functionalization comprises a dideoxynucleotide.
- a chemical functionalization is selected from the group consisting of a 3-carbon spacer, an inverted nucleotide, and a minor groove binder.
- a blocker and a primer have partially overlapping sequences and thus they compete to bind to a given target site.
- the region of overlap between the blocker and primer sequences is referred to as an “overlapping subsequence.”
- An “overlapping subsequence” comprises a nucleotide sequence of at least 3 nucleotides of a primer sequence that is homologous with the blocker sequence.
- a blocker and a primer comprise an overlapping sequence.
- a blocker and a forward primer comprise an overlapping sequence.
- a blocker and a reverse primer comprise an overlapping sequence.
- an overlapping sequence is positioned on the 3' end of a primer and the 5' end of a blocker.
- the primer also has a “non-overlapping subsequence,” which refers to the portion of the primer sequence that does not overlap with the blocker sequence.
- a blocker is a wildtype-specific blocker.
- a mixture comprises a plurality of wildtype-specific blockers.
- a plurality of wildtype-specific blockers are complementary to a plurality of target DNA regions (e.g. , without being limiting, if two wildtypespecific blockers are present, one wildtype-specific blocker hybridizes (or binds) to a first locus, and the second wildtype-specific blocker hybridizes (or binds) to a second locus).
- PCR amplification comprises one or more wildtype-specific blockers.
- a wildtype-specific blocker refers to a blocker that is complementary to a wildtype sequence that is capable of retarding amplification of the wildtype sequence. Without being limiting, wildtype-specific blockers allow the selective amplification of non-wildtype alleles of a given locus.
- a wild-type-specific blocker binds to a “wildtype-specific blocker binding site” on a template nucleic acid molecule or a target DNA sequence region of interest.
- a wildtype-specific blocker binding site and an IFP binding site overlap by between 2 nucleotides and 30 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 3 nucleotides and 5 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 5 nucleotides and 7 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 7 nucleotides and 10 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 10 nucleotides and 15 nucleotides.
- a wildtype-specific blocker binding site and an IFP binding site overlap by between 15 nucleotides and 20 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 20 nucleotides and 25 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 25 nucleotides and 30 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 3 nucleotides and 10 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 5 nucleotides and 15 nucleotides.
- a wildtype-specific blocker binding site and an IFP binding site overlap by between 10 nucleotides and 20 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 15 nucleotides and 25 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 20 nucleotides and 30 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 5 nucleotides and 25 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by between 15 nucleotides and 30 nucleotides.
- a wildtype-specific blocker binding site and an IFP binding site overlap by at least 3 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by at least 5 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by at least 7 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by at least 10 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by at least 15 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by at least 20 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by at least 25 nucleotides.
- a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 3 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 5 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 7 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 10 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 15 nucleotides.
- a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 20 nucleotides. In an aspect, a wildtype-specific blocker binding site and an IFP binding site overlap by less than or equal to 25 nucleotides.
- an IFP can be divided into two regions when paired with a wildtype-specific blocker: a target-specific portion, which does not overlap in sequence with the wildtype-specific blocker; and an overlapping region, which does overlap in sequence with the wildtype-specific blocker.
- a wildtype-specific blocker when paired with an IFP, can also be divided into two regions: a blocker-unique sequence, which does not overlap in sequences with the IFP; and the overlapping region, which does overlap in sequence with the IFP.
- an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -1 kcal/mol and -3 kcal/mol. In an aspect, an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -2 kcal/mol and -4 kcal/mol. In an aspect, an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -3 kcal/mol and -5 kcal/mol. In an aspect, an overlapping region between a wildtype- specific blocker and an IFP comprises a standard free energy of binding between -4 kcal/mol and -6 kcal/mol.
- an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -5 kcal/mol and -7 kcal/mol. In an aspect, an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -6 kcal/mol and -8 kcal/mol. In an aspect, an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -7 kcal/mol and -9 kcal/mol. In an aspect, an overlapping region between a wildtype- specific blocker and an IFP comprises a standard free energy of binding between -8 kcal/mol and -10 kcal/mol.
- an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -2 kcal/mol and -8 kcal/mol. In an aspect, an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -2 kcal/mol and -10 kcal/mol. In an aspect, an overlapping region between a wildtype-specific blocker and an IFP comprises a standard free energy of binding between -5 kcal/mol and -10 kcal/mol.
- the target-specific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -1 kcal/mol and -5 kcal/mol. In an aspect, the target-specific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -2 kcal/mol and -6 kcal/mol. In an aspect, the target-specific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -3 kcal/mol and -7 kcal/mol.
- the target-specific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -4 kcal/mol and -8 kcal/mol. In an aspect, the targetspecific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -5 kcal/mol and -9 kcal/mol. In an aspect, the targetspecific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -6 kcal/mol and -10 kcal/mol.
- the targetspecific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -7 kcal/mol and -11 kcal/mol. In an aspect, the targetspecific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -8 kcal/mol and -12 kcal/mol. In an aspect, the targetspecific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -9 kcal/mol and -13 kcal/mol.
- the targetspecific portion of an IFP that does not overlap with a wildtype-specific blocker comprises a standard free energy of binding between -1 kcal/mol and -13 kcal/mol.
- the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -1 kcal/mol and -6 kcal/mol.
- the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -2 kcal/mol and -7 kcal/mol.
- the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -3 kcal/mol and -8 kcal/mol. In an aspect, the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -4 kcal/mol and -9 kcal/mol. In an aspect, the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -5 kcal/mol and -10 kcal/mol.
- the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -6 kcal/mol and -11 kcal/mol. In an aspect, the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -7 kcal/mol and -12 kcal/mol. In an aspect, the blocker-unique sequence of a wildtypespecific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -8 kcal/mol and -13 kcal/mol.
- the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -9 kcal/mol and -14 kcal/mol. In an aspect, the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -10 kcal/mol and -15 kcal/mol. In an aspect, the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -11 kcal/mol and -16 kcal/mol.
- the blockerunique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -1 kcal/mol and -16 kcal/mol. In an aspect, the blocker-unique sequence of a wildtype-specific blocker that does not overlap with an IFP sequence comprises a standard free energy of binding between -7 kcal/mol and -16 kcal/mol.
- the standard free energy of binding is calculated based on an annealing temperature of 60°C, double-stranded DNA, and aNa + concentration of 0.18 M.
- an IFP, OFP, or an MFP comprises a standard free energy of binding of between -11.5 kcal/mol and -12.5 kcal/mol in a standard PCR buffer.
- a wildtype-specific blocker comprises a terminator to prevent 3' to 5' DNA polymerase exonuclease activity.
- a terminator is selected from the group consisting of a three-carbon (C3) spacer and DXXDM, where D is a match between the blocker sequence and the template nucleic acid molecule sequence or target DNA region, M is a C3 spacer, and X is mismatch between the blocker sequence and the template nucleic acid sequence or target DNA region. Additional terminators known in the art are also suitable for use. A non-limiting example of an additional terminator is a dideoxynucleotide.
- a wildtype-specific blocker comprises a terminator comprising a DNA overhang comprising four nucleotides.
- nucleic acid refers to any molecule, preferably a polymeric molecule, incorporating units of ribonucleic acid (RNA), deoxyribonucleic acid (DNA), or an analog thereof.
- a nucleic acid can be either single-stranded or double-stranded.
- a single-stranded nucleic acid can be one strand nucleic acid of a denatured double-stranded DNA. Alternatively, it can be a single-stranded nucleic acid not derived from any double-stranded DNA.
- a template nucleic acid is DNA.
- a template is RNA.
- Suitable nucleic acid molecules are DNA, including genomic DNA or cDNA.
- Other suitable nucleic acid molecules are RNA, including mRNA.
- a nucleic acid molecule provided herein is a DNA molecule.
- a nucleic acid molecule provided herein is an RNA molecule.
- a nucleic acid molecule provided herein is a cDNA molecule.
- DNA refers to deoxyribonucleic acid. DNA can be either singlestranded or double-stranded. In an aspect, DNA comprises complementary DNA (cDNA). DNA typically comprises four nucleotides: cytosine (C), guanine (G), adenine (A), and thymine (T). In an aspect, the sequence of a DNA molecule provided herein comprises one or more degenerate nucleotides. As used herein, a “degenerate nucleotide” refers to a nucleotide that can perform the same function or yield the same output as a structurally different nucleotide.
- Non-limiting examples of degenerate nucleotides include a C, G, or T nucleotide (B); an A, G, or T nucleotide (D); an A, C, or T nucleotide (H); a G or T nucleotide (K); an A or C nucleotide (M); any nucleotide (N); an A or G nucleotide (R); a G or C nucleotide (S); an A, C, or G nucleotide (V); an A or T nucleotide (W), and a C or T nucleotide (Y).
- primer specific for a target nucleic acid refers to a level of complementarity between the primer and the target such that there exists an annealing temperature at which the primer will anneal to and mediate amplification of the target nucleic acid and will not anneal to or mediate amplification of nontarget sequences present in a sample.
- this application provides a method of preparing a nucleic acid template into a DNA library for sequencing, the method comprising the steps of: (a) mixing a nucleic acid template, a Ligation Tail Adapter (LTA) molecule, a DNA ligase, and reagents for DNA ligase activity to form a first mixture; where the LTA molecule comprises an LTA-top strand and an LTA-bottom strand which two strands form a Double-Stranded End (DSE) and a DNA Tail (DT) region; (b) subjecting the first mixture to a suitable temperature to allow for DNA ligation to form a ligation product mixture; (c) introducing to the ligation product mixture from step (b) a targetspecific Outer Forward Primer (OFP) and a Splint, a DNA polymerase, and reagents for DNA polymerase activity to form a second mixture; where the OFP comprises a nucleic acid sequence that can specifically anneal
- a ligation produce is purified after step (a) of a method provided herein.
- purification comprises column purification.
- purification comprises beads purification.
- purification comprises the use of phenol and/or chloroform.
- purification comprises diluting the ligation product between 10-fold and 10,000-fold.
- purification comprises a method selected from the group consisting of column purification, beads purification, diluting the ligation product between 10-fold and 10,000-fold, and any combination thereof.
- purification comprises diluting the ligation product at least 20-fold. In an aspect, purification comprises diluting the ligation product at least 50-fold. In an aspect, purification comprises diluting the ligation product at least 100-fold. In an aspect, purification comprises diluting the ligation product at least 200-fold. In an aspect, purification comprises diluting the ligation product at least 500-fold. In an aspect, purification comprises diluting the ligation product at least 1000-fold. In an aspect, purification comprises diluting the ligation product at least 2000-fold. In an aspect, purification comprises diluting the ligation product at least 5000- fold.
- purification comprises diluting the ligation product at least 7500-fold. In an aspect, purification comprises diluting the ligation product at least 9000-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 20-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 50-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 100-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 200-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 500-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 1000-fold.
- purification comprises diluting the ligation product less than or equal to 2000-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 5000-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 7500-fold. In an aspect, purification comprises diluting the ligation product less than or equal to 9000-fold.
- this application provides a method for preparing a nucleic acid for sequencing, the method comprising: (i) ligating a Ligation Tail Adapter (LTA) molecule to a nucleic acid comprising a known target nucleotide sequence to produce a ligation product, where the LTA molecule comprises an LTA-top strand and an LTA-bottom strand, and where the LTA- top strand and the LTA-bottom strand form a Double-Stranded End (DSE) and a DNA Tail (DT) region to produce a ligation product; (ii) amplifying the ligation product using a first target-specific primer that specifically anneals to the known target nucleotide sequence and a Splint, where the Splint comprises a 5' end subsequence that is not complementary to a subsequence of the LTA, and a 3' end subsequence that is complementary to the DT region to produce an amplification product; and (LTA) molecule to
- a “Universal Primer” or “UP” refers to a primer that is complementary to nucleotide sequences that are common in a particular set of DNA molecules. Without being limiting, universal primers are able to bind to a wide variety of DNA templates (e.g, they are not specific to a single locus).
- a UP comprises between 10 nucleotides and 70 nucleotides. In an aspect, a UP comprises between 10 nucleotides and 15 nucleotides. In an aspect, a UP comprises between 15 nucleotides and 20 nucleotides. In an aspect, a UP comprises between 20 nucleotides and 25 nucleotides. In an aspect, a UP comprises between 25 nucleotides and 30 nucleotides. In an aspect, a UP comprises between 30 nucleotides and 35 nucleotides. In an aspect, a UP comprises between 35 nucleotides and 40 nucleotides.
- a UP comprises between 40 nucleotides and 45 nucleotides. In an aspect, a UP comprises between 45 nucleotides and 50 nucleotides. In an aspect, a UP comprises between 50 nucleotides and 55 nucleotides. In an aspect, a UP comprises between 55 nucleotides and 60 nucleotides. In an aspect, a UP comprises between 60 nucleotides and 65 nucleotides. In an aspect, a UP comprises between 65 nucleotides and 70 nucleotides. In an aspect, a UP comprises between 10 nucleotides and 20 nucleotides. In an aspect, a UP comprises between 15 nucleotides and 25 nucleotides.
- a UP comprises between 20 nucleotides and 30 nucleotides. In an aspect, a UP comprises between 25 nucleotides and 35 nucleotides. In an aspect, a UP comprises between 35 nucleotides and 45 nucleotides. In an aspect, a UP comprises between 10 nucleotides and 25 nucleotides. In an aspect, a UP comprises between 15 nucleotides and 50 nucleotides. In an aspect, a UP comprises between 20 nucleotides and 55 nucleotides. In an aspect, a UP comprises between 35 nucleotides and 60 nucleotides. In an aspect, a UP comprises between 50 nucleotides and 70 nucleotides.
- a UP comprises at least 12 nucleotides. In an aspect, a UP comprises at least 15 nucleotides. In an aspect, a UP comprises at least 20 nucleotides. In an aspect, a UP comprises at least 25 nucleotides. In an aspect, a UP comprises at least 30 nucleotides. In an aspect, a UP comprises at least 35 nucleotides. In an aspect, a UP comprises at least 40 nucleotides. In an aspect, a UP comprises at least 45 nucleotides. In an aspect, a UP comprises at least 50 nucleotides. In an aspect, a UP comprises at least 55 nucleotides. In an aspect, a UP comprises at least 60 nucleotides.
- a UP comprises less than or equal to 12 nucleotides. In an aspect, a UP comprises less than or equal to 15 nucleotides. In an aspect, a UP comprises less than or equal to 20 nucleotides. In an aspect, a UP comprises less than or equal to 25 nucleotides. In an aspect, a UP comprises less than or equal to 30 nucleotides. In an aspect, a UP comprises less than or equal to 35 nucleotides. In an aspect, a UP comprises less than or equal to 40 nucleotides. In an aspect, a UP comprises less than or equal to 45 nucleotides. In an aspect, a UP comprises less than or equal to 50 nucleotides. In an aspect, a UP comprises less than or equal to 55 nucleotides. In an aspect, a UP comprises less than or equal to 60 nucleotides.
- a UP contains no sequence that is complementary to any 5 -nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 6-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 7-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 8- nucleotide (nt) or longer continuous subsequence of an LTA.
- a UP contains no sequence that is complementary to any 9-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 10-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 11 -nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 12-nucleotide (nt) or longer continuous subsequence of an LTA.
- a UP contains no sequence that is complementary to any 13-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 14-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 15-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 16-nucleotide (nt) or longer continuous subsequence of an LTA.
- a UP contains no sequence that is complementary to any 17-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 18-nucleotide (nt) or longer continuous subsequence of an LTA. In an aspect, a UP contains no sequence that is complementary to any 19-nucleotide (nt) or longer continuous subsequence of an LTA.
- a primer is an Inner Forward Primer (IFP).
- IFP Inner Forward Primer
- an IFP binds (e.g. , hybridizes) to an IFP binding site on a template nucleic acid molecule.
- a primer is an Outer Forward Primer (OFP).
- an OFP binds (e.g , hybridizes) to an OFP binding site on a template nucleic acid molecule.
- a primer is a Middle Forward Primer (MFP).
- MFP Middle Forward Primer
- an MFP binds (e.g., hybridizes) to an MFP binding site on a template nucleic acid molecule.
- an IFP binding site is positioned, at least partially, 3’ to an OFP binding site.
- an MFP binding site partially overlaps an OFP binding site, an IFP binding site, or both.
- an IFP binding site, an OFP binding site, or an MFP binding site is on the positive strand of a DNA or cDNA molecule.
- an IFP binding site, an OFP binding site, or an MFP binding site is on the negative strand of a DNA or cDNA molecule.
- a method comprises a plurality of OFPs. In an aspect, a method comprises a plurality of IFPs. In an aspect, a method comprises a plurality of MFPs. As used herein, a “plurality” refers to 2 or more. In an aspect, a plurality comprises 3 or more. In an aspect, a plurality comprises 4 or more. In an aspect, a plurality comprises 5 or more. In an aspect, a plurality comprises 10 or more. In an aspect, a plurality comprises 15 or more. In an aspect, a plurality comprises 20 or more. In an aspect, a plurality comprises 30 or more. [0109] In an aspect, an IFP and an OFP comprise overlapping sequences. In an aspect, an IFP and an OFP do not comprise overlapping sequences. In an aspect, an IFP and an OFP have a partially or completely overlapping binding site on a target DNA sequence region of interest.
- an OFP comprises between 10 nucleotides and 100 nucleotides. In an aspect, an OFP comprises between 10 nucleotides and 90 nucleotides. In an aspect, an OFP comprises between 10 nucleotides and 80 nucleotides. In an aspect, an OFP comprises between 10 nucleotides and 70 nucleotides. In an aspect, an OFP comprises between 10 nucleotides and 60 nucleotides. In an OFP comprises between 10 nucleotides and 50 nucleotides. In an aspect, an OFP comprises between 10 nucleotides and 40 nucleotides. In an aspect, an OFP comprises between 10 nucleotides and 30 nucleotides.
- an OFP comprises between 10 nucleotides and 20 nucleotides. In an aspect, an OFP comprises between 15 nucleotides and 90 nucleotides. In an aspect, an OFP comprises between 20 nucleotides and 80 nucleotides. In an aspect, an OFP comprises between 25 nucleotides and 70 nucleotides. In an aspect, an OFP comprises between 30 nucleotides and 60 nucleotides. In an OFP comprises between 35 nucleotides and 50 nucleotides. In an OFP comprises between 15 nucleotides and 50 nucleotides. In an OFP comprises between 20 nucleotides and 40 nucleotides. In an OFP comprises between 25 nucleotides and 30 nucleotides. In an OFP comprises between 20 nucleotides and 30 nucleotides.
- an OFP comprises a 5' overhang which does not bind to the reverse strand of the nucleic acid template.
- a 5' overhang comprises between 1 nucleotide and 100 nucleotides.
- a 5' overhang comprises between 2 nucleotides and 95 nucleotides.
- a 5' overhang comprises between 3 nucleotides and 90 nucleotides.
- a 5' overhang comprises between 4 nucleotides and 80 nucleotides.
- a 5' overhang comprises between 5 nucleotides and 70 nucleotides.
- a 5' overhang comprises between 6 nucleotides and 60 nucleotides.
- a 5' overhang comprises between 7 nucleotides and 50 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 90 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 80 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 70 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 60 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 50 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 40 nucleotides.
- a 5' overhang comprises between 5 nucleotides and 30 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 20 nucleotides. In an aspect, a 5' overhang comprises between 5 nucleotides and 10 nucleotides. In an aspect, a 5' overhang comprises between 10 nucleotides and 20 nucleotides. In an aspect, a 5' overhang comprises between 15 nucleotides and 25 nucleotides. In an aspect, a 5' overhang comprises between 20 nucleotides and 30 nucleotides. In an aspect, a 5' overhang comprises between 25 nucleotides and 35 nucleotides.
- a 5' overhang comprises between 30 nucleotides and 40 nucleotides. In an aspect, a 5' overhang comprises between 35 nucleotides and 45 nucleotides. In an aspect, a 5' overhang comprises between 40 nucleotides and 50 nucleotides. In an aspect, a 5' overhang comprises between 45 nucleotides and 55 nucleotides. In an aspect, a 5' overhang comprises between 50 nucleotides and 60 nucleotides. In an aspect, a 5' overhang comprises between 55 nucleotides and 65 nucleotides. In an aspect, a 5' overhang comprises between 60 nucleotides and 70 nucleotides.
- a 5' overhang comprises between 65 nucleotides and 75 nucleotides. In an aspect, a 5' overhang comprises between 70 nucleotides and 80 nucleotides. In an aspect, a 5' overhang comprises between 75 nucleotides and 85 nucleotides. In an aspect, a 5' overhang comprises between 10 nucleotides and 80 nucleotides. In an aspect, a 5' overhang comprises between 15 nucleotides and 70 nucleotides. In an aspect, a 5' overhang comprises between 20 nucleotides and 60 nucleotides. In an aspect, a 5' overhang comprises between 25 nucleotides and 50 nucleotides.
- an OFP anneals to a template nucleic acid at a temperature between 55°C and 72°C.
- an IFP anneals to a template nucleic acid at a temperature between 55°C and 72°C.
- an IFP comprises between 10 nucleotides and 70 nucleotides. In an aspect, an IFP comprises between 10 nucleotides and 15 nucleotides. In an aspect, an IFP comprises between 15 nucleotides and 20 nucleotides. In an aspect, an IFP comprises between 20 nucleotides and 25 nucleotides. In an aspect, an IFP comprises between 25 nucleotides and 30 nucleotides. In an IFP comprises between 30 nucleotides and 35 nucleotides. In an aspect, an IFP comprises between 35 nucleotides and 40 nucleotides. In an IFP comprises between 40 nucleotides and 45 nucleotides.
- an IFP comprises between 45 nucleotides and 50 nucleotides. In an aspect, an IFP comprises between 50 nucleotides and 55 nucleotides. In an aspect, an IFP comprises between 55 nucleotides and 60 nucleotides. In an aspect, an IFP comprises between 60 nucleotides and 65 nucleotides. In an aspect, an IFP comprises between 65 nucleotides and 70 nucleotides. In an IFP comprises between 10 nucleotides and 20 nucleotides. In an aspect, an IFP comprises between 15 nucleotides and 25 nucleotides. In an aspect, an IFP comprises between 20 nucleotides and 30 nucleotides.
- an IFP comprises between 25 nucleotides and 35 nucleotides. In an aspect, an IFP comprises between 35 nucleotides and 45 nucleotides. In an aspect, an IFP comprises between 10 nucleotides and 25 nucleotides. In an aspect, an IFP comprises between 15 nucleotides and 50 nucleotides. In an aspect, an IFP comprises between 20 nucleotides and 55 nucleotides. In an IFP comprises between 35 nucleotides and 60 nucleotides. In an IFP comprises between 50 nucleotides and 70 nucleotides.
- an IFP comprises at least 12 nucleotides. In an aspect, an IFP comprises at least 15 nucleotides. In an aspect, an IFP comprises at least 20 nucleotides. In an aspect, an IFP comprises at least 25 nucleotides. In an aspect, an IFP comprises at least 30 nucleotides. In an aspect, an IFP comprises at least 35 nucleotides. In an aspect, an IFP comprises at least 40 nucleotides. In an IFP comprises at least 45 nucleotides. In an aspect, an IFP comprises at least 50 nucleotides. In an aspect, an IFP comprises at least 55 nucleotides. In an aspect, an IFP comprises at least 60 nucleotides.
- an IFP comprises less than or equal to 12 nucleotides. In an aspect, an IFP comprises less than or equal to 15 nucleotides. In an aspect, an IFP comprises less than or equal to 20 nucleotides. In an aspect, an IFP comprises less than or equal to 25 nucleotides. In an aspect, an IFP comprises less than or equal to 30 nucleotides. In an aspect, an IFP comprises less than or equal to 35 nucleotides. In an IFP comprises less than or equal to 40 nucleotides. In an IFP comprises less than or equal to 45 nucleotides. In an aspect, an IFP comprises less than or equal to 50 nucleotides. In an aspect, an IFP comprises less than or equal to 55 nucleotides. In an aspect, an IFP comprises less than or equal to 60 nucleotides.
- an IFP binds atemplate nucleic acid 5' relative to an OFP, and the IFP and OFP do not overlap when bound to the template nucleic acid.
- an IFP binding site and an OFP binding site do not overlap.
- an IFP binding site is separated from a breakpoint of a known template region by a gap distance comprising between 5 nucleotides and 30 nucleotides.
- a gap distance comprises at least 7 nucleotides.
- a gap distance comprises at least 10 nucleotides.
- a gap distance comprises at least 15 nucleotides.
- a gap distance comprises at least 20 nucleotides.
- a gap distance comprises at least 25 nucleotides.
- a gap distance comprises less than or equal to 7 nucleotides.
- a gap distance comprises less than or equal to 10 nucleotides.
- a gap distance comprises less than or equal to 15 nucleotides. In an aspect, a gap distance comprises less than or equal to 20 nucleotides. In an aspect, a gap distance comprises less than or equal to 25 nucleotides. In an aspect, a gap distance comprises between 5 nucleotides and 7 nucleotides. In an aspect, a gap distance comprises between 7 nucleotides and 10 nucleotides. In an aspect, a gap distance comprises between 10 nucleotides and 15 nucleotides. In an aspect, a gap distance comprises between 15 nucleotides and 20 nucleotides. In an aspect, a gap distance comprises between 20 nucleotides and 25 nucleotides.
- a gap distance comprises between 25 nucleotides and 30 nucleotides. In an aspect, a gap distance comprises between 5 nucleotides and 15 nucleotides. In an aspect, a gap distance comprises between 10 nucleotides and 20 nucleotides. In an aspect, a gap distance comprises between 15 nucleotides and 25 nucleotides. In an aspect, a gap distance comprises between 20 nucleotides and 30 nucleotides.
- primer binding sites When two primer binding sites are positioned such that there are 0 nucleotides between them, and the two primer binding sites do not overlap, the primer binding sites are considered to be “adjacent.”
- an OFP binding site and an IFP binding site are adjacent.
- an OFP binding site and an IFP binding site are adjacent.
- an IFP binding site and an OFP binding site overlap.
- binding sites “overlap” it means that the same sequence in a nucleic acid molecule (e.g., a nucleic acid template) is complementary to both an IFP (in part) and an OFP (in part).
- an “IFP binding site” refers to the location in a nucleic acid molecule to which an IFP is capable of binding. Typically, an IFP binding site will be completely, or partially, complementary to the IFP. Similarly, an “OFP binding site” refers to the location in a nucleic acid molecule to which an OFP is capable of binding. Typically, an OFP binding site will be completely, or partially, complementary to the OFP.
- any binding site provided herein e.g., an IFP binding site, an OFP binding site, a MFP binding site, a blocker binding site
- an overlapping region comprises between 1 nucleotide and 40 nucleotides. In an aspect, an overlapping region comprises between 3 nucleotides and 5 nucleotides. In an aspect, an overlapping region comprises between 5 nucleotides and 7 nucleotides. In an aspect, an overlapping region comprises between 7 nucleotides and 10 nucleotides. In an aspect, an overlapping region comprises between 10 nucleotides and 15 nucleotides. In an aspect, an overlapping region comprises between 15 nucleotides and 20 nucleotides. In an aspect, an overlapping region comprises between 20 nucleotides and 25 nucleotides.
- an overlapping region comprises between 25 nucleotides and 30 nucleotides. In an aspect, an overlapping region comprises between 30 nucleotides and 35 nucleotides. In an aspect, an overlapping region comprises between 35 nucleotides and 40 nucleotides. In an aspect, an overlapping region comprises between 3 nucleotides and 10 nucleotides. In an aspect, an overlapping region comprises between 5 nucleotides and 15 nucleotides. In an aspect, an overlapping region comprises between 10 nucleotides and 20 nucleotides. In an aspect, an overlapping region comprises between 15 nucleotides and 25 nucleotides. In an aspect, an overlapping region comprises between 20 nucleotides and 30 nucleotides.
- an overlapping region comprises between 25 nucleotides and 35 nucleotides. In an aspect, an overlapping region comprises between 5 nucleotides and 25 nucleotides. In an aspect, an overlapping region comprises between 15 nucleotides and 40 nucleotides. In an aspect, an overlapping region comprises between 20 nucleotides and 40 nucleotides.
- an overlapping region comprises at least 3 nucleotides. In an aspect, an overlapping region comprises at least 5 nucleotides. In an aspect, an overlapping region comprises at least 7 nucleotides. In an aspect, an overlapping region comprises at least 10 nucleotides. In an aspect, an overlapping region comprises at least 15 nucleotides. In an aspect, an overlapping region comprises at least 20 nucleotides. In an aspect, an overlapping region comprises at least 25 nucleotides. In an overlapping region comprises at least 30 nucleotides. In an overlapping region comprises at least 35 nucleotides.
- an overlapping region comprises less than or equal to 3 nucleotides. In an aspect, an overlapping region comprises less than or equal to 5 nucleotides. In an aspect, an overlapping region comprises less than or equal to 7 nucleotides. In an aspect, an overlapping region comprises less than or equal to 10 nucleotides. In an aspect, an overlapping region comprises less than or equal to 15 nucleotides. In an aspect, an overlapping region comprises less than or equal to 20 nucleotides. In an aspect, an overlapping region comprises less than or equal to 25 nucleotides. In an aspect, an overlapping region comprises less than or equal to 30 nucleotides. In an aspect, an overlapping region comprises less than or equal to 35 nucleotides.
- primer binding sites When two primer binding sites do not overlap and are not adjacent, they are considered to have a “gap” between them. In an aspect, there is a gap between an OFP binding site and an IFP binding site.
- a gap comprises between 1 nucleotide and 50 nucleotides. In an aspect, a gap comprises between 3 nucleotides and 5 nucleotides. In an aspect, a gap comprises between 5 nucleotides and 7 nucleotides. In an aspect, a gap comprises between 7 nucleotides and 10 nucleotides. In an aspect, a gap comprises between 10 nucleotides and 15 nucleotides. In an aspect, a gap comprises between 15 nucleotides and 20 nucleotides. In an aspect, a gap comprises between 20 nucleotides and 25 nucleotides. In an aspect, a gap comprises between 25 nucleotides and 30 nucleotides.
- a gap comprises between 30 nucleotides and 35 nucleotides. In an aspect, a gap comprises between 35 nucleotides and 40 nucleotides. In an aspect, a gap comprises between 3 nucleotides and 10 nucleotides. In an aspect, a gap comprises between 5 nucleotides and 15 nucleotides. In an aspect, a gap comprises between 10 nucleotides and 20 nucleotides. In an aspect, a gap comprises between 15 nucleotides and 25 nucleotides. In an aspect, a gap comprises between 20 nucleotides and 30 nucleotides. In an aspect, a gap comprises between 25 nucleotides and 35 nucleotides.
- a gap comprises between 5 nucleotides and 25 nucleotides. In an aspect, a gap comprises between 15 nucleotides and 40 nucleotides. In an aspect, a gap comprises between 20 nucleotides and 40 nucleotides.
- a gap comprises at least 3 nucleotides. In an aspect, a gap comprises at least 5 nucleotides. In an aspect, a gap comprises at least 7 nucleotides. In an aspect, a gap comprises at least 10 nucleotides. In an aspect, a gap comprises at least 15 nucleotides. In an aspect, a gap comprises at least 20 nucleotides. In an aspect, a gap comprises at least 25 nucleotides. In an aspect, a gap comprises at least 30 nucleotides. In an aspect, a gap comprises at least 35 nucleotides. [0127] In an aspect, a gap comprises less than or equal to 3 nucleotides.
- a gap comprises less than or equal to 5 nucleotides. In an aspect, a gap comprises less than or equal to 7 nucleotides. In an aspect, a gap comprises less than or equal to 10 nucleotides. In an aspect, a gap comprises less than or equal to 15 nucleotides. In an aspect, a gap comprises less than or equal to 20 nucleotides. In an aspect, a gap comprises less than or equal to 25 nucleotides. In an aspect, a gap comprises less than or equal to 30 nucleotides. In an aspect, a gap comprises less than or equal to 35 nucleotides.
- a middle PCR comprises using an MFP that binds to an MFP binding site on at least one template nucleic acid molecule, where the MFP binding site partially overlaps with the OFP binding site, the IFP binding site, or both.
- a middle PCR comprises using an MFP that comprises a 5' region starting from the second nucleotide of the OFP 5' region to the second nucleotide of the OFP 3' region, and the MFP comprises a 3' region starting from the second nucleotide of the IFP 5' region to the second nucleotide of the IFP 3' region.
- an MFP comprises between 10 nucleotides and 70 nucleotides. In an aspect, an MFP comprises between 10 nucleotides and 15 nucleotides. In an aspect, an MFP comprises between 15 nucleotides and 20 nucleotides. In an aspect, an MFP comprises between 20 nucleotides and 25 nucleotides. In an aspect, an MFP comprises between 25 nucleotides and 30 nucleotides. In an MFP comprises between 30 nucleotides and 35 nucleotides. In an MFP comprises between 35 nucleotides and 40 nucleotides. In an MFP comprises between 40 nucleotides and 45 nucleotides.
- an MFP comprises between 45 nucleotides and 50 nucleotides. In an aspect, an MFP comprises between 50 nucleotides and 55 nucleotides. In an aspect, an MFP comprises between 55 nucleotides and 60 nucleotides. In an aspect, an MFP comprises between 60 nucleotides and 65 nucleotides. In an aspect, an MFP comprises between 65 nucleotides and 70 nucleotides. In an MFP comprises between 10 nucleotides and 20 nucleotides. In an aspect, an MFP comprises between 15 nucleotides and 25 nucleotides. In an aspect, an MFP comprises between 20 nucleotides and 30 nucleotides.
- an MFP comprises between 25 nucleotides and 35 nucleotides. In an aspect, an MFP comprises between 35 nucleotides and 45 nucleotides. In an aspect, an MFP comprises between 10 nucleotides and 25 nucleotides. In an aspect, an MFP comprises between 15 nucleotides and 50 nucleotides. In an aspect, an MFP comprises between 20 nucleotides and 55 nucleotides. In an MFP comprises between 35 nucleotides and 60 nucleotides. In an MFP comprises between 50 nucleotides and 70 nucleotides.
- an MFP comprises at least 12 nucleotides. In an aspect, an MFP comprises at least 15 nucleotides. In an aspect, an MFP comprises at least 20 nucleotides. In an aspect, an MFP comprises at least 25 nucleotides. In an aspect, an MFP comprises at least 30 nucleotides. In an aspect, an MFP comprises at least 35 nucleotides. In an aspect, an MFP comprises at least 40 nucleotides. In an MFP comprises at least 45 nucleotides. In an aspect, an MFP comprises at least 50 nucleotides. In an aspect, an MFP comprises at least 55 nucleotides. In an aspect, an MFP comprises at least 60 nucleotides.
- an MFP comprises less than or equal to 12 nucleotides. In an aspect, an MFP comprises less than or equal to 15 nucleotides. In an aspect, an MFP comprises less than or equal to 20 nucleotides. In an aspect, an MFP comprises less than or equal to 25 nucleotides. In an aspect, an MFP comprises less than or equal to 30 nucleotides. In an aspect, an MFP comprises less than or equal to 35 nucleotides. In an MFP comprises less than or equal to 40 nucleotides. In an MFP comprises less than or equal to 45 nucleotides. In an aspect, an MFP comprises less than or equal to 50 nucleotides. In an aspect, an MFP comprises less than or equal to 55 nucleotides. In an MFP comprises less than or equal to 60 nucleotides.
- an IFP comprises a 5'-end single-stranded nucleic acid sequence.
- an IFP comprises a 5 '-end single-stranded nucleic acid sequence which does not bind to a nucleic acid template.
- a 5'-end single-stranded nucleic acid sequence comprises a sequencing adapter.
- a UP comprises a sequencing adapter.
- a sequencing adapter is an Illumina sequencing adapter.
- a sequencing adapter is a nanopore sequencing adapter.
- a sequencing adapter is an Ion Torrent sequencing adapter.
- a “sequencing adapter” refers to a sequencing platform-specific sequence used for fragment recognition by a sequencing instrument.
- a sequencing adapter is a full-length sequencing adapter.
- a sequencing adapter is a partial sequencing adapter.
- a UP comprises a partial sequencing adapter which is amplified with a sequencing index primer.
- a “forward primer” hybridizes to the anti-sense strand of a dsDNA molecule
- a “reverse primer” hybridizes to the sense strand of the dsDNA molecule.
- a “nucleic acid template” refers to a nucleic acid molecule that is to be amplified in whole or in part.
- a nucleic acid template comprises a target DNA sequence region of interest.
- a nucleic acid template comprises DNA.
- a nucleic acid template comprises cDNA.
- cDNA molecules can be generated through the reverse transcription of an RNA sample (particularly a messenger RNA sample).
- a nucleic acid template comprises double-stranded DNA.
- a nucleic acid template comprises RNA.
- a nucleic acid template comprises an amplicon DNA molecule generated by a DNA polymerase.
- a nucleic acid template is from a physically, chemically, or enzymatically treated product of a biological DNA or RNA sample.
- a nucleic acid template is from a product of a fragmentation process.
- Non-limiting examples of a fragmentation process include ultrasonication and enzymatic fragmentation.
- a nucleic acid template undergoes an end-repair process prior to the initiation of any method provided herein (e.g., before step (a) of a method).
- an end-repair process is performed using a T4 DNA ligase enzyme.
- a nucleic acid template is ligated using a blunt TA ligase enzyme.
- a nucleic acid template is a biological DNA derived from a sample of cells from biofluids such as blood, urine, saliva, feces, cerebrospinal fluid, interstitial fluid, and synovial fluid, or from a tissue such as a biopsy tissue or a surgically resected tissue.
- a nucleic acid template is a eukaryotic DNA molecule. In an aspect, a nucleic acid template is a prokaryotic DNA molecule. In an aspect, a nucleic acid template is a viral DNA molecule. In an aspect, a nucleic acid template is a viroid DNA molecule.
- a nucleic acid template is selected from an animal nucleic acid molecule, a plant nucleic acid molecule, a fungal nucleic acid molecule, and a protozoan nucleic acid molecule.
- a nucleic acid template is selected from a bacterial nucleic acid molecule and an archaea nucleic acid molecule.
- a nucleic acid template molecule is selected from the group consisting of an Adenoviridae nucleic acid molecule, a Herpesviridae nucleic acid molecule, a Poxviridae nucleic acid molecule, a Papillomaviridae nucleic acid molecule, a Parvoviridae nucleic acid molecule, a Reoviridae nucleic acid molecule, a Coronaviridae nucleic acid molecule, a Picomaviridae nucleic acid molecule, a Togaviridae nucleic acid molecule, an Orthomyxoviridae nucleic acid molecule, a Rhabdoviridae nucleic acid molecule, a Retroviridae nucleic acid molecule, a Hepadnaviridae nucleic acid molecule, a Baculoviridae nucleic acid molecule, a Geminiviridae nucleic acid molecule, a Flavivirida
- a virus nucleic acid molecule is selected from the group consisting of a human orthopnemovirus (HRSV) nucleic acid molecule, an influenza virus nucleic acid molecule, a human immunodeficiency virus (HIV) nucleic acid molecule, a hepatitis B virus (HBV) nucleic acid molecule, and a human papillomavirus (HPV) nucleic acid molecule.
- HRSV human orthopnemovirus
- HAV human immunodeficiency virus
- HBV hepatitis B virus
- HPV human papillomavirus
- a nucleic acid template molecule is selected from the group consisting of a Pospiviroidae nucleic acid molecule, and a Avsunviroidae nucleic acid molecule.
- a nucleic acid template is a human DNA molecule.
- a nucleic acid template is an animal DNA molecule.
- a nucleic acid template is a rodent DNA molecule.
- a nucleic acid template is a plant DNA molecule.
- a nucleic acid template is a fungal DNA molecule.
- a nucleic acid template is an environmental specimen DNA molecule.
- a “target DNA sequence region of interest” refers to a region of a DNA molecule that is desired to be amplified from a nucleic acid template.
- a target DNA sequence region of interest comprises one or more exon sequences.
- a target DNA sequence region of interest comprises one or more intron sequences.
- a target DNA sequence region of interest comprises one or more exon sequences and one or more intron sequences.
- a target DNA sequence region of interest comprises an intergenic region.
- a target DNA sequence region of interest consists of one or more exon sequences.
- a target DNA sequence region of interest consists of one or more intron sequences.
- a target DNA sequence region of interest consists of one or more exon sequences and one or more intron sequences.
- a target DNA sequence region of interest consists of an intergenic region.
- a target DNA sequence region of interest comprises a cDNA molecule. In an aspect, a target DNA sequence region of interest comprises at least 20 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 50 nucleotides. In an aspect, a target
- DNA sequence region of interest comprises at least 100 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 250 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 500 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 1000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 2000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 3000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 4000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 5000 nucleotides.
- a target DNA sequence region of interest comprises at least 6000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 7000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 8000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 9000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 10,000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 11,000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 12,000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 13,000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 14,000 nucleotides. In an aspect, a target DNA sequence region of interest comprises at least 15,000 nucleotides.
- a target DNA sequence region of interest comprises between 10 nucleotides and 5000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 20 nucleotides and 5000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 50 nucleotides and 5000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 100 nucleotides and 5000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 500 nucleotides and 5000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 10 nucleotides and 1000 nucleotides.
- a target DNA sequence region of interest comprises between 20 nucleotides and 1000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 50 nucleotides and 1000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 100 nucleotides and 1000 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 500 nucleotides and 1000 nucleotides.
- a target DNA sequence region of interest comprises between 10 nucleotides and 500 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 20 nucleotides and 500 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 50 nucleotides and 500 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 100 nucleotides and 500 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 10 nucleotides and 100 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 20 nucleotides and 100 nucleotides. In an aspect, a target DNA sequence region of interest comprises between 50 nucleotides and 100 nucleotides.
- a target DNA sequence comprising at least 1000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 2000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 3000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 4000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 5000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 6000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 7000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 8000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 9000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 10,000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 11,000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 12,000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 13,000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 14,000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- a target DNA sequence comprising at least 15,000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- tiling comprises the use of a plurality of OFPs and a plurality of IFPs. In an aspect, tiling comprises the use of a plurality of OFPs and a plurality of IFPs, where a gap is present between each OFP binding site and each IFP binding site.
- a target DNA sequence region of interest is within a prokaryotic DNA molecule. In an aspect, a target DNA sequence region of interest is within a eukaryotic DNA molecule. In an aspect, a target DNA sequence region of interest is within a viral DNA molecule. In an aspect, a target DNA sequence region of interest is within a viroid DNA molecule.
- a eukaryotic DNA molecule is selected from the group consisting of an animal DNA molecule, a plant DNA molecule, and a fungi DNA molecule.
- an animal DNA molecule is a human DNA molecule.
- a prokaryotic DNA molecule is selected from the group consisting of a bacteria DNA molecule and an archaea DNA molecule.
- a virus DNA molecule is selected from the group consisting of an Adenoviridae DNA molecule, a Herpesviridae DNA molecule, a Poxviridae DNA molecule, a Papillomaviridae DNA molecule, a Parvoviridae DNA molecule, a Reoviridae DNA molecule, a Coronaviridae DNA molecule, a Picomaviridae DNA molecule, a Togaviridae DNA molecule, an Orthomyxoviridae DNA molecule, a Rhabdoviridae DNA molecule, a Retroviridae DNA molecule, a Hepadnaviridae DNA molecule, a Baculoviridae DNA molecule, a Geminiviridae DNA molecule, a Flaviviridae DNA molecule, a Filoviridae DNA molecule, a Paramyxoviridae DNA molecule, and a Pneumoviridae DNA molecule.
- a virus DNA molecule is selected from the group consisting of a human orthopnemovirus (HRSV)c DNA molecule, an influenza virus cDNA molecule, a human immunodeficiency virus (HIV) cDNA molecule, a hepatitis B virus (HBV) cDNA molecule, and a human papillomavirus (HPV) cDNA molecule.
- HRSV human orthopnemovirus
- HAV human immunodeficiency virus
- HBV hepatitis B virus
- HPV human papillomavirus
- a viroid DNA molecule is selected from the group consisting of a Pospiviroidae DNA molecule, and a Avsunviroidae DNA molecule.
- sample Barcode refers to unique sequences used to tag or track individual samples to allow for sample multiplexing and large numbers of libraries to be pooled and sequenced simultaneously during a single sequencing run.
- an LTA comprises an SB sequence.
- an SB is positioned in an LTA-top strand.
- an SB is positioned in an LTA-bottom strand.
- an LTA comprises a Unique Molecular Identifier sequence.
- Unique Molecular Identifier sequence refers to molecular barcodes of short sequences which are used to uniquely tag each molecule in a sample library. Through UMI, each nucleic acid in a starting material is tagged with a unique molecular barcode. Sequencing with UMIs reduces the rate of false-positive variant calls and increase sensitivity of variant detection and quantification.
- a UMI sequence comprises between 7 degenerate nucleotides and 30 degenerate nucleotides. In an aspect, a UMI sequence comprises between 5 degenerate nucleotides and 40 degenerate nucleotides. In an aspect, a UMI sequence comprises between 10 degenerate nucleotides and 20 degenerate nucleotides. In an aspect, a UMI sequence comprises at least 5 degenerate nucleotides. In an aspect, a UMI sequence comprises at least 7 degenerate nucleotides. In an aspect, a UMI sequence comprises at least 10 degenerate nucleotides.
- a UMI sequence comprises at least 15 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 50 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 40 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 30 degenerate nucleotides. In an aspect, a UMI sequence comprises fewer than 20 degenerate nucleotides.
- each degenerate nucleotide in a UMI sequence is selected from the group consisting of N, B, D, H, V, S, W, Y, R, M, and K.
- a UMI sequence comprises between 7 degenerate nucleotides and 30 degenerate nucleotides, where each degenerate nucleotide is selected from the group consisting of N, B, D, H, V, S, W, Y, R, M, and K.
- a UMI sequence comprises at least one degenerate nucleotide selected from the group consisting of R, Y, S, W, K, M, B, D, H, V, and N.
- a UMI sequence comprises a mixture of between 10 and 100 defined DNA sequences with a minimum pairwise Hamming distance of between 2 and 5.
- a UMI sequence comprises a mixture of between 10 and 1000 defined DNA sequences with a minimum pairwise Levenschtein distance of between 2 and 5.
- an IFP is present in a mixture at a concentration between 1 nM and 1000 nM. In an aspect, an IFP is present in a mixture at a concentration between 1 nM and 500 nM. In an aspect, an IFP is present in a mixture at a concentration between 1 nM and 250 nM. In an aspect, an IFP is present in a mixture at a concentration between 1 nM and 100 nM. In an aspect, an IFP is present in a mixture at a concentration between 1 nM and 50 nM. In an aspect, an IFP is present in a mixture at a concentration between 1 nM and 10 nM.
- an IFP is present in a mixture at a concentration between 50 nM and 1000 nM. In an aspect, an IFP is present in a mixture at a concentration between 50 nM and 500 nM. In an aspect, an IFP is present in a mixture at a concentration between 50 nM and 250 nM. In an aspect, an IFP is present in a mixture at a concentration between 50 nM and 100 nM. In an aspect, an IFP is present in a mixture at a concentration between 100 nM and 1000 nM. In an aspect, an IFP is present in a mixture at a concentration between 100 nM and 500 nM.
- an IFP is present in a mixture at a concentration between 100 nM and 250 nM. In an aspect, an IFP is present in a mixture at a concentration between 250 nM and 1000 nM. In an aspect, an IFP is present in a mixture at a concentration between 250 nM and 500 nM. In an aspect, an IFP is present in a mixture at a concentration between 500 nM and 1000 nM.
- an OFP is present in a mixture at a concentration between 1 nM and 1000 nM. In an aspect, an OFP is present in a mixture at a concentration between 1 nM and 500 nM. In an aspect, an OFP is present in a mixture at a concentration between 1 nM and 250 nM. In an aspect, an OFP is present in a mixture at a concentration between 1 nM and 100 nM. In an aspect, an OFP is present in a mixture at a concentration between 1 nM and 50 nM. In an aspect, an OFP is present in a mixture at a concentration between 1 nM and 10 nM.
- an OFP is present in a mixture at a concentration between 50 nM and 1000 nM. In an aspect, an OFP is present in a mixture at a concentration between 50 nM and 500 nM. In an aspect, an OFP is present in a mixture at a concentration between 50 nM and 250 nM. In an aspect, an OFP is present in a mixture at a concentration between 50 nM and 100 nM. In an aspect, an OFP is present in a mixture at a concentration between 100 nM and 1000 nM. In an aspect, an OFP is present in a mixture at a concentration between 100 nM and 500 nM.
- an OFP is present in a mixture at a concentration between 100 nM and 250 nM. In an aspect, an OFP is present in a mixture at a concentration between 250 nM and 1000 nM. In an aspect, an OFP is present in a mixture at a concentration between 250 nM and 500 nM. In an aspect, an OFP is present in a mixture at a concentration between 500 nM and 1000 nM.
- an MFP is present in a mixture at a concentration between 1 nM and 1000 nM. In an aspect, an MFP is present in a mixture at a concentration between 1 nM and 500 nM. In an aspect, an MFP is present in a mixture at a concentration between 1 nM and 250 nM. In an aspect, an MFP is present in a mixture at a concentration between 1 nM and 100 nM. In an aspect, an MFP is present in a mixture at a concentration between 1 nM and 50 nM. In an aspect, an MFP is present in a mixture at a concentration between 1 nM and 10 nM.
- an MFP is present in a mixture at a concentration between 50 nM and 1000 nM. In an aspect, an MFP is present in a mixture at a concentration between 50 nM and 500 nM. In an aspect, an MFP is present in a mixture at a concentration between 50 nM and 250 nM. In an aspect, an MFP is present in a mixture at a concentration between 50 nM and 100 nM. In an aspect, an MFP is present in a mixture at a concentration between 100 nM and 1000 nM. In an aspect, an MFP is present in a mixture at a concentration between 100 nM and 500 nM.
- an MFP is present in a mixture at a concentration between 100 nM and 250 nM. In an aspect, an MFP is present in a mixture at a concentration between 250 nM and 1000 nM. In an aspect, an MFP is present in a mixture at a concentration between 250 nM and 500 nM. In an aspect, an MFP is present in a mixture at a concentration between 500 nM and 1000 nM.
- a UP is present in a mixture at a concentration between 1 nM and 1000 nM. In an aspect, a UP is present in a mixture at a concentration between 1 nM and 500 nM. In an aspect, a UP is present in a mixture at a concentration between 1 nM and 250 nM. In an aspect, a
- UP is present in a mixture at a concentration between 1 nM and 100 nM. In an aspect, a UP is present in a mixture at a concentration between 1 nM and 50 nM. In an aspect, a UP is present in a mixture at a concentration between 1 nM and 10 nM. In an aspect, a UP is present in a mixture at a concentration between 50 nM and 1000 nM. In an aspect, a UP is present in a mixture at a concentration between 50 nM and 500 nM. In an aspect, a UP is present in a mixture at a concentration between 50 nM and 250 nM.
- a UP is present in a mixture at a concentration between 50 nM and 100 nM. In an aspect, a UP is present in a mixture at a concentration between 100 nM and 1000 nM. In an aspect, a UP is present in a mixture at a concentration between 100 nM and 500 nM. In an aspect, a UP is present in a mixture at a concentration between 100 nM and 250 nM. In an aspect, a UP is present in a mixture at a concentration between 250 nM and 1000 nM. In an aspect, a UP is present in a mixture at a concentration between 250 nM and 500 nM. In an aspect, a UP is present in a mixture at a concentration between 500 nM and 1000 nM.
- a blocker is present in a mixture at a concentration between 1 nM and 1000 nM. In an aspect, a blocker is present in a mixture at a concentration between 1 nM and 500 nM.
- a blocker is present in a mixture at a concentration between 1 nM and 250 nM. In an aspect, a blocker is present in a mixture at a concentration between 1 nM and 100 nM. In an aspect, a blocker is present in a mixture at a concentration between 1 nM and 50 nM. In an aspect, a blocker is present in a mixture at a concentration between 1 nM and 10 nM. In an aspect, a blocker is present in a mixture at a concentration between 50 nM and 1000 nM. In an aspect, a blocker is present in a mixture at a concentration between 50 nM and 500 nM.
- a blocker is present in a mixture at a concentration between 50 nM and 250 nM. In an aspect, a blocker is present in a mixture at a concentration between 50 nM and 100 nM. In an aspect, a blocker is present in a mixture at a concentration between 100 nM and 1000 nM. In an aspect, a blocker is present in a mixture at a concentration between 100 nM and 500 nM. In an aspect, a blocker is present in a mixture at a concentration between 100 nM and 250 nM. In an aspect, a blocker is present in a mixture at a concentration between 250 nM and 1000 nM. In an aspect, a blocker is present in a mixture at a concentration between 250 nM and 500 nM. In an aspect, a blocker is present in a mixture at a concentration between 500 nM and 1000 nM.
- a wildtype-specific blocker is present in a mixture at a concentration between 1 nM and 1000 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 1 nM and 500 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 1 nM and 250 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 1 nM and 100 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 1 nM and 50 nM.
- a wildtype-specific blocker is present in a mixture at a concentration between 1 nM and 10 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 50 nM and 1000 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 50 nM and 500 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 50 nM and 250 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 50 nM and 100 nM.
- a wildtype-specific blocker is present in a mixture at a concentration between 100 nM and 1000 nM. In an aspect, a wildtypespecific blocker is present in a mixture at a concentration between 100 nM and 500 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 100 nM and 250 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 250 nM and 1000 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 250 nM and 500 nM. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration between 500 nM and 1000 nM.
- a wildtype-specific blocker is present in a mixture at a concentration that is between 5 times and 20 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 5 times and 15 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 5 times and 10 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 5 times and 7.5 times higher than the concentration of an IFP in the mixture.
- a wildtype-specific blocker is present in a mixture at a concentration that is between 7.5 times and 20 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 7.5 times and 15 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 7.5 times and 10 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 10 times and 20 times higher than the concentration of an IFP in the mixture.
- a wildtype-specific blocker is present in a mixture at a concentration that is between 10 times and 15 times higher than the concentration of an IFP in the mixture. In an aspect, a wildtype-specific blocker is present in a mixture at a concentration that is between 15 times and 20 times higher than the concentration of an IFP in the mixture.
- this application provides a method for preparing a nucleic acid for sequencing, the method comprising: (i) ligating a Ligation Tail Adapter (LTA) molecule to a nucleic acid comprising a known target nucleotide sequence to produce a ligation product, where the LTA molecule comprises an LTA-top strand and an LTA-bottom strand, and where the LTA- top strand and the LTA-bottom strand which two strands form a Double-Stranded End (DSE) and a DNA Tail (DT) region to produce a ligation product; (ii) amplifying the ligation product using a first target-specific primer that specifically anneals to the known target nucleotide sequence and a Splint, where the Splint comprises a 5' end subsequence that is not complementary to a subsequence of the LTA, and a 3' end subsequence that is complementary to the DT region to produce an a
- LTA Ligation Tail
- the method further comprises mechanically shearing a nucleic acid molecule preparation to obtain a nucleic acid molecule prior to step (i). In an aspect, the method further comprises end-repairing a mechanically sheared nucleic acid molecule. In an aspect, the method further comprises phosphorylating a mechanically sheared nucleic acid molecule. In an aspect, the method further comprises subjecting an RNA molecule to a reverse transcriptase regimen to generate a DNA molecule (e.g, a cDNA molecule). As used here, a “reverse transcriptase regimen” refers to any protocol known in the art that uses a reverse transcriptase to generate a cDNA molecule from an RNA molecule.
- the method further comprises adenylating a nucleic acid molecule to produce a 3 '-adenosine overhang on the nucleic acid molecule, and where a DSE comprises a 3' thymine overhang prior to step (i).
- ligating comprises performing an overhang ligating reaction.
- ligating comprises the use of a T4 DNA ligase enzyme.
- ligating comprises the use of a T3 DNA ligase enzyme.
- ligating comprises the use of a T7 DNA ligase enzyme.
- ligating comprises performing a TA ligation reaction.
- nucleic acid molecule preparation refers to any substance or mixture that comprises a biological sample.
- biological sample refers to a material obtained or isolated from a fresh or preserved biological sample or synthetically created source that contains nucleic acids. Any nucleic acid molecule provided herein can be obtained from a biological sample.
- Biological samples include at least one cell, fetal cell, cell culture, tissue specimen, blood, serum, plasma, saliva, urine, tear, vaginal secretion, sweat, lymph fluid, cerebrospinal fluid, mucosa secretion, peritoneal fluid, ascites fluid, fecal matter, body exudates, umbilical cord blood, chorionic villi, amniotic fluid, embryonic tissue, multicellular embryo, lysate, extract, solution, or reaction mixture suspected of containing nucleic acids.
- a biological sample is obtained from the environment (e.g., from soil, water, or air).
- a biological sample is obtained directly from an organism.
- a biological sample is obtained from a living organism. In an aspect, a biological sample is obtained from a deceased organism. In an aspect, a biological sample is obtained from a cell line. A biological sample can comprise DNA, RNA, or both. In an aspect, a biological sample is from a healthy organism. In an aspect, a biological sample is from a diseased organism. In an aspect, a biological sample is from a mutagenized sample. In an aspect, the nucleic acids obtained from a biological sample can be converted to cDNA.
- a biological sample is a prokaryotic biological sample.
- a biological sample is a eukaryotic biological sample.
- a biological sample is an animal biological sample.
- a biological sample is a plant biological sample.
- a biological sample is a fungal biological sample.
- a biological sample is a protozoan biological sample.
- a biological sample is a mammalian biological sample.
- a biological sample is a primate biological sample.
- a biological sample is a human biological sample.
- a second target-specific primer can specifically anneal to a portion of a known target nucleotide sequence within an amplification product.
- a known target nucleotide sequence comprises a sequence associated with a gene rearrangement.
- this application provides a method of determining the nucleotide sequence contiguous to a known target nucleotide sequence, the method comprising: (a) ligating a target nucleic acid molecule comprising the known target nucleotide sequence with a universal Ligation Tail Adapter (LTA), where the universal LTA comprises a non-amplification strand and an amplification strand to produce a ligation product; (b) amplifying a portion of the target nucleic acid molecule and the amplification strand of the universal LTA with a Splint and a first targetspecific primer from the ligation product to produce a first amplicon; (c) amplifying a portion of the first amplicon with a Universal Primer (UP) and a second target-specific primer to produce a second amplicon; and (d) sequencing the second amplicon using a first sequencing primer and a second sequencing primer; where the universal LTA comprises a ligatable Double-Stranded End (LTA), where the universal L
- this application provides a method for determining the nucleotide sequence contiguous to a known target nucleotide sequence of 10 or more nucleotides, the method comprising: (i) ligating a universal Ligation Tail Adapter (LTA) to a nucleic acid molecule comprising the known target nucleotide sequence to produce a ligation product; (ii) amplifying the ligation product via polymerase chain reaction using a Splint that specifically anneals to the universal LTA, and a first target-specific primer that specifically anneals to the known target nucleotide sequence to produce a first amplification product; (iii) amplifying the first amplification product via polymerase chain reaction using a Splint-specific primer and a second target-specific primer, where the second target-specific primer is nested relative to the first target-specific primer to produce a second amplification product; and (iv) sequencing the second amplification product using a first sequencing primer and
- LTA Universal Ligation
- this application provides a method of determining if a subject in need of treatment for cancer will be responsive to a given treatment, the method comprising: detecting, in a tumor sample obtained from the subject, the presence of an oncogene rearrangement according to any method provided herein; where the subject is determined to be responsive to a treatment targeting the oncogene rearrangement product if the presence of the oncogene rearrangement is detected.
- this disclosure provides a method of treating cancer, the method comprising: detecting, in a tumor sample obtained from a subject in need of treatment for cancer, the presence an oncogene rearrangement according to any method provided herein; and administering a cancer treatment which is effective against tumors comprising the oncogene rearrangement.
- an oncogene rearrangement is an ALK oncogene rearrangement.
- an oncogene rearrangement is an ALK oncogene rearrangement, and where a subject will be responsive to a treatment selected from the group consisting of an ALK inhibitor; crizotinib (PF-02341066); AP26113; LDK378; 3-39; AF802; IPI-504; ASP3026; AP-26113; X-396; GSK- 1838705A; CH5424802; and NVP- TAE684.
- an oncogene rearrangement is an ROS1 oncogene rearrangement.
- an oncogene rearrangement is an ROS1 oncogene rearrangement, and where a subject will be responsive to a treatment selected from the group consisting of an ALK inhibitor; crizotinib (PF-02341066); AP26113; LDK378; 3-39; AF802; IPI-504; ASP3026; AP-26113; X-396; GSK- 1838705A; CH5424802; and NVP- TAE684.
- an oncogene rearrangement is an RET oncogene rearrangement.
- an oncogene rearrangement is an RET oncogene rearrangement, and where a subject will be responsive to a treatment selected from the group consisting of a RET inhibitor; DP -2490; DP- 3636; SU5416; BAY 43-9006; BAY 73-4506 (regorafenib); ZD6474; NVP-AST487; sorafenib; RPI-1; XL184; vandetanib; sunitinib; imatinib; pazopanib; axitinib; motesanib; gefitinib; and withaferin A.
- a cancer is lung cancer.
- a cancer is selected from the group consisting of breast cancer, colorectal cancer, endometrial cancer, fallopian tube cancer, ovarian cancer, primary peritoneal cancer, gastric cancer, melanoma, pancreatic cancer, prostate cancer, sarcoma, carcinoma, leukemia, brain cancer, central nervous system cancer, adrenal cortex cancer, gallbladder cancer, urinary tract cancer, thyroid cancer, liver cancer, kidney cancer, eye cancer, and lymphoma.
- the LTA approach can be used for gene fusion detection.
- the coding sequence of a gene can be on either the (+) strand or the (-) strand of the human genome. Used herein, sequences are referenced based on the coding strand, with upstream defined as the sequence to the 5' end of the transcribed RNA sequence and downstream defined as the sequence to the 3' end of the transcribed RNA sequence.
- FIG. 1 shows a schematic illustration of an example of NGS library construction for targeted RNA and DNA sequencing using a ligation tail adapter (LTA).
- LTA ligation tail adapter
- a DNA template molecule is first ligated to the LTA.
- the LTA comprises an LTA-top strand and an LTA-bottom strand. These two strands form a Double- Stranded End (DSE) for ligation with a nucleic acid template molecule, and a DNA Tail (DT) region.
- DSE Double- Stranded End
- DT region comprises or is a single-stranded DNA
- a DT region comprises or is a double-stranded DNA.
- an LTA comprises at least one single-stranded DT sequence in the LTA-top strand or the LTA-bottom strand, which cannot bind with each other.
- a DT is a double-stranded DNA that comprises a UMI and/or an SB single-stranded DNA between the DSE and the DT region.
- a DNA template contains genomic DNA.
- a DNA template contains sheared genomic DNA.
- a DNA template contains cDNA, which is produced from an RNA reverse transcription.
- An LTA comprises DSE and DT regions ( Figure 2).
- a DSE has a length of 5 - 50 bp.
- a DT comprises a double-stranded DNA.
- a DT comprises a single-stranded DNA.
- a DT comprises a double-stranded DNA with a top strand that has over 80%, 90%, and 100% percent identity to the reverse complement of the bottom strand.
- a DT region comprises a singlestranded Unique Molecular Identifier (UMI) or a Sample Barcode (SB) between the DSE and the DT region.
- UMI Unique Molecular Identifier
- SB Sample Barcode
- the single-stranded DNA is in the LTA-top strand. In some embodiments, the single-stranded DNA is in the LTA-bottom strand. In some embodiments, the single-stranded DNA is in the LTA-top and the LTA-bottom strands. In some embodiments, the single-stranded DNA length ranges from 5 nt to 100 nt. In some embodiments, an LTA-top strand comprises a UMI or an SB between the DSE and a single-stranded DNA. In some embodiments, the UMI and the SB are in an LTA-bottom strand.
- An outer PCR step occurs after a LTA is ligated to a DNA template, and comprises an OFP as a forward primer and a Splint as a reverse primer.
- the number of PCR cycles ranges from 1 cycle to 40 cycles.
- a Splint comprises the full- length reverse complement, or a part thereof, of the LTA-top strand and a unique 5' overhang that cannot bind to the LTA-top strand ( Figure 4).
- the length of the unique single-stranded end ranges from 1 nt to 100 nt.
- a Splint binds to the DT region of a LTA-top strand from the second base of the DT 5' region to the 10 th nt of the 3' region. In some embodiments, a Splint binds to the DT starting from the first base of the 5' region of the DT. In some embodiments, a Splint binds to the LTA-top strand starting from the second base of the 5' region of the LTA-top strand to the second base of the 3' region of the DSE. In some embodiments, a Splint binds to a LTA-top strand starting from the first base of the 5' end of the LTA-top strand. In some embodiments, a Splint comprises a hairpin structure at the 5' end and a single-stranded DNA at the 3' end.
- an IFP binding site is immediately adjacent to an OFP binding site, which means the OFP and the IFP tile the same template without a gap or an overlap ( Figure 5a).
- an IFP binding site and an OFP binding site has a gap from 1-50 nt ( Figure 5b).
- an IFP binding site and an OFP binding site e.g., the IFP 5' end and the OFP 3' end
- a UP sequence serves as a reverse primer that contains the same region of a Splint ( Figure 6).
- a UP primer comprises a reverse complement of part of the outer PCR amplicon positive strand and a unique 5' overhang that cannot bind to the outer PCR amplicon (UP-1).
- a UP is the same as the 5' overhang of the Splint (UP -2).
- a UP comprises a reverse complement of a DT in the LTA-top strand and a 5' overhang region of a Splint (UP-3).
- the number of the inner PCR cycles are between 1 cycle to 40 cycles. After the inner PCR, the amplicon is defined as the Nested PCR Amplicon.
- index primers can be added to both sides of the Nested PCR Amplicon for paired end sequencing in an NGS platform ( Figure 7).
- An adapter PCR comprises an inner AD-FP as a forward primer and a UP as a reverse primer.
- An inner AD-FP comprises adapter sequences in the 5' end and an IFP sequence in the 3' end.
- an index PCR comprises a sequencing index 1 as a forward primer and a sequencing index 2 as a reverse primer.
- the sequencing index 1 is an Illumina P5 index primer
- the sequencing index 2 is an illumine P7 index primer.
- the sequencing index only contains one side for single-end sequencing.
- the sequencing index can be a nanopore primer for nanopore sequencing.
- the sequencing index can be an adapter for Ion Torrent sequencing.
- an UMI or SB addition is prepared following the protocol in Figure 1 if the UMI or SB is in the LTA-top strand.
- a UMI or SB in the LTA-bottom strand and will be added follow the workflow shown in Figure 14.
- a UMI-containing forward primer is added in a PCR reaction for 1 cycle of PCR to add the UMI.
- a library can then be prepared following the outer PCR, the inner PCR and the index PCR.
- the inner PCR can be combined with a Blocker Displacement Amplification (BDA) ( Figure 13) and the overall workflow was named as Adjacent Blocker Displacement Amplification (ABDA).
- BDA Blocker Displacement Amplification
- ABDA Adjacent Blocker Displacement Amplification
- An IFP and a Blocker are designed to have a certain degree of sequence overlap with several nucleotides at the 3' end.
- a 3' region of each IFP sequence is identical in sequence to a 5' region of the corresponding Blocker. In some embodiments, this region has a length between 5 nt and 10 nt. In other embodiments, this region has a length between 3 nt and 25 nt.
- the binding of the IFP or the Blocker to the template will be mutually exclusive: with high probability (without being bound by any theory), a three-stranded molecule comprising the template, the IFP, and the Blocker colocalized via DNA hybridization interactions.
- the three-stranded molecule will rapidly dissociate, releasing a single-stranded forward primer or single-stranded Blocker into the solution.
- the number of overlapping nucleotides of overlaps between the forward primer and the Blocker is 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20.
- the standard free energy of binding of the overlapping nucleotides to bind to their reverse complementary sequences is -4 kcal/mol at approximately 50 °C, approximately 55 °C, approximately 60 °C, approximately 65 °C, or approximately 70 °C in a buffer suitable for PCR.
- the binding of a Blocker to its reverse complementary sequence has a computed melting temperature of approximately 55 °C, approximately 60 °C, approximately 65 °C, approximately 70 °C, approximately 75 °C, or approximately 80 °C in a buffer suitable for PCR, at Blocker concentrations of between 100 nM and 5 pM.
- the binding of a Blocker to its reverse complementary sequence has a computed standard free energy of binding (AG°) of approximately -14 kcal/mol at approximately 50 °C, approximately 55 °C, approximately 60 °C, approximately 65 °C, or approximately 70 °C in a buffer suitable for PCR.
- the standard free energy of a Blocker binding to its reverse complement is stronger than the standard free energy of a forward primer to bind to its reverse complement (AG°fp) by between -1 kcal/mol and -4 kcal/mol at approximately 50 °C, approximately 55 °C, approximately 60 °C, approximately 65 °C, or approximately 70 °C in abuffer suitable for PCR.
- a Blocker comprises a sequence at or near the 3' end that does not hybridize to the template and prevents DNA polymerase extension.
- a Blocker comprised a chemical modification at or near the 3' end that prevented DNA polymerase extension.
- the Blocker comprises a chemical modification at or near the 3' end that prevents 3' ->5' exonuclease activity by error-correct DNA polymerases.
- the chemical modification comprises inverted DNA nucleotides.
- the chemical modification comprises 3-carbon spacers.
- a template binding sequence comprises a length of between 5 nucleotides and 100 nucleotides. In an aspect, a template binding sequence comprises a length of between 6 nucleotides and 75 nucleotides. In an aspect, a template binding sequence comprises a length of between 6 nucleotides and 50 nucleotides. In an aspect, a template binding sequence comprises a length of between 6 nucleotides and 40 nucleotides. In an aspect, a template binding sequence comprises a length of between 6 nucleotides and 30 nucleotides. In an aspect, a template binding sequence comprises a length of between 6 nucleotides and 20 nucleotides.
- a template binding sequence comprises a length of at least 5 nucleotides. In an aspect, a template binding sequence comprises a length of at least 6 nucleotides. In an aspect, a template binding sequence comprises a length of at least 10 nucleotides. In an aspect, a template binding sequence comprises a length of at least 15 nucleotides. In an aspect, a template binding sequence comprises a length of at least 20 nucleotides. In an aspect, a template binding sequence comprises a length of at least 25 nucleotides. In an aspect, a template binding sequence comprises a length of at least 30 nucleotides.
- a template binding sequence comprises a length of at least 40 nucleotides. In an aspect, a template binding sequence comprises a length of at least 50 nucleotides. In an aspect, a template binding sequence comprises a length of at least 75 nucleotides. [0191] In an aspect, this disclosure provides a template nucleic acid molecule. As used herein, a “template nucleic acid molecule” or a “template nucleic acid” refers to a nucleic acid molecule that comprises a sequence that is desired to be detected and/or amplified using PCR-based techniques in conjunction with at least one Occlusion Primer and/or at least one Occlusion Probe.
- a system comprises at least one template nucleic acid molecule. Any nucleic acid molecule that is desired to be detected or amplified can serve as a suitable template nucleic acid molecule. Numerous potential template nucleic acid molecules can be found in publicly available databases such as GenBank®. See Nucleic Acids Research, 41:D36-42 (2013).
- a template nucleic acid molecule is a DNA molecule.
- a template nucleic acid molecule is an RNA molecule.
- a template nucleic acid molecule is a genomic DNA molecule.
- a template nucleic acid molecule is an organellar DNA molecule.
- an organellar DNA molecule is selected from the group consisting of a mitochondrial DNA molecule and a plastid DNA molecule.
- a template nucleic acid molecule is a complementary DNA (cDNA) molecule.
- a template nucleic acid molecule is a eukaryotic nucleic acid molecule.
- a eukaryotic nucleic acid molecule is selected from the group consisting of an animal nucleic acid molecule, a plant nucleic acid molecule, and a fungi nucleic acid molecule.
- an animal nucleic acid molecule is a human nucleic acid molecule.
- a template nucleic acid molecule is a prokaryotic nucleic acid molecule.
- a prokaryotic nucleic acid molecule is selected from the group consisting of a bacteria nucleic acid molecule and an archaea nucleic acid molecule.
- a template nucleic acid molecule is a virus nucleic acid molecule.
- a virus nucleic acid molecule is selected from the group consisting of an Adenoviridae nucleic acid molecule, a Herpesviridae nucleic acid molecule, a Poxviridae nucleic acid molecule, a Papillomaviridae nucleic acid molecule, a Parvoviridae nucleic acid molecule, a Reoviridae nucleic acid molecule, a Coronaviridae nucleic acid molecule, a Picomaviridae nucleic acid molecule, a Togaviridae nucleic acid molecule, an Orthomyxoviridae nucleic acid molecule, a Rhabdoviridae nucleic acid molecule, a Retroviridae nucleic acid molecule, a Hepadnaviridae nucleic acid molecule, a Baculoviridae nu
- a virus nucleic acid molecule is selected from the group consisting of a human orthopnemovirus (HRSV) nucleic acid molecule, an influenza virus nucleic acid molecule, a human immunodeficiency virus (HIV) nucleic acid molecule, a hepatitis B virus (HBV) nucleic acid molecule, and a human papillomavirus (HPV) nucleic acid molecule.
- HRSV human orthopnemovirus
- HAV human immunodeficiency virus
- HBV hepatitis B virus
- HPV human papillomavirus
- a template nucleic acid molecule is a viroid nucleic acid molecule.
- a viroid nucleic acid molecule is selected from the group consisting of a Pospiviroidae nucleic acid molecule, and a Avsunviroidae nucleic acid molecule.
- a method of preparing a nucleic acid template into a DNA library for sequencing comprising the steps of:
- LTA Ligation Tail Adapter
- step (c) Introducing to the ligation product mixture from step (b) a target-specific Outer Forward Primer (OFP) and a Splint, a DNA polymerase, and reagents for DNA polymerase activity to form a second mixture;
- the OFP comprises a nucleic acid sequence that can specifically anneal to a portion of a target DNA sequence region of interest, wherein the Splint comprises a 5' end subsequence that is not complementary to a subsequence of the LTA, and a 3' end subsequence that is complementary to the DT region or portion thereof;
- step (e) Introducing to the DNA polymerase extension product mixture from step (d) a targetspecific Inner Forward Primer (IFP), a Universal Primer (UP), a thermostable DNA polymerase, and reagents for DNA polymerase activity to form a third mixture, wherein the IFP comprises a nucleic acid sequence that can specifically anneal to a portion of the target DNA sequence region of interest; and wherein the UP comprises a sequence which can specifically anneal to the 5' end subsequence, or portion thereof, of the Splint; and
- UMI sequence is from a mixture of 10 to 1000 defined DNA sequences with a minimum pairwise Levenschtein distance of 2, 3, 4, or 5.
- the UMI sequence comprises at least one degenerate nucleotide selected from the group consisting of N, B, D, H, V, S, W, Y, R, M, and K..
- step (b) is selected from the group consisting of Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase, phi29 DNA polymerase, KI enow fragment, Bst DNA polymerase, T4 DNA polymerase, Vent® DNA polymerase, LongAmp® Taq DNA polymerase, and OneTaq® DNA polymerase.
- step (c) is selected from the group consisting of Taq DNA polymerase, Phusion® DNA polymerase, Q5® DNA polymerase, and KAPA High Fidelity DNA polymerase.
- both the LTA-top strand 3' region and the LTA-bottom strand 5' region comprise a single-stranded DT region that does not bind with each other; and wherein the LTA-top strand 5' region and the LTA-bottom strand 3' region binds with each other to form a DSE.
- nucleic acid template is a doublestranded DNA.
- nucleic acid template is a biological DNA derived from a sample of cells from biofluids such as blood, urine, saliva, cerebrospinal fluid, interstitial fluid, and synovial fluid, or from a tissue such as a biopsy tissue or a surgically resected tissue.
- nucleic acid template is a cDNA molecule generated through the reverse transcription of an RNA sample.
- RNA sample is a biological RNA sample derived from a human, an animal, a plant, or an environmental specimen.
- nucleic acid template is an amplicon DNA molecule generated through a DNA polymerase.
- nucleic acid template is from a physically, chemically, or enzymatically treated product of a biological DNA or RNA sample.
- nucleic acid template is from a product of a fragmentation process.
- the end-repair process is performed using a T4 DNA ligase enzyme.
- the 5' overhang in the OFP comprises between 1 nucleotide and 100 nucleotides.
- the 5' end single-stranded nucleic acid sequence comprises a sequencing adapter.
- the sequencing adapter is or is a part of an Illumina sequencing adapter, a nanopore sequencing adapter, or an Ion Torrent adapter.
- the middle PCR comprises using an MFP that binds to an MFP binding site on at least one template nucleic acid molecule, wherein the MFP binding site partially overlaps with the OFP binding site.
- the middle PCR comprises using an MFP that binds to an MFP binding site on at least one template nucleic acid molecule, wherein the MFP binding site partially overlaps with the IFP binding site.
- the middle PCR comprises using an MFP that comprises a 5' region starting from the second nucleotide of the OFP 5' region to the second nucleotide of the OFP 3' region, and the MFP comprises a 3' region starting from the second nucleotide of the IFP 5' region to the second nucleotide of the IFP 3' region.
- the sequencing adapter comprises a sequencing adapter selected from the group consisting of an Illumina sequencing adapter, a nanopore sequencing adapter, and an Ion Torrent sequencing adapter.
- a target DNA sequence region comprising at least 1000 nucleotides is tiled from a first orientation based on the positive strand of the template nucleic acid molecule and a second orientation based on the negative strand of the templated nucleic acid molecule.
- the gap distance comprises between 15 nucleotides and 25 nucleotides.
- the PCR amplification comprises a wildtype-specific Blocker.
- the wildtype-specific blocker binds to the target DNA sequence region of interest at a wildtype-specific binding site, wherein the IFP binds to the target DNA sequence region of interest at an IFP binding site, and wherein the wildtype-specific binding site and the IFP binding site overlap by at least 3 nucleotides.
- the wildtype-specific blocker binds to the target DNA sequence region of interest at a wildtype-specific binding site, wherein the IFP binds to the target DNA sequence region of interest at an IFP binding site, and wherein the wildtype-specific binding site and the IFP binding site overlap by less than or equal to 25 nucleotides.
- the wildtype-specific blocker binds to the target DNA sequence region of interest at a wildtype-specific binding site, wherein the IFP binds to the target DNA sequence region of interest at an IFP binding site, and wherein the wildtype-specific binding site and the IFP binding site overlap by between 3 nucleotides and 5 nucleotides.
- the wildtype-specific blocker binds to the target DNA sequence region of interest at a wildtype-specific binding site, wherein the IFP binds to the target DNA sequence region of interest at an IFP binding site, and wherein the wildtype-specific binding site and the IFP binding site overlap by between 5 nucleotides and 25 nucleotides.
- the IFP comprises a target-specific portion that does not overlap in sequence with the wildtype-specific blocker, wherein the wildtypespecific blocker comprises a blocker-unique sequence that does not overlap with the IFP, and where the IFP comprises an overlapping region that overlaps in sequence with the wildtype- specific blocker, and wherein:
- the overlapping region comprises a standard free energy of binding between -2 kcal/mol and -4 kcal/mol;
- the target-specific portion comprises a standard free energy of binding between -5 kcal/mol and -9 kcal/mol;
- the blocker-unique sequence has a standard free energy of binding between -7 kcal/mol and -12 kcal/mol.
- the IFP comprises a target-specific portion that does not overlap in sequence with the wildtype-specific blocker
- the wildtype- specific blocker comprises a blocker-unique sequence that does not overlap with the IFP
- the IFP comprises an overlapping region that overlaps in sequence with the wildtypespecific blocker
- the overlapping region comprises standard free energy of binding that ranges between -2 kcal/mol and -4 kcal/mol.
- the target-specific portion comprises a standard free energy of binding between -5 kcal/mol and -9 kcal/mol.
- the wildtype-specific blocker comprises a terminator to prevent 3' to 5' DNA polymerase exonuclease activity, wherein the terminator is selected from the group consisting of a three-carbon (C3) spacer and DXXDM, wherein D is a match between the wildtype-specific blocker sequence and the target DNA region, wherein M is a C3 spacer, and wherein X is a mismatch between the wildtypespecific blocker sequence and the target DNA region.
- the wildtype-specific blocker comprises a terminator comprising a DNA overhang comprising four nucleotides.
- the IFP is present at a concentration between 1 nM and 1000 nM;
- the UP primer is present at a concentration between 1 nM and 1000 nM; or,
- the wildtype-specific blocker is present at a total concentration that is between 5 times and 20 times higher than the concentration of the IFP;
- a method for preparing a nucleic acid for sequencing comprising:
- LTA Ligation Tail Adapter
- nucleic acid molecule is a deoxyribonucleic acid (DNA) molecule.
- a method of determining the nucleotide sequence contiguous to a known target nucleotide sequence comprising: (a) ligating a target nucleic acid molecule comprising the known target nucleotide sequence with a universal Ligation Tail Adapter (LTA), wherein the universal LTA comprises a nonamplification strand and an amplification strand to produce a ligation product;
- LTA Ligation Tail Adapter
- the universal LTA comprises a ligatable Double-Stranded End (DSE) and a DNA Tail (DT) region; wherein the non-amplification strand comprises a 5' duplex portion; wherein the amplification strand comprises an unpaired 5' portion, a 3' duplex portion, and a 3' thymine (T) overhang; wherein the duplex portion of the non-amplification strand and the duplex portion of the amplification strand are complementary and form the ligatable DSE comprising a 3' T overhang; wherein the duplex portion is of sufficient length to remain in duplex form at the ligation temperature; wherein the first target-specific primer comprises a nucleic acid sequence that can specifically anneal to the known target nucleotide sequence of the target nucleic acid molecule; wherein the second target-specific primer comprises a 3' portion comprising a nucleic acid sequence that can specifically anneal to a
- a method of determining if a subject in need of treatment for cancer will be responsive to a given treatment comprising: detecting, in a tumor sample obtained from the subject, the presence of an oncogene rearrangement according to the method of any one of embodiments 1- 140; wherein the subject is determined to be responsive to a treatment targeting the oncogene rearrangement product if the presence of the oncogene rearrangement is detected.
- oncogene rearrangement is an ALK oncogene rearrangement
- the subject will be responsive to a treatment selected from the group consisting of an ALK inhibitor; crizotinib (PF-02341066); AP26113; LDK378; 3-39; AF802; IPI-504; ASP3026; AP-26113; X-396; GSK-1838705A; CH5424802; andNVP- TAE684.
- oncogene rearrangement is an ROS1 oncogene rearrangement
- the subject will be responsive to a treatment selected from the group consisting of an ALK inhibitor; crizotinib (PF-02341066); AP26113; LDK378; 3-39; AF802; IPI-504; ASP3026; AP-26113; X-396; GSK-1838705A; CH5424802; andNVP- TAE684.
- the oncogene rearrangement is an RET oncogene rearrangement
- the subject will be responsive to a treatment selected from the group consisting of a RET inhibitor; DP-2490; DP-3636; SU5416; BAY 43-9006; BAY 73-4506 (regorafenib); ZD6474; NVP-AST487; sorafenib; RPI-1; XL184; vandetanib; sunitinib; imatinib; pazopanib; axitinib; motesanib; gefitinib; and withaferin A.
- a RET inhibitor DP-2490; DP-3636; SU5416; BAY 43-9006; BAY 73-4506 (regorafenib); ZD6474; NVP-AST487; sorafenib; RPI-1; XL184; vandetanib; sunitinib; imatini
- a method of treating cancer comprising: detecting, in a tumor sample obtained from a subject in need of treatment for cancer, the presence an oncogene rearrangement according to the method of any one of embodiments 1 - 140; and administering a cancer treatment which is effective against tumors comprising the oncogene rearrangement.
- Example 1 Detection of FGFR2 gene RNA fusion with known upstream partners using gBlocks as the template.
- RNA fusion detection starts with a reverse transcription kit for the reverse transcription from RNA to cDNA.
- the DNA synthesis template gBlocks (which are double-stranded DNA fragments) containing the fusion exons are used.
- the FGFR2 gene fusion gBlocks comprise an FGFR2 exonl9-AHCYLl exon5, an FGFR2 exonl9-BICCl exon3, and an FGFR2 exon 19-GAB2 exon2.
- the gBlocks are mixed with human genomic DNA and then sheared to 300 bp using a DNA fragmentase from Twist Bioscience. Then the fusion detection workflow is followed to detect the fusions.
- the RNA fusion detection is shown using synthetic DNA gBlocks that comprise an exon region.
- the OFP and IFP designs are based on cDNAs. Both designs comprise a forward design and a reverse design.
- the forward design is used for an unknown partner which is at the downstream of the known exon, and the reverse design is at the upstream of the known exon.
- the IFP 3' end has a distance of 5 - 20 nt to the breakpoint.
- the forward design uses an anti-sense strand as the binding template, and the reverse design uses a sense strand as the binding template.
- Exemplary primer sequences used for detecting FGFR2 gene fusion is shown in the Table 3.
- Example 2 Detection of NTRK1 gene RNA fusion with a known downstream partner and its verification in RNA samples.
- the known exons are in the downstream fusion partner.
- An example is shown using the NTRK1 gene.
- IFPs are designed that can bind to a sense template of NTRK1 with different exons from exon8 to exon!4 for a total 7 pl exes.
- the IFPs had different gaps to the breakpoint with a length of 9 - 17 nt.
- the IFPs are designed approximately 5 - 10 nt from the breakpoint.
- Some primer designs are shifted to avoid primer dimers, and the gap is different consequently.
- synthetic gBlocks are used.
- the NTRK1 gene WT gBlocks comprise 3 different templates covering exon8 to part of exonl6.
- the NTRK1 fusion gBlocks comprise 4 different templates.
- All fusion gBlocks and WT gBlocks are pooled and mixed with equal ratio as one template pool. All WT gBlocks are pooled together as another template pool. Different libraries are constructed from each template pool using the LTA approach. The RNA template ordered from SeraCare is reverse-transcribed into cDNA and was used as the template for the library preparation. Exemplary primer sequences used for detecting NTRK1 gene fusion is shown in the Table 4.
- the LTA approach is used for the DNA fusion detection.
- the breakpoint was usually in the intron region so that covering the intron region is necessary to detect unknown DNA fusions ( Figure 11).
- gene 1 exon 2 was used as one known fusion partner, whereas the unknown exon is in the downstream of a known fusion partner, multiple IFPs needed to tile the intron 2 sequence of gene 1, and the binding target sequence is in the reverse strand of intron 2.
- gene 1 exon 2 is used as a known fusion partner, whereas the unknown exon is in the upstream of the known fusion partner, and multiple IFPs are needed to tile intron 1 of gene 1, and the binding target sequence is in the forward strand of intron 1.
- IFPs are designed for tiling the whole intron region.
- One IFP 3' end had a 0- 100 nt gap to the next IFP 5' end of the template binding site (excluding adapter region).
- the tiling design is for NGS sequencing to cover as much as possible breakpoint using shorter amplicon since the NGS read length is below 600 bp.
- Exemplary primer sequences used for detecting ALK gene fusion is shown in the Table 5.
- the ALK gene has multiple fusion types, most of which appear in exon 20.
- the exon 20 of ALK is the known partner in the downstream of the fusion, such as the EML4 exon 6 - ALK exon 20 fusion.
- a set ofl4-plex IFPs and OFPs are designed to tile all the intron 19 region of the ALK gene to find the unknown upstream fusion partner.
- the ALK design is tested on the H2228 cell line DNA since it has an EML4-ALK fusion. The results are shown in Figure 12.
- the fusion reads contain the EML4 intron 6 sequence and the ALK intron 19 sequence, which are referred to the fusion type as EML4 exon 6 and ALK exon 20, respectively.
- Example 4 Detection of EML4-ALK DNA fusion via ABDA enrichment.
- the EML4-ALK fusion breakpoint is in the intron!9 region of the ALK gene. After NGS sequencing using the LTA approach, the breakpoint is found to be close to inner primer 13 with a 10 nt gap between the 3' end of the IFP13 and the breakpoint. Based on BDA design rules, a Blocker is designed to cover the breakpoint with a 6 nt overlap to the IFP13. For EML4-ALK fusion sample H2228, the Blocker has a 6 nt region that cannot bind to the fusion template, and it can bind to the WT template.
- NGS results are shown in Table 1.
- the fusion rate is calculated as the fusion reads divided by the sum of the fusion reads and the WT reads.
- the Fusion Variant Allele Frequency (VAF) is defined as the fusion rate of H2228 sample without BDA enrichment.
- the Fusion Variant Reads Frequency (VRF) is defined as the fusion rate of H2228 sample after BDA enrichment.
- the enrichment fold is calculated based on Table 1 equation (4). The enrichment fold for H2228 EML4-ALK fusion is approximately 7 folds.
- Table 1 ABDA enrichment fold calculation for fusion variants.
- VAF Variant Allele Frequency
- VRF Variant Reads Frequency
- Fusion Rate Fusion Reads/(Fusion Reads + WT Reads)
- VRF Fusion rate of H2228 wB
- Table 2 Exemplary sequences for adapter.
- We design two exemplary adapters one has a length of 13 bp in the DSE region which comprises the LTA-UMI-up and the LTA-UMI- bottom (FLTA-UMI-up and FLTA-UMI-bottom, respectively).
- Another one has a longer length of 22 bp in the DSE region which comprises the LTA-UMI-long-up and the LTA-UMI-bottom (FLTA-UMI-long-up and FLTA-UMI-long-bottom, respectively).
- the UMI sequences are shown as HHHHHHHHHHHHH " in the sequences.
- the three UP sequences are named FusTaiSeqAdaPrimerl, FusTaiSeqAdaPrimer2, FusTaiSeqAdaPrimer3.
- Table 3 Exemplary primer sequences for FGFR2 fusion detection.
- Table 3 shows the primer sequences for FGFR2 gene fusions which is targeting all the exon regions and is designed based on FGFR2 as the upstream fusion partner. This table includes outer primers, inner primers, inner primers with sequencing adapters and Blocker sequences for the ABDA enrichment.
- Table 4 Exemplary primer sequences for NTRK1 fusion detection.
- Table 4 shows the primer sequences for targeting the NTRK1 gene fusion which targets NTRK1 as a downstream partner. This table includes 7 inner and outer primer designs and a primer with sequencing adapters for targeting 7 exons from exon 8 to exon 14.
- Table 4 Examples of primer sequences for NTRK1 fusion detection
- Table 5 Exemplary primer sequences for ALK fusion detection.
- Table 5 shows the 14 pl ex primer sequences for targeting ALK intron 19 and one ABDA design based on H2228 DNA fusion breakpoint.
- the ABDA design includes the OFP, the IFP and the IFP with sequencing adapter and Blocker sequences. Table 5. Examples of primer sequences for ALK fusion detection
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Cette invention concerne de nouveaux procédés et compositions pour enrichir sélectivement des variants génétiques rares potentiels comprenant à la fois des mutations d'ADN et des fusions de gènes avec des partenaires de fusion inconnus. Des modes de réalisation de l'invention comprennent des procédures pour une intégration avec une analyse de séquençage de nouvelle génération (NGS) en aval.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202063131747P | 2020-12-29 | 2020-12-29 | |
| US63/131,747 | 2020-12-29 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022146773A1 true WO2022146773A1 (fr) | 2022-07-07 |
Family
ID=79830991
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2021/064534 Ceased WO2022146773A1 (fr) | 2020-12-29 | 2021-12-21 | Procédés et compositions de séquençage et de détection de fusion à l'aide d'adaptateurs de queue de ligature (lta) |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2022146773A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024108087A1 (fr) * | 2022-11-18 | 2024-05-23 | Takeda Vaccines, Inc. | Procédé de détermination de la proportion d'un flavivirus vivant atténué ayant une séquence nucléotidique comprenant au moins un locus d'atténuation dans une formulation |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2015112948A2 (fr) * | 2014-01-27 | 2015-07-30 | Iafrate Anthony John | Procédés de détermination d'une séquence nucléotidique |
| US9487828B2 (en) | 2012-05-10 | 2016-11-08 | The General Hospital Corporation | Methods for determining a nucleotide sequence contiguous to a known target nucleotide sequence |
| US20170067090A1 (en) | 2014-05-19 | 2017-03-09 | William Marsh Rice University | Allele-specific amplification using a composition of overlapping non-allele-specific primer and allele-specific blocker oligonucleotides |
| WO2018005983A1 (fr) * | 2016-07-01 | 2018-01-04 | Natera, Inc. | Compositions et procédés pour la détection de mutations d'acides nucléiques |
| WO2018053362A1 (fr) * | 2016-09-15 | 2018-03-22 | ArcherDX, Inc. | Procédés de préparation d'échantillon d'acide nucléique |
| WO2019023924A1 (fr) * | 2017-08-01 | 2019-02-07 | Helitec Limited | Procédés d'enrichissement et de détermination de séquences nucléotidiques cibles |
| WO2019164885A1 (fr) | 2018-02-20 | 2019-08-29 | William Marsh Rice University | Systèmes et procédés d'enrichissement d'allèle à l'aide d'une amplification de déplacement d'un agent de blocage multiplexé |
-
2021
- 2021-12-21 WO PCT/US2021/064534 patent/WO2022146773A1/fr not_active Ceased
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9487828B2 (en) | 2012-05-10 | 2016-11-08 | The General Hospital Corporation | Methods for determining a nucleotide sequence contiguous to a known target nucleotide sequence |
| US10718009B2 (en) | 2012-05-10 | 2020-07-21 | The General Hospital Corporation | Methods for determining a nucleotide sequence contiguous to a known target nucleotide sequence |
| WO2015112948A2 (fr) * | 2014-01-27 | 2015-07-30 | Iafrate Anthony John | Procédés de détermination d'une séquence nucléotidique |
| US20170067090A1 (en) | 2014-05-19 | 2017-03-09 | William Marsh Rice University | Allele-specific amplification using a composition of overlapping non-allele-specific primer and allele-specific blocker oligonucleotides |
| WO2018005983A1 (fr) * | 2016-07-01 | 2018-01-04 | Natera, Inc. | Compositions et procédés pour la détection de mutations d'acides nucléiques |
| WO2018053362A1 (fr) * | 2016-09-15 | 2018-03-22 | ArcherDX, Inc. | Procédés de préparation d'échantillon d'acide nucléique |
| WO2019023924A1 (fr) * | 2017-08-01 | 2019-02-07 | Helitec Limited | Procédés d'enrichissement et de détermination de séquences nucléotidiques cibles |
| WO2019164885A1 (fr) | 2018-02-20 | 2019-08-29 | William Marsh Rice University | Systèmes et procédés d'enrichissement d'allèle à l'aide d'une amplification de déplacement d'un agent de blocage multiplexé |
Non-Patent Citations (10)
| Title |
|---|
| "McGraw-Hill Dictionary of Scientific and Technical Terms", 2002, MCGRAW-HILL |
| "The American Heritage® Science Dictionary", 2011, HOUGHTON MIFFLIN HARCOURT |
| ALTSCHUL ET AL.: "Basic local alignment search tool", J. MOL. BIOL., vol. 215, 1990, pages 403 - 410, XP002949123, DOI: 10.1006/jmbi.1990.9999 |
| CHENNA ET AL.: "Multiple sequence alignment with the Clustal series of programs", NUCLEIC ACIDS RESEARCH, vol. 31, 2003, pages 3497 - 3500, XP002316493, DOI: 10.1093/nar/gkg500 |
| HAAS ET AL., GENOME BIOL, vol. 20, 2019, pages 213 |
| JIA ET AL., GENOME BIOL., vol. 14, 2013, pages R12 |
| LARKIN MA ET AL.: "Clustal W and Clustal X version 2.0", BIOINFORMATICS, vol. 23, 2007, pages 2947 - 48 |
| NUCLEIC ACIDS RESEARCH, vol. 41, 2013, pages D36 - 42 |
| SANGERCOULSON, J. MOL. BIOL., vol. 94, 1975, pages 441 - 446 |
| THOMPSON ET AL.: "Clustal W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice", NUCLEIC ACIDS RESEARCH, vol. 22, 1994, pages 4673 - 4680, XP002956304 |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024108087A1 (fr) * | 2022-11-18 | 2024-05-23 | Takeda Vaccines, Inc. | Procédé de détermination de la proportion d'un flavivirus vivant atténué ayant une séquence nucléotidique comprenant au moins un locus d'atténuation dans une formulation |
| EP4375381A1 (fr) * | 2022-11-18 | 2024-05-29 | Takeda Vaccines, Inc. | Procédé pour déterminer la proportion d'un flavivirus vivant atténué ayant une séquence nucléotidique comprenant au moins un locus d'atténuation dans une formulation |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2016281718B2 (en) | Selective degradation of wild-type DNA and enrichment of mutant alleles using nuclease | |
| JP2012511927A (ja) | 対立遺伝子変種を検出するための方法、組成物、およびキット | |
| CN101815789A (zh) | 靶序列的富集 | |
| US10704087B2 (en) | Cooperative primers, probes, and applications thereof | |
| AU2013292706B2 (en) | Cooperative primers, probes, and applications thereof | |
| WO2012118802A9 (fr) | Trousse et procédé de séquençage d'un adn cible dans une population mixte | |
| EP1876246A1 (fr) | Amorces complémentaires à eux-mêmes utilisées dans un LAMP procédé d'amplification de gène | |
| US11085068B2 (en) | Method for generating single-stranded circular DNA libraries for single molecule sequencing | |
| WO2017142989A1 (fr) | Préparation et analyse d'acides nucléiques | |
| US11761033B2 (en) | Methods to amplify highly uniform and less error prone nucleic acid libraries | |
| CN114555830A (zh) | 靶核酸的检测方法、核酸结合分子的检测方法、及核酸结合能力的评价方法 | |
| WO2022146773A1 (fr) | Procédés et compositions de séquençage et de détection de fusion à l'aide d'adaptateurs de queue de ligature (lta) | |
| US20250137074A1 (en) | Additive for pcr amplification, multiplex pcr amplification kit, and use thereof | |
| WO2002090538A1 (fr) | Procede de synthese d'acide nucleique | |
| EP3856931B1 (fr) | Conception spécifique d'allèles d'amorces coopératives pour génotypage de variants d'acide nucléique amélioré | |
| JP2005318884A (ja) | 核酸の増幅方法 | |
| CN118006617A (zh) | Taq酶的适配体 | |
| CA3220607A1 (fr) | Procedes et compositions lies a des amorces cooperatives et a la transcription inverse | |
| WO2023086819A1 (fr) | Méthodes et compositions de séquençage et de détection de fusion à l'aide d'amorces inverses plus aléatoires | |
| CN117448441A (zh) | 一种基于RNase HⅡ酶依赖-锁核酸-阻滞探针-环介导恒温扩增的单核苷酸多态性等温分型方法 | |
| EP2177629A1 (fr) | Amplification de déplacement multiple - shustring |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21847598 Country of ref document: EP Kind code of ref document: A1 |
|
| WA | Withdrawal of international application | ||
| NENP | Non-entry into the national phase |
Ref country code: DE |