[go: up one dir, main page]

WO2025085618A1 - Procédés d'ajout d'adaptateurs sur des polynucléotides - Google Patents

Procédés d'ajout d'adaptateurs sur des polynucléotides Download PDF

Info

Publication number
WO2025085618A1
WO2025085618A1 PCT/US2024/051747 US2024051747W WO2025085618A1 WO 2025085618 A1 WO2025085618 A1 WO 2025085618A1 US 2024051747 W US2024051747 W US 2024051747W WO 2025085618 A1 WO2025085618 A1 WO 2025085618A1
Authority
WO
WIPO (PCT)
Prior art keywords
adaptor
polynucleotide
strand
polymerase
flap
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/051747
Other languages
English (en)
Inventor
Niall Gormley
Esther Musgrave-Brown
Andrew Slatter
Clifford Lee Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Illumina Inc
Original Assignee
Illumina Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Illumina Inc filed Critical Illumina Inc
Publication of WO2025085618A1 publication Critical patent/WO2025085618A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay

Definitions

  • the disclosure relates to methods for appending adaptors to the 5’ and/or 3’ ends of polynucleotides.
  • Library preparation aims to build a collection of DNA fragments for nextgeneration sequencing (NGS).
  • NGS nextgeneration sequencing
  • a high-quality DNA library guarantees uniform and consistent genome coverage, thus delivering comprehensive and reliable sequencing data.
  • the conversion of sample DNA to library DNA can be inefficient using standard ligation methodologies, however.
  • Next generation sequencing typically requires library preparation, where known adaptor DNA sequences are added to the target DNA to be sequenced. Traditionally, this requires that sample DNA is fragmented, end-repaired, and then ligated to the adaptor DNA. While ligation-mediated library prep can yield the highest quality genomes, the conversion of sample DNA to library DNA can be inefficient. In cases where the quantity of sample DNA is in short supply, this poor efficiency makes ligation-mediated library prep more challenging or even infeasible.
  • amplicon based or probe-based hybridization/pulldown Traditional methods for target enrichment for NGS broadly fall into two categories: amplicon based or probe-based hybridization/pulldown.
  • the former employs primer pairs and PCR to amplify targets from a sample; it is simple and fast but limited in its ability to multiplex very high numbers of targets due to PCR mispriming events. It is also restricted in the size of amplicon that can be produced due to the limits of current PCR technology.
  • Other disadvantages of PCR such as sequence bias or polymerase slippage can also impact the performance scope.
  • Hybridization approaches are generally longer in practice than amplicon methods but are virtually limitless in the number of targets that can be enriched. Poorer specificity arising from hybridization of a single probe only, is mitigated by additional rounds of pulldown and/or increasing the probe length and T m .
  • the disclosure provides methods to append adaptors to the 5’ and/or 3’ ends of polynucleotides.
  • the resulting adaptor-polynucleotide constructs can be then used in various applications, including NGS.
  • the disclosure provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides, comprising: fragmenting gDNA or cDNA into polynucleotides that are less than 1000 base pairs in length; end repairing and phosphorylating the polynucleotides; attaching adaptors to the 5’ and 3’ ends of the end-repaired polynucleotides using non-homologous end joining factors.
  • the gDNA or cDNA is fragmented by enzymatic digestion, chemical cleavage, sonication, nebulization, or hydroshearing.
  • the gDNA or cDNA is fragmented by sonication.
  • the DNA fragments are enzymatically end repaired and phosphorylated by using T4 DNA polymerase and T4 polynucleotide kinase.
  • a single ‘A’ deoxynucleotide is added to the end- repaired DNA fragments by use of Klenow enzyme which lacks exonuclease activity.
  • the adaptors comprise a 3' overhang of a ‘T’ deoxynucleotide.
  • the adaptors comprise a double stranded region of complementary sequence and a single stranded region of sequence mismatch.
  • the adaptors are Y- shaped or U-shaped.
  • the single stranded regions of the adaptors comprise one or more of the following sequences: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) and P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • oligonucleotides are added to the 3’ ends of the DNA fragments with terminal transferase.
  • the adaptors comprise an overhang of base pairs that are complementary to the oligonucleotides added to the 3’ ends of the DNA fragments.
  • the adaptors comprise a double stranded region of complementary sequence and a single stranded region of sequence mismatch.
  • the adaptors are Y-shaped or U-shaped.
  • the single stranded regions of the adaptors comprise one or more of the following sequences: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) and P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the non-homologous end joining factors are LigD and Ku, or an engineered variant thereof.
  • the LigD and Ku are from, or derived from, from Mycobacterium.
  • a non-homologous end joining factor is encoded by a polypeptide that has a sequence that is at least 80%, 85%, 90%, 95%, 98%, 99% identical to SEQ ID NO: 1 to 20 and has LigD activity.
  • a non-homologous end joining factor is encoded by a polypeptide that has a sequence that is at least 80%, 85%, 90%, 95%, 98%, 99% identical to SEQ ID NO:21 to 30 and has Ku activity.
  • the engineered variant of LigD lacks exonuclease activity.
  • the engineered variant has the sequence of SEQ ID NO: 1 with the following substitution H373 A.
  • the disclosure also provides a method to append an adaptor to the 5’ end of a polynucleotide, comprising the steps of: (1) hybridizing a 5’ flap adaptor to a single stranded polynucleotide to form a hybridized product comprising a 5’ flap; (2) contacting the hybridized product with a structure-specific endonuclease that has 5’ flap cleavage activity, wherein the structure-specific endonuclease cleaves off the 5’ flap of the hybridized product to form a nicked hybridized product; and (3) contacting the nicked hybridized product with a ligase to form a ligated product comprising a 5’ flap adaptor appended to the 5’ end of the polynucleotide.
  • the method further comprises appending a second adaptor to the 3’ end of the polynucleotide, comprising the steps of: (4) hybridizing a 3’ flap adaptor to the polynucleotide of (3) to form a second hybridized product comprising a 3’ flap; (5) contacting the second hybridized product with a second structure-specific endonuclease that has 3’ flap cleavage activity, wherein the second structure-specific endonuclease cleaves off the 3’ flap to form a clipped hybridized product that has a 3’ overhang of base pairs from the 3’ flap adaptors; and (6) contacting the clipped hybridized product with a polymerase, wherein the polymerase fills in the 3’ overhang of the clipped hybridized product to form a polynucleotide comprising adaptors at the 5’ and 3’ ends.
  • the disclosure also provides a method of appending an adaptor to the 3’ end of a polynucleotide, comprise the steps of: (A) hybridizing a 3’ flap adaptor to a single stranded polynucleotide to form a hybridized product comprising a 3’ flap; (B) contacting the hybridized product with a second structurespecific endonuclease that has 3’ flap cleavage activity, wherein the second structure-specific endonuclease cleaves off the 3’ flap to form a clipped hybridized product that has a 3’ overhang of base pairs from the 3’ flap adaptor; and (C) contacting the clipped hybridized product with a polymerase, wherein the polymerase fills in the 3’ overhang to form a polynucleotide with an adaptor appended to the 3’ end.
  • the method further comprises appending a second adaptor to the 5’ end of the polynucleotide, comprising the steps of: (D) hybridizing a 5’ flap adaptor to the polynucleotide of (C) to form a second hybridized product comprising a 5’ flap; (E) contacting the second hybridized product with a structure-specific endonuclease that has 5’ flap cleavage activity, wherein the structure-specific endonuclease cleaves off the 5’ flap of the second hybridized product to form a nicked hybridized product; and (F) contacting the nicked hybridized product with a ligase to form a ligated product comprising a 5’ flap adaptor appended to the 5’ end of the polynucleotide.
  • the disclosure further provides a method of appending adaptors to the 5’ and 3’ ends of a polynucleotide, comprising: (i) hybridizing a 5’ flap adaptor and a 3’ flap adaptor to a single stranded polynucleotide to form a hybridized product comprising a 5’ flap and a 3’ flap; (ii) contacting the hybridized product with a structure-specific endonuclease that has 5’ flap cleavage activity, wherein the structure-specific endonuclease cleaves off the 5’ flap of the hybridized product to form a nicked hybridized product; (iii) contacting the nicked hybridized product with a ligase to form a ligated product comprising a 5’ flap adaptor appended to the 5’ end of the polynucleotide; (iv) contacting the ligated product with a second structure-specific endonuclease that has 3’ flap cle
  • the 5’ flap adaptor comprises a double stranded adaptor region and a single stranded probe region, wherein the single stranded probe region is complementary to a target sequence of the polynucleotide, and wherein the double stranded adaptor region comprises a universal sequence.
  • the base-pair of the double stranded adaptor region adjacent to the single stranded probe region also matches to the target sequence of the polynucleotide.
  • the universal sequence is a sequence that is commonly used to generate sequence reads using a next generation sequencing platform.
  • the structurespecific endonuclease that has 5’ flap cleavage activity is FEN1.
  • the ligase is ligase selected from T4 DNA ligase, T7 DNA ligase, and Hi-T4 DNA ligase.
  • the 3’ flap adaptor comprises a single stranded adaptor region and a single stranded probe region, wherein the single stranded probe region is complementary to a target sequence of the polynucleotide, and wherein the single stranded adaptor region comprises a universal sequence.
  • the universal sequence is a sequence that is commonly used to generate sequence reads using a next generation sequencing platform.
  • the structure-specific endonuclease that has 3’ flap cleavage activity is XPF/MUS81.
  • the 5’ flap adaptor and/or the 3’ flap adaptor comprises a bar code sequence.
  • the polynucleotides comprising 3’ and/or 5’ adaptors come from different genetic or polynucleotide sources and the source of the polynucleotides can be identified based upon the bar code sequence.
  • the disclosure further provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising: an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand that can be removed under denaturing conditions; (b) annealing a replacement oligonucleotide that comprises one or more locked nucleic acids (LNAs) to the polynucleotide comprising the 5' adaptor; (c) extending the polynucleotide comprising the 5' adaptor up to the replacement oligonucleotide using a non-strand displacing polymerase and dNTPs, wherein the extended product comprises a binding region
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the non-transferred strand can be denatured and removed using mild heat followed by a hot wash.
  • the replacement oligonucleotide has a higher Tm than the non-transferred strand.
  • the replacement oligonucleotide partially hybridizes to the same region as the non-transferred strand, resulting in the 5' portion of the polynucleotide being single strand upstream of the replacement oligonucleotide.
  • the non-strand displacing polymerase is selected from a T4-based polymerase, a T7-based polymerase, a Pfu-based polymerase, and a Taq-based polymerase.
  • the template switch oligonucleotide is blocked at its 3' end by having an -OH group, an ‘inverted T’ group, or a dideoxy version of a dNTP.
  • the Tn5 transposome is immobilized on a streptavidin paramagnetic bead.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor-polynucleotide constructs are used as templates for sequencing.
  • the disclosure also provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a replacement oligonucleotide that comprises locked nucleic acids (LNAs) that remains hybridized to the adaptor under moderate denaturing conditions, and a nontransferred strand that can be removed under moderate denaturing conditions; (b) denaturing under moderate denaturing conditions to remove the non-transferred strand and extending the polynucleotide comprising the 5' adaptor up to the replacement oligonucleotide comprising LNAs using a non-strand displacing poly
  • LNAs locked nucleic
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the nontransferred strand is from 15 bp to 20 bp in length.
  • the non- transferred strand can be denatured and removed using mild heat followed by a hot wash, wherein the replacement oligonucleotide remains hybridized to the adaptor under these conditions.
  • the non-strand displacing polymerase is selected from a T4- based polymerase, a T7-based polymerase, a Pfu-based polymerase, and a Taq-based polymerase.
  • the template switch oligonucleotide is blocked at its 3' end by having an -OH group, an ‘inverted T’ group, or a dideoxy version of a dNTP.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor- polynucleotide constructs are used as templates for sequencing.
  • the disclosure also provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand which contains linkage(s) that resist exonuclease activity that remains hybridized to the adaptor strand when the adaptor strand is appended to the 5' end of the polynucleotide; (b) extending the polynucleotide comprising the 5' adaptor with a polymerase with 5’ to 3’ exonuclease activity (“5' exo polymerase”), wherein the 5' exo polymerase digests the hybridized non-appending strand up
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the linkage(s) that resist exonuclease activity are phosphorothioate linkage(s), carbophosphonate linkage(s), pyridylphosphonate (PyrP) functionalized linkage(s), aminomethyl (AMP) or aminoethyl phosphonate (AEP) functionalized linkages, boranophosphate (BP) linkage(s), methylphosphonothioates (MPS) linkage(s), phosphorodithioates (SPS) linkage(s), thiophosphoramidates (NPS) linkage(s), boranomethylphosphonates (BMP) linkage(s), guanidine (GUA) linkage(s), morpholino phosphorodiamidate (PMO) linkage(s), and/or carbamate linkage(s).
  • the linkage(s) that resist exonuclease activity are phosphorothioate linkage(s).
  • the polymerase with 5’ to 3’ exonuclease activity is selected from a Taq-based polymerase and a Bst-based polymerase.
  • the template switch oligonucleotide is blocked at its 3' end by having an -OH group, an ‘inverted T’ group, or a dideoxy version of a dNTP.
  • the Tn5 transposome is immobilized on a streptavidin paramagnetic bead.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor-polynucleotide constructs are used as templates for sequencing.
  • the disclosure also provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand that remains hybridized to the adaptor when the adaptor is appended to the 5' end of the polynucleotide, and wherein 5' adaptor strand and the non-transferred strand comprises a template switch oligonucleotide binding region, wherein a portion of the sequence of the template switch oligonucleotide binding region does not contain one of the four types of nucleobases; (b) extending the polynucleotide comprising the
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the non-strand displacing polymerase is selected from a T4-based polymerase, a T7-based polymerase, a Pfu-based polymerase, and a Taq-based polymerase.
  • the dNTPs and polymerase are removed prior to step (c).
  • the dNTPs and polymerase are removed by using SPRI beads, or by magnetic bead-based washing if the adaptors appended to the 5' end of the polynucleotide are attached to a bead.
  • the strand displacing polymerase is selected from a phi29-based polymerase and a Bst (large fragment)-based polymerase.
  • the template switch oligonucleotide is blocked at its 3' end by having an -OH group, an ‘inverted T’ group, or a dideoxy version of a dNTP.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor-polynucleotide constructs are used as templates for sequencing.
  • the disclosure further provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand that remains hybridized to the adaptor when the adaptor strand is appended to the 5' end of the polynucleotide, and wherein 5' adaptor strand and the non-transferred strand comprises a template switch oligonucleotide binding region, wherein a portion of the sequence of the template switch oligonucleotide binding region does not contain one of the four types of nucleobases; (b) extending the polynucleotide comprising: (a) appending an adapt
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the nonstrand displacing polymerase is selected from a T4-based polymerase, a T7-based polymerase, a Pfu-based polymerase, and a Taq-based polymerase.
  • the dNTPs and polymerase are removed prior to step (e).
  • the dNTPs and polymerase are removed by using SPRI beads, or by magnetic bead-based washing if the adaptors appended to the 5' end of the polynucleotide are attached to a bead.
  • the non-transferred strand is removed using moderate heat to selective denature the strand, or application of a lambda exonuclease that selectively digests oligonucleotides containing a 5’ phosphorylated ends.
  • the strand displacing polymerase is selected from a phi29-based polymerase and a Bst (large fragmentbased polymerase.
  • the template switch oligonucleotide is blocked at its 3' end by having an -OH group, an ‘inverted T’ group, or a dideoxy version of a dNTP.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the polynucleotides are used as templates for sequencing.
  • the disclosure provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand that can be removed under denaturing conditions, wherein the adaptor comprises a complementary template switch binding domain; (b) extending to the ends of the polynucleotide comprising the 5' adaptor with a strand displacing polymerase to form a polynucleotide comprising the 5' adaptor and a complementary 5' adaptor region comprising a template switch binding domain on the 3' end; (c) denaturing and annealing to the template switch binding domain
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the strand displacing polymerase is selected from a phi29-based polymerase, and a Bst (large fragmentbased polymerase.
  • the polymerase that has 3' to 5' exonuclease activity is selected from a pfu-based polymerase, a phi29-based polymerase, and E. coli DNA polymerase II.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor-polynucleotide constructs are used as templates for sequencing.
  • the disclosure also provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand that can be removed under denaturing conditions, wherein the adaptor comprises a complementary template switch binding domain; (b) extending to the ends of the polynucleotide comprising the 5' adaptor with a strand displacing polymerase to form a polynucleotide comprising the 5' adaptor and a complementary 5' adaptor region comprising a template switch binding domain on the 3' end; (c) denaturing and annealing to the template switch binding
  • the adaptor strand may or may not have nucleotide modifications.
  • the adaptor strand comprises one or more LNAs.
  • the strand displacing polymerase is selected from a phi29-based polymerase and a Bst (large fragment)-based polymerase.
  • the structure specific endonuclease is XPF/Mus81.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor-polynucleotide constructs are used as templates for sequencing.
  • the disclosure further provides a method to append adaptors to the 5’ and 3’ ends of polynucleotides to form an adaptor-polynucleotide constructs, comprising: (a) appending an adaptor to the 5' end of a polynucleotide by tagmenting the polynucleotide with a Tn5 transposome comprising an adaptor strand that is transferred to the 5' end of the polynucleotide, and a non-transferred strand that comprises a single stranded 3’ end that has a sequence complementary to an oligonucleotide comprising a 3’ complementary adaptor sequence; (b) annealing an oligonucleotide comprising the 3’ complementary adaptor sequence to the non-transferred strand to form a polynucleotide comprising the 5’ adaptor, the non-transferred strand, and the oligonucleotide comprising the 3’ complementary
  • the non-strand displacing polymerase is selected from a T4-based polymerase, a T7-based polymerase, a Pfu-based polymerase, and a Taq-based polymerase.
  • the Tn5 transposome is immobilized on a solid substrate by the adaptor strand hybridizing to an anchor oligonucleotide attached to the solid substrate.
  • the 5' adaptor and the 3' adaptor have different sequences selected from either P5 or P7: P5: 5' AAT GAT ACG GCG ACC ACC GA 3' (SEQ ID NO: 32) or P7: 5' CAA GCA GAA GAC GGC ATA CGA GAT 3' (SEQ ID NO: 33).
  • the adaptor-polynucleotide constructs are used as templates for sequencing.
  • Figure 1 provides an overview of steps typically used for ligation-based library preparation that does not use PCR.
  • Figure 2 demonstrates the process in prokaryotes of Ku bridging DNA fragments and recruiting LigD for DNA repair.
  • Figure 3 provides an exemplary embodiment of the disclosure showing steps of an improved ligation-mediated library prep of the disclosure, where prokaryotic NHEJ factors Ku and LigD replace T4 ligase.
  • FIG. 4 provides an exemplary embodiment of the disclosure wherein terminal transferase (TdT) is used to generate poly-nucleotide overhangs that could be joined, trimmed, and ligated by LigD.
  • TdT terminal transferase
  • Figure 5 provides a sampling of prokaryotic wild-type sequences for LigD (SEQ ID NOs: 1 to 20).
  • Figure 6 provides a sampling of prokaryotic wild-type sequences for Ku (SEQ ID NOs: 21 to 30).
  • FIG. 8 demonstrates how FEN1 plays a central role in DNA replication both in eukaryotes and prokaryotes.
  • FEN1 functions to remove single stranded 5’ flaps of DNA from Okazaki fragments that are generated on the lagging strand of the DNA replication fork. These flaps form when a primase generates an RNA primer that serves as a primer to extend a new DNA strand; multiple Okazaki fragments are generated and when extending from the 3’ end of one abuts the 5’ end of another, it displaces it to form a flap structure.
  • Figure 9 demonstrates how FEN1 binds to a 5’ flap and cleaves it at its base to leave a nick in the DNA.
  • Figure 10 shows how a ligase seals the nick in DNA, thus generating a contiguous new long strand from the initial Okazaki fragments.
  • Figure 11 presents the preferred substrate for FEN1 which is a structure having a double flap where both the 5’ end of one strand and the 3’ end of the other abutting strand overlap and both ends form flaps and moreover, the 3’ flap is a single nucleotide long.
  • Figure 12 shows an embodiment of an adaptor that has a specific structure which is designed to work with FEN 1.
  • the adaptor comprises two oligonucleotides that when annealed together form a partially double stranded molecule.
  • a ‘probe’ portion of this adaptor is single stranded and complementary to a target of interest in a genome.
  • the double stranded portion of the adaptor comprises a universal sequence that can be, for instance, the sequences of adaptors for a DNA sequence platform. The last base-pair next to the single stranded probe portion may also match the target in the genome DNA.
  • Figure 13 shows that when the adaptor of FIG. 12 is hybridized to the target DNA molecule that has been previously made single stranded, a flap structure forms.
  • Figure 14 demonstrates that the structure of FIG. 13 comprising a 5’ flap from the target DNA and a single nucleotide 3’ flap from the adaptor is a substrate for FEN1, which can cleave, leaving a nick that can be subsequently joined by a ligase. The result is an addition of an adaptor to the target DNA.
  • FIGS. 15A-15B provides illustrations of structures utilized by flap endonucleases in the XPF/MUS81 family of proteins. These flap endonucleases play a role in damage repair caused by UV-light or DNA cross-linking.
  • FIG. 15A Target structures for the XPF Flap endonuclease which comprise a fork in a DNA structure where the two branches of the fork comprise noncomplementary sequences.
  • FIG. 15B The branches can be partially or fully double stranded or contiguous as in the example of a hairpin loop.
  • Figure 16 demonstrates that cleavage by XPF/MUS81 flap endonuclease occurs within a few bases of the commencement of the 3’ flap, generating a nick.
  • Figure 17 provides an embodiment of an exemplary adaptor having a specific structure that can be used with an XPF/MUS81 3’ flap endonuclease.
  • the adaptor comprises, at a minimum, a single oligonucleotide comprising a 3’ sequence complementary to a target of interest in a genome and a 5’ sequence universal sequence that can be, for instance, used with massively parallel sequencing platforms.
  • FIG. 18 demonstrates that when the adaptor of FIG. 17 is hybridized to the target DNA molecule that has been previously made single stranded, a flap structure forms that is a substrate for a XPF/MUS81 3’ flap endonuclease.
  • Figure 19 shows that the flap structure formed in FIG. 18 can be cleaved by a XPF/MUS81 3’ flap endonuclease, leaving a nick that when extended with a polymerase copies the universal adaptor sequence to the target DNA.
  • Figure 20 demonstrates embodiments where an adaptor for FEN1 endonuclease, complementary to a target #1, and an adaptor complementary for XPF/MUS81 endonuclease, complementary to a target #2, as shown in FIG. 13 and FIG. 18, are hybridized to DNA, for example genomic DNA, a structure forms that contains a 5’ flap and a 3’ flap.
  • Figure 21 demonstrates that when FEN1 and a ligase is added to the adaptors of FIG. 20, adaptor #1 will be appended to the 5’ end of the DNA of target #1.
  • Figure 22 demonstrates that when XPF/MUS81 endonuclease and a polymerase is added to the ligated adaptor/target of FIG. 21, the adaptor target will be copied and adaptor #2 will be appended to the 3’ end of the DNA of target #2.
  • Figure 23 demonstrates the possibility that achieving a double-flap structure may require sequential annealing of the individual oligos that comprise the FEN1 structure, such that a longer oligo that contains the probe #1 sequence anneals first to the target #1 followed by annealing of the shorter oligo complementary to the universal sequence of the adaptor.
  • Such differential hybridization can be achieved by methods known to those skilled in the art, for example through design of the probe sequence and the universal sequences with different T m .
  • Figure 24 demonstrates that the methods of the disclosure which utilize Flap endonucleases can be multiplexed to include many targets.
  • Figure 25 demonstrates a tagmentation process to append adaptors to 5' ends of polynucleotide fragments.
  • the transposase enzyme Tn5 fragment polynucleotides and simultaneously appends adaptor sequences to the 5’ ends of the resulting polynucleotide fragments.
  • Figure 26 demonstrates a process to append adaptors to the 3' ends of tagmented polynucleotides.
  • the free 3’ end of polynucleotide fragments can be extended in the presence of a polymerase and dNTPs. Either heat (e.g., > 68°C), or use of a polymerase with strand displacement activity can be used to remove the ‘non-transferred’ strand of the fragment.
  • the complement of the 5’ adaptor polynucleotide fragment is copied, and finally a PCR reaction with two distinct primers e.g., P5-i5-A14 and P7-i7-B15) can be used to enrich for those dsDNA PCR products that have a 5' based adaptor on one end and a 3' based adaptor on the other end.
  • a PCR reaction with two distinct primers e.g., P5-i5-A14 and P7-i7-B15
  • Figure 27 demonstrates an alternate process to append adaptors to the 3' ends of tagmented polynucleotides.
  • a single double-stranded ‘forked’ adaptor is employed in the transposome and a non-displacing polymerase is used at a temperature below the Tm of the nontransferred strand (e.g., ⁇ 55°C) to extend the free 3’ ends of the fragment until it reaches the 5’ end of the ‘non-transferred’ adaptor strand and then a ligase covalently connects the nontransferred strand to the fragment.
  • FIG. 28A DNA is tagmented with a transposome that may or may not have modifications to the transferred strand.
  • the tagmented library is treated to de-anneal and remove the non-transferred strand of the transposome.
  • a replacement oligo that has a higher Tm than the non-transferred strand is hybridized back to the appended adaptors. The replacement oligo does not hybridize in place of all the non-transferred strand, but instead does so partially.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein). If the strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3’ end of the fragment.
  • FIG. 28B A transposome is employed that already has the ‘replacement oligo’ annealed next to the nontransferred strand.
  • the non-transferred strand can be shorter than its standard 19 base Tb5 recognition sequence (for example 16 bases). Following tagmentation and removal of the Tn5, moderate heat can be used to denature the non-transferred strand leaving the ‘LNA containing replacement oligo’ still annealed.
  • a non-displacing polymerase and polymerase reagents can be used in foregoing denaturing step, or alternatively, in a separate step, to extend from the 3 ’ ends of the insert filling in the ends of the insert and extending over the nontransferred strand but stopping when it reaches the hybridized replacement oligo.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein).
  • FIG. 28 presents an example of data generated using the methods embodied in FIG. 28.
  • a transposome was constructed that contained LNA modifications in its transferred strand and was used to tagment genomic DNA.
  • the transposome was immobilized on a streptavidin paramagnetic bead via a 3’ biotin group on an ‘anchor’ oligo.
  • the tagmentation was conducted in the presence of a ligase enzyme and an IlluminaTM Indexing primer P5-i5-A14.
  • This primer hybridized 5’ of the transferred strand and was ligated to it by virtue of the 5’ end of the transferred strand bearing a phosphate moiety by design.
  • the Tn5 protein was then removed by denaturing it with a solution comprising the anionic detergent sodium dodecyl sulfate (SDS). Different SDS concentrations (%) were tested to effect complete removal of the Tn5 protein.
  • SDS anionic detergent sodium dodecyl sulfate
  • a mixture of a non-displacing polymerase (tTaq608), dNTPs, Q5 polymerase and a template switching oligo was added and incubated at 47 °C to de-anneal the non-transferred strand and extend with tTaq608 pol as far as the anchor oligo.
  • the temperature was then raised further (60-70 °C) to the point where the templates were rendered single stranded and no longer attached to the beads.
  • the temperature was then lowered to 42 °C for 1 min to allow the template switch oligo containing the P7-i7-B15 sequences to hybridize.
  • Figure 30 demonstrates additional embodiments of the disclosure to append adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo.
  • a transposome is employed that contains one or more internal modifications in the non-transferred strand that prevents exonuclease digestion, for example a phosphorothioate linkage in the phosphodiester backbone of the oligo.
  • a polymerase with a 5’ to 3’ exonuclease activity is employed to extend from the free 3’ end of the insert.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein).
  • Figures 31A-31B demonstrates additional embodiments of the disclosure to append adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo. (FIG.
  • a non-displacing polymerase and all four dNTPs (dATP, dCTP, dGTP, dTTP) is used to extend from the free 3’ end of the fragment up to the 5’ end of the non-transferred strand.
  • the polymerase and dNTPs are then removed (e.g., purified on SPRI beads, or by magnetic bead-based washing if the adaptors are attached to a bead).
  • a fresh aliquot of a strand displacing polymerase is added and just +three out of the four dNTPs (dCTP, dGTP, dTTP); dATP is absent.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end through the extension step described herein and comprising a sequence (e.g., 5’CTGTCTCTT3’ (SEQ ID NO:32)).
  • strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3’ end of the fragment.
  • a non-displacing polymerase and all four dNTPs (dATP, dCTP, dGTP, dTTP) is used to extend from the free 3’ end of the fragment up to the 5’ end of the non-transferred strand.
  • the polymerase and dNTP are then removed (eg purified on SPRI beads, or by magnetic bead-based washing if the adaptors are attached to a bead).
  • the non-transferred strand is then removed, e.g., by moderate heat to selective denature the strand, or by application of a lambda exonuclease that selectively digests oligos containing 5’ phosphorylated ends (as is the case with non-transferred strands), or by other means known to those skilled in the art.
  • a fresh aliquot of a polymerase is added and just three out of the four dNTPs (dCTP, dGTP, dTTP); dATP is absent.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end through the extension step described herein and comprising a sequence (e.g., 5’CTGTCTCTT3’ (SEQ ID NO:32)).
  • strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3’ end of the fragment.
  • Figure 32 demonstrates an embodiment of the disclosure demonstrating how an adaptor can be appended to the 3’ end following extension by hybridizing an oligo that is partially complementary to the 5’ adaptor but contains further sequences that are unique and not present in the 5’ adaptor.
  • a ligation reaction covalently joins this oligo to the 3’ end of the insert and forms a ‘ Y’ shaped adaptor construct.
  • FIG. 33A demonstrates additional embodiments of the disclosure to append adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo, and data generated therefrom.
  • a single transposome type comprises a P5 transferred-strand and a short non-transferred-strand is used to tagment DNA.
  • a strand displacing polymerase is added to extend from the free 3’ end, create the complement of the P5 transferred-strand (z.e., creates P5’), and displace the short non-transferred strand.
  • the temperature is then elevated to make the fragments single stranded.
  • a P7 template switch oligo hybridizes forming a forked structure that is partially double-stranded. Then in the presence of a polymerase with 3’ exo activity, the single stranded 3’ end is degraded by this activity until there is no longer any single stranded 3’ end. The remaining 3’ end of the fragment then forms a primer template that extends and creates the complement of the P7 template switch oligo.
  • FIG. 33B Experiments that demonstrate a ‘proof of concept’ of the ‘fork’ -modulated switch in activity from exonuclease to extension activity. Simple P5 transposomes were immobilized on a streptavidin bead by hybridization to an ‘anchor’ oligo and used to tagment DNA.
  • the template switch oligo comprised either a free extendable -OH group at its 3’ end or a non-extendable blocked dideoxyC group at its 3’ end, or a non-extendable ‘inverted T’ blocking group at its 3’ end.
  • the P5’ end of the template is digested and replaced with the P7’.
  • FIG. 33C Images of gel electrophoresis indicating that neither of the two 3 ’blocked template switch oligo were consumed indicating that the block is effective in preventing extension from the 3’ end of the template switch oligo which would result in creating a copy of the entire template.
  • the unblocked template switch oligo produced 1.5x as much product as a result of two mechanisms: (i) ‘fork’ -modulated switch in activity from exonuclease to extension activity to append the P7’ adaptor to the 3’ end of the template, and (ii) extension from the 3’ end of the template switch oligo to append a copy of the template to the P7 template switch oligo.
  • FIG. 33E A simple P5 transposome was immobilized on a streptavidin bead by hybridization to an ‘anchor’ oligo and used to tagment DNA.
  • Figure 34 demonstrates additional embodiments of the disclosure to append adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo.
  • a structure specific endonuclease e.g., XPF/MUS81
  • XPF/MUS81 can be employed as an alternative to exonuclease degradation of the 3’ single stranded of the forked structure following hybridization of the template switch oligo.
  • the endonuclease nicks the double stranded region of the duplex creating a free 3’ end that a polymerase extends and creates the complement of the P7 template switch oligo.
  • library merely refers to a collection or plurality of template molecules, which at their 5' and 3' ends typically comprise added on adaptor sequences.
  • Use of the term “library” to refer to a collection or plurality of template molecules should not be taken to imply that the templates making up the library are derived from a particular source, or that the “library” has a particular composition.
  • use of the term “library” should not be taken to imply that the individual templates within the library must be of different nucleotide sequence or that the templates be related in terms of sequence and/or source.
  • the disclosure encompasses formation of so-called “monotemplate” libraries, which comprise multiple copies of a single type of template molecule, each having added on adaptor sequences at their 5' ends and their 3' ends, as well as “complex” libraries wherein many, if not all, of the individual template molecules comprise different target sequences (as defined below), where each template molecule has added on adaptor sequences at their 5' ends and their 3' ends.
  • complex template libraries may be prepared using the method of the disclosure starting from a complex mixture of target polynucleotides such as (but not limited to) random genomic DNA fragments, cDNA libraries etc.
  • the disclosure also extends to “complex” libraries formed by mixing together several individual “monotemplate” libraries, each of which has been prepared separately using the method of the disclosure starting from a single type of target molecule (ie., a monotemplate).
  • a monotemplate ie., a single type of target molecule
  • more than 50%, or more than 60%, or more than 70%, or more than 80%, or more than 90%, or more than 95% of the individual polynucleotide templates in a complex library may comprise different target sequences.
  • template to refer to individual polynucleotide molecules in the library merely indicates that one or both strands of the polynucleotides in the library are capable of acting as templates for template-dependent nucleic-acid polymerization catalyzed by a polymerase. Use of this term should not be taken as limiting the scope of the invention to libraries of polynucleotides which are actually used as templates in a subsequent enzyme- catalyzed polymerization reaction.
  • the term “unmatched region” refers to a region of the adaptor wherein the sequences of the two polynucleotide strands forming the adaptor exhibit a degree of noncomplementarity such that the two strands are not capable of annealing to each other under standard annealing conditions for a primer extension or PCR reaction.
  • the two strands in the unmatched region may exhibit some degree of annealing under standard reaction conditions for an enzyme-catalyzed ligation reaction, provided that the two strands revert to single stranded form under annealing conditions.
  • a DNA sequencing library is generally formed by ligating adaptor polynucleotide molecules to the 5' and 3' ends of one or more target polynucleotide duplexes (which may be of known, partially known or unknown sequence) to form adaptor-target constructs and then carrying out an initial primer extension reaction in which extension products complementary to both strands of each individual adaptor-target construct are formed.
  • the resulting primer extension products, and optionally amplified copies thereof, collectively provide a library of template polynucleotides.
  • the library of template polynucleotides can then be sequenced using next generation sequencing. To save resources, multiple libraries can be pooled together and sequenced in the same run — a process known as multiplexing.
  • unique index sequences, or “barcodes,” can be added to each library. These barcodes are used to distinguish between the libraries during data analysis.
  • the ends of the amplification products may differ somewhat to the products of the initial primer extension reaction, since the former will be determined in part by the sequence of the PCR primer used to prime synthesis of a polynucleotide strand complementary to the initial primer extension product, whereas the latter will be determined solely by copying of the adaptor sequences at the 3' ends of the adaptortemplate constructs in the initial primer extension.
  • the disclosure provides methods that utilize nonhom ologous end joining factors (nhEJF) to append adaptors to polynucleotides.
  • nhEJF nonhom ologous end joining factors
  • the nhEJF adaptors added onto the double stranded polynucleotides typically comprise a double stranded region of complementary sequence and a single stranded region of sequence mismatch.
  • the nhEJF adaptors have a Y-shape, where the region of sequence mismatch causes the arms of the adaptor to separate from each other.
  • the “doublestranded region” of the nhEJF adaptor is a short double-stranded region, typically comprising 5 or more consecutive base pairs, formed by annealing of the two partially complementary oligonucleotide strands. This term simply refers to a double-stranded region of nucleic acid in which the two strands are annealed and does not imply any particular structural conformation.
  • the nhEJF adaptors instead of having a Y-shape structure, are U- shaped, such that once the nhEJF adaptors are added to the ends of polynucleotides using nhEJFs in methods of the disclosure form a continuous loop at the 5’ and 3’ ends of the templates.
  • the resulting polynucleotides comprising the 5’ and 3’ adaptors can be amplified using rolling circle amplification.
  • the double-stranded region of the nhEJF adaptors it is advantageous for the double-stranded region of the nhEJF adaptors to be as short as possible without loss of function.
  • function in this context is meant that the double-stranded region forms a stable duplex under reaction conditions for the nhEJFs described herein, such that the two strands forming the nhEJF adaptor remain partially annealed during ligation of the nhEJF adaptor to a polynucleotide. It is not absolutely necessary for the doublestranded region to be stable under the conditions typically used in the annealing steps of primer extension or PCR reactions.
  • nhEJF adaptors are added to both ends of each of the polynucleotide.
  • the resulting polynucleotides will be flanked by complementary sequences derived from the double-stranded region of the nhEJF adaptors.
  • the longer the doublestranded region z.e., the complementary sequences of the adaptor-polynucleotide constructs the greater the possibility that the adaptor-polynucleotide construct is able to fold back and base-pair to itself in these regions of internal self-complementarity when annealed for primer extension and/or PCR.
  • the double-stranded region of the nhEJF adaptors comprise 5 base pairs (bps), 6 bps, 6 bps, 7 bps, 8 bps, 9 bps, 10 bps, 11 bps, 12 bps, 13 bps, 14 bps, 15 bps, 16 bps, 17 bps, 18 bps, 19 bps, 20 bps, or a range that includes or is between any two of the foregoing bps.
  • the stability of the double-stranded region of the nhEJF adaptor may be increased, and hence its length potentially reduced, by the inclusion of non-natural nucleotides which exhibit stronger base-pairing than standard Watson-Crick base pairs.
  • two strands of a nhEJF adaptor comprise base pairs that are 100% complementary to a sequence of the polynucleotide. It will be appreciated, however, that one or more nucleotide mismatches may be tolerated within the double-stranded region of the nhEJF adaptor, provided that the two strands are capable of forming a stable duplex under standard ligation conditions.
  • the nhEJF adaptors added onto the double stranded templates using the non-homologous end joining factors in methods of the disclosure comprise double stranded complementary sequences.
  • the resulting adaptor/template molecules can then be amplified by PCR to form the DNA library templates.
  • a splint oligonucleotide can be used to join the ends of polynucleotides comprising adaptors to form a circle.
  • An exonuclease is added to remove all remaining linear single-stranded and double-stranded DNA products. The result is a completed circular DNA template.
  • nhEJF adaptors for use in the methods disclosed herein will generally include a double-stranded region adjacent to the “ligatable” end of the nhEJF adaptor, z.e., the end that is joined to a target polynucleotide using the non-homologous end joining factors in methods of the disclosure.
  • the ligatable end of a nhEJF adaptor may be blunt or, in other embodiments, short 5’ or 3' overhangs of one or more nucleotides may be present to facilitate/promote ligation.
  • the 5' terminal nucleotide at the ligatable end of the nhEJF adaptor should be phosphorylated to enable phosphodiester linkage to a 3' hydroxyl group on the target polynucleotide.
  • Different annealing conditions may be used for a single primer extension reaction not forming part of a PCR reaction (again see Sambrook el al., 2001, Molecular Cloning, A Laboratory Manual, 3rd Ed, Cold Spring Harbor Laboratory Press, Cold Spring Harbor Laboratory Press, NY; Current Protocols, eds Ausubel et al.).
  • Conditions for primer annealing in a single primer extension include, for example, exposure to a temperature in the range of from 30 to 37° C. in standard primer extension buffer. It will be appreciated that different enzymes, and hence different reaction buffers, may be used for a single primer extension reaction as opposed to a PCR reaction. There is no requirement to use a thermostable polymerase for a single primer extension reaction.
  • the nhEJF adaptors comprise a double stranded region and an unmatched region.
  • the lower limit on the length of the unmatched region will typically be determined by function, for example the need to provide a suitable sequence for binding of a primer for primer extension, PCR and/or sequencing.
  • the length of unmatched region in each strand should be 20 nucleotides (nts), 25 nts, 30 nts, 35 nts, 40 nts, 45 nts, 50 nts in length, or have a range of lengths that includes or is between any two of the foregoing nucleotide lengths.
  • the overall length of the two strands forming a nhEJF adaptor will typically be 25 nts, 30 nts, 35 nts, 40 nts, 45 nts, 50 nts, 55 nts, 60 nts, 65 nts, 70 nts, 75 nts, 80 nts, 85 nts, 90 nts, 95 nts, 100 nts, 105 nts, 110 nts, 115 nts, 120 nts, 125 nts, 130 nts, 135 nts, 140 nts, 145 nts, 150 nts, or a range that is between or includes any two of foregoing nucleotide lengths.
  • the portions of the two strands forming the unmatched region of a nhEJF adaptor should preferably be of similar length, although this is not absolutely essential, provided that the length of each portion is sufficient to fulfil its desired function (e.g., primer binding). It has been shown by experiment that the portions of the two strands forming the unmatched region of a nhEJF may differ by up to 25 nucleotides without unduly affecting adaptor function.
  • portions of the two polynucleotide strands forming an unmatched region of a nhEJF adaptor will be completely mismatched, or 100% noncompl ementary.
  • some sequence “matches”, ie., a lesser degree of noncomplementarity may be tolerated in this region without affecting function to a material extent.
  • the extent of sequence mismatching or non-complementarity is such that the two strands in the unmatched region remain in single-stranded form under annealing conditions as defined above.
  • the precise nucleotide sequence of the nhEJF adaptors is generally not material to the disclosure and may be selected by the user such that the desired sequence elements are ultimately included in the common sequences of the library of polynucleotides comprising the adaptors, e.g., to provide binding sites for particular sets of universal amplification primers and/or sequencing primers (e.g., P7 or P5 primers). Additional sequence elements may be included, for example to provide binding sites for sequencing primers which will ultimately be used in sequencing of template molecules in the library, or products derived from amplification of the template library, for example on a solid support.
  • the nhEJF adaptors may further include “bar code” sequences, which can be used to bar code polynucleotides derived from a particular source.
  • the sequences of the individual strands in the unmatched region should be such that neither individual strand exhibits any internal self-complementarity which could lead to self-annealing, formation of hairpin structures, etc., under standard annealing conditions. Selfannealing of a strand in the unmatched region is to be avoided as it may prevent or reduce specific binding of an amplification primer to this strand.
  • nhEJF adaptors are preferably formed from two strands of DNA, but may include mixtures of natural and non-natural nucleotides (e.g., one or more ribonucleotides) linked by a mixture of phosphodiester and non-phosphodiester backbone linkages.
  • Other non-nucleotide modifications may be included such as, for example, biotin moieties, blocking groups and capture moieties for attachment to a solid surface, as discussed in further detail below.
  • polynucleotides to which the adaptors are appended to may be a polynucleotide that can be used with additional methodologies, including amplification by solidphase PCR, next generation sequencing, subcloning, etc.
  • Polynucleotides in which nhEJF adaptors are appended to may originate in double-stranded DNA form (e.g., genomic DNA fragments) or may have originated in single-stranded form, as DNA or RNA, and been converted to dsDNA form prior to ligation.
  • mRNA molecules may be copied into double-stranded cDNAs suitable for use with nhEJF adaptors disclosed herein.
  • polynucleotides The precise sequence of the polynucleotides is generally not material to the disclosure, and may be known or unknown. Modified polynucleotides including polynucleotides comprising non-natural nucleotides and/or non-natural backbone linkages could also be utilized in the methods of the disclosure, provided that the modifications do not preclude adding on nhEJF adaptors and/or copying in a primer extension reaction.
  • the non-homologous end joining factors and methods of the disclosure can be used with a single polynucleotide, or can be used with a mixture or plurality of polynucleotides.
  • the non-homologous end joining factors in the methods of the disclosure may be used with multiple copies of the same polynucleotides (z.e., monotemplates) or with mixtures of different polynucleotides.
  • the polynucleotides may differ from each other with respect to nucleotide sequence over the full length of the polynucleotide or only a part of the polynucleotide.
  • a nhEJF-based method disclosed herein may be applied to a plurality of polynucleotides derived from a common source, for example a library of genomic DNA fragments derived from a particular individual.
  • the target polynucleotides will comprise random fragments of human genomic DNA.
  • the fragments may be derived from a whole genome or from part of a genome (e.g., a single chromosome or sub-fraction thereof), and from one individual or several individuals.
  • the polynucleotides may be treated chemically or enzymatically either prior to, or subsequent to the ligation of the nhEJF adaptor sequences. Techniques for fragmentation of genomic DNA include, for example, enzymatic digestion or mechanical shearing.
  • “Ligation” of nhEJF adaptors to 5' and 3' ends of each polynucleotide involves joining of the two polynucleotide strands of the nhEJF adaptor to double-stranded target polynucleotide such that covalent linkages are formed between both strands of the two double- stranded molecules.
  • joining means covalent linkage of two polynucleotide strands which were not previously covalently linked.
  • such “joining” will take place by formation of a phosphodiester linkage between the two polynucleotide strands but other means of covalent linkage (e.g., non-phosphodiester backbone linkages) may be used.
  • covalent linkages formed in the ligation reactions should allow for read-through of a polymerase, such that the resultant construct can be copied in a primer extension reaction using primers which binding to sequences in the regions of the adaptor-target construct that are derived from the nhEJF adaptor molecules.
  • the ligation reactions will typically be enzyme-catalyzed.
  • the ligation reactions will be by the non-homologous end joining factors of the disclosure.
  • Non-enzymatic ligation techniques e.g., chemical ligation
  • the non-enzymatic ligation leads to the formation of a covalent linkage which allows read-through of a polymerase, such that the resultant construct can be copied in a primer extension reaction.
  • the desired products of the ligation reaction are adaptor-target constructs in which nhEJF adaptors are ligated at both ends of each target polynucleotide, given the structure adaptor-polynucleotide-adaptor.
  • Conditions of the ligation reaction should therefore be optimized to maximize the formation of this product, in preference to targets having an adaptor at one end only.
  • the products of the ligation reaction may be subjected to purification steps in order to remove unbound nhEJF adaptor molecules before the adaptor-polynucleotide constructs are processed further. Any suitable technique may be used to remove excess unbound nhEJF adaptors, examples of which will be described in further detail below.
  • Adaptor-polynucleotides constructs formed in the ligation reaction as discussed above are then subject to an initial primer extension reaction in which a primer oligonucleotide is annealed to an adaptor portion of each of the adaptor-polynucleotide constructs and extended by sequential addition of nucleotides to the free 3' hydroxyl end of the primer to form extension products complementary to at least one strand of each of the adaptor-target constructs.
  • the term “initial” primer extension reaction refers to a primer extension reaction in which primers are annealed directly to the adaptor-polynucleotide constructs, as opposed to either complementary strands formed by primer extension using the adaptor-polynucleotide construct as a template or amplified copies of the adaptor-polynucleotide construct.
  • the initial primer extension reaction is carried out using a “universal” primer which binds specifically to a cognate sequence within an adaptor portion of the adaptor-polynucleotide construct, and is not carried out using a target-specific primer or a mixture of random primers.
  • the use of an adaptor-specific primer for the initial primer extension reaction is key to formation of a library of polynucleotides which have common sequence at the 5' and common sequence at the 3' end.
  • the primers used for the initial primer extension reaction will be capable of annealing to each individual strand of adaptor-polynucleotide constructs having adaptors ligated at both ends, and can be extended so as to obtain two separate primer extension products, one complementary to each strand of the construct.
  • the initial primer extension reaction will result in formation of primer extension products complementary to each strand of each adaptortarget
  • the primer used in the initial primer extension reaction will anneal to a primer-binding sequence (in one strand) in the unmatched region of the adaptor.
  • annealing refers to sequence-specific binding/hybridization of the primer to a primer-binding sequence in an adaptor region of the adaptor-target construct under the conditions to be used for the primer annealing step of the initial primer extension reaction.
  • the products of the primer extension reaction may be subjected to standard denaturing conditions in order to separate the extension products from strands of the adaptor- polynucleotide constructs.
  • the strands of the adaptor-polynucleotide constructs may be removed at this stage.
  • the extension products (with or without the original strands of the adaptor-target constructs) collectively form a library of template polynucleotides which can be used, e.g., as templates for solid-phase PCR.
  • the initial primer extension reaction may be repeated one or more times, through rounds of primer annealing, extension and denaturation, in order to form multiple copies of the same extension products complementary to the adaptor-target constructs.
  • the initial extension products may be amplified by conventional solution-phase PCR, as described in further detail below.
  • the products of such further PCR amplification may be collected to form a library of templates comprising “amplification products derived from” the initial primer extension products.
  • both primers used for further PCR amplification will anneal to different primerbinding sequences on opposite strands in the unmatched region of the adaptor.
  • Other embodiments may, however, be based on the use of a single type of amplification primer which anneals to a primer-binding sequence in the double-stranded region of the adaptor.
  • the “initial” primer extension reaction occurs in the first cycle of PCR.
  • inclusion of an initial primer extension step (and optionally further rounds of PCR amplification) to form complementary copies of the adaptor-target constructs is advantageous, for several reasons. Firstly, inclusion of the primer extension step, and subsequent PCR amplification, acts as an enrichment step to select for adaptor-target constructs with adaptors ligated at both ends. Only target constructs with adaptors ligated at both ends provide effective templates for whole genome or solid-phase PCR using common or universal primers specific for primer-binding sequences in the adaptors, hence it is advantageous to produce a template library comprising only double-ligated targets prior to solidphase or whole genome amplification.
  • the method disclosed herein to make a template library is PCR-free.
  • PCR-free By being PCR-free, there is reduced library bias and gaps, due to preferential enrichment of certain adaptor/template constructs over others. The result is high data quality and optimal variant detection across the genome.
  • inclusion of the initial primer extension step, and subsequent PCR amplification permits the length of the common sequences at the 5' and 3' ends of the target to be increased prior to solid-phase PCR or sequencing.
  • Inclusion of the primer extension (and subsequent amplification) steps means that the length of the common sequences at one (or both) ends of the polynucleotides in the template library can be increased after ligation by inclusion of additional sequence at the 5' ends of the primers used for primer extension (and subsequent amplification).
  • the use of such “tailed” primers is described in further detail below.
  • FIG. 1 illustrates a process standardly used to generate a template library for sequencing.
  • Next generation sequencing typically requires library preparation, where known adaptor DNA sequences are added to the target DNA to be sequenced. Traditionally, this requires that sample DNA is fragmented, end-repaired, and then ligated to the adaptor DNA (e.g., see FIG. 1).
  • This library preparation is common to all major sequencing platforms, including those from IlluminaTM, Pacific BiosciencesTM, and Oxford NanoporeTM.
  • ligation-mediated library prep is currently the only option for library preparation.
  • the starting DNA is fragmented, and the fragments purified.
  • An end repair reaction is then performed with T4 Polynucleotide Kinase, rATP, and T4 DNA polymerase, dNTP, to form blunt ended double stranded templates.
  • an A-tailing reaction is performed with Klenow exo-, dNTP.
  • the adaptor is formed by annealing two single-stranded oligonucleotides prepared by conventional automated oligonucleotide synthesis.
  • the oligonucleotides are partially complementary such that the 3' end of a first oligonucleotide is complementary to the 5' end of a second oligonucleotide.
  • the 5' end of the first oligonucleotide and the 3' end of second oligonucleotide are not complementary to each other.
  • the resulting structure is double stranded at one end (the double-stranded region) and single stranded at the other end (the unmatched region) and is referred to herein as a “Y-shaped adaptor” (see FIG. 1).
  • the double-stranded region of the Y- shaped adaptor may be blunt-ended (see FIG. 1) or it may have an overhang. In the latter case, the overhang may be a 3' overhang or a 5' overhang, and may comprise a single nucleotide or more than one nucleotide.
  • the Y-shaped adaptor is phosphorylated at its 5' end and the double-stranded portion of the duplex contains a single base 3' overhang comprising a ‘T’ deoxynucleotide (see FIG. 1).
  • the adaptors are then ligated using T4 Ligase, rATP, to the ends of double stranded template molecules containing a single base 5’ overhand of an ‘A’ nucleotide (see FIG. 1).
  • FIG. 2 illustrates how in prokaryotes, e.g., Mycobacterium, the Ku protein bridges DNA fragments and recruits an ATP dependent DNA ligase (LigD) for DNA repair.
  • NHEJ non-homologous end joining
  • NHEJ is a pathway that repairs double-strand breaks in DNA.
  • NHEJ is referred to as "non-homologous" because the break ends are directly ligated without the need for a homologous template, in contrast to homology directed repair, which requires a homologous sequence to guide repair.
  • the non-homologous end joining (NHEJ) pathway requires only two factors: Ku and LigD.
  • Ku recognizes and binds free ends of double-stranded DNA (e.g., two fragments of DNA) and joins the two ends to form a DNA bridge complex e.g., see FIG. 2).
  • LigD uses its polymerase and exonuclease domains to repair the ends of the DNA. As it also possesses an ATP-dependent ligase domain, LigD then ligates the DNA.
  • Ku actively bringing the ends of the DNA together and recruiting LigD, the ligation conversion is boosted by as much as 30-fold, compared to reactions with LigD alone.
  • FIG. 3 provides an embodiment of a method of the disclosure which utilizes nonhom ologous end joining factors to ligate adaptors to double stranded template DNA.
  • ligation-mediated library prep can yield the highest quality genomes, the conversion of sample DNA to library DNA can be inefficient. In cases where the quantity of sample DNA is in short supply, this poor efficiency makes ligation-mediated library prep more challenging or even infeasible.
  • Typical ligation-mediated library prep methods employ ligases that in nature serve to ligate nicked DNA. That is, their intended purpose is not to join and ligate two strands or ends of DNA, as is required by the library prep method.
  • the ligation-mediated methods disclosed herein employ the use of prokaryotic end joining and repair factors for the ligation of two ends of DNA.
  • the in vitro end-repair and A-tailing steps of traditional library prep is employed.
  • one uses Ku and LigD e.g., see FIG. 3
  • the LigD’s wild-type nuclease activity is unneeded and so a nuclease deficient mutant can be used (e.g., Mycobacterium tuberculosis LigD H373A).
  • DNA e.g., gDNA
  • cDNA is fragmented into small molecules, typically less than 1000 base pairs in length. Fragmentation of DNA may be achieved by a number of methods including: enzymatic digestion, chemical cleavage, sonication, nebulization, or hydroshearing. Fragmented DNA may be made blunt-ended by a number of methods known to those skilled in the art. As shown in FIG. 3, the ends of the fragmented DNA are end repaired and phosphorylated using T4 DNA polymerase, dNTP, and T4 polynucleotide Kinase, rATP.
  • a single ‘A’ deoxynucleotide is then added to both 3' ends of the DNA molecules using Klenow exo- enzyme, dATP, producing a one-base 3' overhang that is complementary to the one-base 3- ‘T' overhang on the double-stranded end of the Y-shaped nhEJF adaptor.
  • a ligation reaction between the Y-shaped nhEJF adaptor and the DNA fragments is then performed using Ku exo-(lacking exonuclease activity) and LigD, ATP, which joins two copies of the adaptor to each DNA fragment, one at either end, to form adaptor-polynucleotide constructs.
  • the products of this reaction can be purified from unligated nhEJF adaptor by a number of means, including size-exclusion chromatography.
  • nhEJF adaptor After the excess nhEJF adaptor has been removed, unligated target DNA remains in addition to ligated adaptor-polynucleotide constructs and this can be removed by selectively capturing only those target DNA molecules that have adaptor(s) attached.
  • the presence of a biotin group on the 5' end of the adaptors enables any target DNA ligated to the adaptor to be captured on a surface coated with streptavidin, a protein that selectively and tightly binds biotin.
  • Streptavidin can be coated onto a surface using well developed chemistries.
  • commercially available magnetic beads e.g., DynabeadsTM
  • streptavidin can be used to capture ligated adaptor-target constructs.
  • the application of a magnet to the side of a vessel containing these beads immobilizes them such that they can be washed free of the unligated target DNA molecules.
  • FIG. 4 provides another embodiment of a method of the disclosure which utilizes non-homologous end joining factors to ligate nhEJF adaptors to double stranded template DNA.
  • DNA e.g., gDNA
  • cDNA is fragmented into small molecules, typically less than 1000 base pairs in length. Fragmentation of DNA may be achieved by a number of methods including: enzymatic digestion, chemical cleavage, sonication, nebulization, or hydroshearing. Fragmented DNA may be made blunt-ended by a number of methods known to those skilled in the art. As shown in FIG.
  • T4 DNA polymerase dNTP
  • T4 polynucleotide Kinase rATP
  • Terminal transferase TdT
  • Ku and LigD Terminal transferase
  • unligated target DNA remains in addition to ligated adaptor-target constructs and this can be removed by selectively capturing only those target DNA molecules that have adaptor attached.
  • a biotin group on the 5' end of the adaptors enables any target DNA ligated to the adaptor to be captured on a surface coated with streptavidin, a protein that selectively and tightly binds biotin.
  • Streptavidin can be coated onto a surface using well developed chemistries.
  • commercially available magnetic beads e.g., DynabeadsTM
  • DynabeadsTM that are coated in streptavidin can be used to capture ligated adaptor-target constructs. The application of a magnet to the side of a vessel containing these beads immobilizes them such that they can be washed free of the unligated target DNA molecules.
  • non-homologous end joining factors like Ku and LigD
  • the non-homologous end joining factors can be used to add nhEJF adaptors to double stranded template DNA for library preparation.
  • the disclosure further provides for engineered variants of Ku and/or LigD including, but not limited to, to increase enzyme stability, to suppress exonuclease activity, or to increase enzymatic activity.
  • the LigD ligase domain can be replaced with another ligase e.g., T4 ligase), forming a fusion of LigD’ s polymerase and nuclease domains with the chosen ligase. This allows the fusion ligase to be recruited by Ku to DNA ends.
  • a LigD exonuclease deficient mutant can be used when this nuclease activity is not desired.
  • the disclosure provides for polypeptides that exhibit non-homologous end joining factor activity.
  • the polypeptide may encode a wild-type enzyme, a homolog thereof or encode an engineered variant of the wild-type enzyme.
  • FIG. 5 provides a sampling of wild-type sequences for LigD (see SEQ ID NO: 1 to 20).
  • FIG. 6 provides a sampling of wild-type sequences for Ku (see SEQ ID NO:21 to 30).
  • the disclosure provides for a polypeptide that has a sequence that is at least 80%, 85%, 90%, 95%, 98%, or 100% identical to any one of SEQ ID NO:1 to 20.
  • the disclosure provides for a polypeptide that has a sequence that is at least 80%, 85%, 90%, 95%, 98%, or 100% identical to any one of SEQ ID NO:21 to 30.
  • the polypeptides can encode LigD that exonuclease activity is suppressed by an appropriate substitution(s) in the exonuclease domain of LigD.
  • An example of such a substitution includes, H373A of SEQ ID NO:1.
  • substitutions are contemplated and can be quickly determined by in silico methods.
  • homologs used with respect to an original enzyme or gene of a first family or species refers to distinct enzymes or genes of a second family or species which are determined by functional, structural or genomic analyses to be an enzyme or gene of the second family or species which corresponds to the original enzyme or gene of the first family or species. Most often, homologs will have functional, structural or genomic similarities. Techniques are known by which homologs of an enzyme or gene can readily be cloned using genetic probes and PCR. Identity of cloned sequences as homolog can be confirmed using functional assays and/or by genomic mapping of the genes.
  • a protein has "homology” or is “homologous” to a second protein if the nucleic acid sequence that encodes the protein has a similar sequence to the nucleic acid sequence that encodes the second protein.
  • a protein has homology to a second protein if the two proteins have "similar” amino acid sequences. (Thus, the term “homologous proteins” is defined to mean that the two proteins have similar amino acid sequences).
  • two proteins are substantially homologous when the amino acid sequences have at least about 30%, 40%, 50% 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity.
  • the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non- homologous sequences can be disregarded for comparison purposes).
  • amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid "homology”).
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.
  • a “conservative amino acid substitution” is one in which an amino acid residue is substituted by another amino acid residue having a side chain (R group) with similar chemical properties (e.g., charge or hydrophobicity).
  • R group side chain
  • a conservative amino acid substitution will not substantially change the functional properties of a protein.
  • the percent sequence identity or degree of homology may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art (see, e.g., Pearson et al., 1994, hereby incorporated herein by reference).
  • a "conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain.
  • Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine).
  • the following six groups each contain amino acids that are conservative substitutions for one another: 1) Serine (S), Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine (V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).
  • Sequence homology for polypeptides is typically measured using sequence analysis software. See, e.g., the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wis. 53705. Protein analysis software matches similar sequences using measure of homology assigned to various substitutions, deletions and other modifications, including conservative amino acid substitutions. For instance, GCG contains programs such as "Gap” and "Bestfit” which can be used with default parameters to determine sequence homology or sequence identity between closely related polypeptides, such as homologous polypeptides from different species of organisms or between a wild type protein and a mutein thereof.
  • GCG Genetics Computer Group
  • Bestfit programs
  • BLAST Altschul, 1990; Gish, 1993; Madden, 1996; Altschul, 1997; Zhang, 1997), especially blastp or tblastn (Altschul, 1997).
  • Typical parameters for BLASTp are: Expectation value: 10 (default); Filter: seg (default); Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Max. alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62.
  • polypeptide sequences can be compared using FASTA, a program in GCG Version 6.1.
  • FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences (Pearson, 1990, hereby incorporated herein by reference).
  • percent sequence identity between amino acid sequences can be determined using FASTA with its default parameters (a word size of 2 and the PAM250 scoring matrix), as provided in GCG Version 6.1, hereby incorporated herein by reference.
  • the disclosure further provides methods using structure specific endonucleases (SSEs) for appending flap adaptors to polynucleotides.
  • SSEs structure specific endonucleases
  • These SSE-based methods can be used to selectively add flap adaptors in a sequence specific manner, thereby providing for enrichment of targeted polynucleotides that have a specific sequence (z.e., target enrichment).
  • Traditional methods for target enrichment broadly fall into two categories: amplicon based or probe-based hybridization/pulldown.
  • the former employs primer pairs and PCR to amplify targets from a sample; it is simple and fast but limited in its ability to multiplex very high numbers of targets due to PCR mispriming events. It is also restricted in the size of amplicon that can be produced due to the limits of current PCR technology.
  • Other disadvantages of PCR such as sequence bias or polymerase slippage can also impact the performance scope.
  • Hybridization methods involve making NGS library templates first then using a probe to pull out library templates that cover the target region of a genome of interest. This approach is generally longer in practice than the amplicon methods but is virtually limitless in the number of targets that can be enriched. Poorer specificity arising from hybridization of a single probe only, is mitigated by additional rounds of pulldown and/or increasing the probe length and Tm. [0121] In general, amplicon workflows are used for small panels of targets whereas hybridization workflows can be used for exome enrichment.
  • the disclosure for the creation of target enriched polynucleotide library by methods using structure-specific nucleases. Accordingly, the disclosure provides an alternative methodology for creating target enriched libraries than known methods used in the art.
  • the methods of disclosure provide increased specificity over conventional probebased hybridization/pulldown methods, by employing two probes flanking the target instead of one. Unlike amplicon-based methods, the methods of the disclosure have limitless multiplexity.
  • a pre-generated library is not required and a target of any size and sequence can be enriched. While it is similar to Crispr/Cas9 approaches in that it employs two cleavage events on either side of a target sequence, it does, unlike Crispr/Cas, append adaptors either side of the target sequence.
  • the resulting product comprising adaptors can be used to seed a flow cell directly or it can be further amplified by PCR if required. Amplification proceeds through primers that bind to the flanking adaptor sequences and thus is advantageous over multiplex PCR where amplification utilizes gene specific primers and efficiency varies between target amplicons.
  • Structure-specific nucleases are a class of DNA binding/modifying enzymes that target structures in nucleic acids in vivo rather than sequences. These structures comprise deviations to the contiguous double helix structure that usually arise, for example, during DNA replication or in the process of damage repair. Structures such as Holiday Junctions, replication forks, or single-stranded flaps require enzymes to resolve their topology to ultimately restore the canonical structure of the genome. Examples of structure-specific nucleases include, but are not limited to, Holliday junction resolvases, and flap endonucleases.
  • the disclosure provides for the creation of target enriched template library by the use of structurespecific nucleases, wherein the structure-specific nucleases comprise flap endonucleases.
  • Flap endonuclease enzymes target junctions in DNA where a single-stranded stretch of DNA protrudes from the double-helix.
  • a flap may be described as a 3’ flap or a 5’ flap depending on the polarity of the sequence (e.g., see FIG. 7).
  • FEN1 is an example of a Flap endonuclease that targets and modifies a 5’ flap
  • the XPF/MUS81 family of proteins are examples of Flap endonucleases that target and modify 3’ flap structures.
  • FEN1 plays a central role in DNA replication both in eukaryotes and prokaryotes. It functions to remove single stranded 5’ flaps of DNA from Okazaki fragments that are generated on the lagging strand of the DNA replication fork. These flaps form when a primase generates an RNA primer that serves as a primer to extend a new DNA strand; multiple Okazaki fragments are generated and when the extending 3’ end of one abuts the 5’ end of another, it displaces it to form a flap structure (e.g., see FIG. 8). FEN1 binds to the 5’ flap and cleaves it at its base to leave a nick in the DNA (e.g., see FIG. 9).
  • FEN1 can cleave single-stranded flaps of up to 200 nucleotides in length. It does not cleave single stranded DNA alone, such as the regions of single strands of the parental template strands at the replication fork; it only cleaves ssDNA strands in the structure of a flap.
  • Its preferred substrate structure is a double flap where both the 5’ end of one strand and the 3’ end of the other abutting strand overlap and both ends form flaps and moreover, the 3’ flap is a single nucleotide long (e.g, see FIG. 11).
  • a 5’ flap that contains double stranded regions are inhibitory for flap cleavage, even if the double stranded region is distant from the base of the flap.
  • the disclosure provides compositions, methods, and kits directed to the use of FEN1 with a 5’ flap adaptor having a specific structure that is recognized by FENl(e.g., see FIG. 12).
  • the 5’ flap adaptor comprises two oligonucleotides that when annealed together form a partially double stranded molecule.
  • a ‘probe’ portion of this 5’ flap adaptor is single stranded and complementary to a targeted sequence.
  • the double stranded portion of the 5’ flap adaptor comprises a universal sequence that can be, for instance, the sequences of adaptors for NGS. The last base-pair next to the single stranded probe portion may also match a targeted sequence.
  • a flap structure forms (e.g., see FIG. 13).
  • the structure comprising a 5’ flap from the target DNA and a single nucleotide 3’ flap from the adaptor is a substrate for FEN1, which can cleave, leaving a nick that can be subsequently joined by a ligase.
  • the result is an addition of an adaptor to the 5’ end of a polynucleotide (e.g., see FIG. 14).
  • Flap endonucleases in the XPF/MUS81 family of proteins play a role in damage repair caused by UV-light or DNA cross-linking. Its target is illustrated in FIG.
  • the disclosure provides embodiments directed to the utilization of XPF/MUS81 3’ flap endonuclease activity in conjunction with a 3’ flap adaptor having a specific structure (e.g., see FIG. 17).
  • the 3’ flap adaptor comprises, at a minimum, a single oligonucleotide comprising a 3’ sequence complementary to a target of interest in a genome and a 5’ sequence universal sequence that can be, for instance, the sequences of adaptors for NGS.
  • the 5’ universal sequence may be double-stranded.
  • the XPF/MUS81 3’ flap endonuclease then cleaves the DNA, leaving a nick that when extended with a polymerase copies the universal adaptor sequence to the target DNA (e.g., see FIG. 19).
  • the 5’ flap adaptor, complementary to a target #1, and the 3’ flap adaptor, complementary to a target #2, as outlined above are hybridized to DNA, for example genomic DNA, a structure forms that contains a 5’ flap and a 3’ flap (e.g., see FIG. 20).
  • the genomic DNA is fragmented to a suitable size such that the 5’ flap is less than 200 nucleotides long.
  • the sample can be applied directly to a sequencer for sequencing of the DNA intervening the target sequences.
  • the adaptor sequences can be used to append additional sequences, such as step-out primers.
  • the 5’ flap adaptor illustrated in FIG. 12 is shown hybridized already in a double-flap structure in FIG. 13. In practice, achieving this double-flap structure may require sequential annealing of the individual oligos that comprise the FEN1 structure, such that the longer oligo that contains the probe #1 sequence anneals first to the target #1 followed by annealing of the shorter oligo complementary to the universal sequence of the adaptor.
  • differential hybridization can be achieved by methods known to those skilled in the art, for example through design of the probe sequence and the universal sequences with different T m (e.g., see FIG. 23).
  • the probe can have a lower T m than the universal adaptor sequence such that at a particular temperature the probe, but not the universal adaptor sequence, anneals first; lowering the temperature then enables the shorter universal adaptor oligo to anneal forming a structure illustrated in FIG. 13.
  • the disclosure provides methods that utilizes a structure-specific endonuclease that has 5’ flap cleavage activity and a 5’ flap adaptor in order to append an adaptor to the 5’ end of a polynucleotide.
  • the 5’ flap adaptor is hybridized to a complementary sequence of a single stranded polynucleotide.
  • the 5’ flap adaptor comprises a first oligonucleotide that has a single stranded region that can hybridize to a target sequence of a polynucleotide.
  • the length of the single stranded region can vary but should be of sufficient length to bind with high fidelity to a targeted sequence.
  • the single stranded region of the 5’ flap adaptor that can hybridize to a targeted sequence can comprise 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, 31 nt, 32, nt, 33 nt, 34 nt, 35 nt, or a range of nucleotide lengths between or including any two of the foregoing nucleotide lengths.
  • the first oligonucleotide of the 5’ flap adaptor further comprises a single stranded region that codes for a universal sequence
  • the universal sequence is not complementary (z.e., cannot hybridize) to the sequence of the polynucleotide.
  • An example of a universal sequence includes, but is not limited to, a sequence commonly used for NGS applications, such a P5 or P7 sequence.
  • the single stranded region that codes for a universal sequence can comprise 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22, nt, 23 nt, 24 nt, 25 nt, or a range of nucleotide lengths between or including any two of the foregoing nucleotide lengths.
  • the single stranded region that codes for a universal sequence can further comprise one or more barcode sequences, allowing for identification of the source of the polynucleotide if multiple sources of polynucleotides are being multiplexed in the same reaction.
  • the first oligonucleotide of the 5’ flap adaptor can be hybridized directly to the target sequence of the polynucleotide and then a second oligonucleotide of the 5’ flap adaptor can be hybridized to the universal sequence of the first oligonucleotide of the 5’ flap adaptor.
  • the second oligonucleotide of the 5’ flap adaptor comprises a sequence that is complementary to the universal sequence of the first oligonucleotide of the 5’ flap adaptor.
  • the second oligonucleotide of the 5’ flap adaptor may further comprise a base pair on the 3’ end that is complementary to a base pair to the single stranded region of the first oligonucleotide that hybridizes to the target sequence of the polynucleotide.
  • the second oligonucleotide of 5’ flap adaptor may be hybridized to the first oligonucleotide of the 5’ flap adaptor so that there is a double stranded region comprising base pairs for the universal sequence, and a single stranded region from the first oligonucleotide that can hybridize with a target sequence from a polynucleotide.
  • the 5’ flap adaptor is bound to the target sequence of the polynucleotide a 5’ flap is generated in the polynucleotide.
  • a 1 base pair 3’ flap may also be generated if the second oligonucleotide comprises a base pair on the 3’ end that is complementary to a base pair of the single stranded region of the first oligonucleotide that hybridizes to a target sequence of the polynucleotide.
  • the generation of 5’ flap, or the 5’ flap and 1 bp 3’ flap of the polynucleotide- adaptor hybridized construct is then recognized by a structure-specific endonuclease that has 5’ flap cleavage activity.
  • the structure-specific endonuclease binds the polynucleotide-adaptor construct and cleaves off the 5’ flap structure and forms a nick in the polynucleotide-adaptor hybridized construct. This nick may then be closed by use of a ligase.
  • the end result is the 5’ flap adaptor being appended to the polynucleotide, such that the polynucleotide now contains a sequence for the universal adaptor.
  • the disclosure provides methods that utilizes a structurespecific endonuclease that has 3’ flap cleavage activity and a 3’ flap adaptor in order to append an adaptor to the 3’ end of a polynucleotide.
  • the 3’ flap adaptor is hybridized to a complementary sequence of a single stranded polynucleotide.
  • the 3’ flap adaptor comprises an oligonucleotide that has a single stranded region that can hybridize to a target sequence of a polynucleotide. The length of the single stranded region can vary but should be of sufficient length to bind with high fidelity to a targeted sequence.
  • the single stranded region of the 3’ flap adaptor that can hybridize to a targeted sequence can comprise 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, 31 nt, 32, nt, 33 nt, 34 nt, 35 nt, or a range of nucleotide lengths between or including any two of the foregoing nucleotide lengths.
  • the oligonucleotide of the 3’ flap adaptor further comprises a single stranded region that codes for a universal sequence, the universal sequence is not complementary (z.e., cannot hybridize) to the sequence of the polynucleotide.
  • a universal sequence includes, but is not limited to, a sequence commonly used for NGS applications, such a P5 or P7 sequence.
  • the universal sequence of the 3’ flap adaptor may be the same as the universal sequence of the 5’ flap adaptor, or alternatively be different.
  • the single stranded region that codes for a universal sequence can comprise 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22, nt, 23 nt, 24 nt, 25 nt, or a range of nucleotide lengths between or including any two of the foregoing nucleotide lengths.
  • the single stranded region that codes for a universal sequence can further comprise one or more barcode sequences, allowing for identification of the source of the polynucleotide if multiple sources of polynucleotides are being multiplexed in the same reaction.
  • the oligonucleotide of the 3’ flap adaptor is hybridized directly to a complementary target sequence of the polynucleotide.
  • the 3’ flap adaptor and the 5’ flap adaptor hybridize to the same polynucleotide.
  • the 5’ flap adaptor and the 3’ flap adaptor can be hybridized to the polynucleotide in a concurrent or sequential manner.
  • a 3’ flap is generated in the polynucleotide.
  • the generation of 3’ flap of polynucleotide-adaptor hybridized construct is then recognized by a structure-specific endonuclease that has 3’ flap cleavage activity.
  • the structure-specific endonuclease binds the polynucleotide-adaptor hybridized construct and cleaves off the 3’ flap structure forming a 3’ overhang that comprises the universal sequence region of the 3’ adaptor.
  • the 3’ overhang may be filed with a complementary sequence to the universal sequence of the 3’ adaptor.
  • the disclosure provides methods that utilize tagmentation and template switch oligonucleotides to append adaptors to polynucleotides.
  • Tagmentation is an established workflow for making templates for polynucleotide applications. The process relies on the transposase enzyme Tn5 fragmenting and simultaneously appending adaptor sequences to the 5’ ends of polynucleotide fragments (see FIG. 25).
  • a second, Tn5 independent, step is used to further process the ‘adapted’ fragments to append a similar or different adaptor to the 3’ ends of the fragments thus completing the library template in a form ready for polynucleotide applications, like sequencing.
  • the 3’ end adaptors are added in one of two ways. One way for appending adaptors to 3' ends of Tn5 tagmented products is shown in FIG. 26.
  • the free 3’ end of the fragment can be extended in the presence of a polymerase (e.g., a strand displacing polymerase), dNTPs and heat (e.g., >68 °C) to remove the ‘non-transferred’ strand of the transposome ds adaptor.
  • a polymerase e.g., a strand displacing polymerase
  • dNTPs e.g., >68 °C
  • the complement of the 5’ adaptor is copied, and finally a PCR reaction with two distinct primers (e.g., P5-i5-A14 and P7-i7-B15) can be used to enrich for those primary tagmentation molecules that have a P5 based adaptor on one end and a P7 based adaptor on the other end.
  • FIG. 27 The other way for appending adaptors to 3' ends of Tn5 tagmented products is shown in FIG. 27.
  • a single double-stranded ‘forked’ adaptor is employed in the transposome (see FIG. 27).
  • a non-displacing polymerase is used at a temperature below the Tm of the nontransferred strand (e.g., ⁇ 55 °C), to extend the free 3’ end of the fragment until it reaches the 5’ end of the ‘non-transf erred’ adaptor strand and then a ligase covalently connects the nontransferred strand to the fragment.
  • the disclosure provides new and innovative methods for appending an adaptor oligo sequence to the 3’ end of a tagmented fragments that utilizes partial extension from the free 3’ end of a tagmented fragment to generate a known sequence capable of hybridizing to, and extending off, a template switch oligo (see FIGs. 28-34).
  • the disclosure additionally provides methods for processing a fully extended 3’ sequence to alter its base composition.
  • FIG. 28 An embodiment for appending adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo is shown in FIG. 28.
  • DNA is tagmented with a transposome that may or may not have modifications to the transferred strand.
  • the tagmented library is treated to de-anneal and remove the non-transferred strand of the transposome. For example, if the transposome is attached to a solid surface (e.g., a bead), mild heat followed by a hot wash will remove the strand. Next a replacement oligo that has a higher Tm than the non-transferred strand is hybridized back to the appended adaptors.
  • the oligo can be longer than the non-transferred strand it replaces, or it may contain modifications such as ‘Linked Nucleic Acids’ (LNAs). The modifications may also or only be present in the transferred strand.
  • LNAs Linked Nucleic Acids
  • the replacement oligo does not hybridize in place of all the non-transferred strand, but instead does so partially. This results in portion of the adaptor, 5’ of the replacement oligo, remaining single stranded.
  • a non-displacing polymerase is used to extend from the 3’ ends of the insert filling in the ends of the insert and extending over the adaptor sequence but stopping when it reaches the hybridized replacement oligo.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein). If the strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3’ end of the fragment.
  • FIG. 28A-B Additional embodiments for appending adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo is shown in FIG. 28A-B. As shown in FIG.
  • a transposome is employed that already has the ‘replacement oligo’ annealed next to the non-transferred strand.
  • the nontransferred strand can be shorter than its standard 19 base Tb5 recognition sequence (for example 16 bases).
  • Tb5 recognition sequence for example 16 bases.
  • moderate heat can be used to denature the non-transferred strand leaving the ‘LNA containing replacement oligo’ still annealed.
  • a non-displacing polymerase is used to extend from the 3’ ends of the insert filling in the ends of the insert and extending over the non-transferred strand but stopping when it reaches the hybridized replacement oligo.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein).
  • the strands can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3’ end of the fragment.
  • a transposome is employed that already has the ‘replacement oligo’ annealed next to the non-transferred strand.
  • the non-transferred strand can be shorter than its standard 19 base Tb5 recognition sequence (for example 16 bases).
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein).
  • strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3’ end of the fragment.
  • FIG. 30 Additional embodiments for appending adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo is shown in FIG. 30.
  • a transposome is employed that contains one or more internal modifications in the non-transferred strand that prevents exonuclease digestion, e.g., a phosphorothioate linkage in the phosphodiester backbone of the oligo.
  • a polymerase with a 5’ to 3’ exonuclease activity is employed to extend from the free 3’ end of the insert.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein). If the strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3’ end of the insert to add new sequences to the 3 ’ end of the fragment.
  • FIGs. 31A-31B Additional embodiments for appending adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo is shown in FIGs. 31A-31B.
  • a non-displacing polymerase and all four dNTPs (dATP, dCTP, dGTP, dTTP) is used to extend from the free 3’ end of the fragment up to the 5’ end of the non-transferred strand.
  • the polymerase and dNTPs are then removed (eg purified on SPRI beads, or by magnetic bead-based washing if the adaptors are attached to a bead).
  • a fresh aliquot of a strand displacing polymerase is added and just three out of the four dNTPs (dCTP, dGTP, dTTP); dATP is absent.
  • This mix will continue to extend the 3’ end of the fragment across the Tn5 adaptor which for the first 19 bases is a known Tn5 recognition sequence (5’AGATGTGTATAAGAGACAG3’) (SEQ ID NO: 31) but will stop incorporating bases once it reaches a ‘T’ base in the template, due to the absence of dATP molecules.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein and comprising the sequence: 5’CTGTCTCTT3’).
  • the strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3 ’ end of the insert to add new sequences to the 3 ’ end of the fragment. As shown in FIG.
  • a non-displacing polymerase and all four dNTPs (dATP, dCTP, dGTP, dTTP) is used to extend from the free 3’ end of the fragment up to the 5’ end of the nontransferred strand.
  • the polymerase and dNTP are then removed (e.g., purified on SPRI beads, or by magnetic bead-based washing if the adaptors are attached to a bead).
  • the non-transferred strand is then removed, e.g., by moderate heat to selective denature the strand, or application of a lambda exonuclease that selectively digests oligos containing a 5’ phosphorylated ends (as is the case with non-transferred strands), or other means known to those skilled in the art.
  • a fresh aliquot of a polymerase is added and just three out of the four dNTPs (dCTP, dGTP, dTTP); dATP is absent.
  • each strand of the insert now has an adaptor appended at its 5’ end (through tagmentation) and a partial adaptor at its 3’ end (through the extension step described herein and comprising the sequence: 5’CTGTCTCTT3’).
  • the strands are denatured, they can then participate in a template switch reaction where a 3’ blocked template switch oligo is annealed and serves as a template to further extend the 3 ’ end of the insert to add new sequences to the 3 ’ end of the fragment.
  • the workflow described in the embodiments above may take place where the transposomes are attached to a surface such as a bead, or the transposomes may be free in solution.
  • the 3’ end can also be completed following extension by hybridizing an oligo that is partially complementary to the 5’ adaptor but contains further sequences that are unique and not present in the 5’ adaptor.
  • a ligation reaction covalently joins this oligo to the 3’ end of the insert and forms a ‘ Y’ shaped adaptor construct (see FIG. 32).
  • FIG. 33A Additional embodiments for appending adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo is shown in FIG. 33A.
  • a single transposome type comprises a P5 transferred-strand and a short non-transferred-strand is used to tagment DNA.
  • a strand displacing polymerase is added to extend from the free 3’ end, create the complement of the P5 transferred-strand (ie., creates P5’), and displace the short non-transferred strand. The temperature is then elevated to make the fragments single stranded.
  • a P7 template switch oligo hybridizes forming a forked structure that is partially double-stranded. Then in the presence of a polymerase with 3’ exo activity, the single stranded 3’ end is degraded by this activity until there is no longer any single stranded 3’ end. The remaining 3’ end of the fragment then forms a primer template that extends and creates the complement of the P7 template switch oligo.
  • FIG. 34 Additional embodiments for appending adaptors to the ends of tagmented polynucleotides capable of hybridizing to, and extending off, a template switch oligo is shown in FIG. 34.
  • a structure specific endonuclease e.g., XPF/MUS81
  • the endonuclease nicks the double stranded region of the duplex creating a free 3’ end that a polymerase extends and creates the complement of the P7 template switch oligo.
  • the adaptor-polynucleotide constructs prepared according to the methods disclosed herein can be used in any method of nucleic acid analysis, e.g., sequencing of the templates or amplification products thereof.
  • Exemplary uses of the template libraries include, but are not limited to, providing templates for whole genome amplification, sequencing, subcloning, and PCR amplification (of either monotemplate or complex template libraries).
  • Template libraries prepared according to a method of the disclosure can be from a complex mixture of genomic DNA fragments representing a whole or substantially whole genome provide suitable templates for so-called “whole-genome” amplification.
  • the term “whole-genome amplification” refers to a nucleic acid amplification reaction (e.g., PCR) in which the template to be amplified comprises a complex mixture of nucleic acid fragments representative of a whole (or substantially whole genome).
  • solid-phase amplification refers to any nucleic acid amplification reaction carried out on or in association with a solid support such that all or a portion of the amplified products are immobilized on the solid support as they are formed.
  • solid-phase PCR solid-phase polymerase chain reaction
  • solid-phase PCR is a reaction analogous to standard solution phase PCR, except that one or both of the forward and reverse amplification primers is/are immobilized on the solid support.
  • one amplification primer may be immobilized (the other primer usually being present in free solution).
  • both the forward and the reverse primers may be immobilized.
  • References herein to forward and reverse primers are to be interpreted accordingly as encompassing a “plurality” of such primers unless the context indicates otherwise.
  • Amplification primers for solid-phase PCR are preferably immobilized by covalent attachment to the solid support at or near the 5' end of the primer, leaving the templatespecific portion of the primer free for annealing to its cognate template and the 3' hydroxyl group free for primer extension.
  • attachment chemistry will depend on the nature of the solid support, and any derivatization or functionalization applied to it.
  • the primer itself may include a moiety, which may be a non-nucleotide chemical modification, to facilitate attachment.
  • cluster and “colony” are used interchangeably herein to refer to a discrete site on a solid support comprised of a plurality of identical immobilized nucleic acid strands and a plurality of identical immobilized complementary nucleic acid strands.
  • clustered array refers to an array formed from such clusters or colonies. In this context the term “array” is not to be understood as requiring an ordered arrangement of clusters.
  • the disclosure further provides methods of sequencing amplified nucleic acids generated by whole genome or solid-phase amplification.
  • the disclosure provides a method of nucleic acid sequencing comprising amplifying a library of nucleic acid templates using whole genome or solid-phase amplification as described above and carrying out a nucleic acid sequencing reaction to determine the sequence of the whole or a part of at least one amplified nucleic acid strand produced in the whole genome or solid-phase amplification reaction.
  • Sequencing can be carried out using any suitable “sequencing-by-synthesis” technique, wherein nucleotides are added successively to a free 3 ' hydroxyl group, resulting in synthesis of a polynucleotide chain in the 5' to 3' direction.
  • the nature of the nucleotide added is preferably determined after each nucleotide addition.
  • the initiation point for the sequencing reaction may be provided by annealing of a sequencing primer to a product of the whole genome or solid-phase amplification reaction.
  • one or both of the adaptors added during formation of the template library may include a nucleotide sequence which permits annealing of a sequencing primer to amplified products derived by whole genome or solid-phase amplification of the template library.
  • bridged structures formed by annealing of pairs of Immobilized polynucleotide strands and immobilized complementary strands, both strands being attached to the solid support (e.g., a flowcell) at the 5' end.
  • Arrays comprised of such bridged structures provide inefficient templates for nucleic acid sequencing, since hybridization of a conventional sequencing primer to one of the immobilized strands is not favored compared to annealing of this strand to its immobilized complementary strand under standard conditions for hybridization.
  • Bridged template structures may be linearized by cleavage of one or both strands with a restriction endonuclease or by cleavage of one strand with a nicking endonuclease.
  • Other methods of cleavage can be used as an alternative to restriction enzymes or nicking enzymes, including inter alfa chemical cleavage (e.g.
  • cleavage of a diol linkage with periodate cleavage of a diol linkage with periodate
  • cleavage of abasic sites by cleavage with endonuclease or by exposure to heat or alkali
  • cleavage of ribonucleotides incorporated into amplification products otherwise comprised of deoxyribonucleotides photochemical cleavage or cleavage of a peptide linker.
  • a linearization step may not be essential if the solidphase amplification reaction is performed with only one primer covalently immobilized and the other in free solution.
  • the product of the cleavage reaction may be subjected to denaturing conditions in order to remove the portion(s) of the cleaved strand(s) that are not attached to the solid support.
  • denaturing conditions will be apparent to the skilled reader with reference to standard molecular biology protocols (Sambrook el al.. 2001, Molecular Cloning, A Laboratory Manual, 3rd Ed, Cold Spring Harbor Laboratory Press, Cold Spring Harbor Laboratory Press, NY; Current Protocols, eds. Ausubel el al. .
  • Denaturation results in the production of a sequencing template which is partially or substantially single-stranded.
  • a sequencing reaction may then be initiated by hybridization of a sequencing primer to the singlestranded portion of the template.
  • the nucleic acid sequencing reaction may comprise hybridizing a sequencing primer to a single-stranded region of a linearized amplification product, sequentially incorporating one or more nucleotides into a polynucleotide strand complementary to the region of amplified template strand to be sequenced, identifying the base present in one or more of the incorporated nucleotide(s) and thereby determining the sequence of a region of the template strand.
  • One sequencing method which can be used in accordance with the disclosure relies on the use of modified nucleotides that can act as chain terminators. Once the modified nucleotide has been incorporated into the growing polynucleotide chain complementary to the region of the template being sequenced there is no free 3'-OH group available to direct further sequence extension and therefore the polymerase cannot add further nucleotides. Once the nature of the base incorporated into the growing chain has been determined, the 3' block may be removed to allow addition of the next successive nucleotide. By ordering the products derived using these modified nucleotides it is possible to deduce the DNA sequence of the DNA template.
  • Such reactions can be done in a single experiment if each of the modified nucleotides has attached a different label, known to correspond to the particular base, to facilitate discrimination between the bases added at each incorporation step.
  • a separate reaction may be carried out containing each of the modified nucleotides separately.
  • the modified nucleotides may carry a label to facilitate their detection.
  • this is a fluorescent label.
  • Each nucleotide type may carry a different fluorescent label.
  • the detectable label need not be a fluorescent label. Any label can be used which allows the detection of an incorporated nucleotide.
  • One method for detecting fluorescently labelled nucleotides comprises using laser light of a wavelength specific for the labelled nucleotides, or the use of other suitable sources of illumination.
  • the fluorescence from the label on the nucleotide may be detected by a CCD camera or other suitable detection means.
  • the disclosure is not intended to be limited to use of the sequencing method outlined above, as essentially any sequencing methodology which relies on successive incorporation of nucleotides into a polynucleotide chain can be used. Suitable alternative techniques include, for example, PyrosequencingTM, FISSEQ (fluorescent in situ sequencing), MPSS (massively parallel signature sequencing) and sequencing by ligation-based methods.
  • the target polynucleotide to be sequenced using the method of the disclosure may be any polynucleotide that it is desired to sequence.
  • Using the template library preparation method described in detail herein it is possible to prepare template libraries starting from essentially any double or single-stranded target polynucleotide of known, unknown or partially known sequence. With the use of clustered arrays prepared by solid-phase amplification it is possible to sequence multiple targets of the same or different sequence in parallel.
  • kits comprising the non-homologous end joining factors disclosed herein.
  • the kits can be tailored for use in particular applications.
  • the kits can be directed to the use of non-homologous end joining factors, or use of structure specific endonucleases in preparing libraries of adaptor-polynucleotide constructs using the methods of the disclosure.
  • Such kits can comprise at least a supply of adaptors as defined.
  • the kits can further comprise enzymes (e.g., structure specific endonucleases or non-homologous end joining factors), and/or amplification primers).
  • the structure and properties of amplification primers will be well known to those skilled in the art.
  • Adaptors included in the kit can be readily prepared using standard automated nucleic acid synthesis equipment and reagents in routine use in the art.
  • kits may be supplied in the kits ready for use, or more preferably as concentrates-requiring dilution before use, or even in a lyophilized or dried form requiring reconstitution prior to use.
  • the kits may further include a supply of a suitable diluent for dilution or reconstitution of the adaptors.
  • the kits may further comprise supplies of reagents, buffers, enzymes, for use in carrying out the methods disclosed herein.
  • Further components which may optionally be supplied in the kit include “universal” sequencing primers suitable for sequencing templates prepared using the adaptors and primers.
  • a transposome was constructed that contained LNA modifications in its transferred strand and was used to tagment genomic DNA.
  • the transposome was immobilized on a streptavidin paramagnetic bead via a 3’ biotin group on an ‘anchor’ oligo hybridized to the transferred strand.
  • the tagmentation was conducted in the presence of a ligase enzyme and an IlluminaTM Indexing primer P5-i5-A14. This primer hybridized 5’ of the transferred strand and was ligated to it by virtue of the 5’ end of the transferred strand bearing a phosphate moiety by design.
  • the Tn5 protein was then removed by denaturing it with a solution of SDS. Different SDS concentrations (%) were tested to effect complete removal of the Tn5 protein.
  • a mixture of a non-displacing polymerase (tTaq608), dNTPs, Q5 polymerase and a template switching oligo was added and incubated at 47 °C to deanneal the non-transferred strand and extend with tTaq608 pol as far as the anchor oligo. The temperature was then raised further (60-70°C) to the point where the templates were rendered single stranded and no longer attached to the beads.
  • the temperature was then lowered to 42 °C for 1 min to allow the template switch oligo containing the P7-i7-B15 sequences to hybridize.
  • Q5 polymerase then extended from the free 3’ end of the fragment to copy the P7-i7-B15, thus completely the template construct.
  • a qPCR reaction was performed to quantify how much completed template was present. The graph indicates that up to 2000 pM of correct product was formed under SDS concentrations that removed all of the Tn5 from the tagmented product complex (see FIG. 29).
  • the template switch oligo comprised either a free extendable -OH group at its 3’ end or a non-extendable blocked dideoxyC group at its 3’ end, or a non-extendable ‘inverted T’ blocking group at its 3’ end.
  • the P5’ end of the template is digested and replaced with the P7’.
  • the products of the reaction were subjected to analysis by ‘gel size exclusion’ electrophoresis and by a qPCR reaction which only amplifies and quantifies templates that have a P5 adaptor at their 5’ end and a P7’ adaptor at their 3’ end.
  • FIG. 33C presents the image of the gel electrophoresis and indicates that neither of the two 3 ’blocked template switch oligos were consumed indicating that the block is effective in preventing extension from the 3’ end of the template switch oligo which would result in creating a copy of the entire template.
  • extension and copying of the entire template occurred as evident from the reduction in fluorescence intensity of the template switch oligo band and the appearance of higher molecular weight product labelled with FAM.
  • FIG. 33D presents the results of the qPCR analysis that only detects product that is correctly appended with P5 on the 5’ end and P7’ on the 3’ end, but not product appended with P5 on both ends.
  • Both blocked template switch oligos yielded the correct product at approximately 1,700 pM.
  • the unblocked template switch oligo produced 1.5x as much product as a result of two mechanisms: (i) ‘fork’ -modulated switch in activity from exonuclease to extension activity to append the P7’ adaptor to the 3’ end of the template, and (ii) extension from the 3’ end of the template switch oligo to append a copy of the template to the P7 template switch oligo.
  • the P5’ end of the template is digested and replaced with the P7’ copied from the template switch oligo (see FIG. 33E).
  • a control experiment was also performed using a transposome comprising a P5 transferred-strand hybridized to a bead via an anchor oligo and a non-transferred strand comprising a single stranded 3’ end that is complementary to a P7 oligo (see FIG. 33F).
  • FIG. 33G indicates that the tempi ate- switch workflow of FIG. 33E produced a greater yield of library than the control workflow of FIG. 33F.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

La divulgation concerne des procédés d'ajout d'adaptateurs aux extrémités 5' et/ou 3' de polynucléotides.
PCT/US2024/051747 2023-10-20 2024-10-17 Procédés d'ajout d'adaptateurs sur des polynucléotides Pending WO2025085618A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363592016P 2023-10-20 2023-10-20
US63/592,016 2023-10-20

Publications (1)

Publication Number Publication Date
WO2025085618A1 true WO2025085618A1 (fr) 2025-04-24

Family

ID=95449084

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/051747 Pending WO2025085618A1 (fr) 2023-10-20 2024-10-17 Procédés d'ajout d'adaptateurs sur des polynucléotides

Country Status (1)

Country Link
WO (1) WO2025085618A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5888737A (en) * 1997-04-15 1999-03-30 Lynx Therapeutics, Inc. Adaptor-based sequence analysis
US20100167954A1 (en) * 2006-07-31 2010-07-01 Solexa Limited Method of library preparation avoiding the formation of adaptor dimers
US20210363596A1 (en) * 2015-06-09 2021-11-25 Life Technologies Corporation Methods, Systems, Compositions, Kits, Apparatus and Computer-Readable Media for Molecular Tagging

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5888737A (en) * 1997-04-15 1999-03-30 Lynx Therapeutics, Inc. Adaptor-based sequence analysis
US20100167954A1 (en) * 2006-07-31 2010-07-01 Solexa Limited Method of library preparation avoiding the formation of adaptor dimers
US20210363596A1 (en) * 2015-06-09 2021-11-25 Life Technologies Corporation Methods, Systems, Compositions, Kits, Apparatus and Computer-Readable Media for Molecular Tagging

Similar Documents

Publication Publication Date Title
US12071711B2 (en) Method of preparing libraries of template polynucleotides
US12385085B2 (en) Preparation of templates for methylation analysis
US9328378B2 (en) Method of library preparation avoiding the formation of adaptor dimers
US9012184B2 (en) End modification to prevent over-representation of fragments
AU2003223730B2 (en) Amplification of DNA to produce single-stranded product of defined sequence and length
EP1546355A2 (fr) Procedes d'utilisation de ligases d'arn thermostables
US20240191288A1 (en) Blocking oligonucleotides for the selective depletion of non-desirable fragments from amplified libraries
WO2025085618A1 (fr) Procédés d'ajout d'adaptateurs sur des polynucléotides
HK40014831B (en) Method of preparing libraries of template polynucleotides
HK40014831A (en) Method of preparing libraries of template polynucleotides

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24880569

Country of ref document: EP

Kind code of ref document: A1