WO2025224232A1 - Pcr leaping and applications thereof - Google Patents
Pcr leaping and applications thereofInfo
- Publication number
- WO2025224232A1 WO2025224232A1 PCT/EP2025/061194 EP2025061194W WO2025224232A1 WO 2025224232 A1 WO2025224232 A1 WO 2025224232A1 EP 2025061194 W EP2025061194 W EP 2025061194W WO 2025224232 A1 WO2025224232 A1 WO 2025224232A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pcr
- primers
- leaping
- sequence
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/686—Polymerase chain reaction [PCR]
Definitions
- the present invention relates to the field of PCR amplification and uses thereof, and specifically relates to leaping PCR reactions, where only a fraction of a target nucleic acid is amplified in the PCR reactions, wherein the PCR reaction produces an amplification product which comprises the terminal parts of a target nucleic acid but is devoid of a central part of the target nucleic acid.
- PCR Polymerase chain reaction
- a fundamental requirement for PCR reaction is to have the product loyally representing the original DNA sequence. Nonspecific amplification can cause false judgement in PCR based diagnosis and artifacts in PCR based sequencing, and the spurious products are unusable for cloning or genetic engineering. Traditionally, the nonspecific amplifications have mainly been attributed to two scenarios. Mis-binding of one or both primers to unintended template sites results in mis-priming products. Mis-binding of two primers to each other results in primer dimers. Because primer annealing and the DNA polymerases can be promiscuous, mis-binding structures can form and be extended resulting in unwanted amplification products. Shorter error products are amplified faster than the desired product so once formed they can easily dominate the exponential PCR reaction.
- the present disclosure describes a new type of PCR amplification, which reduces the number of reagents used and the time spend for each individual sample, thereby providing an improved PCR reaction exceptionally suitable for large scale DNA library screening. Furthermore, the improved PCR reaction of the present disclosure also improves the identification of target sequences in genomic libraries by providing a more robust identification of target sequences from genomic libraries.
- the present disclosure relates to methods using leaping PCR, wherein the amplification products produced comprise the distal portions of the target nucleic acid sequences but lacks a central part of the target sequence.
- the disclosure accordingly provides optimized PCR methods which are optimized for leaping PCR. Such methods are suitable for high throughput identification of target sequences in genomic libraries.
- a first aspect of the disclosure relates to a method for performing a leaping PCR amplification on a target nucleotide sequence, said method comprise, a) providing at least one pair of PCR primers, said primers each comprises i. a tag nucleotide portion (i), such as a barcode sequence, which does not anneal to the target nucleotide, and
- a nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence b) performing a PCR amplification using a polymerase on the target nucleotide sequence using said PCR primers, therein the extension step of said PCR amplification is performed for less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence, wherein the amplification product produced by said PCR amplification comprises a 5’end portion and a 3’end portion of said target nucleotide sequence, characterized in that a central part of more than 10kb, such as more than 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb or such as more than 150 kb, of the target nucleotide sequence is absent from the amplification product.
- the polymerase has a replication efficiency of at least 500b/min (bases/minute) during the extension phase of the PCR amplification, such as at least 1 kb/min or such as between 500b-3 kb/min.
- the amplification product is 100 b to 5000 b long, preferably, 150-2000 b long.
- the leaping PCR amplification comprises a) denaturation of a target nucleotide sequence; b) annealing of primers to the target nucleotide sequence, and c) extension of the primers to produce an amplification product using a polymerase, wherein, c) is conducted for 5 minutes (s) or less, such as less than 4min, 2min, 1min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less.
- the leaping PCR amplification may be repeated for 10-100 cycles.
- the primers are specific to regions flanking the target nucleotide sequence.
- the annealing temperature may be greater than, lower than, or equal to the T m of the primers.
- the amplification is an intramolecular amplification reaction.
- Example 4 and Figures 9 and 10 clearly shows that the leaping PCR reactions have a different mechanism (Figure 10c) than the previously reported “polymerase slippage”, “polymerase jumping”, “jumping PCR” or “bridging PCR”.
- the two nucleotide strands of the amplification product anneal to each other.
- the target sequence does not comprise a hairpin structure within 10-500 b from the primer target region.
- the primers are 20-50 b long.
- the tag portion of the primer is 2-10 b long.
- the nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence are about 10-50 b long, such as 15-30 b long, preferably 18-28 b long.
- a second aspect of the invention relates to a method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprised in a genomic library, such as in a vector e.g., a plasmid, said method comprises a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) targeting the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments, c) performing a leaping PCR method as defined herein, on the vectorized DNA fragments using said primers, d) sequencing said amplification product, e) identifying the presence of the one or more target sequences based on the sequence of the amplification product.
- the primers encompass the sequence of the genomic library cloning site. In additional embodiments, the primers are placed within about 1000 base pairs (b), such as between 0-1000 b from the cloning site of the genomic library cloning site. In additional embodiments, the primer(s) comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1-167, or a nucleotide sequence which is at least 80% identical thereto. The methods of the present disclosure are e.g., suited for identification of target sequences in large genomic libraries, which are faster and more reliable than previously known methods.
- FIGURE 1 Schematic of leaping PCR. a, During the nonspecific amplification, two fragments from the terminals of the PCR target was jointed together. The middle part of the PCR target was leaped over, b, the sequence of the joining area.
- FIGURE 2 Product gel photo for PCR reactions from plasmid p4-1 E. Details of each reaction are described in example 1. Unspecific products below 1.5 kb were generated. The leaping PCR products identified from sanger sequencing result were marked by arrows.
- FIGURE 3 Product gel photo of leaping PCR reactions and one primer PCR reactions with two different annealing temperatures. Details of each reaction are described in example 1. One primer PCR did not generate band at the higher annealing temperature.
- FIGURE 4 Product gel photo of leaping PCR reactions using dreamtaq polymerase and Q5 polymerase. Details of each reaction are described in example 2. Both polymerases generated leaping PCR products below 1.5 kb.
- FIGURE 5 Product gel photo of leaping PCR reactions using primers with and without hairpin structures. Details of each reaction are described in example 2. Both types of primers generated leaping PCR products below 1.5 kb.
- FIGURE 6 Product gel photo of leaping PCR reactions using annealing temperature of 66°C. Details of each reaction are described in example 2. Leaping PCR products below 3 kb were generated.
- FIGURE 7 Product gel photo of leaping PCR reactions from template with GC content of 45%. Details of each reaction are described in example 2. Leaping PCR products below 1.5 kb were generated.
- FIGURE 9 Product gel photo of PCR reactions from fragment CP or a mixture of C and P as templates. Different template concentrations were used. Details of each reaction are described in example 4.
- FIGURE 10 - a comparison of single piece template and double-piece template, fragment P-G or a mixture of fragment P and G were used as PCR template. All ten sense primers target sequence P; all antisense primers target sequence G. b, summary of the PCR results, Error bars are displaying the ⁇ SD of three replicates, c, mechanisms of PCR leaping and other error products.
- jumping PCR the premature extension product annealed to a wrong place. It relies on long mis-binding of the 3’ end of the premature extension product. In mispriming, the primer annealed to a wrong place.
- the primer concentration is much higher than the premature extension product, it requires relatively short homology of the 3’ end of the primer with the mis-priming site on the template.
- the 3’ end of the premature extension product dissociated from the template in DNA breathing or as a result of incomplete renaturation, or the whole premature extension product has denatured from the template but has yet drifted away or still intertwined with the template strand.
- the 3’ end of the premature extension product then get mis-extended at a wrong place of the same template. It relies on temporary or partial binding of the premature extension product with the same template. It requires short or no homology between the 3’ end of the premature extension product and the mis-extension site on the template.
- FIGURE 11 Product gel photo of plasmids isolated from randomly picked colonies from a BAC library of Kutzneria sp. CA-103260. The insertion sizes are between 150 kb and 250 kb.
- FIGURE 12 - a leaping PCR screening on BAC library. Library colonies were transferred from LB agar to multi well PCR plate. All the PCR products were pooled together and submitted to NGS sequencing, b, BAC plasmid, primer sites and leaping PCR product structure, c, mapping the leaping PCR product on the genome.
- FIGURE 13 Product gel photo of plasmids isolated from randomly picked colonies from DNA library of Vibrio natriegens. The insertion sizes are above 20 kb.
- FIGURE 14 Product gel photo for Leaping PCR reactions from fast growing colonies from DNA library of Vibrio natriegens.
- the present disclosure relates to a method for performing a leaping PCR amplification on a target nucleotide sequence.
- Leaping PCR is a method for producing amplification products useful for sequencing and identification, wherein the amplification products only contain a portion of the terminal sequences of the target nucleic acids.
- regular PCR amplification a single product is desired.
- the PCR amplification product stretches from the 5’end primer to the 3’end primer, and consists of two antiparallel nucleotide strands, generally known as the amplicon.
- the entire region between the 5’end and 3’end primers is extended and amplified.
- extension products containing the terminal parts of the target nucleotide sequence, but not the central part of the target nucleic acid sequence could be obtained, thus eliminating the need for extension of the entire target region.
- Leaping PCR is thus exceptionally suitable for application where the knowledge of the identity of the entire sequence of the target sequence is not required from the sequencing, but where only part of the sequence is needed to identify regions of interest from a library. Such applications are e.g., identification of particular DNA inserts from a library, where only part of the sequence is required to identify the presence of the target nucleic acid sequence.
- genomic libraries where entire genomes or part of genomes are digested and inserted into cloning vectors, which are subsequently inserted into a host cell, from which the cloning vector may subsequently be extracted and amplified using PCR, whereafter the PCR product/amplicon may be sequenced, and the region of interest may be identified in the particular host cell.
- the host cell is a bacterial cell.
- the host cell is a eukaryotic cell. Simultaneously sampling both terminals of DNA fragment has many applications and is traditionally done by a complex process involving circularizing the DNA with a vector or adaptor, random shearing, and recirculation before sequencing.
- the methods presented herein provides an alternative method for identifying the presence of particular regions of interest in a library, using the sequence of the terminal ends of the target region, which can be retrieved by analysis of the produced amplicons.
- the term “host cell” refers to a cell suitable for being genetically modified to replicate a heterologous genetic element, such as a plasmid carrying one or more heterologous genes or expression elements.
- Leaping PCR is not restricted to produce a single amplicon from an amplification reaction, since a target sequence result in several leaps being performed by the polymerase. Often due to the size of the target region often being larger than what is practically obtainable from a single amplification reaction, it is beneficial to obtain incomplete extension fragments, this is however only possible, in the case where both a forward and reverse strand capable of annealing are produced. Accordingly, in embodiments, at least one set of the nucleotide strands of the amplification product can anneal. In embodiments, the two nucleotide strands of the amplification product can anneal.
- Incomplete extension fragments may in leaping PCR be produced by the extension leaping from one site in the target nucleic acid sequence onto a new site further up/down the target nucleic acid sequence, thus completely omitting a central part of the target nucleic acid in the amplification.
- An illustration of the principle of leaping PCR can be seen in Figure 1.
- the omitted central part of the strand is preferably more than 5 kb (kilobases), such as more than 10 kb, 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb, 150 kb, 200 kb, 300 kb or such as more than 500 kb.
- the target nucleic acid contains one leaping point, thus giving rise to a single amplification product. In other embodiments, the target nucleic acid contains more than one leaping point, thus giving rise to more than one amplification product. In other embodiments, the target nucleic acid contains several leaping points, thus giving rise to several amplification products. In the case where leaping points are present, (leaping one time is also enough for the deep sequencing) deep sequencing may be used to identify the terminal regions of the target nucleic acids, such as e.g., about the 100 first and the 100 last nucleotides in the target nucleic acid sequence.
- the primers used for the leaping PCR amplification are specific to regions flanking the target nucleotide sequence. Such as about 1000 b, such as between 0-1000 b from the cloning site.
- the amplification product is preferably an intramolecular PCR product, i.e., a product produced from the same target nucleic acid sequence.
- the amplification product is preferably not an intermolecular PCR product, i.e., a product produced from different target nucleic acid sequences.
- “Polymerase Jumping” is a phenomenon occurring from stem-loop structures in the target nucleic acid sequence, where the polymerase “jumps” over the stem-loop/transposon, and a gap in the sequence is obtained (Viswanathan et al., “Template Secondary Structure Promotes Polymerase Jumping During PCR Amplification”, (1999) BioTechniques).
- “Polymerase Jumping” is not “leaping PCR”.
- the target nucleic acid/template sequence does not comprise a hairpin and/or stem-loop structure(s).
- the target nucleic acid/template sequence does not comprise a hairpin and/or stem-loop structure(s) between the primer binding region(s). In embodiments, the target nucleic acid/template sequence does not comprise inverted repeats between the primer binding region(s).
- target nucleic acid and template sequence are used interchangeably.
- the target nucleic acid/template sequence does not comprise a hairpin and/or stem-loop structure(s) within 10-500 b, such as about 10b, 25b, 50b, 75b, 100b, 125b, 150b, 175b, 200b, 250b, 300b, 350b, 400b, 450b or about 500b, from the primer target region(s).
- the target nucleic acid/template sequence does not comprise inverted repeats within 10-500 b, such as about 10b, 25b, 50b, 75b, 100b, 125b, 150b, 175b, 200b, 250b, 300b, 350b, 400b, 450b or about 500b, from the primer target region(s).
- the PCR amplification is a technique for making many copies of a specific template DNA sequence.
- the basic principle of PCR amplification is well known to the skilled person.
- One set of primers complementary to a template/target DNA are designed, and a region flanked by the primers is amplified by polymerase (e.g., DNA polymerase) in a reaction including multiple amplification cycles.
- polymerase e.g., DNA polymerase
- Each amplification cycle includes an initial denaturation, and up to 100 cycles of annealing, strand elongation/extension and strand separation (denaturation).
- the DNA sequence between the primers is copied.
- Primers can bind to the copied DNA as well as the original template sequence, so the total number of copies increases exponentially with time.
- Various modified PCR methods are available and well known in the art.
- Various modifications such as the“RT-PCR” method, in which DNA is synthesized from RNA using a reverse transcriptase before performing PCR.
- hot start PCR conditions may be used to reduce mis-priming, primerdimer formation, improve yield, and/or ensure high PCR specificity and sensitivity.
- hot start DNA polymerases e.g., hot start DNA polymerases with aptamer-based inhibitors or with mutations that limit activity at lower temperatures
- hot start dNTPs e.g., CLEANAMPTM dNTPs, TriLink Biotechnologies.
- a PCR amplification may include from about 20 cycles to about 100 cycles or more (e.g., about 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 cycles).
- the sequences amplified in this manner form an amplified target nucleic acid (also referred to herein as an amplicon).
- the amplicon of the present disclosure generally consists of only the terminal parts of the target nucleic acid and lacks the central part of the target nucleic acid/template. Primers and probes can be readily designed by those skilled in the art to target a specific template nucleic acid sequence. In certain preferred embodiments, resulting amplicons are short to allow for rapid cycling and generation of copies.
- the size of the amplicon can vary as needed, for example, to provide the ability to discriminate target nucleic acids from non-target nucleic acids. For example, amplicons can be less than about 1 ,000 nucleotides in length.
- the amplicons are from 100 to 500 nucleotides in length (e.g., 100 to 200, 150 to 250, 300 to 400, 350 to 450, or 400 to 500 nucleotides in length). In other embodiments, the amplicons are greater than about 1 ,000 nucleotides in length, e.g., about 1 ,000, about 2,000, about 3,000, about 4,000, about 5,000, or more nucleotides in length.
- the amplicon lacks a central part of more than 10kb, such as more than 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb or such as more than 150 kb, of the target nucleotide sequence.
- the amplicon contains from 100 to 500 nucleotides of the target sequence (e.g., 100 to 200, 150 to 250, 300 to 400, 350 to 450, or 400 to 500 nucleotides in length), wherein the target sequence has a length of more than 10 kb.
- the amplicons contains more than about 1 ,000 nucleotides of the target sequecne, e.g., about 1 ,000, about 2,000, about 3,000, about 4,000, about 5,000, or more nucleotides of the target nucleic acid, wherein the target sequence has a length of more than 10 kb.
- more than one e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 target nucleic acids may be amplified in one reaction.
- Amplification primers can be readily designed to target a specific template nucleic acid sequence.
- the primers may be designed such that they comprise a target portion which targets a specific target/template nucleic acid sequence. Additionally, the primers may also comprise a portion which does not target the target/template nucleic acid sequence, but which serves as a barcode sequence and/or sequencing tag, i.e. , a non-target portion. In embodiments, the tag nucleotide portion is a barcode sequence.
- the non-target portion may comprise a sequencing portion and/or a barcode portion.
- primers each comprise a tag nucleotide portion (i) which does not anneal to the target nucleotide and a nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence.
- the primers comprise a tag nucleotide portion (i) which does not anneal to the target nucleotide and a nucleotide portion (ii) which specifically targets a region flanking the 5’ end or the 3’end regions of the target nucleotide sequence, such as e.g., regions flanking an insertion site in a backbone.
- sequences are provided in table 1 .
- each PCR primer used in the present disclosure may comprise a barcode primer region and a target-specific binding region complementary to a sequence in a target nucleic acid.
- the target-specific binding region is a region flanking an insertion site in a cloning vector.
- each PCR primer used in the present disclosure may comprise a barcode primer region, a sequencing primer region (illumina/nanopore sequencing does not need sequencing primer. Sanger sequencing needs sequencing primer, the sequencing primer we used is actually the per primer it self. So can say our per primer contain a sequencing primer region, but it actually overlap with the other regions ), and a target-specific binding region complementary to a sequence in a target nucleic acid.
- Each region of the primer oligonucleotide may include 2-30 nucleotides.
- the barcode primer regions may include 2-20 nucleotides; the sequencing primer regions may include 12-30 nucleotides; and the target-specific binding region may include 5-30 nucleotides.
- the overall sequence of the primers is chosen to be non-naturally occurring, when comprising a target specific and a non-target specific portion.
- the primers may include RNA, DNA, or a combination thereof.
- the oligonucleotides may also contain modified nucleotides, e.g., modified bases, sugars, or phosphates.
- uracil is substituted for positions where thymine appears in the primers, which allows removal of trace amounts of synthetic oligonucleotide and carryover PCR products by pre-treatment with uracil-DNA glycosylase (UDG).
- UDG uracil-DNA glycosylase
- the target-specific binding regions may flank a sequencing assay region in the target nucleic acid and allow for amplification thereof.
- the primers may include RNA, DNA, or a combination thereof.
- the primers used in the method of the present disclosure preferably comprises barcoding portion.
- Each region of the barcoding portion may include 2-20 nucleotides.
- the barcode sequences may have 4-18 nucleotides and the primer regions may have 7-30 nucleotides.
- the barcoding portion may include RNA, DNA, or a combination thereof.
- the barcoding portion may also contain modified nucleotides, e.g., modified bases, sugars, or phosphates.
- barcode is meant a unique oligonucleotide sequence that may allow the corresponding oligonucleotide to be identified.
- the nucleic acid sequence may be located at a specific position in a longer nucleic acid sequence.
- each barcode may be different from every other barcode by at least a minimum Hamming Distance, wherein the minimum Hamming Distance may be a number greater or equal to 1 , such as 2.
- Hamming Distance is meant a relationship between two nucleic acid sequences of equal length, wherein the number corresponding to the Hamming Distance is the number of bases by which two sequences of equal lengths differ.
- Non-limiting examples of barcoding sequences and target specific sequences may e.g., be found in Table 1 .
- Non-limiting examples of further primer sequences and target specific sequences may e.g., be found in Table 2.
- Tower case letters indicate barcode/non-target sequences; Uppercase letters indicate targel or overhang sequences.
- the primer(s) comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1-167, or a nucleotide sequence which is at least 80%, such as at least 85%, 90%, 95%, 98%, 99% identical thereto.
- sequence identity describes the relatedness between two nucleotide sequences, i.e., a candidate sequence (e.g., a sequence of the disclosure) and a reference sequence (such as a prior art sequence) based on their pairwise alignment.
- sequence identity between two nucleotide sequences is determined using the Needleman-Wunsch algorithm (Needleman and Wunsch, 1 970, supra) as implemented in the Needle program of the EMBOSS package (EMBOSS: The European Molecular Biology Open Software Suite, Rice et al., 2000, Trends Genet. 16: 276- 277), 10 preferably version 5.0.0 or later.
- the parameters used are gap open penalty of 10, gap extension penalty of 0.5, -endopen 10.0, -endextend 0.5 and the DNAFULL (EMBOSS version of NCBI NUC4.4) substitution matrix.
- the output of Needle labelled " identity" (obtained using the -nobrief option) is used as the percent identity.
- sequence identity may be calculated as follows: (Identical nucleotide residues x 100)/(aligned region).
- the method presented herein reduces the time spend on the extension step, since the method does not require extension of the entire target nucleic acid sequence, but only a fraction of the target nucleic acid sequence sufficient to identify the terminal ends of the target nucleic acid sequence.
- Example 1 and 3 shows that Leaping PCR products are present under normal and optimized conditions, but clearly shows that leaping PCR products are enriched in PCR reactions where the extension time is optimized for leaping PCR, i.e., shortened (at least by 50%) compared to normal PCR reactions, where the extension time is set to approx. 1 min/kb.
- the extension time may be defined from the length of the target nucleotide sequence and the efficiency of the polymerase, also known as the polymerization rate or replication efficiency, during the extension phase.
- extension time is not limited but is preferably shorter than the extension time needed to synthesize the entire target nucleic acid.
- the extension time is for less than 75%, such as less than 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2% or such as less than 1% of the time required by the polymerase to synthesize the entire target nucleotide sequence. More preferably the extension time is less than 50%, such as less than 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2% or such as less than 1 % of the time required by the polymerase to synthesize the entire target nucleotide sequence.
- the time required for the extension step may be determined by dividing the length of the target nucleic acid sequence with the efficiency of the polymerase.
- the time may be calculated as the length of the target nucleic acid sequence divided by a polymerase efficiency of e.g., 1 kb/min.
- the polymerase efficiency of commonly used polymerases is well known, and is generally in the range of 1-2kb/min. In examples, the polymerase efficiency of the Taq polymerase is about 1 kb/min at about 72°C.
- the specific polymerase elongation efficiency also depends on the GC content of the target nucleic acid sequence and the specific extension and/or annealing temperature. This is also evident from the results provided in Example 3, which shows that the annealing temperature can be used to further optimize the leaping PCR reaction to specifically enrich the leaping PCR products over other PCR products.
- Example 2 shows that leaping PCR also occurs using different polymerases, from different quality of template DNA, from DNA with different GC content, with different kinds of primers and at different annealing temperatures.
- Example 3 further illustrates that leaping PCR products may be obtainable from normal PCR reactions, but at a much lower amount than what is possible using the optimized leaping PCR conditions provided in example 1 and 2, thus clearly shows that the optimized PCR reaction in the previous examples is superior in generating leaping PCR products.
- polymerase efficiency By the terms “polymerase efficiency”, “polymerase elongation efficiency” are used interchangeably and is defined by the polymerization rate of a polymerase under given conditions, i.e. , the number of nucleotides in a sequence synthesized for a given time under particular conditions.
- the “polymerase efficiency” or “polymerase elongation efficiency” is usually in the range of 1-2 kb/min but is largely dependent on the specific temperature under which the extension is performed.
- the leaping PCR amplification comprises denaturation of a target nucleotide sequence; annealing of primers to the target nucleotide sequence; and extension of the primers to produce an amplification product using a polymerase.
- the annealing temperature is greater than the T m of the primers, such as 1°C, 2°C, 3°C, 4°C, 5°C or such as 6°C higher than the T m of the primers. In embodiments, the annealing temperature is lower than the T m of the primers such as 1°C, 2°C, 3°C, 4°C, 5°C or such as 6°C lower than the T m of the primers. In embodiments, the annealing temperature is equal to, or substantially equal to the T m of the primers.
- the extension phase conducted for 5 minutes (s) or less, such as less than 4min, 2min, 1min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less.
- the extension phase conducted for a time less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence, wherein the target sequence has a size of 10 kb - 500 kb.
- the extension phase may be conducted for less than 5 min, 4min, 2min, 1 min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less for a target region of more than or equal to 1 kb, 2 kb, 3 kb, 4 kb, 5 kb, 10 kb, 25 kb, 50 kb, 100 kb, 200 kb, 300 kb, or more than or equal to 500 kb.
- the extension time is between 5s and 2 min.
- the amplification product(s) obtained from the leap PCR reaction is about 75 b to 5000 b long, preferably, 150-2200 b long, such as about 75b, 100b, 150 b, 200 b, 300b, 400b, 500b, 600b, 700b, 800b, 900b, 1000 b, 1100b, 1200b, 1300b, 1400b, 1500b, 1600b, 1700b, 1800 b, 1900 b, 2000 b, 2100 b, or such as about 2200 b, more preferable about 1-2 kb.
- the polymerase has a replication efficiency of at least 500b/min during the extension phase of the PCR amplification, such as at least 1 kb/min, 1.5kb/min, 2kb/min, 2.5kb/min or such as at least 3kb/min, or such as between 500b-3 kb/min, such as 1kb-2.5 kb/min, or such as about 1 kb/min.
- Example 3 underlines that different polymerases may be used in order to produce the leaping PCR products, in specific example 3 shows that proofreading Q5® Hot Start High- Fidelity polymerase and nonproofreading polymerase DreamTaqTM Hot Start DNA Polymerases both produces Leaping PCR products.
- the replication efficiency of a polymerase may also be referred to as mean synthesis speed
- suitable polymerase synthesis speeds are e.g., in. the range of 15s/kb to 120s/kb, such as 15s/kb, 30s/kb, 40/kb, 45/kb, 50s/kb, 60s/kb, 75s/kb or such as 120s/kb.
- Non-limiting examples of polymerases are e.g., Taq polymerase and variants thereof, pfu polymerase and variants thereof, and Fi polymerase and variants thereof.
- the PCR products/amplicons may be purified before sequencing. In other embodiments the PCR products/amplicons sequenced without purification. Methods for purification of amplification products from a PCR reaction is well known in the art.
- the detecting includes sequencing.
- the sequencing includes massively parallel sequencing, Sanger sequencing, or single-molecule sequencing.
- the massively parallel sequencing includes sequencing by synthesis or sequencing by ligation.
- the massively parallel sequencing includes sequencing by synthesis.
- the sequencing by synthesis includes ILLUMINATM dye sequencing, ion semiconductor sequencing, or pyrosequencing.
- the sequencing by synthesis includes ILLUMINATM dye sequencing.
- the sequencing by ligation includes sequencing by oligonucleotide ligation and detection (SOLiDTM) sequencing or polony-based sequencing.
- the single-molecule sequencing is nanopore sequencing, single-molecule real-time (SMRTTM) sequencing, or HelicosTM sequencing.
- Example 1 , 5 and 7 of the present disclosure describes the use of leaping PCR on different genomic libraries such as a bacterial artificial chromosome (BAC) library.
- BAC bacterial artificial chromosome
- the genome of a target organism is digested using a restriction enzyme, whereafter the DNA fragments are cloned into a BAC vector, carrying the site recognized by the restriction enzyme.
- DNA fragments are also referred to herein as vectorized DNA fragments.
- the BAC constructs are then transformed into a suitable host cell, in the case of example 1 an E. coli cell, which is then grown, and each individual colony is mapped and paired with a primer set, comprising a pair of barcode tags, unique for the specific colony.
- Genomic libraries may be prepared in numerous ways, known to the skilled person.
- bases b
- nucleotides nt
- kilobases is abbreviated as “kb”.
- Example 5 and 7 illustrates how the leaping PCR can be used to improve the identification of biosynthetic gene clusters in large genomic libraries by firstly improving the speed of the assay by reducing the extension time and secondly, providing more identifications of the gene clusters than what is possible by normal PCR reaction methods and sequencing methods such as for example sanger sequencing or illumina sequencing.
- the DNA sequences for biosynthetic gene clusters (BGCs) amount to around 10 % of the genome of several species.
- high throughput cloning method independent of BGC types has been missing. This problem is solved by the use of leaping PCR.
- Example 5 and 6 illustrates a high throughput BGC cloning platform that combines traditional BAC library construction and leaping PCR based end-paired library analysis.
- the method provided herein is for identifying gene clusters. Accordingly, in embodiments, the method provided herein is for identifying biosynthetic gene clusters.
- biosynthetic gene cluster BGCs
- BGCs biosynthetic gene cluster
- genes are sets of microbial genes that synthesize a wide plurality of biosynthetic compounds with diverse functions, such as siderophores and antibiotics, often such biosynthetic compounds are bioactive compounds.
- gene clusters are generally clusters of functionally related genes, i.e., they often include genes within the same pathway, genes encoding interacting proteins, or genes that affect the same phenotype.
- Such gene cluster may e.g., be identified using a genomic library produced using an exonuclease such as e.g., Exonuclease I or Xrnl , or endonuclease, such as e.g., BamHI, EcoRV, EcoRI, or Hindlll.
- Exonuclease/endonuclease generated genomic libraries often entails providing genetic material, e.g., a chromosome of a bacteria or fungi, which is cleaved into fragments using specific nucleases to provide a fragmented genome.
- the fragments may be inserted into vectors, that are designed to receive the fragments with the specific restriction sites. Accordingly, such fragments inserted into the genomic library can be of varying sizes.
- the methods provided herein is for identifying large genomic inserts.
- large genomic inserts are >1 Okb, such as as >15 kb, >20 kb, >30 kb, >40 kb, >50 kb, >75 kb, >100 kb, >130 kb or such as >150 kb.
- leaping PCR is used prior to sequencing.
- library is meant the amplification product of multiple nucleic acids, wherein the multiple nucleic acids may have the same or different sequences.
- the present disclosure also provides a method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprised in a genomic library.
- the method for performing the identification of genomic libraries is as such not limited to the particulars of the methods disclosed in the examples but may be conducted in a plethora of ways known to the skilled person.
- such a method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprise the step of: a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) as disclosed herein, c) performing a leaping PCR method on the vectorized DNA fragments using said primers as defined herein, d) sequencing said amplification product, e) identifying the presence of the one or more target sequences based on the sequence of the amplification product or part of the amplification product.
- the present disclosure relates to a method for identifying the presence of one or more biosynthetic gene clusters from a genomic library, said method comprising the step of: a) providing a genomic library comprising one or more vectorized DNA fragments comprising a backbone and an insert, wherein the insert is suspected of containing one or more biosynthetic gene clusters, b) providing one or more PCR primer pair(s) targeting the regions flanking the insertion site of the backbone, c) performing a leaping PCR protocol on the vectorized DNA fragments using said primers, d) sequencing said leaping PCR amplification product, identifying the presence of the one or more biosynthetic gene clusters based on the sequence of the leaping PCR amplification product or part of the amplification product.
- the present disclosure relates to a method for identifying the sequences in one or more vectorized DNA fragments comprised in a genomic library, comprising a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) targeting the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments, c) performing a leaping PCR method as defined herein, on the vectorized DNA fragments using said primers, d) sequencing said amplification product, e) identifying the sequences based on the sequence of the amplification product.
- the identification of the presence of the one or more biosynthetic gene clusters is based on a part of the amplification product, such as the first and/or last 10-100 b, preferably 25-50 b of the amplification product. In embodiments, the identification of the presence of the one or more biosynthetic gene clusters is based on the first and/or last IQ-
- the identification of the presence of the one or more target genes is based on the first and/or last 10-100 b of the amplification product.
- the identification of the presence of the one or more target genes is based on the first and/or last 10-100 b of the amplification product, not including the primer and cloning site.
- the identification of the presence of the one or more target genes is based on the first and last 25-50 b of the amplification product.
- the identification of the presence of the one or more target genes is based on the first and/or last 10-50 b of the amplification product, not including the primer and cloning site. In embodiments, the identification of the presence of the one or more target genes is based on the first and/or last 20-30 b, such as approx. 20 b, 21 b, 22 b, 23 b, 24 b, 25 b, 26 b, 27 b, 28 b, 29 b, or approx. 30 b of the amplification product, not including the primer and cloning site. In further embodiments the identification of the presence of the one or more target genes is based on the complete amplified sequence.
- the leaping PCR protocol is performed on the vectorized DNA fragments using said primers with an extension time of about 1s-30s. In other embodiments, the leaping PCR protocol is performed on the vectorized DNA fragments using said primers with an extension time of less than 50%, such as less than 40%, 30%, 20%, 10%, 5%, 2%, 1 %, 0.1 % or such as less than 0.01 % of the time required by the polymerase to synthesize the entire vectorized DNA fragment.
- the present disclosure also relates to a method for determining only a part of the nucleotide sequence of a nucleic acid of interest, comprising: providing n samples, wherein n is an integer, and n > 1 ; dividing the n samples into m groups, wherein m is an integer, and n > m > 1 ; performing leaping PCR amplification, as defined herein, on the m groups of samples under conditions suitable for amplifying the nucleic acid of interest when templates from the samples are available, wherein a pair or multiple pairs of index primers are used for each sample, wherein each pair of index primers consists of a forward index primer and a reverse index primer and primer indexes used for different samples are different; obtaining the leaping PCR products, comprising a sequence covering only the proximal and distal parts of the nucleic acid of interest, thus lacking a central part of the nucleic acid of interest of at least 10 kb; subjecting the recovered leaping PCR products to sequencing to obtain
- the length of the complete DNA target sequence of the nucleic acid of interest exceeds a maximum read length of a sequencer used for said sequencing.
- the alignment covers less than 50%, such as less than 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, 1%, 0.1% or such as less than 0.01 % of the nucleic acid of interest.
- the alignment covers less than 2000 b, such as less than 1000 b, 500 b, 300 b, 200 b, 150 b, 100 b, 75 b or such as less than 50 b of the nucleic acid of interest.
- Example 1-7 shows that different vectors may be used with the methods described herein, the vectors shown in the examples are to be considered as non-limiting examples, and it is to be understood that other vectors could readily be considered by the skilled person.
- any vector carrying a cloning site, or a set of cloning sites compatible with a suitable restriction enzyme or restriction Endonucleases for producing the library is considered suitable to carry out the methods of the disclosure.
- suitable restriction enzymes are e.g., BamHI, EcoRI and Hindlll.
- the vector plasmid can be amplified by PCR before being used for the cloning. Furthermore, new cloning sites can be added during the PCR reaction by including the cloning sites in the PCR primers.
- suitable vectors which may be used as vectors in the methods of the disclosure.
- the vector may e.g., be a plasmid, such as pUC19 or pJC720, BAC (bacterial artificial chromosome) vector, such as pBACe3.6, PAC vector (P1 -derived artificial chromosome), such as pPAC4, YAC (yeast artificial chromosome) vector, such as, pYAC4, cosmid, or fosmid
- BAC bacterial artificial chromosome
- PAC vector P1 -derived artificial chromosome
- YAC yeast artificial chromosome
- a cosmid is a type of hybrid plasmid that contains a Lambda phage cos sequence.
- a fosmid is a type of hybrid plasmid that contains a bacterial F-plasmid.
- the plasmid comprises one or more restriction sites which allows for linearization of the plasmid, and integration of one or more genomic inserts of interest.
- sequence library other than it should be suitable for amplification of the target nucleic acids.
- the amplicons produced in the PCR reaction are not too long i.e. , >5kb, preferably >2kb, since this would mean an unfavourable long amplification reaction and a lower chance of successful amplification of the desired portion of the insert. Accordingly, in order to ensure that the amplification reaction is sufficiently short, it is preferred the as little as possible of the vector is amplified.
- the primers are placed within about 1000 b, or such as between 0- 1000 b, 0-750 b 0-500 b 0-250 b, or such as 0-10 b, or such as between 10-1000 b, 50-1000 b 100-1000 b 250-1000 b, or such as 500-1000 b from the cloning site of the genomic library cloning site.
- the primers are placed immediately before or within the cloning site.
- the primers encompass the sequence of the genomic library cloning site.
- the primers do not encompass the sequence of the genomic library cloning site.
- the extension step of said PCR amplification as described herein is performed for less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence.
- the present disclosure discriminates between the target nucleotide sequence, which refers to the nucleic acid sequence inserted into e.g., the vectors as described above, and the target region targeted by the primers.
- the region targeted by the primers are not part of the inserted nucleic acid sequence, which is the unknown sequence, but is a known region, preferably a region of the vector, which may be shared by all constructs of the library.
- the primers preferably target a known target sequence, and amplification of the primers results in amplification of the known primer sequence forming part of the vector and an unknown sequence forming part of the target nucleic acid sequence.
- the primers preferably target the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments.
- the aim of the present example is to establish the presence of Leaping PCR products in different vectors.
- the present example further aims to demonstrate the difference between one primer mis-priming and PCR leaping using the same template and conditions.
- Plasmid p4-1 E and E. coli colony 4-1 E are used as PCR templates.
- p4-1 E is based on pESAC13 PAC vector with an insertion of 222kb.
- E. coli colony 4-1 E harbours the plasmid p4-1 E.
- Plasmid pXJ149.12 is used as PCR template.
- pXJ149.12 is based on pBeloBACH BAC Vector with a 6.3 kb insertion.
- Primers Xj527.1 and XJ528.1 are designed around the cloning site of the vector pESAC13.
- Xj462 is a primer that have no binding site on the template p4-1 E, and is used here as a negative control.
- Primers XI11 and XI12 are designed around the insertion of pXJ149.12.
- Bio-Rad S1000TM Thermal Cycler is used for the PCR programs.
- the PCR reactions were prepared with the following components:
- the PCR products were checked by electrophoresis and sanger sequencing.
- the PCR reactions were prepared with the following components:
- the PCR products were checked by electrophoresis and Sanger sequencing.
- RESULTS As a proof of concept, we first tested a PCR reaction with a DNA fragment of 222 KB carried on a BAC plasmid.
- the 222 kb sequence was cloned from the genome of Kutzneria sp. CA- 103260. It has a GC content of 72%.
- the plasmid was named as p4-1 E.
- the primer sites flank the cloning site and are six b from the insertion.
- the target size is beyond what PCR can normally amplify, so no full-size product can be observed. Instead, a nonspecific product band below 1 kb can be seen on agarose gel ( Figure 2). Sequencing reveal that this product represents a new type of PCR artifact. It is a recombinant molecule composed of the sequences from both ends of the PCR target ( Figure 1a and b). The resulting band at approx. 1 kb is the PCR leaping product, where the amplification has leaped over the middle part of the target. In the negative controls where only one of the specific primers was used, no such leaping products was generated. The unspecific bands were confirmed to be mis-priming products by sequencing.
- Mis-priming is the classic explanation for shortened product formation and mis-priming of one primer has been deliberately used to amplify one terminal of a sequence in technologies like single primer PCR and semi-random PCR.
- the one primer mis-priming and PCR leaping was compared using the same template and conditions. At higher annealing temperature of 55°C, PCR leaping products were generated when a pair of primers were used. When only one primer was used at doubled concentration, no nonspecific bands can be observed. Upon further reduction of the annealing temperature to 45°C, nonspecific bands were formed from the one primer reactions, and were confirmed to be mis-priming products by sequencing (Figure 3). This shows that leaping PCR products are more frequently produced compared to mis-priming.
- the present example showed that PCR reactions optimized for Leaping PCR, produces leaping products that are enriched over full-length products, when the targets are longer than what can be elongated by the polymerase in the relative short extension time are used.
- Leaping PCR amplifications, and PCR in general is affected by many factors.
- the aim of the present example is to demonstrate the effect of different polymerases and different annealing temperatures on leaping PCR.
- Xj527.9 and XJ528.9 have additional sequences on the 5’ ends that can form hairpin structures to protecting against mis-priming.
- Bio-Rad S1000TM Thermal Cycler was used for the PCR programmes.
- PCR reactions were prepared with different polymerases as in the following table: PCR reactions with Q5 polymerase were carried with the following program: PCR reactions with Dreamtaq polymerase were carried with the following program:
- the PCR products were checked by electrophoresis and Sanger sequencing.
- PCR reactions were prepared with different primer pairs. In one set of reactions, primer Xj527.1 and Xj528.1 were used. In the other set of reactions, primer Xj527.9 and Xj528.9 were used. Xj527.1 and Xj528.1 don’t have hairpin structure. Xj527.9 and XJ528.9 have additional sequences on the 5 ends that can form hairpin structures to protecting against mis-priming.
- the PCR reactions were prepared with the following components:
- the PCR products were checked by electrophoresis and Sanger sequencing.
- PCR reactions were prepared with Q5 polymerases and different templates and primers as follows:
- PCR reaction mixtures were prepared as in the following table.
- the PCR products were checked by electrophoresis and Sanger sequencing. To test DNA target with low GC content of 45%.
- the PCR reactions were prepared for seven colonies from the DNA library Lib.C74.
- the library insertions have an average GC content of 45%.
- Primer Xj595 and Xj596 were used in one set of reactions.
- Primer 597+598 were used in the other set of reactions.
- Primer 597+598 have additional sequence on their 5’ end forming hairpin structure against mis-priming.
- the PCR reaction mixtures were prepared as follows: The PCR reactions were carried out using the following program
- the PCR products were checked by electrophoresis and Sanger sequencing.
- PCR results are known to be affected by many factors. For example, purified DNA is in general better than DNA in complex matrices. Higher GC sequence tend to form secondary structures and cause primer mis-binding. DNA polymerases with proofreading activity can repair 3’ end mismatch and reinitiate extension or priming. Primers with hairpin structure can increase binding specificity. Lower annealing temperature promote primer mis-binding.
- Example 1 already shows that leaping PCR occurs from different qualities of templates, for example from purified plasmid DNA, boiled E. coli colony or E. coli colony directly from LB plate. In boiled E. coli colony or E. coli colony directly from LB plate the target DNA is in complex matrices of proteins, RNA and genomic DNA of the E. coli host.
- Example 1 also shows that leaping PCR occurs from DNA template with high GC content of 72%.
- Example 1 also shows that leaping PCR occurs when low annealing temperature of 40 °C was used.
- the present example shows that leaping PCR also occurs using different polymerases, from different quality of template DNA, from DNA with different GC content, with different kinds of primers and at different annealing temperatures.
- the aim of this example is to demonstrate that leaping PCR products may also be obtainable from normal PCR reactions.
- Plasmid pUC19 and lambda DNA purchased from NEB were used as templates.
- PCR reaction mixtures were prepared as follows:
- the present example illustrates that leaping PCR products may be obtainable from normal PCR reactions, but at a much lower amount than what is possible using the optimized leaping PCR conditions provided in the above examples.
- the aim of the present example is to illustrate that leaping PCR is a result of intramolecular extension rather intermolecular extension.
- Fragment CP or a mixture of fragment C and fragment P were used as PCR templates.
- Fragment C was amplified from the genome of Corynebacterium variabile DSM 44702 with primers xj872 and xj875.
- Fragment P was amplified from the genomes of Pseudomonas putida KT2440 genome with primers xj868 and xj871 .
- Fragment CP was amplified from a recombinant plasmid pXJ199 with primers xj892 and xj895.
- testing PCR reactions and primers used here are as follows:
- PCR reaction mixtures were prepared with different templates at different concentrations as follows:
- the PCR products were checked by electrophoresis and sanger sequencing.
- PCR reaction contains a mixture of homologic sequences as template, for example the 16sRNA genes in environmental samples
- hybrid PCR products with 5’ ends from one sequence and 3’ end from another may be generated. This is called “jumping PCR” or “bridging PCR” artifact. It happens as premature extension products from earlier PCR cycles annealed to a different template in a following cycle and get extended.
- Premature extension products are generated as DNA polymerases can only synthesis a short sequence of 3 to 104 b on average in each binding event and frequently dissociate and associate to the DNA extension intermediates.
- the aim of this example is to illustrate the use of Leaping PCR in high throughput sequencing from large genomic libraries.
- E. coli DH10B competent cells was constructed by a commercial supplier.
- the E. coli DH10B competent cells were constructed by partially digesting Kutzneria sp. CA-103260 genomic DNA which were collected and transferred into a suitable backbone vector by DNA ligase. The ligation products were electroporated into E. coli DH10B competent cells.
- the E. coli cells were spread on agar plates containing apramycin. Single colonies were picked into 384 well plates.
- the BAC library of Kutzneria sp. CA-103260 was constructed.
- PCR reaction mixtures were prepared with the following component:
- E. coli colonies from the BAC library were printed into the PCR reaction array by a plastic pin replicator.
- the PCR products were pooled together and subjected to illumina sequencing.
- BGCs biosynthetic gene clusters
- the DNA sequences for BGCs count to around 10 % of the genome. Recently, several methods have been invented to clone those BGCs of large DNA size. Based on degenerate primers targeting conserved sequences in PKS and NRPS.
- a high throughput cloning method independent of BGC types is still missing.
- the BAC library was built with a shuttle vector pESAC13-Apramycin to facilitate following transfer and expression of the BGCs in Streptomyces.
- the BamHI cloning site of the pESAC13-Apramycin vector was used for the insertions.
- Kutzneria sp. CA-103260 genomic DNA was randomly and partially digested by BamHI and inserted to the vector backbone and transferred into E. coli host. Ten colonies were randomly picked, and their plasmids were checked on Pulse Field Gel Electrophoresis.
- Average insertion size was found to be around 140 Kb ( Figure 11), so one 384 well plate of the library represents 5.4 coverages of a 10 Mbp genome.
- Primers are designed to flanking the cloning BamHI site and 6 bp away from the insertions.
- 3 or 4 b barcodes were added to the 5’end of primers. Each two barcodes have at least one or two b different with each other.
- 24 sense primers and 16 antisense primers make up an array of 384 PCR reactions.
- 384 colonies from a library microplate are cultured on a LB agar plate and then printed into the 384 PCR reactions with a plastic pin replicator. The PCR reaction results were pooled together and subjected to illumine sequencing (Figure 12a).
- insertion sequence was obtained by mapping the reads against the genome, as leaping PCR product can indicate the terminals of the insertions.
- the locations of the corresponding colonies were interpreted from the primer barcode pairs ( Figure 12b).
- Figure 12b By comparing the sequence ranges of the BAC insertions and the BGCs on the Kutzneria sp. CA-103260 genomes. E. coli colonies carrying the BGCs were identified from the BAC library.
- the aim of this example is to illustrate the use of sequences obtained from sequencing of pooled Leaping PCR products to identify the presence of particular biosynthetic gene clusters (BGCs) in specific colonies in the BAC plate.
- BGCs biosynthetic gene clusters
- Illumina paired end sequencing results of unsheared leaping PCR products from an exemplary BAC library are Illumina paired end sequencing results of unsheared leaping PCR products from an exemplary BAC library.
- Example GFF3 results from one of the 384 Leaping PCR barcodes:
- EXAMPLE 7 Rapid identification of insertions for colonies from activity-based screening of DNA library
- the aim of this example is to illustrate the use of Leaping PCR in high throughput identification of particular inserts from large genomic libraries, using a different vector than presented in Example 5.
- Plasmid pXJ166 is composed of a p15a-ori plasmid replication origin, a chloramphenicol acetyltransferase gene as selective marker, and a Hind I II restriction site as cloning site.
- Vibrio natriegens DSM 759 was cultured and embedded in agarose gel. The cells within gel were lysed and washed. The genomic DNA was partially digested with Hind 111 and then separated by Pulse Field Gel Electrophoresis. DNA fragments of desired sizes were collected and ligated with pXJ166 backbone by DNA ligase. The ligation products were electroporated into E. coli competent cells. The E. coli cells were spread on agar plate containing chloramphenicol. Single colonies were picked into 384 well plates. Thus, the library of Vibrio natriegens DSM 759 was constructed. Colonies that formed big size colonies with in 15 hours were picked as the fast-growing strains. The PCR reactions were prepared for the fast-growing strains. Primer 595 and 596 flanking the Hindlll cloning site of the pXJ166 vector were used.
- the PCR reaction mixtures were prepared as follows:
- the PCR products were checked by electrophoresis and Sanger sequencing.
- the sequencing results were compared to Vibrio natriegens DSM 759 genome sequence (NCBI access number CP009977.1 and CP009978.1 ) to find identify the insertions.
- pXJ166 has a p15a replication origion, a cmr gene as selective marker and a Hindi 11 cloning site.
- Vibrio genomic DNA has an average GC content of 45%.
- Vibrio genomic DNA was randomly and partially digested by Hindi 11 and inserted to the vector backbone and transferred into E. coli host. Eight colonies were randomly picked, and their plasmids were checked on agrose gel electrophoresis. Average insertion size was found to be around 50 Kb ( Figure 13). All colonies were screening for their growth speed. Fast growing is desired for E. coli used in industry fermentation, as it can greatly reduce the overall fermentation cost. Seven colonies that form big colonies within 15 hours were selected as fast-growing strains.
- the present example clearly shows that leaping PCR can be apply to library of different vector and different DNA source with different GC content.
- the present example also shows that leaping PCR can be used together with different sequencing technologies, for example sanger sequencing or illumina sequencing. It also shows leaping PCR can be used to quickly identify insertions of colonies from activity-based screening of DNA library.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention relates to the field of PCR amplification and uses thereof, and specifically relates to leaping PCR reactions, where only a fraction of a target nucleic acid is amplified in the PCR reactions, wherein the PCR reaction produces an amplification product which comprises the terminal parts of a target nucleic acid but is devoid of a central part of the target nucleic acid.
Description
PCR LEAPING AND APPLICATIONS THEREOF
FIELD
The present invention relates to the field of PCR amplification and uses thereof, and specifically relates to leaping PCR reactions, where only a fraction of a target nucleic acid is amplified in the PCR reactions, wherein the PCR reaction produces an amplification product which comprises the terminal parts of a target nucleic acid but is devoid of a central part of the target nucleic acid.
BACKGROUND
Polymerase chain reaction (PCR) uses DNA polymerase and a pair of primers to amplify a chosen DNA sequence. DNA copies are produced in an exponential manner, generating enough materials for subsequent detection, sequencing, or cloning. Since its invention in 1984, it has become a widely used and often indispensable technique in many fields, ranging from genetic engineering, DNA sequencing, genetic disorder and cancer diagnostics, infection diagnostics, food safety, environmental biology, criminal forensics, archaeology, and DNA-based data storage.
A fundamental requirement for PCR reaction is to have the product loyally representing the original DNA sequence. Nonspecific amplification can cause false judgement in PCR based diagnosis and artifacts in PCR based sequencing, and the spurious products are unusable for cloning or genetic engineering. Traditionally, the nonspecific amplifications have mainly been attributed to two scenarios. Mis-binding of one or both primers to unintended template sites results in mis-priming products. Mis-binding of two primers to each other results in primer dimers. Because primer annealing and the DNA polymerases can be promiscuous, mis-binding structures can form and be extended resulting in unwanted amplification products. Shorter error products are amplified faster than the desired product so once formed they can easily dominate the exponential PCR reaction. Based on this model, many methods have been developed to improve PCR reaction, for example by optimized primer sequence, protected primer structures, hot-start PCR, touchdown PCR, nested PCR, chemical PCR additives, DNA mismatch-recognizing protein. Among them hot-start and touchdown are often used as they are most effective and convenient.
In high-throughput screening and DNA sequencing the amount of reagents and time consumption pr. analysis are critical factors that are often highly optimized to enhance the efficiency of the screen and reduce the cost. Due to high number of repetitions even minor reduction in time spend pr. analysis or reagents, amounts to significant cost savings.
Accordingly, methods that optimizes the efficiency of high throughput screenings are highly advantageous, especially in the field of DNA screening and sequencing.
To this end, the present disclosure describes a new type of PCR amplification, which reduces the number of reagents used and the time spend for each individual sample, thereby providing an improved PCR reaction exceptionally suitable for large scale DNA library screening. Furthermore, the improved PCR reaction of the present disclosure also improves the identification of target sequences in genomic libraries by providing a more robust identification of target sequences from genomic libraries.
SUMMARY
The present disclosure relates to methods using leaping PCR, wherein the amplification products produced comprise the distal portions of the target nucleic acid sequences but lacks a central part of the target sequence. The disclosure accordingly provides optimized PCR methods which are optimized for leaping PCR. Such methods are suitable for high throughput identification of target sequences in genomic libraries.
A first aspect of the disclosure relates to a method for performing a leaping PCR amplification on a target nucleotide sequence, said method comprise, a) providing at least one pair of PCR primers, said primers each comprises i. a tag nucleotide portion (i), such as a barcode sequence, which does not anneal to the target nucleotide, and
II. a nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence, b) performing a PCR amplification using a polymerase on the target nucleotide sequence using said PCR primers, therein the extension step of said PCR amplification is performed for less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence, wherein the amplification product produced by said PCR amplification comprises a 5’end portion and a 3’end portion of said target nucleotide sequence, characterized in that a central part of more than 10kb, such as more than 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb or such as more than 150 kb, of the target nucleotide sequence is absent from the amplification product. In embodiments, the polymerase has a replication efficiency of at least 500b/min (bases/minute) during the extension phase of the PCR amplification, such as at least 1 kb/min or such as between 500b-3 kb/min. in embodiments, the amplification product is 100 b to 5000 b long, preferably, 150-2000 b long. In additional embodiments, the leaping PCR amplification comprises a) denaturation of a target
nucleotide sequence; b) annealing of primers to the target nucleotide sequence, and c) extension of the primers to produce an amplification product using a polymerase, wherein, c) is conducted for 5 minutes (s) or less, such as less than 4min, 2min, 1min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less. The leaping PCR amplification may be repeated for 10-100 cycles. In additional embodiments, the primers are specific to regions flanking the target nucleotide sequence. In said PCR reaction the annealing temperature may be greater than, lower than, or equal to the Tm of the primers. Preferably the amplification is an intramolecular amplification reaction. Example 4 and Figures 9 and 10 clearly shows that the leaping PCR reactions have a different mechanism (Figure 10c) than the previously reported “polymerase slippage”, “polymerase jumping”, “jumping PCR” or “bridging PCR”.
In additional embodiments, the two nucleotide strands of the amplification product anneal to each other. In embodiments, the target sequence does not comprise a hairpin structure within 10-500 b from the primer target region. In embodiments, the primers are 20-50 b long. In embodiments, the tag portion of the primer is 2-10 b long. In embodiments, the nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence are about 10-50 b long, such as 15-30 b long, preferably 18-28 b long.
A second aspect of the invention relates to a method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprised in a genomic library, such as in a vector e.g., a plasmid, said method comprises a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) targeting the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments, c) performing a leaping PCR method as defined herein, on the vectorized DNA fragments using said primers, d) sequencing said amplification product, e) identifying the presence of the one or more target sequences based on the sequence of the amplification product.
In embodiments, the primers encompass the sequence of the genomic library cloning site. In additional embodiments, the primers are placed within about 1000 base pairs (b), such as between 0-1000 b from the cloning site of the genomic library cloning site. In additional embodiments, the primer(s) comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1-167, or a nucleotide sequence which is at least 80% identical thereto.
The methods of the present disclosure are e.g., suited for identification of target sequences in large genomic libraries, which are faster and more reliable than previously known methods.
BRIEF DESCRIPTION OF FIGURES
FIGURE 1 - Schematic of leaping PCR. a, During the nonspecific amplification, two fragments from the terminals of the PCR target was jointed together. The middle part of the PCR target was leaped over, b, the sequence of the joining area.
FIGURE 2 - Product gel photo for PCR reactions from plasmid p4-1 E. Details of each reaction are described in example 1. Unspecific products below 1.5 kb were generated. The leaping PCR products identified from sanger sequencing result were marked by arrows.
FIGURE 3 - Product gel photo of leaping PCR reactions and one primer PCR reactions with two different annealing temperatures. Details of each reaction are described in example 1. One primer PCR did not generate band at the higher annealing temperature.
FIGURE 4 - Product gel photo of leaping PCR reactions using dreamtaq polymerase and Q5 polymerase. Details of each reaction are described in example 2. Both polymerases generated leaping PCR products below 1.5 kb.
FIGURE 5 - Product gel photo of leaping PCR reactions using primers with and without hairpin structures. Details of each reaction are described in example 2. Both types of primers generated leaping PCR products below 1.5 kb.
FIGURE 6 - Product gel photo of leaping PCR reactions using annealing temperature of 66°C. Details of each reaction are described in example 2. Leaping PCR products below 3 kb were generated.
FIGURE 7 - Product gel photo of leaping PCR reactions from template with GC content of 45%. Details of each reaction are described in example 2. Leaping PCR products below 1.5 kb were generated.
FIGURE 8 - A) Results of PCR reactions (10 different primers sets for each) from pUC19 and lambda DNA on agarose gel. Recommended annealing temperature and extension time are used. Full-length products are around 2 kb, corresponding to the bright single bands.
The nonspecific products (indicated by brackets) smaller than the full size products were recovered from the gel and sequenced by Illumina NGS sequencer. B) Summary of the sequencing result. Each sequencing read was compared to the template sequence to check if the two primers bind correctly and if there is PCR leaping event. C) Three examples of the PCR leaping results. The two leaping points on the template and the final joint point on the product were identified by sequence comparison between the product and the template. The micro homology between two leaping points were marked by grey shadows. D) Leaping points of all the unique PCR leaping events. The sequencing reads were deduplicated, and analysed by the sequence comparison as in c) . Zero or 0 indicates no microhomology found. Mismatched bases in the microhomology are represented using IUB codes (R for A or G, Y for C or T, S for G or C, W for A or T, K for G or T, and M for A or C). Deletions are indicated by underlining. The pie chart illustrates the distribution of microhomology lengths. FIGURE 9 - Product gel photo of PCR reactions from fragment CP or a mixture of C and P as templates. Different template concentrations were used. Details of each reaction are described in example 4.
FIGURE 10 - a, comparison of single piece template and double-piece template, fragment P-G or a mixture of fragment P and G were used as PCR template. All ten sense primers target sequence P; all antisense primers target sequence G. b, summary of the PCR results, Error bars are displaying the ±SD of three replicates, c, mechanisms of PCR leaping and other error products. In jumping PCR, the premature extension product annealed to a wrong place. It relies on long mis-binding of the 3’ end of the premature extension product. In mispriming, the primer annealed to a wrong place. As the primer concentration is much higher than the premature extension product, it requires relatively short homology of the 3’ end of the primer with the mis-priming site on the template. In primer leaping, the 3’ end of the premature extension product dissociated from the template in DNA breathing or as a result of incomplete renaturation, or the whole premature extension product has denatured from the template but has yet drifted away or still intertwined with the template strand. The 3’ end of the premature extension product then get mis-extended at a wrong place of the same template. It relies on temporary or partial binding of the premature extension product with the same template. It requires short or no homology between the 3’ end of the premature extension product and the mis-extension site on the template.
FIGURE 11 - Product gel photo of plasmids isolated from randomly picked colonies from a BAC library of Kutzneria sp. CA-103260. The insertion sizes are between 150 kb and 250 kb.
FIGURE 12 - a, leaping PCR screening on BAC library. Library colonies were transferred from LB agar to multi well PCR plate. All the PCR products were pooled together and submitted to NGS sequencing, b, BAC plasmid, primer sites and leaping PCR product structure, c, mapping the leaping PCR product on the genome.
FIGURE 13 - Product gel photo of plasmids isolated from randomly picked colonies from DNA library of Vibrio natriegens. The insertion sizes are above 20 kb.
FIGURE 14 - Product gel photo for Leaping PCR reactions from fast growing colonies from DNA library of Vibrio natriegens.
DETAILED DESCRIPTION
LEAPING PCR
The present disclosure relates to a method for performing a leaping PCR amplification on a target nucleotide sequence.
Leaping PCR is a method for producing amplification products useful for sequencing and identification, wherein the amplification products only contain a portion of the terminal sequences of the target nucleic acids. In regular PCR amplification a single product is desired. The PCR amplification product stretches from the 5’end primer to the 3’end primer, and consists of two antiparallel nucleotide strands, generally known as the amplicon. In a regular PCR reaction, the entire region between the 5’end and 3’end primers is extended and amplified. The inventors surprisingly found that extension products containing the terminal parts of the target nucleotide sequence, but not the central part of the target nucleic acid sequence could be obtained, thus eliminating the need for extension of the entire target region. In addition, entire regions above 5 kb cannot be directly subjected to NGS sequencing due to the large sizes. In contrast, the leaping PCR products are smaller, thus can be sequenced by illumina sequencing. Leaping PCR is thus exceptionally suitable for application where the knowledge of the identity of the entire sequence of the target sequence is not required from the sequencing, but where only part of the sequence is needed to identify regions of interest from a library. Such applications are e.g., identification of particular DNA inserts from a library, where only part of the sequence is required to identify the presence of the target nucleic acid sequence. Examples of such are genomic libraries, where entire genomes or part of genomes are digested and inserted into cloning
vectors, which are subsequently inserted into a host cell, from which the cloning vector may subsequently be extracted and amplified using PCR, whereafter the PCR product/amplicon may be sequenced, and the region of interest may be identified in the particular host cell. In embodiments, the host cell is a bacterial cell. In other embodiments, the host cell is a eukaryotic cell. Simultaneously sampling both terminals of DNA fragment has many applications and is traditionally done by a complex process involving circularizing the DNA with a vector or adaptor, random shearing, and recirculation before sequencing. The methods presented herein provides an alternative method for identifying the presence of particular regions of interest in a library, using the sequence of the terminal ends of the target region, which can be retrieved by analysis of the produced amplicons. In the present disclosure the term “host cell” refers to a cell suitable for being genetically modified to replicate a heterologous genetic element, such as a plasmid carrying one or more heterologous genes or expression elements.
Leaping PCR is not restricted to produce a single amplicon from an amplification reaction, since a target sequence result in several leaps being performed by the polymerase. Often due to the size of the target region often being larger than what is practically obtainable from a single amplification reaction, it is beneficial to obtain incomplete extension fragments, this is however only possible, in the case where both a forward and reverse strand capable of annealing are produced. Accordingly, in embodiments, at least one set of the nucleotide strands of the amplification product can anneal. In embodiments, the two nucleotide strands of the amplification product can anneal.
Incomplete extension fragments, may in leaping PCR be produced by the extension leaping from one site in the target nucleic acid sequence onto a new site further up/down the target nucleic acid sequence, thus completely omitting a central part of the target nucleic acid in the amplification. An illustration of the principle of leaping PCR can be seen in Figure 1. The omitted central part of the strand is preferably more than 5 kb (kilobases), such as more than 10 kb, 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb, 150 kb, 200 kb, 300 kb or such as more than 500 kb. In embodiments, the target nucleic acid contains one leaping point, thus giving rise to a single amplification product. In other embodiments, the target nucleic acid contains more than one leaping point, thus giving rise to more than one amplification product. In other embodiments, the target nucleic acid contains several leaping points, thus giving rise to several amplification products. In the case where leaping points are present, (leaping one time is also enough for the deep sequencing) deep sequencing may be used to identify the terminal regions of the target nucleic acids, such as e.g., about
the 100 first and the 100 last nucleotides in the target nucleic acid sequence. Alternatively, about the 100, 50, 40, 30, 25, 20, 15 or 10 proximal and/or the 100, 50, 40, 30, 25, 20, 15 or 10 most distal nucleotides in the target nucleic acid sequence, compared to the start of the insert as defined by the cloning site.
Accordingly, it is preferred that the primers used for the leaping PCR amplification are specific to regions flanking the target nucleotide sequence. Such as about 1000 b, such as between 0-1000 b from the cloning site.
In the present disclosure, the amplification product is preferably an intramolecular PCR product, i.e., a product produced from the same target nucleic acid sequence. In the present disclosure, the amplification product is preferably not an intermolecular PCR product, i.e., a product produced from different target nucleic acid sequences.
An alternative to leaping PCR is “Polymerase Jumping” which is a phenomenon occurring from stem-loop structures in the target nucleic acid sequence, where the polymerase “jumps” over the stem-loop/transposon, and a gap in the sequence is obtained (Viswanathan et al., “Template Secondary Structure Promotes Polymerase Jumping During PCR Amplification”, (1999) BioTechniques). In the present disclosure, “Polymerase Jumping” is not “leaping PCR”. In embodiments, the target nucleic acid/template sequence does not comprise a hairpin and/or stem-loop structure(s). In embodiments, the target nucleic acid/template sequence does not comprise a hairpin and/or stem-loop structure(s) between the primer binding region(s). In embodiments, the target nucleic acid/template sequence does not comprise inverted repeats between the primer binding region(s). In the present disclosure “target nucleic acid” and “template sequence” are used interchangeably.
In embodiments, the target nucleic acid/template sequence does not comprise a hairpin and/or stem-loop structure(s) within 10-500 b, such as about 10b, 25b, 50b, 75b, 100b, 125b, 150b, 175b, 200b, 250b, 300b, 350b, 400b, 450b or about 500b, from the primer target region(s). In embodiments, the target nucleic acid/template sequence does not comprise inverted repeats within 10-500 b, such as about 10b, 25b, 50b, 75b, 100b, 125b, 150b, 175b, 200b, 250b, 300b, 350b, 400b, 450b or about 500b, from the primer target region(s).
PCR AMPLIFICA TION
The PCR amplification is a technique for making many copies of a specific template DNA sequence. The basic principle of PCR amplification is well known to the skilled person. One set of primers complementary to a template/target DNA are designed, and a region flanked by the primers is amplified by polymerase (e.g., DNA polymerase) in a reaction including multiple amplification cycles.
Each amplification cycle includes an initial denaturation, and up to 100 cycles of annealing, strand elongation/extension and strand separation (denaturation). In each cycle of the reaction, the DNA sequence between the primers is copied. Primers can bind to the copied DNA as well as the original template sequence, so the total number of copies increases exponentially with time. Various modified PCR methods are available and well known in the art. Various modifications such as the“RT-PCR” method, in which DNA is synthesized from RNA using a reverse transcriptase before performing PCR.
In some embodiments, hot start PCR conditions may be used to reduce mis-priming, primerdimer formation, improve yield, and/or ensure high PCR specificity and sensitivity. A variety of approaches may be employed to achieve hot start PCR conditions, including hot start DNA polymerases (e.g., hot start DNA polymerases with aptamer-based inhibitors or with mutations that limit activity at lower temperatures) as well as hot start dNTPs (e.g., CLEANAMP™ dNTPs, TriLink Biotechnologies).
In some embodiments, a PCR amplification may include from about 20 cycles to about 100 cycles or more (e.g., about 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 cycles).
The sequences amplified in this manner form an amplified target nucleic acid (also referred to herein as an amplicon). The amplicon of the present disclosure generally consists of only the terminal parts of the target nucleic acid and lacks the central part of the target nucleic acid/template. Primers and probes can be readily designed by those skilled in the art to target a specific template nucleic acid sequence. In certain preferred embodiments, resulting amplicons are short to allow for rapid cycling and generation of copies. The size of the amplicon can vary as needed, for example, to provide the ability to discriminate target nucleic acids from non-target nucleic acids. For example, amplicons can be less than about 1 ,000 nucleotides in length. In some embodiments, the amplicons are from 100 to 500 nucleotides in length (e.g., 100 to 200, 150 to 250, 300 to 400, 350 to 450, or 400 to 500
nucleotides in length). In other embodiments, the amplicons are greater than about 1 ,000 nucleotides in length, e.g., about 1 ,000, about 2,000, about 3,000, about 4,000, about 5,000, or more nucleotides in length. In embodiments, the amplicon lacks a central part of more than 10kb, such as more than 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb or such as more than 150 kb, of the target nucleotide sequence. In embodiments, the amplicon contains from 100 to 500 nucleotides of the target sequence (e.g., 100 to 200, 150 to 250, 300 to 400, 350 to 450, or 400 to 500 nucleotides in length), wherein the target sequence has a length of more than 10 kb. In other embodiments, the amplicons contains more than about 1 ,000 nucleotides of the target sequecne, e.g., about 1 ,000, about 2,000, about 3,000, about 4,000, about 5,000, or more nucleotides of the target nucleic acid, wherein the target sequence has a length of more than 10 kb. In some embodiments, more than one (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10) target nucleic acids may be amplified in one reaction.
Primers
Amplification primers can be readily designed to target a specific template nucleic acid sequence.
Generally, the primers may be designed such that they comprise a target portion which targets a specific target/template nucleic acid sequence. Additionally, the primers may also comprise a portion which does not target the target/template nucleic acid sequence, but which serves as a barcode sequence and/or sequencing tag, i.e. , a non-target portion. In embodiments, the tag nucleotide portion is a barcode sequence.
The non-target portion may comprise a sequencing portion and/or a barcode portion.
Accordingly, provided are pair(s) of PCR primers, said primers each comprise a tag nucleotide portion (i) which does not anneal to the target nucleotide and a nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence. Alternatively, the primers comprise a tag nucleotide portion (i) which does not anneal to the target nucleotide and a nucleotide portion (ii) which specifically targets a region flanking the 5’ end or the 3’end regions of the target nucleotide sequence, such as e.g., regions flanking an insertion site in a backbone. Non-limiting examples of such sequences are provided in table 1 .
Accordingly, each PCR primer used in the present disclosure may comprise a barcode primer region and a target-specific binding region complementary to a sequence in a target nucleic acid. Preferably the target-specific binding region is a region flanking an insertion site in a cloning vector.
Accordingly, each PCR primer used in the present disclosure may comprise a barcode primer region, a sequencing primer region (illumina/nanopore sequencing does not need sequencing primer. Sanger sequencing needs sequencing primer, the sequencing primer we used is actually the per primer it self. So can say our per primer contain a sequencing primer region, but it actually overlap with the other regions ), and a target-specific binding region complementary to a sequence in a target nucleic acid.
Each region of the primer oligonucleotide may include 2-30 nucleotides. For example, the barcode primer regions may include 2-20 nucleotides; the sequencing primer regions may include 12-30 nucleotides; and the target-specific binding region may include 5-30 nucleotides. The overall sequence of the primers is chosen to be non-naturally occurring, when comprising a target specific and a non-target specific portion. In some embodiments, the primers may include RNA, DNA, or a combination thereof. The oligonucleotides may also contain modified nucleotides, e.g., modified bases, sugars, or phosphates. In one embodiment, uracil is substituted for positions where thymine appears in the primers, which allows removal of trace amounts of synthetic oligonucleotide and carryover PCR products by pre-treatment with uracil-DNA glycosylase (UDG).
The target-specific binding regions may flank a sequencing assay region in the target nucleic acid and allow for amplification thereof.
In some embodiments, the primers may include RNA, DNA, or a combination thereof.
The primers used in the method of the present disclosure preferably comprises barcoding portion. Each region of the barcoding portion may include 2-20 nucleotides. For example, the barcode sequences may have 4-18 nucleotides and the primer regions may have 7-30 nucleotides. In some embodiments, the barcoding portion may include RNA, DNA, or a combination thereof. The barcoding portion may also contain modified nucleotides, e.g., modified bases, sugars, or phosphates.
By “barcode” is meant a unique oligonucleotide sequence that may allow the corresponding oligonucleotide to be identified. In some embodiments, the nucleic acid sequence may be
located at a specific position in a longer nucleic acid sequence. In some embodiments, each barcode may be different from every other barcode by at least a minimum Hamming Distance, wherein the minimum Hamming Distance may be a number greater or equal to 1 , such as 2. By “Hamming Distance” is meant a relationship between two nucleic acid sequences of equal length, wherein the number corresponding to the Hamming Distance is the number of bases by which two sequences of equal lengths differ.
Non-limiting examples of barcoding sequences and target specific sequences may e.g., be found in Table 1 .
Table 1
sequences.
Non-limiting examples of further primer sequences and target specific sequences may e.g., be found in Table 2.
Table 2
Tower case letters indicate barcode/non-target sequences; Uppercase letters indicate targel or overhang sequences.
In embodiments, the primer(s) comprises or consists of a nucleotide sequence selected from the group consisting of SEQ ID NO: 1-167, or a nucleotide sequence which is at least 80%, such as at least 85%, 90%, 95%, 98%, 99% identical thereto.
The term "sequence identity" as used herein describes the relatedness between two nucleotide sequences, i.e., a candidate sequence (e.g., a sequence of the disclosure) and a reference sequence (such as a prior art sequence) based on their pairwise alignment.
For purposes disclosed herein, the sequence identity between two nucleotide sequences is determined using the Needleman-Wunsch algorithm (Needleman and Wunsch, 1 970, supra) as implemented in the Needle program of the EMBOSS package (EMBOSS: The European Molecular Biology Open Software Suite, Rice et al., 2000, Trends Genet. 16: 276- 277), 10 preferably version 5.0.0 or later. The parameters used are gap open penalty of 10, gap extension penalty of 0.5, -endopen 10.0, -endextend 0.5 and the DNAFULL (EMBOSS version of NCBI NUC4.4) substitution matrix. The output of Needle labelled " identity" (obtained using the -nobrief option) is used as the percent identity. Generally sequence identity may be calculated as follows: (Identical nucleotide residues x 100)/(aligned region).
Extension time
As a general rule, recommended extension times for normal PCR reactions is about one minute per 1000 base pairs (e.g. 2 minutes for a 2 kb amplification product). Accordingly, the method presented herein reduces the time spend on the extension step, since the method does not require extension of the entire target nucleic acid sequence, but only a fraction of
the target nucleic acid sequence sufficient to identify the terminal ends of the target nucleic acid sequence.
Example 1 and 3 shows that Leaping PCR products are present under normal and optimized conditions, but clearly shows that leaping PCR products are enriched in PCR reactions where the extension time is optimized for leaping PCR, i.e., shortened (at least by 50%) compared to normal PCR reactions, where the extension time is set to approx. 1 min/kb.
The extension time may be defined from the length of the target nucleotide sequence and the efficiency of the polymerase, also known as the polymerization rate or replication efficiency, during the extension phase.
As such the extension time is not limited but is preferably shorter than the extension time needed to synthesize the entire target nucleic acid.
Preferably, the extension time is for less than 75%, such as less than 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2% or such as less than 1% of the time required by the polymerase to synthesize the entire target nucleotide sequence. More preferably the extension time is less than 50%, such as less than 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2% or such as less than 1 % of the time required by the polymerase to synthesize the entire target nucleotide sequence.
As such the time required for the extension step may be determined by dividing the length of the target nucleic acid sequence with the efficiency of the polymerase. Thus, the time may be calculated as the length of the target nucleic acid sequence divided by a polymerase efficiency of e.g., 1 kb/min.
The polymerase efficiency of commonly used polymerases is well known, and is generally in the range of 1-2kb/min. In examples, the polymerase efficiency of the Taq polymerase is about 1 kb/min at about 72°C. However, the specific polymerase elongation efficiency also depends on the GC content of the target nucleic acid sequence and the specific extension and/or annealing temperature. This is also evident from the results provided in Example 3, which shows that the annealing temperature can be used to further optimize the leaping PCR reaction to specifically enrich the leaping PCR products over other PCR products. Examples of such different parameters, and their influence on the obtained results can e.g., be seen in Example 2, which shows that leaping PCR also occurs using different
polymerases, from different quality of template DNA, from DNA with different GC content, with different kinds of primers and at different annealing temperatures. Example 3 further illustrates that leaping PCR products may be obtainable from normal PCR reactions, but at a much lower amount than what is possible using the optimized leaping PCR conditions provided in example 1 and 2, thus clearly shows that the optimized PCR reaction in the previous examples is superior in generating leaping PCR products.
Ways to further optimize such the parameters of a PCR reaction is well known to the skilled person.
By the terms “polymerase efficiency”, “polymerase elongation efficiency” are used interchangeably and is defined by the polymerization rate of a polymerase under given conditions, i.e. , the number of nucleotides in a sequence synthesized for a given time under particular conditions. The “polymerase efficiency” or “polymerase elongation efficiency” is usually in the range of 1-2 kb/min but is largely dependent on the specific temperature under which the extension is performed.
In embodiments, the leaping PCR amplification comprises denaturation of a target nucleotide sequence; annealing of primers to the target nucleotide sequence; and extension of the primers to produce an amplification product using a polymerase.
In embodiments, the annealing temperature is greater than the Tm of the primers, such as 1°C, 2°C, 3°C, 4°C, 5°C or such as 6°C higher than the Tm of the primers. In embodiments, the annealing temperature is lower than the Tm of the primers such as 1°C, 2°C, 3°C, 4°C, 5°C or such as 6°C lower than the Tm of the primers. In embodiments, the annealing temperature is equal to, or substantially equal to the Tm of the primers.
In embodiments, the extension phase conducted for 5 minutes (s) or less, such as less than 4min, 2min, 1min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less. Preferably the extension phase conducted for a time less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence, wherein the target sequence has a size of 10 kb - 500 kb. As such the extension phase may be conducted for less than 5 min, 4min, 2min, 1 min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less for a target region of more than or equal to 1 kb, 2 kb, 3 kb, 4 kb, 5 kb, 10 kb, 25 kb, 50 kb, 100 kb, 200 kb, 300 kb, or
more than or equal to 500 kb. In particular embodiments the extension time is between 5s and 2 min.
Accordingly, in the present disclosure it is preferred that the amplification product(s) obtained from the leap PCR reaction is about 75 b to 5000 b long, preferably, 150-2200 b long, such as about 75b, 100b, 150 b, 200 b, 300b, 400b, 500b, 600b, 700b, 800b, 900b, 1000 b, 1100b, 1200b, 1300b, 1400b, 1500b, 1600b, 1700b, 1800 b, 1900 b, 2000 b, 2100 b, or such as about 2200 b, more preferable about 1-2 kb.
In particular embodiments, the polymerase has a replication efficiency of at least 500b/min during the extension phase of the PCR amplification, such as at least 1 kb/min, 1.5kb/min, 2kb/min, 2.5kb/min or such as at least 3kb/min, or such as between 500b-3 kb/min, such as 1kb-2.5 kb/min, or such as about 1 kb/min.
Example 3 underlines that different polymerases may be used in order to produce the leaping PCR products, in specific example 3 shows that proofreading Q5® Hot Start High- Fidelity polymerase and nonproofreading polymerase DreamTaq™ Hot Start DNA Polymerases both produces Leaping PCR products.
The replication efficiency of a polymerase may also be referred to as mean synthesis speed, examples of suitable polymerase synthesis speeds are e.g., in. the range of 15s/kb to 120s/kb, such as 15s/kb, 30s/kb, 40/kb, 45/kb, 50s/kb, 60s/kb, 75s/kb or such as 120s/kb.
Non-limiting examples of polymerases are e.g., Taq polymerase and variants thereof, pfu polymerase and variants thereof, and Fi polymerase and variants thereof.
Sequencing
In embodiments the PCR products/amplicons may be purified before sequencing. In other embodiments the PCR products/amplicons sequenced without purification. Methods for purification of amplification products from a PCR reaction is well known in the art.
In some embodiments of any of the methods described herein, the detecting includes sequencing. In some embodiments, the sequencing includes massively parallel sequencing, Sanger sequencing, or single-molecule sequencing. In some embodiments, the massively parallel sequencing includes sequencing by synthesis or sequencing by ligation. In some embodiments, the massively parallel sequencing includes sequencing by synthesis. In some
embodiments, the sequencing by synthesis includes ILLUMINA™ dye sequencing, ion semiconductor sequencing, or pyrosequencing. In some embodiments, the sequencing by synthesis includes ILLUMINA™ dye sequencing. In some embodiments, the sequencing by ligation includes sequencing by oligonucleotide ligation and detection (SOLiD™) sequencing or polony-based sequencing. In some embodiments, the single-molecule sequencing is nanopore sequencing, single-molecule real-time (SMRT™) sequencing, or Helicos™ sequencing.
GENOMIC DNA LIBRARIES
Example 1 , 5 and 7 of the present disclosure describes the use of leaping PCR on different genomic libraries such as a bacterial artificial chromosome (BAC) library. In example 1 the genome of a target organism is digested using a restriction enzyme, whereafter the DNA fragments are cloned into a BAC vector, carrying the site recognized by the restriction enzyme. Such DNA fragments are also referred to herein as vectorized DNA fragments. The BAC constructs are then transformed into a suitable host cell, in the case of example 1 an E. coli cell, which is then grown, and each individual colony is mapped and paired with a primer set, comprising a pair of barcode tags, unique for the specific colony. The DNA fragment inserted into the BAC construct is amplified using leaping PCR, using the primers which targets a region flanking the insertion site via. a target specific nucleic acid sequence as shown in Figure 12 b. Genomic libraries may be prepared in numerous ways, known to the skilled person. In the present disclosure, the terms “bases” (b) and “nucleotides” (nt) are used interchangeably, and kilobases is abbreviated as “kb”.
Example 5 and 7 illustrates how the leaping PCR can be used to improve the identification of biosynthetic gene clusters in large genomic libraries by firstly improving the speed of the assay by reducing the extension time and secondly, providing more identifications of the gene clusters than what is possible by normal PCR reaction methods and sequencing methods such as for example sanger sequencing or illumina sequencing. The DNA sequences for biosynthetic gene clusters (BGCs) amount to around 10 % of the genome of several species. However, high throughput cloning method independent of BGC types has been missing. This problem is solved by the use of leaping PCR. Example 5 and 6 illustrates a high throughput BGC cloning platform that combines traditional BAC library construction and leaping PCR based end-paired library analysis. Due to the size of BGCs which is generally above 10 kb and often above 100 kb, normal PCR is not suited for extension of these products. Accordingly, Leaping PCR solves this problem by only requiring extension of
a fraction of the target BGC. This results in a highly optimized pipeline for identification of gene regions such as BGCs.
Accordingly, in embodiments, the method provided herein is for identifying gene clusters. Accordingly, in embodiments, the method provided herein is for identifying biosynthetic gene clusters. As used herein “biosynthetic gene cluster” (BGCs) are sets of microbial genes that synthesize a wide plurality of biosynthetic compounds with diverse functions, such as siderophores and antibiotics, often such biosynthetic compounds are bioactive compounds. As such gene clusters are generally clusters of functionally related genes, i.e., they often include genes within the same pathway, genes encoding interacting proteins, or genes that affect the same phenotype. Such gene cluster may e.g., be identified using a genomic library produced using an exonuclease such as e.g., Exonuclease I or Xrnl , or endonuclease, such as e.g., BamHI, EcoRV, EcoRI, or Hindlll. Exonuclease/endonuclease generated genomic libraries often entails providing genetic material, e.g., a chromosome of a bacteria or fungi, which is cleaved into fragments using specific nucleases to provide a fragmented genome. Due to the specific restriction sites resulting from the cleavage using the exonucleases/endonucleases, the fragments may be inserted into vectors, that are designed to receive the fragments with the specific restriction sites. Accordingly, such fragments inserted into the genomic library can be of varying sizes.
In embodiments, the methods provided herein is for identifying large genomic inserts. In embodiments, large genomic inserts are >1 Okb, such as as >15 kb, >20 kb, >30 kb, >40 kb, >50 kb, >75 kb, >100 kb, >130 kb or such as >150 kb. In embodiments, leaping PCR is used prior to sequencing.
By “library” is meant the amplification product of multiple nucleic acids, wherein the multiple nucleic acids may have the same or different sequences.
The present disclosure also provides a method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprised in a genomic library. The method for performing the identification of genomic libraries is as such not limited to the particulars of the methods disclosed in the examples but may be conducted in a plethora of ways known to the skilled person.
In essence, such a method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprise the step of:
a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) as disclosed herein, c) performing a leaping PCR method on the vectorized DNA fragments using said primers as defined herein, d) sequencing said amplification product, e) identifying the presence of the one or more target sequences based on the sequence of the amplification product or part of the amplification product.
In additional embodiments, the present disclosure relates to a method for identifying the presence of one or more biosynthetic gene clusters from a genomic library, said method comprising the step of: a) providing a genomic library comprising one or more vectorized DNA fragments comprising a backbone and an insert, wherein the insert is suspected of containing one or more biosynthetic gene clusters, b) providing one or more PCR primer pair(s) targeting the regions flanking the insertion site of the backbone, c) performing a leaping PCR protocol on the vectorized DNA fragments using said primers, d) sequencing said leaping PCR amplification product, identifying the presence of the one or more biosynthetic gene clusters based on the sequence of the leaping PCR amplification product or part of the amplification product.
In additional embodiments, the present disclosure relates to a method for identifying the sequences in one or more vectorized DNA fragments comprised in a genomic library, comprising a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) targeting the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments, c) performing a leaping PCR method as defined herein, on the vectorized DNA fragments using said primers, d) sequencing said amplification product, e) identifying the sequences based on the sequence of the amplification product.
In embodiments, the identification of the presence of the one or more biosynthetic gene clusters is based on a part of the amplification product, such as the first and/or last 10-100 b, preferably 25-50 b of the amplification product. In embodiments, the identification of the
presence of the one or more biosynthetic gene clusters is based on the first and/or last IQ-
100 b , such as about 15 b, 20 b, 25 b, 30 b, 35 b, 40 b, 45 b, 50 b, 55 b, 60 b, 65 b, 70 b, 75 b, 80 b, 85 b, 90 b, 95 b or such as about 100 b of the amplification product. In embodiments, the identification of the presence of the one or more target genes is based on the first and/or last 10-100 b of the amplification product.
In embodiments, the identification of the presence of the one or more target genes is based on the first and/or last 10-100 b of the amplification product, not including the primer and cloning site.
In embodiments, the identification of the presence of the one or more target genes is based on the first and last 25-50 b of the amplification product.
In embodiments, the identification of the presence of the one or more target genes is based on the first and/or last 10-50 b of the amplification product, not including the primer and cloning site. In embodiments, the identification of the presence of the one or more target genes is based on the first and/or last 20-30 b, such as approx. 20 b, 21 b, 22 b, 23 b, 24 b, 25 b, 26 b, 27 b, 28 b, 29 b, or approx. 30 b of the amplification product, not including the primer and cloning site. In further embodiments the identification of the presence of the one or more target genes is based on the complete amplified sequence.
In embodiments, the leaping PCR protocol is performed on the vectorized DNA fragments using said primers with an extension time of about 1s-30s. In other embodiments, the leaping PCR protocol is performed on the vectorized DNA fragments using said primers with an extension time of less than 50%, such as less than 40%, 30%, 20%, 10%, 5%, 2%, 1 %, 0.1 % or such as less than 0.01 % of the time required by the polymerase to synthesize the entire vectorized DNA fragment.
In embodiments, the present disclosure also relates to a method for determining only a part of the nucleotide sequence of a nucleic acid of interest, comprising: providing n samples, wherein n is an integer, and n > 1 ; dividing the n samples into m groups, wherein m is an integer, and n > m > 1 ; performing leaping PCR amplification, as defined herein, on the m groups of samples under conditions suitable for amplifying the nucleic acid of interest when templates from the samples are available, wherein a pair or multiple pairs of index primers are used for each sample, wherein each pair of index primers consists of a forward index primer and a reverse index primer and primer indexes used for different samples are different;
obtaining the leaping PCR products, comprising a sequence covering only the proximal and distal parts of the nucleic acid of interest, thus lacking a central part of the nucleic acid of interest of at least 10 kb; subjecting the recovered leaping PCR products to sequencing to obtain sequences of the leaping PCR products; and matching obtained sequencing data to corresponding samples based on a unique primer index for each sample, aligning obtained sequences of the leaping PCR products against DNA reference sequences corresponding to the nucleic acid of interest, and assembling only a partial sequence of the nucleic acid of interest from the obtained sequences of the leaping PCR products DNA based on sequence overlap. In embodiments the length of the complete DNA target sequence of the nucleic acid of interest exceeds a maximum read length of a sequencer used for said sequencing. In additional embodiments, the alignment covers less than 50%, such as less than 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, 1%, 0.1% or such as less than 0.01 % of the nucleic acid of interest. In additional embodiments, the alignment covers less than 2000 b, such as less than 1000 b, 500 b, 300 b, 200 b, 150 b, 100 b, 75 b or such as less than 50 b of the nucleic acid of interest.
Plasmids
Example 1-7 shows that different vectors may be used with the methods described herein, the vectors shown in the examples are to be considered as non-limiting examples, and it is to be understood that other vectors could readily be considered by the skilled person.
As such any vector carrying a cloning site, or a set of cloning sites compatible with a suitable restriction enzyme or restriction Endonucleases for producing the library is considered suitable to carry out the methods of the disclosure. Non-limiting examples of suitable restriction enzymes are e.g., BamHI, EcoRI and Hindlll.
Additionally, the vector plasmid can be amplified by PCR before being used for the cloning. Furthermore, new cloning sites can be added during the PCR reaction by including the cloning sites in the PCR primers.
Provided herein are non-limiting examples of suitable vectors which may be used as vectors in the methods of the disclosure.
In embodiments, the vector may e.g., be a plasmid, such as pUC19 or pJC720, BAC (bacterial artificial chromosome) vector, such as pBACe3.6,
PAC vector (P1 -derived artificial chromosome), such as pPAC4, YAC (yeast artificial chromosome) vector, such as, pYAC4, cosmid, or fosmid
A cosmid is a type of hybrid plasmid that contains a Lambda phage cos sequence. A fosmid is a type of hybrid plasmid that contains a bacterial F-plasmid.
Accordingly, it is preferred that the plasmid comprises one or more restriction sites which allows for linearization of the plasmid, and integration of one or more genomic inserts of interest.
As such there is no limitation to the construction of the sequence library other than it should be suitable for amplification of the target nucleic acids.
In the method of the present disclosure, it is preferable that the amplicons produced in the PCR reaction are not too long i.e. , >5kb, preferably >2kb, since this would mean an unfavourable long amplification reaction and a lower chance of successful amplification of the desired portion of the insert. Accordingly, in order to ensure that the amplification reaction is sufficiently short, it is preferred the as little as possible of the vector is amplified.
Thus, it is preferrable that the primers are placed within about 1000 b, or such as between 0- 1000 b, 0-750 b 0-500 b 0-250 b, or such as 0-10 b, or such as between 10-1000 b, 50-1000 b 100-1000 b 250-1000 b, or such as 500-1000 b from the cloning site of the genomic library cloning site. Preferably, the primers are placed immediately before or within the cloning site. In some embodiments, the primers encompass the sequence of the genomic library cloning site. In other embodiments, the primers do not encompass the sequence of the genomic library cloning site. As mentioned in the present disclosure, the extension step of said PCR amplification as described herein is performed for less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence.
As such the present disclosure discriminates between the target nucleotide sequence, which refers to the nucleic acid sequence inserted into e.g., the vectors as described above, and the target region targeted by the primers. As such the region targeted by the primers are not part of the inserted nucleic acid sequence, which is the unknown sequence, but is a known region, preferably a region of the vector, which may be shared by all constructs of the library. Accordingly, the primers preferably target a known target sequence, and amplification of the
primers results in amplification of the known primer sequence forming part of the vector and an unknown sequence forming part of the target nucleic acid sequence. Accordingly, the primers preferably target the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments.
It should be understood that any feature and/or aspect discussed above in connections with the primers, compounds etc. according to the invention apply by analogy to the methods described herein.
The following figures and examples are provided below to illustrate the present invention. They are intended to be illustrative and are not to be construed as limiting in any way.
SEQUENCES
Primer sequences can be found in Table 1 and Table 2 and are referred to as SEQ ID NO: 1- 167 in the sequence listing. Sequences with SEQ ID NOs: 168-170 are sequences presented in Figure 1.
EXAMPLES
EXAMPLE 1 - LEAPING PCR AMPLIFICATION
AIM
The aim of the present example is to establish the presence of Leaping PCR products in different vectors. The present example further aims to demonstrate the difference between one primer mis-priming and PCR leaping using the same template and conditions.
MATERIALS
OneTaq® Quick-Load® 2X Master Mix with Standard Buffer (M0486L) is used for the PCR reactions.
Plasmid p4-1 E and E. coli colony 4-1 E are used as PCR templates. p4-1 E is based on pESAC13 PAC vector with an insertion of 222kb. E. coli colony 4-1 E harbours the plasmid p4-1 E.
Plasmid pXJ149.12 is used as PCR template. pXJ149.12 is based on pBeloBACH BAC Vector with a 6.3 kb insertion.
The primers used are depicted in the below table:
Primers Xj527.1 and XJ528.1are designed around the cloning site of the vector pESAC13. Xj462 is a primer that have no binding site on the template p4-1 E, and is used here as a negative control. Primers XI11 and XI12 are designed around the insertion of pXJ149.12.
Bio-Rad S1000™ Thermal Cycler is used for the PCR programs.
METHODS
9 PCR samples were prepared with different primers and templates as shown in the following table:
The PCR reactions were prepared with the following components:
A touchdown programme was used for the amplifications, as shown in the following table:
The PCR products were checked by electrophoresis and sanger sequencing.
Six samples were prepared with different primers as shown in the following table:
The PCR reactions were prepared with the following components:
Two different programmes with different annealing temperatures were used for the amplifications.
Final annealing at 55°C
Final annealing at 45°C
The PCR products were checked by electrophoresis and Sanger sequencing.
RESULTS As a proof of concept, we first tested a PCR reaction with a DNA fragment of 222 KB carried on a BAC plasmid. The 222 kb sequence was cloned from the genome of Kutzneria sp. CA- 103260. It has a GC content of 72%. The plasmid was named as p4-1 E. The primer sites flank the cloning site and are six b from the insertion.
The target size is beyond what PCR can normally amplify, so no full-size product can be observed. Instead, a nonspecific product band below 1 kb can be seen on agarose gel (Figure 2). Sequencing reveal that this product represents a new type of PCR artifact. It is a
recombinant molecule composed of the sequences from both ends of the PCR target (Figure 1a and b). The resulting band at approx. 1 kb is the PCR leaping product, where the amplification has leaped over the middle part of the target. In the negative controls where only one of the specific primers was used, no such leaping products was generated. The unspecific bands were confirmed to be mis-priming products by sequencing.
Mis-priming is the classic explanation for shortened product formation and mis-priming of one primer has been deliberately used to amplify one terminal of a sequence in technologies like single primer PCR and semi-random PCR.
The one primer mis-priming and PCR leaping was compared using the same template and conditions. At higher annealing temperature of 55°C, PCR leaping products were generated when a pair of primers were used. When only one primer was used at doubled concentration, no nonspecific bands can be observed. Upon further reduction of the annealing temperature to 45°C, nonspecific bands were formed from the one primer reactions, and were confirmed to be mis-priming products by sequencing (Figure 3). This shows that leaping PCR products are more frequently produced compared to mis-priming.
The present example showed that PCR reactions optimized for Leaping PCR, produces leaping products that are enriched over full-length products, when the targets are longer than what can be elongated by the polymerase in the relative short extension time are used.
EXAMPLE 2 - EFFECT OF POLYMERASE AND ANNEALING TEMPERATURE ON LEAPING PCR
AIM
Leaping PCR amplifications, and PCR in general is affected by many factors. The aim of the present example is to demonstrate the effect of different polymerases and different annealing temperatures on leaping PCR.
MATERIALS
Q5® Hot Start High-Fidelity 2X Master Mix from NEB and DreamTaq™ Hot Start DNA Polymerases 2X Master Mix from Thermo Scientific are tested for the PCR reactions. Overnight culture of E. coll colony 4-1 E was used as PCR template. DNA fragments 9.3 and 12.3 were from genomic DNA of Streptomyces coelicolor A3(2) and used as PCR templates The following primers were used in the PCR reactions.
Xj527.9 and XJ528.9 have additional sequences on the 5’ ends that can form hairpin structures to protecting against mis-priming.
Bio-Rad S1000™ Thermal Cycler was used for the PCR programmes.
METHODS
To test polymerase with or without proofreading activity:
PCR reactions were prepared with different polymerases as in the following table:
PCR reactions with Q5 polymerase were carried with the following program:
PCR reactions with Dreamtaq polymerase were carried with the following program:
The PCR products were checked by electrophoresis and Sanger sequencing.
To test primers with or without hairpin protection structures:
PCR reactions were prepared with different primer pairs. In one set of reactions, primer Xj527.1 and Xj528.1 were used. In the other set of reactions, primer Xj527.9 and Xj528.9 were used. Xj527.1 and Xj528.1 don’t have hairpin structure. Xj527.9 and XJ528.9 have additional sequences on the 5 ends that can form hairpin structures to protecting against mis-priming.
The PCR reactions were prepared with the following components:
The PCR reactions were carried out with the following program:
The PCR products were checked by electrophoresis and Sanger sequencing.
To test high annealing temperatures at 66°C.
The PCR reactions were prepared with Q5 polymerases and different templates and primers as follows:
PCR reaction mixtures were prepared as in the following table.
The PCR reactions were carried out using the following program:
The PCR products were checked by electrophoresis and Sanger sequencing. To test DNA target with low GC content of 45%.
The PCR reactions were prepared for seven colonies from the DNA library Lib.C74. the library insertions have an average GC content of 45%.
Primer Xj595 and Xj596 were used in one set of reactions. Primer 597+598 were used in the other set of reactions. Primer 597+598 have additional sequence on their 5’ end forming hairpin structure against mis-priming.
The PCR reaction mixtures were prepared as follows:
The PCR reactions were carried out using the following program
The PCR products were checked by electrophoresis and Sanger sequencing.
RESULTS
PCR results are known to be affected by many factors. For example, purified DNA is in general better than DNA in complex matrices. Higher GC sequence tend to form secondary structures and cause primer mis-binding. DNA polymerases with proofreading activity can repair 3’ end mismatch and reinitiate extension or priming. Primers with hairpin structure can increase binding specificity. Lower annealing temperature promote primer mis-binding. Example 1 already shows that leaping PCR occurs from different qualities of templates, for example from purified plasmid DNA, boiled E. coli colony or E. coli colony directly from LB plate. In boiled E. coli colony or E. coli colony directly from LB plate the target DNA is in complex matrices of proteins, RNA and genomic DNA of the E. coli host.
Example 1 also shows that leaping PCR occurs from DNA template with high GC content of 72%.
Example 1 also shows that leaping PCR occurs when low annealing temperature of 40 °C was used.
To check the dependence of PCR leaping on different factors, additional conditions were tested. Proofreading DNA polymerase Q5 and nonproofreading polymerase Dream Taq were compared with the same template and primers. From both polymerases unspecific bands smaller than the full-length target were generated and most of them could be confirmed to be PCR leaping products (Figure 4).
Primers with or without hairpin structures were tested with the same polymerase and templates. From both unspecific bands smaller than the full-length target were generated and most of them could be confirmed to be PCR leaping products (Figure 5).
Higher annealing temperature of 66°C was tested with the same polymerase. Unspecific bands smaller than the full-length target were generated and most of them could be confirmed to be PCR leaping products (Figure 6).
DNA template with low GC content of 45% was tested with the same polymerase and primer designs. Unspecific bands smaller than the full-length target were generated and most of them could be confirmed to be PCR leaping products (Figure 7).
Accordingly, the present example shows that leaping PCR also occurs using different polymerases, from different quality of template DNA, from DNA with different GC content, with different kinds of primers and at different annealing temperatures.
EXAMPLE 3 - LEAPING PCR PRODUCTS UNDER NORMAL PCR CONDITIONS
AIM
The aim of this example is to demonstrate that leaping PCR products may also be obtainable from normal PCR reactions.
MATERIALS
Plasmid pUC19 and lambda DNA purchased from NEB were used as templates.
Primers used for pUC19 are as follows:
Primers used for lambda DNA are as follows:
METHODS The PCR reaction mixtures were prepared as follows:
The PCR reactions were carried out using the following program
The products were pooled together and subjected to illumina sequencing
RESULTS
Twenty reactions with different primer pairs were carried out on circular pUC19 DNA and linear lambda DNA. Recommended annealing temperature and extension time were used. As shown in the gel photo (Figure 8a), full-length products around 2 kb were successfully produced, indicated by the bright single bands. In the low size area, only weak smears can be seen. Those smears were recovered from the gel and subjected to illumine sequencing. They were found to be composed of unfinished extension products and nonspecific amplification products. Among the nonspecific amplification products, PCR leaping products are the most abundant with the rest to be one primer mis-priming and double-primer mispriming products (Figure 8b). The leaping products were compared with the template sequence to identify the leaping points (Figure 8c). No or only very short microhomologies were found between the leaping points (Figure 8d), suggesting the leaping can occur at random positions.
Accordingly, the present example illustrates that leaping PCR products may be obtainable from normal PCR reactions, but at a much lower amount than what is possible using the optimized leaping PCR conditions provided in the above examples.
Thus, the present example clearly shows that the optimized PCR reaction in the previous examples is superior in generating leaping PCR products.
EXAMPLE 4 - PCR LEAPING IS INTRAMOLECULAR EXTENSION
AIM
The aim of the present example is to illustrate that leaping PCR is a result of intramolecular extension rather intermolecular extension.
MATERIALS
Fragment CP or a mixture of fragment C and fragment P were used as PCR templates.
Fragment C was amplified from the genome of Corynebacterium variabile DSM 44702 with primers xj872 and xj875.
Fragment P was amplified from the genomes of Pseudomonas putida KT2440 genome with primers xj868 and xj871 .
Fragment CP was amplified from a recombinant plasmid pXJ199 with primers xj892 and xj895.
The sequences of those primers are as follows:
The testing PCR reactions and primers used here are as follows:
METHODS
PCR reaction mixtures were prepared with different templates at different concentrations as follows:
The PCR reactions were carried using the following program:
The PCR products were checked by electrophoresis and sanger sequencing.
RESULTS
To investigate the mechanism of PCR leaping, first we rule out DNA “polymerase slippage” (Viguera E., et al., “In vitro replication slippage by DNA polymerases from thermophilic organisms”. J Mol Biol 312, 323-333 (2001)), and “polymerase jumping” (Viswanathan VK, et al., “Template secondary structure promotes polymerase jumping during PCR amplification”. Biotechniques. 1999 Sep;27(3):508-11. doi: 10.2144/99273st04. PMID: 10489610.) as those required long inverted repeats above 200 bp on the template sequence next to the jumping point. This kind of long inverted repeats doesn’t exist in the leaping PCR templates (Figure 1 b).
Previous studies showed that when PCR reaction contains a mixture of homologic sequences as template, for example the 16sRNA genes in environmental samples, hybrid PCR products with 5’ ends from one sequence and 3’ end from another may be generated. This is called “jumping PCR” or “bridging PCR” artifact. It happens as premature extension products from earlier PCR cycles annealed to a different template in a following cycle and get extended. Premature extension products are generated as DNA polymerases can only synthesis a short sequence of 3 to 104 b on average in each binding event and frequently dissociate and associate to the DNA extension intermediates. To check if PCR leaping is a result of “jumping PCR” occurred between two sections of the same template, we compared reactions with a single piece DNA as template and with two separated pieces as template. Two DNA fragments, C and P, were amplified from Corynebacterium variabile DSM 44702 and Pseudomonas putida KT2440 genomes respectively. A combined fragment CP was amplified from a recombinant plasmid with the sequences of C and P cloned together. PCR reactions with 10 different primer pairs were carried out with either fragment CP or a mixture of C and P as templates. All ten sense primers target the sequence of C, and the all ten antisense primers target the sequences of P (Figure 10a). Different template concentrations from 5E-7 ng/ul up to 0.005 ng/ul were tested. Relatively short extension time was used to enrich nonspecific amplifications. As shown in the gel photo (Figure 9), single piece template generated nonspecific bands at concentration 5E-7 ng/ul, while double-piece template
required higher concentrations to generate nonspecific bands. Sequencing show that single piece template mainly has PCR leaping products, while double-piece template have only one primer mis-priming products (Figure 10b).
Thus, the present example clearly shows that the leaping PCR reactions have a different mechanism (Figure 10c) than the previously reported “polymerase slippage”, “polymerase jumping”, “jumping PCR” or “bridging PCR”.
EXAMPLE 5 - HIGH THROUGHPUT BGC CLONING BY BAC LIBRARY AND LEAPING
PCR BASED SCREENING AIM
The aim of this example is to illustrate the use of Leaping PCR in high throughput sequencing from large genomic libraries.
MATERIALS Commercially available Kutzneria sp. CA-103260 library
E. coli strain DH10B
Dreamtaq hotstart 2 X master
Primers:
METHODS E. coli DH10B competent cells was constructed by a commercial supplier. In brief, the E. coli DH10B competent cells were constructed by partially digesting Kutzneria sp. CA-103260 genomic DNA which were collected and transferred into a suitable backbone vector by DNA ligase. The ligation products were electroporated into E. coli DH10B competent cells. The E. coli cells were spread on agar plates containing apramycin. Single colonies were picked into 384 well plates. Thus, the BAC library of Kutzneria sp. CA-103260 was constructed.
384 PCR reactions in an array are prepared with different sense primers and antisense primers as follows
Forward primers used for each reaction in the 384 array:
Reverse primers used for each reaction in the 384 array:
PCR reaction mixtures were prepared with the following component:
E. coli colonies from the BAC library were printed into the PCR reaction array by a plastic pin replicator.
PCR reactions were carried out using the following program
The PCR products were pooled together and subjected to illumina sequencing.
The sequencing reads were checked for their barcodes and mapped to the Kutzneria sp. CA-103260 genome to find the insertions and the corresponding colonies.
RESULTS
Microbial natural products have contributed most life-saving antibiotics and also many other bioactive compounds. Genome analysis confirms that there is still a vast reservoir of biosynthetic gene clusters (BGCs) for novel compounds. For example, software tools like antismash can find around 30 to 60 BGCs on an average per actinobacterial genome and the majority of them can encode new chemicals.
The DNA sequences for BGCs count to around 10 % of the genome. Recently, several methods have been invented to clone those BGCs of large DNA size. Based on degenerate primers targeting conserved sequences in PKS and NRPS.
A high throughput cloning method independent of BGC types is still missing. Here we established a high throughput BGC cloning platform by combining traditional BAC library and leaping PCR based end-paired library analysis. The BAC library was built with a shuttle vector pESAC13-Apramycin to facilitate following transfer and expression of the BGCs in Streptomyces. The BamHI cloning site of the pESAC13-Apramycin vector was used for the insertions. Kutzneria sp. CA-103260 genomic DNA was randomly and partially digested by BamHI and inserted to the vector backbone and transferred into E. coli host. Ten colonies were randomly picked, and their plasmids were checked on Pulse Field Gel Electrophoresis. Average insertion size was found to be around 140 Kb (Figure 11), so one 384 well plate of the library represents 5.4 coverages of a 10 Mbp genome. Primers are designed to flanking the cloning BamHI site and 6 bp away from the insertions. 3 or 4 b barcodes were added to the 5’end of primers. Each two barcodes have at least one or two b different with each other. 24 sense primers and 16 antisense primers make up an array of 384 PCR reactions. 384 colonies from a library microplate are cultured on a LB agar plate and then printed into the 384 PCR reactions with a plastic pin replicator. The PCR reaction results were pooled together and subjected to illumine sequencing (Figure 12a).
From the sequencing result, insertion sequence was obtained by mapping the reads against the genome, as leaping PCR product can indicate the terminals of the insertions. The locations of the corresponding colonies were interpreted from the primer barcode pairs (Figure 12b). By comparing the sequence ranges of the BAC insertions and the BGCs on the Kutzneria sp. CA-103260 genomes. E. coli colonies carrying the BGCs were identified from the BAC library.
Thus, the present example clearly shows that BAC library and leaping PCR based screening can clone BGCs in a high throughput manner more easily the previously existing methods.
EXAMPLE 6 - IDENTIFICATION OF BACTERIAL BGC USING DEEP SEQUENCING
FROM LEAPING PCR
AIM
The aim of this example is to illustrate the use of sequences obtained from sequencing of pooled Leaping PCR products to identify the presence of particular biosynthetic gene clusters (BGCs) in specific colonies in the BAC plate.
MATERIALS
Illumina paired end sequencing results of unsheared leaping PCR products from an exemplary BAC library.
Computer running a unix operating system, for example WSL, with basic bioinformatics programs installed.
METHODS
This example is performed by a software (shell script) which performs the following steps:
1 . Quality trim raw illumina data using Adapterremoval v.2.3.1
2. Demultiplex illumina data based on the 384 position based LeapingPCR indices using Adapterremoval v.2.3.1
3. Strip the reads to the first 25nt after GGATCC using the command-line utility ‘grep’
4. Map shortened reads from each of the 384 barcodes to base genome using Bowtie2 v.2.3.4.1
5. Extract the most common mapping start position using command-line utilities and programs (sort, cut, uniq and head)
6. Build GFF3 using unix command line utilities and programs (cut, echo, paste, awk). Besides the required fields (reference, source(always "leapingPCR"), type (always "BAC"), start, stop, orientation (orientation on the BAC back bone)), this GFF3 file also have an ID field consisting of reference-leapingPCR position (e.g. "contig_1-j14" indicating position j14 in the BAC library) and info in the mapping in the Note field (e.g. “ ID=contig_1 -j14;Note=supportF-16622, supportR-16583”), where the support is the number of reads mapping on the start (F) and end (R) position of the GFF3 entry. So in the example below, 16622 read mappings started exactly on the position 6262275:
7. remove GFF3 entries which represent BACs <10kb or >500kb
8. output finished GFF3 annotation format file.
RESULTS
The result from this software is a GFF3 annotation file with information connecting the leaping PCR well position with the coordinates of the BAC sequence in the base genome. Example GFF3 results from one of the 384 Leaping PCR barcodes:
• contig_1 leapingPCR BAC 6262275 6347865 . 1 . ID=contig_1 -j14;Note=supportF- 16622, supportR- 16583
EXAMPLE 7 - Rapid identification of insertions for colonies from activity-based screening of DNA library
AIM
The aim of this example is to illustrate the use of Leaping PCR in high throughput identification of particular inserts from large genomic libraries, using a different vector than presented in Example 5.
MATERIALS
Bacterial strain Vibrio natriegens DSM 759 Hind II I restriction nuclease
Plasmid pXJ166. pXJ166 is composed of a p15a-ori plasmid replication origin, a chloramphenicol acetyltransferase gene as selective marker, and a Hind I II restriction site as cloning site.
DreamTaq™ Hot Start DNA Polymerases 2X Master Mix
Primers:
METHODS
Vibrio natriegens DSM 759 was cultured and embedded in agarose gel. The cells within gel were lysed and washed. The genomic DNA was partially digested with Hind 111 and then separated by Pulse Field Gel Electrophoresis. DNA fragments of desired sizes were collected and ligated with pXJ166 backbone by DNA ligase. The ligation products were electroporated into E. coli competent cells. The E. coli cells were spread on agar plate containing chloramphenicol. Single colonies were picked into 384 well plates. Thus, the library of Vibrio natriegens DSM 759 was constructed. Colonies that formed big size colonies with in 15 hours were picked as the fast-growing strains.
The PCR reactions were prepared for the fast-growing strains. Primer 595 and 596 flanking the Hindlll cloning site of the pXJ166 vector were used.
The PCR reaction mixtures were prepared as follows:
The PCR reactions were carried out using the following program:
The PCR products were checked by electrophoresis and Sanger sequencing.
The sequencing results were compared to Vibrio natriegens DSM 759 genome sequence (NCBI access number CP009977.1 and CP009978.1 ) to find identify the insertions.
RESULTS
To obtain new enzymes, new bioactive compounds or other new biofunctions, activity-based screening of DNA library is often used. Usually, multiple colonies showing the desired activity can be obtained from the library screening. Identify the DNA insertions carried by those colonies is required for further studies and applications. The insertions can be identified by isolating the plasmids from the colonies and sequencing the plasmids, but it is expensive and time consuming. Here we applied leaping PCR and sanger sequencing to quickly identify insertions for colonies from activity-based screening of DNA library.
The library was constructed from the genomic DNA of Vibrio natriegens. Vibrio natriegens is one of the most fast-growing bacteria discovered. The library was built with a plasmid vector pXJ166. pXJ166 has a p15a replication origion, a cmr gene as selective marker and a Hindi 11 cloning site. Vibrio genomic DNA has an average GC content of 45%. Vibrio genomic DNA was randomly and partially digested by Hindi 11 and inserted to the vector backbone and transferred into E. coli host. Eight colonies were randomly picked, and their plasmids were checked on agrose gel electrophoresis. Average insertion size was found to be around 50 Kb (Figure 13). All colonies were screening for their growth speed. Fast growing is desired for E. coli used in industry fermentation, as it can greatly reduce the overall fermentation cost. Seven colonies that form big colonies within 15 hours were selected as fast-growing strains. They are subjected to leaping PCR to identify the insertions. Primers are designed to flanking the Hindi 11 cloning site. Leaping PCR products are within the size range of 200 b to 3 kb (Figure 14). The PCR products were subjected to sanger sequencing. By comparing the sequence results to the vibrio genome sequences, the insertions of the seven colonies were identified to be:
Thus, the present example clearly shows that leaping PCR can be apply to library of different vector and different DNA source with different GC content. The present example also shows that leaping PCR can be used together with different sequencing technologies, for example
sanger sequencing or illumina sequencing. It also shows leaping PCR can be used to quickly identify insertions of colonies from activity-based screening of DNA library.
Claims
1 . A method for performing a leaping PCR amplification on a target nucleotide sequence comprising, a) providing at least one pair of PCR primers, said primers each comprises i. a tag nucleotide portion (i) which does not anneal to the target nucleotide and
II. a nucleotide portion (ii) which specifically targets the 5’ end or the 3’end regions of the target nucleotide sequence, b) performing a PCR amplification using a polymerase on the target nucleotide sequence using said PCR primers, therein the extension step of said PCR amplification is performed for less than 50% of the time required by the polymerase to synthesize the entire target nucleotide sequence, wherein the amplification product produced by said PCR amplification comprises a 5’end portion and a 3’end portion of said target nucleotide sequence, characterized in that a central part of more than 10kb, such as more than 15 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, 100 kb, 130 kb or such as more than 150 kb, of the target nucleotide sequence is absent from the amplification product.
2. The method according to claim 1 , wherein the method comprises a step of sequencing said amplification product, and a step of identifying the presence of the one or more target sequences based on the sequence of the amplification product.
3. The method according to claim 1 or 2, wherein the polymerase has a replication efficiency of at least 500b/min during the extension phase of the PCR amplification, such as at least 1 kb/min or such as between 500b-3 kb/min.
4. The method according to any of the preceding claims, wherein the leaping PCR amplification comprises a) denaturation of a target nucleotide sequence, b) annealing of primers to the target nucleotide sequence, and c) extension of the primers to produce an amplification product using a polymerase, wherein, c) is conducted for 5 minutes (s) or less, such as less than 4min, 2min,
1 min, 45s, 30s, 15s, 10s, 8s, 7s, 6s, or such as 5s or less.
5. The method according to any of the preceding claims, wherein the leaping PCR amplification is repeated for 10-100 cycles.
6. The method according to any of the preceding claims, wherein the primers are specific to regions flanking the target nucleotide sequence.
7. The method according to any of the preceding claims, wherein said tag nucleotide portion (i) is a barcode sequence.
8. The method according to any of claims 3-6, wherein the annealing temperature is greater than, lower than, or equal to the Tm of the primers.
9. The method according to any of the preceding claims, wherein the amplification is an intramolecular amplification reaction.
10. The method according to any of the preceding claims, wherein the template sequence does not comprise a hairpin structure within 10-500 b from the primer target region.
11 . The method according to any of the preceding claims, wherein the amplification product is 100 b to 5000 b long, preferably, 150-2000 b long.
12. The method according to any of the preceding claims, wherein the two nucleotide strands of the amplification product can anneal.
13. A method for identifying the presence of one or more target sequences in one or more vectorized DNA fragments comprised in a genomic library, said method comprises a) providing a genomic library comprising one or more vectorized DNA fragments, b) providing one or more PCR primer pair(s) targeting the regions on the vector flanking the genomic library cloning site of said vectorized DNA fragments, c) performing a leaping PCR method as defined in any of claims 1-11 , on the vectorized DNA fragments using said primers, d) sequencing said amplification product, e) identifying the presence of the one or more target sequences based on the sequence of the amplification product.
14. The method according to claim 12, wherein the vector is a plasmid, such as pUC19, pJC720, pBACe3.6, pPAC4, or pYAC4.
15. The method according to any of claims 12 or 13, wherein the primers encompass the sequence of the genomic library cloning site.
16. The method according to any of claims 12 to 14, wherein the primers are placed within about 1000 b, such as between 0-1000 b from the cloning site of the genomic library cloning site.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP24172474.9 | 2024-04-25 | ||
| EP24172474 | 2024-04-25 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025224232A1 true WO2025224232A1 (en) | 2025-10-30 |
Family
ID=90904570
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2025/061194 Pending WO2025224232A1 (en) | 2024-04-25 | 2025-04-24 | Pcr leaping and applications thereof |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025224232A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6323030B1 (en) * | 1994-02-17 | 2001-11-27 | Maxygen, Inc. | Methods for generating polynucleotides having desired characteristics by iterative selection and recombination |
| DE102004025650A1 (en) * | 2004-05-26 | 2006-06-08 | Schindelhauer, D., Dr. | High quality, stable, intact DNA production, for use e.g. in PCR reactions and diagnostic tests, by gel matrix electrophoresis to remove double- or single-strand broken molecules quantitatively |
| US20230193353A1 (en) * | 2020-05-07 | 2023-06-22 | Northeastern University | Methods and compositions for high-fidelity sequence analysis of individual long and ultralong nucleic acid molecules |
-
2025
- 2025-04-24 WO PCT/EP2025/061194 patent/WO2025224232A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6323030B1 (en) * | 1994-02-17 | 2001-11-27 | Maxygen, Inc. | Methods for generating polynucleotides having desired characteristics by iterative selection and recombination |
| DE102004025650A1 (en) * | 2004-05-26 | 2006-06-08 | Schindelhauer, D., Dr. | High quality, stable, intact DNA production, for use e.g. in PCR reactions and diagnostic tests, by gel matrix electrophoresis to remove double- or single-strand broken molecules quantitatively |
| US20230193353A1 (en) * | 2020-05-07 | 2023-06-22 | Northeastern University | Methods and compositions for high-fidelity sequence analysis of individual long and ultralong nucleic acid molecules |
Non-Patent Citations (7)
| Title |
|---|
| "TEMPLATE SECONDARY STRUCTURE PROMOTES POLYMERASE JUMPING DURING PCRAMPLIFICATION", BIOTECHNIQUES, INFORMA HEALTHCARE, US, vol. 27, no. 3, 1 September 1999 (1999-09-01), pages 508 - 511, XP000849475, ISSN: 0736-6205 * |
| IGUERA E. ET AL.: "In vitro replication slippage by DNA polymerases from thermophilic organisms", J MOL BIOL, vol. 312, 2001, pages 323 - 333, XP004449577, DOI: 10.1006/jmbi.2001.4943 |
| KLOCK HEATH E. ET AL: "Combining the polymerase incomplete primer extension method for cloning and mutagenesis with microscreening to accelerate structural genomics efforts", PROTEINS: STRUCTURE, FUNCTION, AND BIOINFORMATICS, vol. 71, no. 2, 14 November 2007 (2007-11-14), US, pages 982 - 994, XP093212666, ISSN: 0887-3585, DOI: 10.1002/prot.21786 * |
| RICE ET AL.: "EMBOSS: The European Molecular Biology Open Software Suite", TRENDS GENET., vol. 16, 2000, pages 276 - 277, XP004200114, DOI: 10.1016/S0168-9525(00)02024-2 |
| VIGUERA E ET AL: "In vitro replication slippage by DNA polymerases from thermophilic organisms", JOURNAL OF MOLECULAR BIOLOGY, ACADEMIC PRESS, UNITED KINGDOM, vol. 312, no. 2, 14 September 2001 (2001-09-14), pages 323 - 333, XP004449577, ISSN: 0022-2836, DOI: 10.1006/JMBI.2001.4943 * |
| VISWANATHAN ET AL.: "Template Secondary Structure Promotes Polymerase Jumping During PCR Amplification", BIOTECHNIQUES, 1999 |
| VISWANATHAN VK ET AL.: "Template secondary structure promotes polymerase jumping during PCR amplification", BIOTECHNIQUES, vol. 27, no. 3, September 1999 (1999-09-01), pages 508 - 11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7570651B2 (en) | Methods for sequencing nucleic acids in a mixture and compositions relating thereto - Patents.com | |
| JP7460539B2 (en) | IN VITRO sensitive assays for substrate selectivity and sites of binding, modification, and cleavage of nucleic acids | |
| TW201321518A (en) | Method of micro-scale nucleic acid library construction and application thereof | |
| KR20150140663A (en) | Methods for the production of libraries for directed evolution | |
| CN104232627A (en) | 2b-RAD pooling technology | |
| US20250137000A1 (en) | Methods of preferentially amplifying nucleic acid molecules | |
| US20200131504A1 (en) | Plasmid library comprising two random markers and use thereof in high throughput sequencing | |
| WO2018057779A1 (en) | Compositions of synthetic transposons and methods of use thereof | |
| JP2023506631A (en) | NGS library preparation using covalently closed nucleic acid molecule ends | |
| CN111094587A (en) | Transposase composition, preparation method and screening method | |
| CN120041539A (en) | Library construction method combining low-depth whole genome sequencing and targeted sequencing | |
| CN116888276B (en) | A multiplex PCR library construction method for high-throughput targeted sequencing | |
| CN108026525A (en) | The composition and method of polynucleotides assembling | |
| US20160108394A1 (en) | Molecular identity tags and uses thereof in identifying intermolecular ligation products | |
| US20230083751A1 (en) | Method For Constructing Gene Mutation Library | |
| WO2025224232A1 (en) | Pcr leaping and applications thereof | |
| US9856470B2 (en) | Process for generating a variant library of DNA sequences | |
| WO2024209000A1 (en) | Linkers for duplex sequencing | |
| US20020119535A1 (en) | Method for recombining polynucleotides | |
| EP3976899A1 (en) | Flexible and high-throughput sequencing of targeted genomic regions | |
| EP3969581A2 (en) | Capture and analysis of target genomic regions | |
| JP2025522572A (en) | Methods and compositions for nucleic acid sequencing | |
| US20230287396A1 (en) | Methods and compositions of nucleic acid enrichment | |
| CN114507903A (en) | Plasmid sequencing method | |
| HK40075402A (en) | Method for introducing mutations |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25721254 Country of ref document: EP Kind code of ref document: A1 |