WO2018009729A2 - Modification d'adn polymérases pour des applications in vitro - Google Patents
Modification d'adn polymérases pour des applications in vitro Download PDFInfo
- Publication number
- WO2018009729A2 WO2018009729A2 PCT/US2017/040994 US2017040994W WO2018009729A2 WO 2018009729 A2 WO2018009729 A2 WO 2018009729A2 US 2017040994 W US2017040994 W US 2017040994W WO 2018009729 A2 WO2018009729 A2 WO 2018009729A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- polymerase
- seq
- dna
- polynucleotide
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
- C12N9/1252—DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase
Definitions
- the field of the present invention relates to novel compositions of DNA polymerases for in vitro applications.
- DNA polymerases are widely used for nucleic acid amplification, detection and sequencing.
- the most commonly used enzyme for Sanger sequencing is derived from Thermus aquaticus (Taq), whereas Bacillus sterothermophilus (Bst) DNAP is used in the 454/Roche pyrosequencing platform.
- Taq polymerase lacks proofreading activity and is unable to efficiently extend misincorporated bases. Mismatched base pairing generates truncated products that accumulate during PCR and contribute to reaction failure if the target is too long and/or the template DNA is supplied in low amounts.
- proofreading high fidelity enzymes are extremely accurate, but do not perform well over longer target distances or with low template concentration because the 3 '-5' exonuclease (proofreading) activity destroys primers and affects sensitivity.
- Second and third generation instruments for massively parallel DNA sequencing can deliver megabases of data at a lower cost.
- the development of a DNAP to match the technical capabilities of new instrument platforms has not kept pace. Achieving long and accurate reads using new solid phase extension methods, terminator chemistries, and microf!uidic flow technologies places new demands on currently used enzymes.
- Polymerases with increased template affinity for DNA or RNA could provide important improvements in sequencing, amplification and reverse transcription.
- NGS next generation sequencing
- DNA polymerases are well known in the art and include both DNA-dependent polymerases and RNA-dependent polymerases such as reverse transcriptase. At least five families of DNA-dependent DNA polymerases are known, although most fall into families A, B and C. Other less characterized families include D, X, Y and RT. There is little or no structural or sequence similarity among the various families. Most family A polymerases are single chain proteins that can contain multiple enzymatic functions including polymerase, 3' to 5' exonuclease activity and 5' to 3' exonuclease activity. Family B polymerases typically have a single catalytic domain with polymerase and 3' to 5 * exonuclease activity, as well as accessory factors. Family C polymerases are typically multi-subunit proteins with
- DNA polymerizing and 3' to 5' exonuclease activity In E. coli, three types of DNA polymerases have been found, DNA polymerases I (family A), II (family B), and III (family C). In eukaryotic cells, three different family B polymerases, DNA polymerases alpha, delta, and epsilon, are implicated in nuclear replication, and a family A polymerase, polymerase gamma, is used for mitochondrial DNA replication. Other types of DNA polymerases include phage polymerases.
- RNA polymerases typically include eukaryotic RNA polymerases I, II, and III, and bacterial RNA polymerases as well as phage and viral polymerases.
- polymerases can be DNA-dependent and RNA-dependent.
- the present invention is directed to eukaryotic DNA polymerases, in particular Pol eta, Pol iota and Pol kappa, members of Family Y DNA polymerases involved in the DNA repair by translesion synthesis and encoded by genes POLH, POLI and POLK respectively.
- the DinB Pol ⁇ subgroup proteins are ubiquitously present from bacteria to humans, but notably absent in the completely sequenced genomes of Saccharomyces
- the E. co/?DinB protein was shown to have DNA polymerase activity (designated DNA polymerase rV, or Pol IV), independently of accessory proteins such as UmuD' and RecA which are required for UmuC-dependent DNA
- Polymerase activity (designated Pol V).
- Members of Family Y have five common motifs to aid in binding the substrate and primer terminus and they all include the typical right-hand thumb, palm and finger domains with added domains like little finger (LF), polymerase- associated domain (PAD), or wrist.
- the active site differs between family members due to the different lesions being repaired.
- Polymerases in Family Y are low-fidelity polymerases. The importance of these polymerases is evidenced by the fact that gene encoding DNA polymerase ⁇ is referred as XPV, because loss of this gene results in the disease Xeroderma Pigmentosum Variant.
- Pol ⁇ is particularly important for allowing accurate translesion synthesis of DNA damage resulting from ultraviolet radiation. The functionality of Pol ⁇ is not completely understood, but researchers have found two probable functions. Pol ⁇ is thought to act as an extender or an inserter of a specific base at certain DNA lesions.
- the present invention provides modified DNA polymerases that have improved amplification properties and/or processivity over natural forms of the polymerases as well as other polymerases in commercial use.
- the invention also provides recombinant DNA sequences encoding such DNA polymerases, and vector plasmtds and host cells suitable for the expression of these recombinant DNA sequences.
- the polymerase sequence is selected from SEQ ID NOS: 3-242.
- the polymerase may be selected from SEQ ID NOS: 156, 159, 164, 165, 166, 173, 181, 190, 214, 218, 225 and 242.
- the invention further provides a polynucleotide encoding a non-natura ly occuring polymerase, wherein the polymerase has a sequence comprising SEQ ID NO: 1 modified by one or more substitutions listed in Table 3 and up to ten internal insertions, deletions or substitutions at positions other than those listed in Table 3.
- the polymerase comprises at least 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions listed in Table 3.
- the polymerase has at least 90, 95 or 99% sequence identity to claim 1.
- the polymerase has no substitutions other than those shown in Table 3 and conservative substitutions not affecting activity of the polymerase.
- the invention further provides a polynucleotide, which encodes an amino acid sequence at least 90, 95 or 99% sequence identical to any of SEQ ID NOS: 3-242 provided any substitutions present in the amino acid sequence specified in Table 4 are retained.
- the polymerases selected can be thermostable, retain polymerase activity and exhibit reverse transcriptase activity.
- the polymerase variants can be selected for any or all of properties that include strand displacement activity, amplification of next generation sequencing (NGS) libraries, high fidelity amplification of target sequence, amplification of amplification resistant target sequences comprising direct repeats, inverted repeats, at least 65% G+C residues or A+T residues or a sequence greater than 2 kilobases.
- NGS next generation sequencing
- target sequence comprising direct repeats, inverted repeats, at least 65% G+C residues or A+T residues or a sequence greater than 2 kilobases.
- any or all such properties are enhanced relative to the corresponding property of the polymerase having the amino acid sequence of SEQ ID NO:l .
- the target sequence may be
- DNA deoxyribonucleic acid
- RNA ribonucleic acid
- DNA sequences encoding DNA polymerases and other motifs such as DNA binding proteins, antibodies and more are another aspect.
- the invention also provides a novel formulation of the DNA polymerases of the present invention and other thermostable DNA polymerases, which formulation of enzymes is capable of efficiently catalyzing the amplification by PCR (the polymerase chain reaction) of unusually long and faithful products.
- compositions comprising one or more non-natural polymerases selected from SEQ ID NOS: 3-242.
- a method comprising amplifying sequences from a target polynucleotide, for example PCR or reverse transcription using a modified polymerase, other than a natural sequence selected from any of SEQ ID NOS: 3-242.
- the modified non-natural polymerase polynucleotide sequence is included in a construct operably linked to a promoter; the construct is included in a recombinant host cell.
- the target polynucleotide may be DNA or RNA and may be from a bacterial cell, a human cell or murine cell.
- the modified non- natural polymerases can amplify target polynucleotide sequences comprising amplification resistant sequence comprising direct repeats, inverted repeats, at least 65 % G+C residues, or A+T residues or a sequence greater than 2 kilobases.
- the modified non-natural polymerases allow amplification of next generation sequencing (NGS) libraries.
- NGS next generation sequencing
- a kit comprising a polynucleotide encoding a non-natural polymerase, wherein the polymerase of SEQ ID NO: 1 comprises one or more substitutions listed in Table 3 and/or is selected from SEQ ID NOS: 3- 242.
- FIGURE 1 A Purified modified polymerases were tested for purity by SDS- PAGE analysis, >95% purity was observed.
- FIGURE IB Additional purified modified polymerases were tested for purity by SDS-PAGE analysis, >95% purity was observed.
- FIGURE 2A Purified modified DNA polymerase variants were tested for ability to amplify contaminating bacterial host DNA by the E. coli 16S PCR test. Reactions were set up as described in Section 6.1.5 and analyzed by agarose gel electrophoresis.
- FIGURE 2B Additional purified modified DNA polymerase variants were tested for ability to amplify contaminating bacterial host DNA by the E. coli 16S PCR test. Reactions were set up as described in Section 6.1.5 and analyzed by agarose gel electrophoresis.
- FIGURE 3A Amplification properties of modified DNA polymerase variants were determined by PCR of a 2.8 kb PoIA target DNA from E. coli genomic DNA (gDNA). PCR reactions were run at two denaturation temperatures 94°C and 98°C to determine thermostability and analyzed by agarose gel electrophoresis. Polymerases Accura (Acc) DNAP, GoTaq and KAPA (KA) were run as controls.
- FIGURE 3B Additional PCR reactions from Figure 3A were run at two denaturation temperatures 94°C and 98°C to determine thermostability and analyzed by agarose gel electrophoresis as above. Polymerases Accura (Acc) DNAP and KAPA (KA) were run as controls.
- Acc Accura
- KA KAPA
- FIGURE 4A Amplification properties of modified DNA polymerase variants were determined by PCR of a 5 kb target DNA from E, coli genomic DNA (gDNA) at denaturation temperatures of 94°C and 98°C. PCR reactions were analyzed by agarose gel electrophoresis.
- FIGURE 4B Amplification properties of additional modified DNA polymerase variants were determined by PCR of a 5 kb target DNA from E. coli genomic DNA (gDNA) at denaturation temperatures of 94°C and 98°C. PCR reactions were analyzed by agarose gel electrophoresis.
- FIGURE 5 A Amplification properties of modified DNA polymerase variants were determined by PCR of a 10 kb target DNA from E. coli genomic DNA (gDNA) at denaturation temperatures of 94°C and 98°C. PCR reactions were analyzed by agarose gel electrophoresis.
- FIGURE 5B Amplification properties of additional modified DNA polymerase variants were determined by PCR of a 10 kb target DNA from E. coli genomic DNA (gDNA) at denaturation temperatures of 94°C and 98°C. PCR reactions were analyzed by agarose gel electrophoresis.
- FIGURE 6A Modified DNA polymerase variants efficiently amplified longer human DNA target (5 kb) by PCR and analyzed by agarose gel electrophoresis. Accura (Acc) DNA polymerase and KAPA (KA) DNA polymerase was included as controls.
- FIGURE 6B Modified DNA polymerase variants (V313 and V318 shown) efficiently amplified longer human DNA target (5 kb) by PCR and analyzed on a
- FIGURE 7 panels A and B: Modified DNA polymerase variants identified show good NGS Human DNA library amplification. PCR reactions were analyzed by agarose gel electrophoresis. Accura (Acc) DNA polymerase and KAPA (KA) DNA polymerase was included as controls. Several modified DNA polymerase variants successfully amplified a human DNA library giving high yields of amplicons.
- FIGURE 8 panels I and II Modified DNA polymerase variants successfully amplified high GC-content NGS DNA library from Rhodobacter. PCR reactions were analyzed by agarose gel electrophoresis and on a Bioanalyser (data not shown). Accura (Acc) DNA polymerase and KAPA (KA) DNA polymerase was included as controls.
- FIGURE 9 panels I and II Modified DNA polymerase variants successfully amplified high AT-content NGS DNA library from Staphylococcus, PCR reactions were analyzed by agarose gel electrophoresis. Accura (Acc) DNA polymerase and KAPA (KA) DNA polymerase was included as controls.
- FIGURE 10 Panels A and B show modified DNA polymerase variants that show strong strand displacement activity by primer extension. Primer extension was set up and reactions analyzed by agarose gel electrophoresis.
- FIGURE 11 A Modified DNA polymerase variants show high reverse transcriptase (RT) activity on a 520 base pairs (bp) MS2 RNA target. RT-PCR reactions were run without (panel I) and with KF (panel II) and analyzed by agarose gel electrophoresis. Accura (Acc) DNA polymerase was run as control.
- FIGURE 1 IB Additional modified DNA polymerase variants show high reverse transcriptase (RT) activity on a 520 base pairs (bp) MS2 RNA target. RT-PCR reactions were run without (panel I) and with KF (panel II) and analyzed by agarose gel electrophoresis. Accura (Acc) DNA polymerase was run as control.
- FIGURE 1 1C Modified DNA polymerase variants show high reverse transcriptase (RT) activity on a 520 base pairs (bp) MS2 RNA target. A subset of samples from Figure 11 A were re-analyzed on an agarose gel to confirm positives.
- RT reverse transcriptase
- FIGURE 12 12 DNA polymerase variants from Table 10 were tested in a
- HotStart version to show inhibition of polymerase activity by antibody.
- FIGURE 13 Efficient amplification of a 3 kb plasmid E. coli target was shown for the 12 variants from round 1 and 2 round 2 variants SEQ ID NO: 101 (V103) and SEQ ID NO: 1 17 (VI 19). PCR reaction was set up and analyzed by agarose gel electrophoresis.
- FIGURE 14 Efficient amplification of a 5 kb E. coli target was shown for the 12 variants from rounds 1 and 2, and round 2 variants V103 (SEQ ID NO: 101) and VI 19 (SEQ ID NO: 117). PCR reaction was set up and analyzed by agarose gel electrophoresis.
- FIGURE 1 A Efficient amplification of a 10 kb E. coli target was shown for the 12 variants from rounds 1 and round 2, round 2 variants V103 (SEQ ID NO: 101) and VI 19 (SEQ ID NO: 117). PCR reaction was set up and analyzed by agarose gel electrophoresis.
- FIGURE 15B Efficient amplification of a 10 kb E. coli target was shown for the 12 variants from rounds 1 and 2, round 2 variants V103 (SEQ ID NO: 101) and VI 19 (SEQ ID NO: 117).
- PCR reaction was set up and analyzed by agarose gel electrophoresis.
- Panel I shows reactions after 35 cycles of amplification.
- Panel II shows reactions after 30 cycles of amplification.
- FIGURE 16 Efficient amplification of a 5 kb human DNA target was shown for the 12 modified polymerase variants from rounds 1 and 2, round 2 variants VI 03 (SEQ ID NO: 101) and VI 19 (SEQ ID NO: 117). Accura DNA polymerase and KAPA DNA polymerase was included as controls. PCR reactions were analyzed by agarose gel electrophoresis.
- FIGURE 17 Efficient amplification of NGS DNA libraries from human (panel I) and Rhodobacter (panel II) was shown for the 12 modified polymerase variants from Table 10 and two second round variants V103 (SEQ ID NO: 101) and VI 19 (SEQ ID NO: 117). Accura DNA polymerase and KAPA DNA polymerase was included as controls. PCR reactions were analyzed by agarose gel electrophoresis.
- FIGURE 18 Efficient amplification of NGS DNA library from AT-rich Staphylococcus was shown for the 12 modified polymerase variants from Table 10 and two second round variants V103 (SEQ ID NO: 101) and VI 19 (SEQ ID NO: 117).
- Accura Acc
- KAPA KAPA
- FIGURE 19A PCR reaction optimization was carried out for modified polymerase variants (SEQ ID NOS: 159 and 190) in the presence of sugars, for example sucrose, trehalose, mannitoi and reactions analyzed by agarose gel electrophoresis.
- sugars for example sucrose, trehalose, mannitoi and reactions analyzed by agarose gel electrophoresis.
- FIGURE I9B PCR optimization was carried out for modified polymerase variants (SEQ ID NOS: 159 and 190) in the presence of sugars, for example trehalose, mannitol and sorbitol and reactions analyzed by agarose gel electrophoresis.
- sugars for example trehalose, mannitol and sorbitol and reactions analyzed by agarose gel electrophoresis.
- FIGURE 20 PCR reaction optimization was carried out for modified polymerase variant (SEQ ID NO: 190) in the presence of BSA or L-carnitine and reactions analyzed by agarose gel electrophoresis.
- FIGURE 21 A PCR reaction optimization was carried out for modified polymerase variants (SEQ ID NOS: 159 and 190) in the presence of sugars, for example sucrose, trehalose, mannitol and sorbitol in the presence or absence of L-carnitine and reactions analyzed by agarose gel electrophoresis. Sugar and L-carnitine together showed an additive effect on amplification of a 3 kb CometGFP DNA target.
- FIGURE 2 IB PCR reaction optimization was carried out for modified polymerase variants (SEQ ID NOS: 159 and 190) in the presence of sugars, for example sucrose, trehalose, mannitol and sorbitol in the presence or absence of L-carnitine and reactions analyzed by agarose gel electrophoresis. Sugar and L-carnitine together showed an additive effect on amplification of a 5 kb human gDNA target.
- FIGURE 22 A Various combinations of buffer compositions were used to set up PCR on a 3 kb E. coli target DNA with modified polymerase variant (SEQ ID NO: 190). PCR reactions with or without additive were analyzed by agarose gel electrophoresis.
- FIGURE 22B Various combinations of buffer compositions were used to set up PCR on a 5 kb E. coli target DNA with modified polymerase variant (SEQ ID NO: 190). PCR reactions with or without additive were analyzed by agarose gel electrophoresis.
- FIGURE 23A PCR reactions with standard PCR buffer was tested on CometGFP DNA as template with addition of sorbitol and with modified polymerase variant (SEQ ID NO: 190). PCR reactions without and with addition of sorbitol were analyzed by agarose gel electrophoresis.
- FIGURE 23B PCR reactions with standard PCR buffer was tested on CometGFP DNA as template with addition of KC1 and with modified polymerase variant (SEQ ID NO: 190). PCR reactions without and with addition of KC1 were analyzed by agarose gel electrophoresis.
- FIGURE 23C PCR reactions with standard PCR buffer was tested on CometGFP DNA as template with addition of non-ionic detergents, for example Triton X- 100, Tween 20, NP-40, CHAPS and Brij 8 and with modified polymerase variant (SEQ ID NO: 190). PCR reactions were analyzed by agarose gel electrophoresis. DETAILED DESCRIPTION OF THE INVENTION
- Amplification reaction refers to any in vitro means for multiplying the copies of a target sequence of nucleic acid.
- Such methods include but are not limited to polymerase chain reaction (PCR), DNA ligase chain reaction (see U.S. Pat. Nos. 4,683,195 and 4,683,202; PCR Protocols: A Guide to Methods and Applications (Innis et al., eds, 1990)), (LCR), QBeta RNA replicase, and RNA transcription-based (such as TAS and 3SR) amplification reactions as well as others known to those of skill in the art.
- PCR polymerase chain reaction
- DNA ligase chain reaction see U.S. Pat. Nos. 4,683,195 and 4,683,202; PCR Protocols: A Guide to Methods and Applications (Innis et al., eds, 1990)
- LCR LCR
- QBeta RNA replicase QBeta RNA replicase
- “Amplifying” refers to a step of submitting a solution to conditions sufficient to allow for amplification of a polynucleotide if all components of the reaction are intact.
- Components of an amplification reaction include, e.g., primers, a polynucleotide template, polymerase, nucleotides, and the like.
- the term “amplifying” typically refers to an
- exposure increase in target nucleic acid.
- amplifying as used herein can also refer to linear increases in the numbers of a select target sequence of nucleic acid, such as is obtained with cycle sequencing.
- Amplification reaction mixture refers to an aqueous solution comprising the various reagents used to amplify a target nucleic acid. These include enzymes, aqueous buffers, salts, amplification primers, target nucleic acid, and nucleoside triphosphates.
- the mixture can be either a complete or incomplete
- DNA sequence means a contiguous nucleic acid sequence.
- the sequence can be an oligonucleotide of 2 to 20 nucleotides in length to a full length genomic sequence of thousands or hundreds of thousands of base pairs.
- Domain refers to a unit of a protein or protein complex, comprising a polypeptide subsequence, a complete polypeptide sequence, or a plurality of polypeptide sequences where that unit has a defined function.
- the function is understood to be broadly defined and can be ligand binding, catalytic activity or can have a stabilizing effect on the structure of the protein.
- DNA binding domain refers to nucleic acid and both full-length polypeptides and fragments of the polypeptides that have sequence nonspecific double- stranded DNA binding activity.
- expression system means any in vivo or in vitro biological system that is used to produce one or more gene product encoded by a polynucleotide.
- Enhances in the context of an enzyme refers to improving the activity of the enzyme, i.e., increasing the amount of product per unit enzyme per unit time.
- Two elements are "heterologous" to one another if not naturally associated.
- a nucleic acid sequence encoding a protein linked to a heterologous promoter means a promoter other than that which naturally drives expression of the protein.
- a heterologous nucleic acid flanked by transposon ends or ITRs means a heterologous nucleic acid not naturally flanked by those transposon ends or ITRs, such as a nucleic acid encoding a polypeptide other than a transposase, including an antibody heavy or light chain.
- a nucleic acid is heterologous to a cell if not normally found in the cell or in a different location (e.g., episomal or different genomic location) than the location naturally present within a cell.
- the term "host” means any prokaryotic or eukaryotic organism that can be a recipient of a nucleic acid.
- the terms "host,” “host cell,” “host system” and “expression host” can be used interchangeably.
- An 'isolated' polypeptide or polynucleotide means a polypeptide or polynucleotide that has been either removed from its natural environment, produced using recombinant techniques, or chemically or enzymatically synthesized.
- a polypeptide or polynucleotide of this invention is purified, that is, it is essentially free from any other polypeptide or polynucleotide and associated cellular products or other impurities.
- nucleic acids or polypeptide sequences refers to the residues in the two sequences that are the same when aligned for maximum correspondence, as measured using a “sequence comparison algorithms”.
- nucleoside and nucleotide include those moieties which contain not only the known purine and pyrimidine bases, but also other heterocyclic bases which have been modified. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, or other heterocycles. Modified nucleosides or nucleotides can also include modifications on the sugar moiety, for example, where one or more . of the hydroxyl groups are replaced with halogen, aliphatic groups, or is functional ized as ethers, amines, or the like.
- nucleotidic unit is intended to encompass nucleosides and nucleotides.
- NGS Next-generation sequencing
- An "Open Reading Frame” or “ORP” means a portion of a polynucleotide that, when translated into amino acids, contains no stop codons.
- the genetic code reads DNA sequences in groups of three base pairs, which means that a double-stranded DNA molecule can read in any of six possible reading frames-three in the forward direction and three in the reverse.
- operably linked refers to functional linkage between two sequences such that one sequence modifies the behavior of the other.
- a first polynucleotide comprising a nucleic acid expression control sequence (such as a promoter, IRES sequence, enhancer or array of transcription factor binding sites) and a second polynucleotide are operably linked if the first polynucleotide affects transcription and/or translation of the second polynucleotide.
- a first amino acid sequence comprising a secretion signal or a subcellular localization signal and a second amino acid sequence are operably linked if the first amino acid sequence causes the second amino acid sequence to be secreted or localized to a subcellular location.
- a "promoter” means a nucleic acid sequence sufficient to direct transcription of an operably linked nucleic acid molecule. Also included in this definition are those transcription control elements (for example, enhancers) that are sufficient to render promoter- dependent gene expression controllable in a cell type-specific, tissue-specific, or temporal- specific manner, or that are inducible by external signals or agents; such elements, which are well-known to skilled artisans, may be found in a 5' or 3' region of a gene or within an intron.
- transcription control elements for example, enhancers
- a promoter is operably linked to a nucleic acid sequence, for example, a cDNA or a gene sequence, or an effector RNA coding sequence, in such a way as to enable expression of the nucleic acid sequence, or a promoter is provided in an expression cassette into which a selected nucleic acid sequence to be transcribed can be conveniently inserted.
- Polymerase refers to an enzyme that performs template-directed synthesis of polynucleotides. The term encompasses both the full-length polypeptide or a domain that has polymerase activity.
- Processivity refers to the ability of a polymerase to remain bound to the template or substrate and perform DNA synthesis. Processivity is measured by the number of catalytic events that take place per binding event.
- PCR Polymerase chain reaction
- PCR refers to a method whereby a specific segment or subsequence of a target double-stranded DNA, is amplified in a geometric progression.
- PCR is well known to those of skill in the art; see, e.g., U.S. Pat. Nos. 4,683, 195 and 4,683,202; and PCR Protocols: A Guide to Methods and Applications, Innis et al., eds, 1990.
- Exemplary PCR reaction conditions typically comprise either two or three step cycles. Two step cycles have a denaturation step followed by a hybridization/elongation step. Three step cycles comprise a denaturation step followed by a hybridization step followed by a separate elongation step.
- Long PCR refers to the amplification of a DNA fragment of 5 kb or longer in length. Long PCR is typically performed using specially-adapted polymerases or polymerase mixtures (see, e.g., U.S. Pat. Nos. 5,436,149 and 5,512,462) that are distinct from the polymerases conventionally used to amplify shorter products.
- a "primer” refers to a polynucleotide sequence that hybridizes to a sequence on a target nucleic acid and serves as a point of initiation of nucleic acid synthesis.
- Primers can be of a variety of lengths and are often less than 50 nucleotides in length, for example 12- 30 nucleotides, in length.
- the length and sequences of primers for use in PCR can be designed based on principles known to those of skill in the art, see, e.g., Innis et al., supra.
- polymerase primer/template binding specificity refers to the ability of a polymerase to discriminate between correctly matched primer/templates and mismatched primer templates.
- An "increase in polymerase primer/template binding specificity” in this context refers to an increased ability of a variant modified fusion polymerases of the invention to discriminate between matched primer/template in comparison to a wildtype polymerase fusion protein.
- An "improved polymerase” includes both, a modified polymerase and/or sequence-non-specific double-stranded DNA binding domain joined to the polymerase or polymerase domain.
- a "modified DNA polymerase” or modified DNA polymerase variant or DNA polymerase variant refers to a DNA polymerase, comprising one or more mutations that modulate one or more activities of the DNA polymerase including, but not limited to, DNA polymerization activity, base analog detection activities, 3 '-5' or 5 '-3' exonuclease activities, processivity improved nucleotide analog incorporation activity, proofreading, fidelity, efficiency, specificity, thermostability and intrinsic hot start capability or decreased DNA polymerization at room temperature, decreased amplification slippage on templates with trinucleotide repeat stretches or homopolymeric stretches, decreased amplification cycles, decreased extension times, reduced sensitivity to inhibitors (e.g., high salt, nucleic acid purification reagents), altered optimal reaction conditions (e.g., pH, KCL) and a decrease in the amount of polymerase needed for the applications described.
- inhibitors e.g., high salt, nucleic acid purification reagents
- PCR "sensitivity” refers to the ability to amplify a target nucleic acid that is present in low concentration.
- Low concentration refers to 10.sup.4, often 10.sup.3 ;
- polypeptide peptide
- protein protein
- amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
- selectable marker means a polynucleotide segment that allows one to select for or against a molecule or a cell that contains it, often under particular conditions. These markers can encode an activity, such as, but not limited to, production of RNA, peptide, or protein, or can provide a binding site for RNA, peptides, proteins, inorganic and organic compounds or compositions.
- Selectable markers include but are not limited to: (1) DNA segments that encode products which provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) DNA segments that encode products which are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); (3) DNA segments that encode products which suppress the activity of a gene product; (4) DNA segments that encode products which can be readily identified (e.g., phenotypic markers such as beta-galactosidase, green fluorescent protein (GFP), and cell surface proteins); (5) DNA segments that bind products which are otherwise detrimental to cell survival and/or function; (6) DNA segments that otherwise inhibit the activity of any of the DNA segments described in Nos.
- DNA segments that encode products which provide resistance against otherwise toxic compounds e.g., antibiotics
- DNA segments that encode products which are otherwise lacking in the recipient cell e.g., tRNA genes, auxotrophic markers
- DNA segments that encode products which suppress the activity of a gene product e.g., phenotypic markers such as beta-galact
- DNA segments that bind products that modify a substrate e.g. restriction endonucleases
- DNA segments that can be used to isolate a desired molecule e.g. specific protein binding sites
- DNA segments that encode a specific nucleotide sequence which can be otherwise non-functional e.g., for PCR amplification of subpopulations of molecules
- DNA segments, which when absent, directly or indirectly confer sensitivity to particular compounds e.g., antisense oligonucleotides
- DNA segments that bind products that modify a substrate e.g. restriction endonucleases
- DNA segments that can be used to isolate a desired molecule e.g. specific protein binding sites
- DNA segments that encode a specific nucleotide sequence which can be otherwise non-functional e.g., for PCR amplification of subpopulations of molecules
- DNA segments, which when absent, directly or indirectly confer sensitivity to particular compounds e.g., antisense oligonucleotides
- Sequence identity can be determined by aligning sequences using algorithms, such as BESTFIT, FAST A, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.), using default gap parameters, or by inspection, and the best alignment (i.e., resulting in the highest percentage of sequence similarity over a comparison window).
- algorithms such as BESTFIT, FAST A, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.
- Percentage of sequence identity is calculated by comparing two optimally aligned sequences over a window of comparison, determining the number of positions at which the identical residues occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of matched and mismatched positions not counting gaps in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.
- the window of comparison between two sequences is defined by the entire length of the shorter of the two sequences.
- translation refers to the process by which a polypeptide is synthesized by a ribosome 'reading' the sequence of a polynucleotide.
- Thermally stable polymerase refers to any enzyme that catalyzes polynucleotide synthesis by addition of nucleotide units to a nucleotide chain using DNA or RNA as a template and has an optimal activity at a temperature above 45°C, or retains at least 50% of its activity at elevated temperatures, for example above 95 °C.
- thermostable refers to an enzyme which is stable to heat, is heat resistant, and functions at high temperatures, e.g., 50 to 100°C as compared, for example, to a non- thermostable form of an enzyme with a similar activity.
- a thermostable nucleic acid polymerase derived from thermophilic organisms such as P. furiosus, M. jannaschii, A. fulgidiis or P. horikoshii are more stable and active at elevated temperatures as compared to a nucleic acid polymerase from E. coli.
- a representative thermostable nucleic acid polymerase isolated from P. furiosus (Pfu) is described in Lundberg et al., 1991, Gene, 108:1-6.
- Additional representative temperature stable polymerases include, e.g., polymerases extracted from the thermophilic bacteria Thermus fl vus, Thermus ruber, Thermus (hemophilus, Bacillus stearothermophilus (which has a somewhat lower temperature optimum than the others listed), Thermus lacteus, Thermus rubens, Thermotoga maritima, or from thermophilic archaea Thermococcus litoralis, and Methano thermusfervidus.
- polymerases extracted from the thermophilic bacteria Thermus fl vus, Thermus ruber, Thermus (hemophilus, Bacillus stearothermophilus (which has a somewhat lower temperature optimum than the others listed), Thermus lacteus, Thermus rubens, Thermotoga maritima, or from thermophilic archaea Thermococcus litoralis, and Methano thermusfervidus.
- a "temperature profile” refers to the temperature and lengths of time of the denaturation, annealing and/or extension steps of a PCR or cycle sequencing reaction.
- a temperature profile for a PCR or cycle sequencing reaction typically consists of 10 to 60 repetitions of similar or identical shorter temperature profiles; each of these shorter profiles may typically define a two-step or three-step cycle. Selection of a temperature profile is based on various considerations known to those of skill in the art, see, e.g., Innis et al., supra.
- the extension time required to obtain an amplification product of 5 kb or greater in length is reduced compared to conventional polymerase mixtures.
- a "template” refers to a double stranded polynucleotide sequence that comprises the polynucleotide to be amplified, flanked by primer hybridization sites.
- a “target template” comprises the target polynucleotide sequence flanked by hybridization sites for a 5' primer and a 3' primer.
- vector or "DNA vector” or “gene transfer vector” refers to a polynucleotide that is used to perform a "carrying" function for another polynucleotide.
- vectors are often used to allow a polynucleotide to be propagated within a living cell, or to allow a polynucleotide to be packaged for delivery into a cell, or to allow a polynucleotide to be integrated into the genomic DNA of a cell.
- a vector may further comprise additional functional elements, for example it may comprise a transposon.
- amino acids are grouped as follows: Group I (hydrophobic side chains): met, ala, val, leu, ile; Group II (neutral hydrophilic side chains): cys, ser, thr; Group III (acidic side chains): asp, glu; Group IV (basic side chains): asn, gin, his, lys, arg; Group V (residues influencing chain orientation): gly, pro; and Group VI (aromatic side chains): tip, tyr, phe.
- Conservative substitutions involve substitutions between amino acids in the same class.
- Non-conservative substitutions constitute exchanging a member of one of these classes for a member of another.
- Positions in a variant sequence are assigned the same numbers as the aligned positions in corresponding reference sequence.
- One class of methods is alignment-based, and involves aligning a set of sequences, including naturally occurring sequences. This alignment may be used to derive a phylogenetic relationship between the sequences. This relationship can also be used to calculate conservation properties and for each amino acid at each position in the protein.
- a second class of methods is structural, in which substitutions are tested computationally for their effect upon a known or calculated protein structure. All these methods yield quantitative measures of the predicted favorability of replacing an amino acid with a different amino acid. The favorabilities predicted by one method may be different from the favorabilities predicted by a different method, so it is often desirable to combine the results from different methods.
- amino acid residue positions in a reference sequence are compared with the same position in an alignment of homologous sequences. Positions that exhibit a high degree of variance in homologs may have a high probability that substitutions at such positions will be active.
- One method of calculating the degree of amino acid variance is described by Gribskov (1987) Proc Natl Acad Sci USA 84, 4355.
- a sequence alignment can serve as the basis of a Hidden Markov model that can be used to calculate the probability that one specific residue will be followed by a second specific residue. These models also include probabilities for gaps and insertions.
- substitutions may be identified based upon the consensus sequence of the alignment.
- homologous sequences are often analogous functionally and structurally, although having been subjected separately to different selective pressures they are also likely to be optimized differently. Amino acids that differ between homologous sequences thus provide a guide to substitutions that are likely to yield functional though different proteins. Alignment of homologous sequences can therefore be used to identify candidate substitution positions.
- homologous protein sequences may be aligned (e.g., by using clustalw; Thompson et al. (1994) Nucleic Acids Res 22: 4673-80) and then a phylogenetic tree reconstructed.
- Scores for a given alignment can also be normalized to have an average value of 0.0 and a standard deviation of 1.0, or other standard procedures can be used to compare and combine scores from multiple methods. These values can then be used directly as a score. For example, all sites with a score above a certain threshold value can be selected, or all sites with a score below a certain threshold value can be eliminated. Alternatively, the most variable (e.g., least conserved) sites can be selected by ranking the sites in order of these scores, or the least variable (e.g., most conserved) sites can be eliminated by ranking the sites in order of these scores.
- most variable e.g., least conserved
- Amino acid diversity and tolerance at each site can be measured as a fitness property of each amino acid at every location. The most fit residue for that position carries a higher value (e.g., Koshi et al. (2001) Pac Symp Biocomput 1 1-202; O. Soyer, .W.
- Sites can be grouped into site-classes or treated independently. Sites and site classes most fit to change based on the substitution rate and the substitutions most favorable based on the fitness can be selected. These values of fitness may then be used directly as a score. All sites with a score above a certain threshold value could then be selected, for example, a cutoff (threshold) of 0.0 can be chosen (when the normalization of scores sets the wild type residue found in the reference to be 0.0). Alternatively, all sites with a score below a certain threshold value could be eliminated.
- Threshold values of 0.0 or below can be eliminated, thereby only including amino changes that have a higher fitness value that the reference wild type amino acid found in that position.
- the sites most tolerant to change could be selected by ranking the sites in order of these scores. For example, in the study of G- protein coupled receptors (GPCR) by Soyer et. al. (O. Soyer, M.W. Dimmic, R.R. Neubig, and R.A. Goldstein; Pacific Symposium on Biocomputing 7:625-636 (2002)), using the 8-site class model the class #8 was identified to have the highest substitution rate and the property correlating with fitness of amino acids at these positions was identified to be "charge transfer" propensity of the amino acid. In the present invention, amino acids in the sites that carry a higher relative fitness compared to the wild type amino acid found in that position are identified as suitable for substitution.
- a substitution matrix represents the probability of one amino acid being replaced by a second amino acid across a set of positions within a set of proteins.
- the matrix can be expressed in terms of probabilities or values derived from probabilities by mathematical transformation involving probabilities of transitions or substitutions (Pij) and observed frequencies of amino acids(Fi). Matrices using such transformation include scoring matrices like PAM100, PAM250, and BLOSUUM etc.
- Substitution matrices are derived from pairwise alignments of protein homoiogs from sequence databases. They constitute estimates of the probability that one amino acid will be changed to another while conserving function. Different substitution matrices are calculated from different sets of sequences.
- substitution matrices are calculated by selecting the protein families used to create the matrix, as well as the positions considered.
- a substitution matrix that best captures the observed sequences in the protein family of interest can be calculated using the Bayesian method developed by Goldstein et al. (Koshi et al. (1995) Protein Eng 8: 641-645) and used to score all candidate substitutions.
- a substitution or a scoring matrix can be calculated by considering homologous proteins from many different protein families (e.g., Benner et al. (1994) Protein Eng 7: 1323-1332; and Tomii et al. (1996) Protein Eng 9: 27-36) can be used to score all candidate substitutions.
- Matrices derived from a variety of protein are often used to evaluate and confirm homology of protein sequences and represent an approximation of protein evolution in general.
- Another example of an alignment based model is the reconstruction of ancestral sequences. Evolutionary relationships between homologous sequences can be derived in the form of phylogenetic trees. Using evolutionary models, ancestral sequences can probabilistically reconstructed. See, for example, Koshi and Goldstein (1996) MoL Evol. 42, 313-320. Coupled with knowledge of functions of proteins, evolutionary analysis will also identify amino acid changes that occur in functionally distinct groups. See, for example, Zhang and Rosenberg (2002) PNAS 99, 5486-5491. Comparison of the rates of synonymous (Ks) versus non-synonymous substitutions (Ka) can also be used to quantify (e.g., using Ka K s ratio) the type and degree of evolutionary constraint on substitutions.
- the codon for this residue in the gene for the protein will tend to encode the same amino acid throughout the phylogenetic tree (synonymous substitutions, high K s ).
- substitutions high K s
- alterations of the corresponding codon in the gene will more frequently encode different amino acids throughout the phylogenetic tree (non-synonymous substitutions, Ka comparable with Ks).
- the ratio of frequency with which a site is replaced by a synonymous codon to the frequency with which it is replaced by a non-synonymous codon in a reconstructed phylogenetic tree provides a measure of the selective pressure (on the function of the protein) acting to conserve the identity of the amino acid at that position. Often these ratios are calculated as averages for entire sequences. However, such ratios can also be limited to specific sites or groups of positions. These ratios can also be used to weight substitutions identified by other methods from a specific homolog.
- Another example of an alignment-based method for identifying amino acid substitutions that are most important in differentiating protein function is a dimension- reducing technique such as principal component analysis. This has been previously described (e.g., Casari et al, 1995, Nat Struct Biol 2: 171-178; Gogos et al, 2000, Proteins 40: 98-105; and del Sol Mesa et al, 2003, J Mol Biol 326: 1289-1302).
- PCA can identify sequence features and substitutions corresponding to the desired phenotype of the protein and scores "loads" for these features in the direction of desired phenotype are used as absolute scores or as filters to identify substitutions.
- candidate amino acid changes can be computationally modeled in the structure(s) and changes in the free energy computed. These computationally calculated changes in free energies resulting from the substitutions can then be used directly as a score.
- all changes can be selected that increase the free energy of the protein by less than a certain value. For example, all changes that would increase the free energy by less than lkCal/mol can be selected, all changes that would increase the free energy by less than 1.5 kCal/mol can be selected, all changes that would increase the free energy by less than 2kCaI/moI can be selected, or all changes that would increase the free energy by less than 2.5kCal/mol can be selected.
- regions of the protein that differ structurally between homologs are considered more likely to tolerate change, while those regions that are structurally conserved are likely to be less tolerant.
- Structures can be directly obtained from the database or predicted using various structure modeling software packages. Structures of homologs and mutants can be superposed on the wild type structure. See, for example, May et al. (1994) Protein Eng 7: 475-85; and Ochagavia et al (2002) Bio informatics 18: 637-40). Structural conservation can be calculated as the root mean squared (RMS) deviations of the backbones of the superposed chains.
- RMS root mean squared
- These computationally calculated RMS deviations for every position between homologous structures can then be used directly as a score.
- RMS deviations between the alpha carbons (or backbone atoms) in the structure of the target protein and one or more homologous proteins that are greater than a threshold value can be considered structurally labile and these sites can be selected.
- This threshold RMS deviation between homologous structures can be greater than 2A, 2.5A, 3A, 3.5A, 4A, 4.5A, 5A.
- RMS deviations between the alpha carbons in the structure of the target protein and one or more homologous proteins that are less than a threshold value can be considered structurally conserved and these sites can be eliminated.
- This threshold RMS deviation between homologous structures can be less than 2A, 2.5 A, 3 A, 3.5A, 4A, 4.5A, or 5A.
- changes near catalytic and binding sites are highly likely to influence the activity of the protein and are good candidates for substitution.
- All amino acid substitutions that are found in one or more homologs can be tested for proximity to a binding or catalytic or regulatory site of the protein. The distance between an amino acid substitution from a binding or catalytic or regulatory site, in one or more homologs, can be used directly as a score. Alternatively, all amino acid substitutions that are found in one or more homologs and that are within a threshold distance of a binding or catalytic or regulatory site in the protein can be selected.
- This threshold distance can be less than 2k, 2.5A, 3A, 3.5 A, 4A, 4.5A, 5k, 5.5k, 6k, 6.5k, 7k.
- all amino acid substitutions that are found in one or more homologs and that are beyond a threshold distance of a binding or catalytic or regulatory site in the protein can be eliminated.
- This threshold distance can be more than 2A, 2.5A, 3A, 3.5A, 4A, 4.5 A, 5A, 5.5 A, 6k, 6.5k, or 7 A.
- all amino acid substitutions that are found in one or more homologs can be ranked in order of proximity to a binding or catalytic or regulatory site in the protein and those that are closest to the binding or catalytic or regulatory site.
- the substitution closest to the binding or catalytic or regulatory site can be selected, or between 2 and 20, between 10 and 100, or the top 200 substitutions closest to the binding or catalytic or regulatory site can be selected.
- all amino acid substitutions that are found in one or more homologs can be ranked in order of proximity to a binding or catalytic or regulatory site in the protein and those that are farthest from the binding or catalytic or regulatory site eliminated.
- the substitution farthest from the binding or catalytic or regulatory site can be eliminated.
- between 2 and 20, between 10 and 100, or the top 200 substitutions farthest from the binding or catalytic or regulatory site can be eliminated.
- infologs are designed variants of a given gene, for example a polymerase, where substitutions are systematically incorporated to achieve high information content enabling modern machine learning tools to de-convolute sequence-activity relationships.
- infologs is the basis of our rational approach to protein engineering in which a matrix of well-defined amino acid substitutions is used to map the targeted fitness landscape.
- libraries of ⁇ 100 mutants typically 96 are characterized for property/activities of interest and serve as a basis for machine-learning tools to design the next generation of infologs.
- the initial set of infologs are designed to have the same number of substitutions (approximately 3), thereby probing regions at the same hamming distance from the reference locus in sequence space.
- Substitutions in the mutants are selected from a pool of substitutions and each set of infologs contains several variants with the same substitution albeit in presence of two completely different mutations, thereby providing us with the ability to characterize the amino acid change with respect to its additivity and context dependence.
- the functional consequences of individual substitutions can be modeled and quantitatively evaluated.
- an infolog library based on 59 amino-acid substitutions in a tau class glutathione transferase (GST) from wheat afforded increased activity against most of a number of herbicides tested (Govindarajan et al, 2015).
- SEQ ID NO: 128971 was used to identify homologs from Genbank database of non-redundant protein sequences using the BLAST program. The list of homologs used for identifying substitutions may not be limited to these homologs.
- the multiple sequence alignment of all homologs and SEQ ID NO: 1 was obtained using the clustalw program and a phylogenetic tree was constructed. The alignment was used to enumerate possible changes that can be made to SEQ ID NO: 1 that are seen in the alignments.
- DNA polymerase substitutions (relative to SEQ ID NO: 1) identified using combinations of the methods described above are shown in Tables 1 and 2.
- DNA polymerase variants were synthesized to incorporate the amino acid substitutions shown in Table 3. Specific activities of the synthesized polymerase variants were determined experimentally. The specific activities were individually modeled as a function of the substitutions by linear regression. The results provide relative weights of each substitution for activity and other properties such as thermostability and amplification of >3 kb target DNA (Table 5). Mutations that showed positive weights for selected activity profiles were used in different combinations with other substitutions to construct a library of polymerase variants. Sets of variants were designed to incorporate selected substitutions a uniform number of times. The substitutions were also distributed within a variant set so that the number of unique pairs of substitutions is high, that ensures that different substitutions are tested in a wide variety of contexts. The specific combinations of substitutions in the 3 variant sets (relative to SEQ ID NO: I) are shown in Table 4. Amino acid sequences are given as SEQ ID NOS: 3-242.
- a nucleic acid encoding a non-natural polymerase wherein the polymerase has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% sequence identity with SEQ ID NO: 190 retaining the combination of substitutions specified in Table 4 present in SEQ ID NO: 190 or any subset thereof.
- modified polymerases other than natural sequences, comprising one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more of the substitutions shown in Table 1, Table 2 or Table 3 may also be used.
- the invention thus provides non-naturally occurring polymerases having the sequence of SEQ ID NO:l modified by one or more of the substitutions shown in Table 3 and having zero to ten (e.g., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10) substitutions, deletions or internal substitutions at positions other than those shown in Table 3.
- the polymerases encoded by such polynucleotides can have at least, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or up to all of the substitutions present in Table 3.
- any modifications at positions other than those shown in Table 3 are conservative substitutions having no significant effect on polymerase activity (i.e., activity indistinguishable from an otherwise identical polymerase without the substitution within experimental error).
- Non-naturally occurring polymerases encoded by such polynucleotides are also provided.
- the invention further provides polynucleotides encoding any of polymerase having the sequence of SEQ ID NO:3-242 or variants having for example at least 85%, 90%, 95%, 99% or 100% identity therewith provided that the combination of substitutions present in the SEQ ID NO. in question specified in Table 4 is retained.
- Non-naturally occurring polymerases encoded by such polynucleotides are also provided. Any variations present in a sequence other than the substitutions shown in Table 4 for that sequence are preferably conservative substitutions not significantly affecting activity or substitutions shown in Table 3.
- the substitutions present in preferred polymerase having the sequence of SEQ ID NO: 159 are M354I, L438I, D468G and A494S.
- the substitutions present in preferred polymerase having the sequence of SEQ ID NO:I90 are Y342I, L438I and D468G.
- the above mentioned polymerases preferably have an enhanced property relative to a base polymerase having the sequence of SEQ ID NO:l measured under the same conditions.
- the enhanced property can be at least one of enhanced polymerase activity, enhanced reverse transcriptase activity, enhanced thermostability, enhanced strand- displacement activity, or any combination thereof, including all of these properties.
- An activity is considered enhanced if the change is beyond experimental error (in other words, statistically significant).
- the enhancement if quantifiable can be for example at least 10% to 1000%, including for example 10-500%, 10-200%, 10-100%, 20-1000%, 20-500%, 20- 100%, and 20-50%.
- Modified polymerases e.g., SEQ ID NOS: 3-242 fused to various accessory proteins called processivity factors may also be used.
- Processivity factors assist polymerases in various ways— some by forming complexes with the polymerase itself (for example, thioredoxin), and some by encircling duplex DNA (for example, clamp protein)— thereby ensuring a strong, stable binding with the template (Zhuang and Ai 2010).
- a stable association of the polymerase with the template DNA is crucial for the unfettered incorporation of nucleotides, and therefore, for an efficient PCR reaction.
- One strategy demonstrating enhanced processivity and improved PCR efficiency is the approach developed and patented by Wang in 2000 and published in 2004 (Wang et al.
- Sso7d is a small protein (7 kD) capable of covalently binding to dsDNA without any preference for specific sequences.
- the binding of the Sso7d domain to a DNA polymerase is optimal to smoothly slide along the template.
- the covalent linking of the fusion protein with the polymerase does not entail any structural modifications, therefore the fusion does not interfere with the structural integrity and thermal stability and consequently, with the catalytic activity of the enzyme.
- Sso7d can be linked to both family A and family B DNA polymerases and can bind with dsDNA at ambient temperature as well as high temperatures. It can therefore help enzymes that are thermostable or otherwise. Studies carried out using the Sso7d fusion polymerase have demonstrated that the fusion protein technique improves processivity without affecting the catalytic activity or thermal stability of the enzyme.
- aptamers or antibodies fused to the modified polymerase are also used.
- the aptamer used is attached to the reactive site of the modified polymerase, and thus is inactive at room temperature.
- the temperature of the reaction solution is increased to high temperature, the three-dimensional structure of the aptamer is modified so that it is separated from polymerase so as to be active, and the reverse transcription reaction of a specifically primed target RNA can be performed, and afterwards PCR can be performed.
- Other processivity factors comprise sequences encoding polymerase fused to certain protein functional domains.
- protein functional domains can include, but are not limited to, one or more DNA binding domains, one or more nuclear localization signals, one or more flexible hinge regions that can facilitate one or more domain fusions, and combinations thereof. Fusions can be made either to the N-terminus, C-terminus, or internal regions of the polymerase protein so long as polymerse activity is retained.
- DNA binding domains used can include, but are not limited to, a helix-turn-helix domain, Zn-fmger domain, a leucine zipper domain, or a helix-ioop-helix domain.
- Specific DNA binding domains used can include, but are not limited to, a Gal4 DNA binding domain, a LexA DNA binding domain, or a Zif268 DNA binding domain.
- Nuclear localization signals (NLS) used can include, but are not limited to, consensus NLS sequences, viral NLS sequences, cellular NLS sequences, and combinations thereof.
- Flexible hinge regions used can include, but are not limited to, glycine/serine linkers and variants thereof.
- a polymerase other than a naturally occurring polymerase, whose sequence comprises a polypeptide with at least 85%, at least 86 at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% identity to SEQ ID NO: 190 provided the combination of substitutions shown in Table 4 for SEQ ID NO: 190 or a subset thereof is retained.
- Modified DNA polymerase combines a function of polymerizing DNA using RNA as a template and a function of polymerizing DNA using DNA as a template.
- the modified polymerases are active on both DNA and RNA templates and can generate amplicons from any source, for example DNA or RNA from bacterial, human or murine hosts and more.
- a modified polymerase often exhibits an increase in primer/template specificity in comparison to an unmodified polymerase comprising a wildtype sequence.
- Primer/template specificity is the ability of an enzyme to discriminate between matched primer/template duplexes and mismatched primer/template duplexes. Specificity can be determined, for example, by comparing the relative yield of two reactions, one of which employs a matched primer, and one of which employs a mismatched primer. An enzyme with increased discrimination has a higher relative yield with the matched primer than with the mismatched primer, i.e., the ratio of the yield in the reaction using the matched primer vs. the reaction using the mismatched primer is about 1 or above.
- the modified polymerase(s) typically exhibit at least a 2-fold, often 3 -fold or greater increase in the ratio relative to a wild type polymerase.
- Specificity can also be measured, for example, in a real-time PCR, where the difference in the Ct (threshold cycle) values (.DELTA.C.sub.t) between the fully
- the Ct value represents the number of cycles required to generate a detectable amount of DNA (a "detectable" amount of DNA is typically 2 times, usually 5 times, 10 times, 100 times or more above background).
- a polymerase with enhanced specificity may be able to produce a detectable amount of DNA in a smaller number of cycles by more closely approaching the theoretical maximum amplification efficiency of PCR. Accordingly, a lower Ct value reflects a greater
- Polymerase processivity can be measured by a variety of methods known to those of ordinary skill in the art. Polymerase processivity is generally defined as the number of nucleotides incorporated during a single binding event of a modifying enzyme to a primed template. For example, a 5'FAM-labeled primer is annealed to circular or linearized ssM13 mpl8 DNA to form a primed template. In measuring processivity, the primed template usually is present in significant molar excess to the polymerase so that the chance of any primed template being extended more than once by the polymerase is minimized.
- the primed template is therefore mixed with the polymerase at a ratio such as approximately 4000:1 (primed DNA: DNA polymerase) in the presence of buffer and dNTPs.
- MgCb is added to initiate DNA synthesis.
- Samples are quenched at various times after initiation, and analyzed on a sequencing gel.
- the length corresponds to the processivity of the enzyme.
- the processivity of a protein of the invention i.e., a modified polymerase, is then compared to the processivity of the wild-type enzyme (an unmodified polymerase).
- the modified polymerases of the invention are expected to exhibit increased processivity relative to the unmodified polymerase.
- Enhanced efficiency can also be demonstrated by measuring the increased ability of an enzyme to produce product
- Such an analysis measures the stability of the double-stranded nucleic acid duplex indirectly by determining the amount of product obtained in a reaction.
- a PCR assay can be used to measure the amount of PCR product obtained with a short, e.g., 12 nucleotides in length, primer annealed at an elevated temperature, for example, 50°C.
- enhanced efficiency is shown by the ability of a modified polymerase to produce more product in a PCR reaction using the 12-nucleotide primer annealed at 50°C in comparison to an unmodified polymerase.
- Long PCR may be used as another of demonstrating enhanced efficiency.
- an enzyme with enhanced efficiency typically allows the amplification of a long ampiicon (>5 kb) in a shorter extension time compared to an enzyme with relatively lower efficiency.
- Assays such as salt sensitivity can also be used to demonstrate improvement in efficiency of a processive nucleic acid modifying enzyme of the invention.
- a modified polymerase of the invention can exhibit increased tolerance to high salt concentrations, i.e., a processive enzyme with increased processivity can produce more product in higher salt concentrations.
- a PCR analysis can be performed to determine the amount of product obtained in a reaction using a modified polymerase compared to an unmodified polymerase in reaction conditions with high salt, for example, 80 mM.
- the fidelity of DNA polymerase refers to its ability to accurately replicate a template.
- High-fidelity PCR utilizing modified DNA polymerase variants that couple low misincorporation rates with proofreading activity to give faithful replication of a DNA target are also contemplated.
- Modified DNA polymerase variants selected from SEQ ID NOS: 3- 242 that show high fidelity and proofreading activity are claimed. Additional variants with proofreading activity and low misincorporation rates are also contemplated.
- Variants of SEQ ID NO: 1, that is modified polymerases comprising one or more substitution shown in Table 1 are anticipated to possess improved properties, relative to naturally occurring DNA polymerases, conferring higher activity in amplifying sequences from difficult to amplify templates including polynucleotide templates (DNA or RNA) from human cells.
- Modified polymerase variants comprising one or more substitution shown in Table 2 are anticipated to possess improved properties, relative to naturally occurring DNA polymerases, conferring higher activity in amplifying sequences from difficult to amplify templates including polynucleotide templates from human cells.
- Modified polymerase variants comprising one or more substitution shown in Table 3 are anticipated to possess improved properties, relative to naturally occurring DNA polymerases, conferring higher activity in amplifying sequences from difficult to amplify templates including polynucleotide templates from human cells.
- Modified polymerase variants comprising one or more combinations of substitutions from Table 3 relative to SEQ ID NO: 1 are shown in Table 4 and are anticipated to possess improved properties, relative to naturally occurring DNA polymerases, conferring higher activity in amplifying sequences from difficult to amplify templates including polynucleotide templates from human cells.
- DNA polymerase sequences SEQ ID NO: 3-242 are anticipated to possess improved properties conferring higher activity in amplifying sequences from difficult to amplify templates including polynucleotide templates from human cells.
- the modified polymerases may be included in vectors and host cells comprising such vectors.
- the modified polymerases may be provided as a vector or a purified protein as a component of a kit.
- Kits may include other components such as buffers, salts, primers and may be used for applications such as amplification, e.g., PCR and/or sequencing (e.g., next-generation sequencing (NGS) platforms).
- NGS next-generation sequencing
- Modified polymerases may be used for quantification and quality assessment of human genomic DNA samples prior to NGS library construction.
- the kit includes a qPCR Master mix with the modified polymerase, optimized for high-performance SYBR Green I- based qPCR, pre-diluted set of DNA standards and primer premixes targeting different portions of a highly conserved single-copy human locus. Absolute quantification is achieved with the primer pair defining the shortest fragment, whereas the additional primers are used to derive information about the amount of amplifiable template in the DNA sample.
- Quality scores (or Q-ratios) generated with the kit may be used to predict the outcome of library construction, or tailor workflows for samples of variable quality, particularly formalin-fixed paraffin-embedded (FFPE) DNA, samples obtained by laser-capture microdissection of fresh, frozen, or FFPE tissue, DNA extracted from cells collected by flow cytometry, free circulating DNA from plasma or serum, forensic samples or any other low-concentration or precious clinical sample.
- FFPE formalin-fixed paraffin-embedded
- NGS Next-generation sequencing
- modified polymerase may be used for DNA library preparation. Modified polymerases that enable higher yields and lower amplification bias translates to higher library diversity, lower duplication rates and more uniform coverage.
- a method comprising amplifying sequences from a target polynucleotide, for example, polymerase chain reaction (PCR) or reverse transcription using a modified polymerase.
- the polymerase may be selected from SEQ ID NOS: 3-242, in particular SEQ ID NO: 190.
- kits comprising modified polymerases, other than natural sequences, comprising one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more of the substitutions shown in Table 1, Table 2 or Table 3 may also be used.
- Amplifying sequences from a target polynucleotide using modified polymerases, other than a natural sequence, that have at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% sequence identity to polymerase of SEQ ID NO: 190 is also
- the target polynucleotide may be DNA or RNA.
- the target polynucleotide may be from a human cell or a murine cell.
- Modified DNA polymerase variants were expressed in E. coli using the rhamnose inducible system in vector pD861 (from ATUM/DNA2.0). Variants were purified in parallel using heat cut, host cell DNA removal and column chromatography and stored in appropriate storage buffer. Purified protein was characterized for i) purity on SDS-PAGE; ii) polymerase activity; iii) host cell DNA contamination test by 16S-PCR; iv) exonuclease activity and; v) thermostability test.
- DNA polymerase activity was determined using a DNA polymerase assay based on the detection of incorporated radioisotope-labeled dNTP in a DNA elongation reaction. The reaction was incubated at 70°C for 10 minutes. Activity of the DNA polymerase (DNAP) variants was compared to other available polymerases Accura DNAP, GoTaq and KAPA (Table 6).
- the parent DNA polymerase (Accura DNA polymerase) has an inherent 3' - 5' proofreading exonuclease activity. Exonuclease activity of the modified DNA polymerase (DNAP) variants was measured to determine if the purified DNAP variants maintained the exonuclease activity. Nuclease activity of the modified DNA polymerase (DNAP) variants was compared to other available polymerases Accura DNAP, GoTaq and KAPA. The reaction mixture including 10 units of enzyme and 3 P-labeled DNA fragment in 25 ⁇ IX DNAP buffer B (Lucigen) was incubated at 65°C for 16 hours. Reactions were analyzed by trichloroacetic acid precipitation method.
- DNA polymerase variants were tested by incubating the enzymes in IX DNAP buffer B (Lucigen) at 98°C for 2 minutes. Thermostability of the DNA polymerase (DNAP) variants was compared to other available polymerases Accura DNAP, GoTaq and KAPA (Table 0). The polymerase activity was determined as described in example 6.1.2. . o «5 3 ⁇ 4 *-r> rM
- Amplification properties of DNA polymerase variants were determined by PCR of a 2.8 kb Pol A target DNA from E, coli genomic DNA (gDNA). PCR reactions were run at two denaturation temperatures 94°C and 98°C to determine thermostability. 25 ⁇ PCR reactions were set up as follows: IX HiF buffer (Lucigen), 200 ⁇ dNTP, 10 ng E. coli gDNA, 1 ⁇ each primer, 5 U DNA polymerase; cycling conditions: 94°C for 2 mins, 35 cycles - 94°C for 15 seconds; 60°C for 30 seconds; 72°C for 2 mins, 72°C for 10 mins, hold at 4°C.
- a second PCR reaction set was run with cycling conditions: 94°C for 2 mins, 35 cycles - 98°C for 15 seconds; 60°C for 30 seconds; 72°C for 2 mins, 72°C for 10 mins, hold at 4°C.
- Accura DNA polymerase and KAPA DNA polymerase was included as controls.
- Amplification properties of DNA polymerase variants were determined by PCR of a 5 kb and 10 kb target DNA from E. coli genomic DNA (gDNA). PCR reactions were run at two denaturation temperatures 94°C and 98°C to determine thermostability. 25 ⁇ PCR reactions were set up as follows: IX HiF buffer (Lucigen), 200 ⁇ dNTP, 10 ng E. coli gDNA, 1 ⁇ each primer, 5 U DNA polymerase; cycling conditions: 94°C for 2 mins, 35 cycles - 94°C for 15 seconds; 60°C for 30 seconds; 72°C for 2 mins, 72°C for 10 mins, hold at 4°C.
- a second PCR reaction set was run with cycling conditions: 94°C for 2 mins, 35 cycles - 98°C for 15 seconds; 60°C for 30 seconds; 72°C for 2 mins, 72°C for 10 mins, hold at 4°C.
- Accura DNA polymerase and KAPA DNA polymerase was included as controls.
- PCR reactions were analyzed by agarose gel electrophoresis.
- V302 (SEQ ID NO: 148), V303 (SEQ ID NO: 149), V305 (SEQ ID NO: 151), V307 (SEQ ID NO: 153), V308 (SEQ ID NO: 154), V309 (SEQ ID NO: 155), V310 (SEQ ID NO: 156), V311 (SEQ ID NO: 157), V313 (SEQ ID NO: 159), V314 (SEQ ID NO: 160), V3 I5 (SEQ ID NO: 161), V316 (SEQ ID NO: 162), V320 (SEQ ID NO: 166), V321 (SEQ ID NO: 167), V322 (SEQ ID NO: 168), V324 (SEQ ID NO: 170), V327 (SEQ ID NO: 173), V331 (SEQ ID NO: 177), V334 (SEQ ID NO: 180), V335 (SEQ ID NO: 181), V339 (SEQ ID NO: 148), V303 (SEQ ID NO:
- Modified DNA polymerase variants efficiently amplified longer human DNA target (5 kb) by PCR.
- 25 ⁇ PCR reactions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 50 ng human DNA, 2.5 ⁇ P5/P7 primer, 5 U DNA polymerase; cycling conditions: 95°C for 2 mins, 30 cycles - 98°C for 15 seconds; 60°C for 30 seconds; 72°C for 2.5 mins, 72°C for 10 mins, hold at 4°C; cycling conditions for KAPA polymerase: 95°C for 3 mins, 30 cycles - 98°C for 20 seconds; 65°C for 15 seconds; 72°C for 2.5 mins, 72°C for 10 mins, hold at 4°C.
- Accura DNA polymerase and KAPA DNA polymerase was included as controls, PCR reactions were analyzed by agarose gel electrophoresis ( Figure 6 A) and on a Bioanalyser ( Figure 6B).
- Modified DNA polymerase variants identified show good NGS Human DNA library amplification. 25 ⁇ PCR reactions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 20 pg human DNA library, 2.5 ⁇ P5/P7 primer, 5 U DNA polymerase; cycling conditions: 95°C for 2 mins, 18 cycles - 98°C for 1 seconds; 60°C for 30 seconds; 72°C for 1 minute, 72°C for 10 mins, hold at 4°C; cycling conditions for KAPA polymerase: 95°C for 3 mins, 18 cycles - 98°C for 20 seconds; 65°C for 15 seconds; 72°C for 1 minute, 72°C for 10 mins, hold at 4°C. Accura DNA polymerase and KAPA DNA polymerase was included as controls.
- Modified DNA polymerase variants V304 (SEQ ID NO: 150), V309 (SEQ ID NO: 155), V313 (SEQ ID NO: 159), V318 (SEQ ID NO: 164), V319 (SEQ ID NO: 165), V320 (SEQ ID NO: 166), V327 (SEQ ID NO: 173), V339 (SEQ ID NO: 185), V344 (SEQ ID NO: 190), V397 (SEQ ID NO: 242) efficiently amplified high GC-content NGS DNA library from Rhodobacter giving high yields of amplicons.
- Modified DNA polymerase variants successfully amplified high AT-content NGS DNA library from Staphylococcus. 25 ⁇ PCR reactions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 20 pg Staphylococcus DNA library, 2.5 ⁇ P5/P7 primer, 5 U DNA polymerase; cycling conditions: 95°C for 2 mins, 18 cycles - 98°C for 15 seconds; 60°C for 30 seconds; 72°C for 1 minute, 72°C for 10 mins, hold at 4°C; cycling conditions for KAPA polymerase: 95°C for 3 mins, 18 cycles - 98°C for 20 seconds; 65°C for 15 seconds; 72°C for 1 mins, 72°C for 10 mins, hold at 4°C. Accura (Acc) DNA polymerase and KAPA (KA) DNA polymerase was included as controls. PCR reactions were analyzed by agarose gel electrophoresis ( Figures 9) and on a Bioanalyser (data
- Modified DNA polymerase variants V313 (SEQ ID NO: 159), V320 (SEQ ID NO: 166), V344 (SEQ ID NO: 190), V368 (SEQ ID NO: 214) efficiently amplified high AT- content NGS DNA library from Staphylococcus giving high yields of amplicons.
- Primer extension was set up as follows in a 25 ⁇ reaction mixture containing IX DNAP buffer B, 0.5 ⁇ g Ml 3 single-stranded DNA (ssDNA), 0.4 ⁇ Ml 3 primer, 250 ⁇ dNTPs, 0.1 ⁇ g/ ⁇ l BSA and 5 U of polymerase. Reactions were incubated at 65 °C for 30 minutes and analyzed by agarose gel electrophoresis (Figure 10).
- RT-PCR 25 ⁇ reaction mixture was set up as follows: IX HF buffer, 200 ⁇ dNTP, 1 ng MS2 RNA, 0.5 ⁇ each primer, 5 mM PCF if needed, 2.5 U polymerase; RT-PCR cycling conditions: 65°C for 2 mins, 94°C for 2 mins; 35 cycles: 94°C for 15 seconds, 60°C for 30 seconds, 72°C for 1 minute; 72°C for 10 mins, hold at 4°C.
- RT-PCR reactions were analyzed by agarose gel electrophoresis ( Figures 11A and 1 IB), a subset of samples were re-analyzed on an agarose gel to confirm positives ( Figure 11C).
- PCR reaction and cycling conditions were the same as sections 6.1.5 and 6.1.6 above except using 2.5 U polymerase for the 3kb and 5 kb templates.
- 25 ⁇ PCR reactions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 10 ng E. coli gDNA, 0.5 ⁇ each primer, 2.5 U DNA polymerase; cycling conditions: 94°C for 2 mins, 35 cycles: 94°C for 15 seconds, 60°C for 30 seconds, 72°C for 5 mins; 72°C for 10 mins, hold at 4°C (Figure 15B panel I).
- a second set was cycling conditions with 30 cycles ( Figure 15B panel II).
- PCR reaction and cycling conditions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 10 ng human gDNA, 0.5 ⁇ primer, 2.5 U DNA polymerase; cycling conditions: 95°C for 2 mins, 30 cycles: 98°C for 15 seconds; 60°C for 30 seconds; 72°C fori.5 mins, 72°C for 10 mins, hold at 4°C; cycling conditions for KAPA polymerase: 95°C for 3 mins, 30 cycles - 98°C for 20 seconds; 65°C for 15 seconds; 72°C for 1.5 mins, 72 °C for 10 mins, hold at 4°C.
- Accura DNA polymerase and KAPA DNA polymerase was included as controls. PCR reactions were analyzed by agarose gel electrophoresis ( Figure 16).
- PCR reactions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 20 pg DNA library, 2.5 ⁇ P5/P7 primer, 2.5 U DNA polymerase; cycling conditions: 95°C for 2 mins, 18 cycles: 98°C for 15 seconds; 60°C for 30 seconds; 72°C for 1 minute, 72°C for 10 mins, hold at 4°C; cycling conditions for KAPA polymerase: 95°C for 3 mins, 18 cycles - 98°C for 20 seconds; 65°C for 15 seconds; 72°C for 1 minute, 72°C for 10 mins, hold at 4°C. Accura DNA polymerase and KAPA DNA polymerase was included as controls. PCR reactions were analyzed by agarose gel electrophoresis ( Figure 17) and on a Bioanalyser (data not shown).
- PCR reactions were set up as follows: IX HiF buffer (Lucigen), 300 ⁇ dNTP, 20 pg Staphylococcus DNA library, 2.5 ⁇ P5/P7 primer, 2.5 U DNA polymerase; cycling conditions: 95°C for 2 mins, 18 cycles - 98°C for 15 seconds; 60°C for 30 seconds; 72°C for 1 minute, 72°C for 10 mins, hold at 4°C; cycling conditions for KAPA polymerase: 95°C for 3 mins, 18 cycles - 98°C for 20 seconds; 65°C for 15 seconds; 72°C for 1 mins, 72°C for 10 mins, hold at 4°C.
- Accura (Acc) DNA polymerase and KAPA (KA) DNA polymerase was included as controls. PCR reactions were analyzed by agarose gel electrophoresis ( Figure 18) and by Bioanalyser (data not shown).
- Second round and third round variant screening is summarized in Table 12 below.
- PCR optimization was carried out first in the presence of sugars, for example sucrose, trehalose, mannitol ( Figure 19 A) and sorbitol (Figure 19B) followed by addition of bovine serum albumin (BSA) or L-carnitine (Figure 20).
- sugars for example sucrose, trehalose, mannitol ( Figure 19 A) and sorbitol ( Figure 19B) followed by addition of bovine serum albumin (BSA) or L-carnitine (Figure 20).
- BSA bovine serum albumin
- L-carnitine helped with a more productive PCR.
- Sugar and L-carnitine together showed an additive effect on amplification of a 3 kb CometGFP DNA and a 5 kb human gDNA target ( Figures 21 A and 21B).
- buffer compositions for example Tris-HCl (10-50 mM), Tris-acetate (10-50 mM) or Bis-Tris propane (10-50 mM) with a pH range of 8.0-9.3; salts, for example KC1 (10-100 mM) and ammonium sulfate (5-50 mM); metal, for example magnesium chloride (1-5 mM) or magnesium sulfate (1-5 mM); non-ionic detergents, for example Triton X-100 (0.1-1%), NP-40 (0.1-1%), Tween 20 (0.1-1%), Brij58 (0.1-1%) or CHAPS (0.1-1%); dNTPs (50-500 ⁇ ) were set up on a 3 kb and 2.8 kb E.coli targets ( Figures 22A and 22B respectively), buffer combinations shown in Table 13.
- Tris-HCl 10-50 mM
- PCR reactions were set up as follows: 25 ⁇ of IX buffer as designed (Table 12), 5 pg of plasmid DNA, 0.5 ⁇ primers, 2.5 U of hotstart polymerase V344 (SEQ ID NO: 190) with or without additive, cycling conditions: 95°C for 2 minutes; 30 cycles of 98°C for 15 seconds, 60°C for 30 seconds, 72°C for 1.5 minutes; 72°C for 10 min. A summary of results is shown in Table 14.
- PCR reactions with standard PCR buffer was tested on CometGFP DNA as template with addition of sorbitol (Figure 23A), KC1 ( Figure 23B) or non-ionic detergents ( Figure 23C).
- PCR reactions were set up as follows: 25 ⁇ of IX basic buffer, pH range 8.0- 9.3 ( Figure 23A), 200 ⁇ dNTP, 5 pg of CometGFP DNA, 0.5 ⁇ primers, 15% sorbitol added to a subset of reactions, 2.5 U of hotstart polymerase V344 (SEQ ID NO: 238041) with or without additive, cycling conditions: 95°C for 2 minutes; 30 cycles of 98°C for 15 seconds, 60°C for 30 seconds, 72°C for 1.5 minutes; 72°C for 10 min. Reactions were analyzed by agarose gel electrophoresis.
- Models are based on the set of infologs described and assess the relative contribution of substitutions within the set. Model weights are shown for activities that are desirable in the modified variants and are used for selection of substitutions.
- V645A 6.35 S571L 5.28 E511L 4.15 V601L 3.58
- V602F 1.66 V343E 2.23 P560Y 1.78 F576I 1.48
- V358A 0.05 V343S 0.02 Q513D 0.00 P495Y 0.00
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
La présente invention concerne des séquences d'ADN polymérase pour des applications in vitro comprenant la PCR pour la création de banques de séquençage de nouvelle génération, et pour la RT-PCR.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201662359059P | 2016-07-06 | 2016-07-06 | |
| US62/359,059 | 2016-07-06 | ||
| US201662360627P | 2016-07-11 | 2016-07-11 | |
| US62/360,627 | 2016-07-11 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2018009729A2 true WO2018009729A2 (fr) | 2018-01-11 |
| WO2018009729A3 WO2018009729A3 (fr) | 2018-02-15 |
Family
ID=60913159
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2017/040994 Ceased WO2018009729A2 (fr) | 2016-07-06 | 2017-07-06 | Modification d'adn polymérases pour des applications in vitro |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018009729A2 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021242740A3 (fr) * | 2020-05-26 | 2022-03-10 | Qiagen Beverly Llc | Enzyme polymérase |
| CN115948364A (zh) * | 2022-10-24 | 2023-04-11 | 翌圣生物科技(上海)股份有限公司 | Taq DNA聚合酶突变体Taq001及其编码基因、表达质粒、原核表达宿主 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2007044671A2 (fr) * | 2005-10-06 | 2007-04-19 | Lucigen Corporation | Polymerases virales thermostables, et leurs methodes d'utilisation |
| WO2010091203A2 (fr) * | 2009-02-04 | 2010-08-12 | Lucigen Corporation | Enzymes de copie de l'arn et de l'adn |
-
2017
- 2017-07-06 WO PCT/US2017/040994 patent/WO2018009729A2/fr not_active Ceased
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2021242740A3 (fr) * | 2020-05-26 | 2022-03-10 | Qiagen Beverly Llc | Enzyme polymérase |
| CN115948364A (zh) * | 2022-10-24 | 2023-04-11 | 翌圣生物科技(上海)股份有限公司 | Taq DNA聚合酶突变体Taq001及其编码基因、表达质粒、原核表达宿主 |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2018009729A3 (fr) | 2018-02-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250115895A1 (en) | Dpo4 polymerase variants | |
| JP7288457B2 (ja) | 正確性が向上したdpo4ポリメラーゼバリアント | |
| US20240240161A1 (en) | Dp04 polymerase variants | |
| CN112442493A (zh) | 热稳定的逆转录酶 | |
| CN110331136B (zh) | 一种末端脱氧核糖核苷转移酶变体及其应用 | |
| EP3423576B1 (fr) | Variants de polymérase | |
| WO2023143123A1 (fr) | Variant de transférase terminale pour la synthèse contrôlable d'adn simple brin et son utilisation | |
| KR20230075403A (ko) | 활성 및 열안정성이 증가된 역전사효소 돌연변이체 | |
| CN106318924B (zh) | 一种催化dna合成延伸能力提高的dna聚合酶 | |
| Andrews et al. | Characterization of a novel bacterial arginine kinase from Desulfotalea psychrophila | |
| WO2018009729A2 (fr) | Modification d'adn polymérases pour des applications in vitro | |
| CN114174502B (zh) | 具有改进的引物识别的Phi29 DNA聚合酶突变体 | |
| WO2018009726A2 (fr) | Modification d'adn polymérases pour des applications in vitro | |
| EP3697930A1 (fr) | Adn polymérases indépendantes des amorces et leur utilisation dans la synthèse d'adn | |
| TWI862945B (zh) | B族dna聚合酶變體以及包含其的試劑套組 | |
| WO2002031745A1 (fr) | Bibliotheques riches en informations | |
| JP6826275B2 (ja) | 改変ポリメラーゼ | |
| JP7660563B2 (ja) | 改善された熱安定性ウイルス逆転写酵素 | |
| CN114958800A (zh) | 一种耐受血液或者血液制品抑制的Taq DNA聚合酶突变体及其应用 | |
| CN118401656A (zh) | 热稳定性优异的逆转录酶 | |
| CN120225667A (zh) | 重组蛋白及其应用 | |
| US20240301457A1 (en) | Compositions and methods for enzymatic nucleic acid synthesis | |
| JP2023090534A (ja) | 熱安定性に優れた逆転写酵素 | |
| WO2024138419A1 (fr) | Polypeptide présentant une activité d'adn polymérase et son utilisation | |
| WO2023082266A1 (fr) | Adn polymérase chimérique et son utilisation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17824937 Country of ref document: EP Kind code of ref document: A2 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17824937 Country of ref document: EP Kind code of ref document: A2 |