[go: up one dir, main page]

EP4103702A1 - Variant family d dna polymerases - Google Patents

Variant family d dna polymerases

Info

Publication number
EP4103702A1
EP4103702A1 EP21710761.4A EP21710761A EP4103702A1 EP 4103702 A1 EP4103702 A1 EP 4103702A1 EP 21710761 A EP21710761 A EP 21710761A EP 4103702 A1 EP4103702 A1 EP 4103702A1
Authority
EP
European Patent Office
Prior art keywords
family
polymerase
polymerases
dna
amino acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP21710761.4A
Other languages
German (de)
French (fr)
Inventor
Andrew F. Gardner
Kelly M. ZATOPEK
Thomas C. Evans
Ece ALPASLAN
Ludovic Sauguet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institut Pasteur
New England Biolabs Inc
Original Assignee
Institut Pasteur
New England Biolabs Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institut Pasteur, New England Biolabs Inc filed Critical Institut Pasteur
Publication of EP4103702A1 publication Critical patent/EP4103702A1/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1252DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • C12Y207/07007DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase

Definitions

  • DNA polymerases are classified into Families A, B, C, D, X, Y and RT according to their amino acid sequences. DNA polymerases have several properties that contribute to their replicative fidelity. For example, some DNA polymerases of Families A, B, and D have proofreading 3' to 5' (3 '-5') exonuclease activity. When a DNA polymerase incorporates an incorrect or modified nucleotide, for example, in a primer strand, it detects structural perturbations caused by mispairing or nucleotide modification and transfers the primer strand from the polymerase domain to the 3 '-5' exonuclease active site.
  • DNA polymerases restrict access to their active sites to prevent incorporation of ribonucleotides.
  • Families A, B, X, Y, and reverse transcriptases have a steric gate that excludes rNTPs from the active site by a steric clash between a bulky amino acid side chain in the steric gate and the 2’-OH of such rNTPs. Reducing the size of the side chain at the steric gate position allows DNA polymerases so modified to incorporate a single rNTP as efficiently as dNTP.
  • Polymerases with such modified steric gates have been extensively employed in molecular biology applications such as single-molecule sequencing, sequencing by synthesis, and single nucleotide polymorphism (SNP) detection.
  • SNP single nucleotide polymorphism
  • PolD is a heterodimeric enzyme consisting of a large 5 ’-3’ polymerase subunit and a small MRE11-like 3'-5' exonuclease subunit. The activity of each subunit requires the other subunit to be present.
  • Family D DNA polymerases preferentially incorporate dNTPs over rNTPs. The molecular basis for this selectivity has not yet been identified, which may have limited the use of wild type Family D polymerases.
  • a variant Family D polymerase may have an amino acid sequence according to SEQ ID NO:l and (b) includes a substitution at a position corresponding to a position selected from positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, and 940-962, 981-997 of SEQ ID NO: 1.
  • FIGURE 1 shows an alignment of Family D polymerase sequences from Pyrococcus abyssi (P.ab) and Euryarchaeota (9°N PolD-L). Based on a cryo-electron microscopy structure of the P.ab sequence bound to DNA, amino acid residues that (a) appear to be within 20 A of the catalytic site of polD are shown under a solid bar, (b) appear to be within 12 A of the 2’-OH group of the substrate nucleotide are underlined, and (c) may contact the substrate nucleotide are shown in bold.
  • compositions, methods and kits are provided here that improve among other things, the synthesis of DNA that contains ribonucleotides using a class of polymerases that do not belong to the DNA polymerase A or B families.
  • Family D polymerases preferentially incorporate deoxyribonucleoside triphosphates relative to larger ribonucleoside triphosphates, which may be attributed to a steric gate motif limiting substrate access to the catalytic site.
  • Variant Family D polymerases with modifications in or modifications impacting this steric gate are provided in some embodiments disclosed herein. For example, variant Family D polymerases may have an enhanced ability to exclude ribonucleotides from the active site.
  • a protein refers to one or more proteins, i.e., a single protein and multiple proteins.
  • claims can be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements or use of a “negative” limitation.
  • Numeric ranges are inclusive of the numbers defining the range. All numbers should be understood to encompass the midpoint of the integer above and below the integer i.e., the number 2 encompasses 1.5-2.5. The number 2.5 encompasses 2.45-2.55 etc. When sample numerical values are provided, each alone may represent an intermediate value in a range of values and together may represent the extremes of a range unless specified.
  • buffering agent refers to an agent that allows a solution to resist changes in pH when acid or alkali is added to the solution.
  • suitable non-naturally occurring buffering agents include, for example, any of Tris, HEPES, TAPS, MOPS, tricine, and MES.
  • corresponding to refers to positions that lie across from one another when sequences are aligned, e.g., by the BLAST algorithm.
  • An amino acid position in a functional or structural motif in one polymerase may correspond to a position within a functionally equivalent functional or structural motif in another polymerase.
  • DNA polymerase refers to an enzyme that is capable of replicating DNA and optionally may have exonuclease activity.
  • DNA template refers to the DNA strand read by a DNA polymerase and of which a copy is synthesized.
  • “Family D polymerase” refers to a heterodimeric archaeal DNA polymerase having a small exonuclease subunit (DPI) and a large polymerase subunit (DP2).
  • Family D polymerases may be produced in cells in an immature form and then undergo post-translational processing (e.g ., removal of amino-terminal sequences, inteins, or both). Examples of Family D polymerases may include 9°N PolD (Euryarchaeota), Genbank Accession No. KPV61551.1 (Bathyarchaeota), Accession No.
  • fusion protein refers to protein composed of a plurality of polypeptide components that are un-joined in their native state. Fusion proteins may be a combination of two, three or four or more different proteins.
  • polypeptide is not intended to be limited to a fusion of two heterologous amino acid sequences.
  • a fusion protein may have one or more heterologous domains added to the N-terminus, C-terminus, and or the middle portion of the protein. If two parts of a fusion protein are “heterologous”, they are not part of the same protein in its natural state.
  • fusion proteins include a variant Family D polymerase fused to an SS07 DNA binding peptide (see for example, US Patent 6,627,424), a transcription factor (see for example, US patent 10,041,051), a binding protein suitable for immobilization such as maltose binding domain (MBP), a histidine tag (“His-tag”), chitin binding domain (CBD) or a SNAP-Tag® (New England Biolabs, Ipswich, MA (see for example US patents 7,939,284 and 7,888,090)).
  • MBP maltose binding domain
  • His-tag histidine tag
  • CBD chitin binding domain
  • SNAP-Tag® New England Biolabs, Ipswich, MA (see for example US patents 7,939,284 and 7,888,090)
  • fusion proteins include a heterologous targeting sequence, a linker, an epitope tag, a detectable fusion partner, such as a fluorescent protein, b-galactosidase, luciferase and the functionally similar peptides.
  • NTP refers to a nucleoside triphosphate including, for example, any deoxyribonucleoside triphosphate (“dNTP”) and any ribonucleoside triphosphate (“rNTP”).
  • dNTP deoxyribonucleoside triphosphate
  • rNTP ribonucleoside triphosphate
  • non-naturally occurring refers to a polynucleotide, polypeptide, carbohydrate, lipid, or composition that does not exist in nature.
  • a polynucleotide, polypeptide, carbohydrate, lipid, or composition may differ from naturally occurring polynucleotides polypeptides, carbohydrates, lipids, or compositions in one or more respects.
  • a polymer e.g ., a polynucleotide, polypeptide, or carbohydrate
  • the component building blocks e.g., nucleotide sequence, amino acid sequence, or sugar molecules.
  • a polymer may differ from a naturally occurring polymer with respect to the molecule(s) to which it is linked.
  • a “non-naturally occurring” protein may differ from naturally occurring proteins in its secondary, tertiary, or quaternary structure, by having a chemical bond (e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others) to a polypeptide (e.g., a fusion protein), a lipid, a carbohydrate, or any other molecule.
  • a chemical bond e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others
  • a “non-naturally occurring” polynucleotide or nucleic acid may contain one or more other modifications (e.g., an added label or other moiety) to the 5’- end, the 3’ end, and/or between the 5’- and 3 ’-ends (e.g., methylation) of the nucleic acid.
  • a “non-naturally occurring” composition may differ from naturally occurring compositions in one or more of the following respects: (a) having components that are not combined in nature, (b) having components in concentrations not found in nature, (c) omitting one or components otherwise found in naturally occurring compositions, (d) having a form not found in nature, e.g., dried, freeze dried, crystalline, aqueous, and (e) having one or more additional components beyond those found in nature (e.g., buffering agents, a detergent, a dye, a solvent or a preservative).
  • All publications, patents, and patent applications identified in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
  • polynucleotide copy refers to the product of polymerization activity of a DNA polymerase.
  • a polynucleotide copy may comprise deoxyribonucleotides with or without ribonucleotides.
  • position refers to the place such amino acid occupies in the primary sequence of a peptide or polypeptide numbered from its amino terminus to its carboxy terminus.
  • substitution at a position in a comparator amino acid sequence refers to any difference at that position relative to the corresponding position in a reference sequence, including a deletion, an insertion, and a different amino acid, where the comparator and reference sequences are at least 80% identical to each other.
  • a substitution in a comparator sequence, in addition to being different than the reference sequence, may differ from all corresponding positions in naturally occurring sequences that are at least 80% identical to the comparator sequence.
  • variant Family D polymerase refers to a non-naturally occurring archaeal Family D DNA polymerase that has an amino acid sequence that is less than 100% identical to the amino acid sequence of a naturally occurring DNA polymerase from archaea or has a non-naturally occurring chemical modification (e.g ., a polypeptide fused to its amino terminal or carboxy terminal end or other chemical modification).
  • a variant amino acid sequence may have at least 95%, at least 97%, at least 98% or at least 99% identity to the amino acid sequence of a naturally occurring Family D polymerase without being 100% identical to any known naturally occurring polymerase.
  • a variant Family D polymerase may have a small exonuclease subunit (DPI) and a large polymerase subunit (DP2), with the DPI subunit having or lacking exonuclease activity.
  • DPI small exonuclease subunit
  • DP2 large polymerase subunit
  • a variant Family D polymerase lacking exonuclease activity may comprise only such portion of the DPI subunit as necessary to support catalytic activity of the polymerase (DP2) subunit.
  • a steric gate amino acid blocks ribonucleotide incorporation at low ribonucleotide concentrations and altering the steric gate amino acid has been observed to reduce discrimination against ribonucleotides at low concentration.
  • family D DNA polymerases the position of the rNTP 2'-OH in the active site had not been identified.
  • the present disclosure relates to a low resolution cryo-electron microscopy structure of the Pyrococcus abyssi (P.ab) sequence bound to DNA that provides insight into the possible structure of the active site and surrounding domains.
  • amino acid positions potentially close enough to the catalytic site of the polymerase domain (DP2) to interact with a substrate NTP were identified from structural data using PyMol software (Schrodinger, LLC) and are shown in Tables 1-3.
  • Variant Family D polymerases are provided here that differ from wild type Family D DNA polymerases in their abilities to incorporate rNTPs.
  • the incorporation ratio of adenosine triphosphate to deoxyadenosine triphosphate (rA:dA) may be more than wild type.
  • Use of wild type Family D polymerases may have been limited in the past, in part, by their rNTP/dNTP selectivity.
  • Variant Family D polymerases are not naturally occurring and have at least one substitution relative to wild type in their amino acid sequences.
  • a variant Family D polymerase may have an amino acid sequence comprising a substitution in one or more of the domains shown in Table 1 (optionally, in order, from amino- to carboxy terminal ends).
  • Each of these domains corresponds to the indicated portion of wild type Family D polymerase (e.g ., Pyrococcus abyssi; SEQ ID NO: 1). Amino acid residues in each of these domains may be located in the active protein near the active site (e.g., within about 20 A).
  • a variant Family D polymerase may have an amino acid sequence comprising a substitution at one or more positions that may contact the incoming substrate nucleoside triphosphate, for example, positions shown in Table 2.
  • a variant Family D polymerase may have an amino acid sequence comprising a substitution at one or more positions that may be located in the protein near the active site ( e.g ., within about 12 A), for example, positions shown in Table 3.
  • a variant Family D polymerase may comprise one or more substitutions at one or more positions selected from positions corresponding to positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, 940-962, and 981- 997 of SEQ ID NO:l.
  • a variant Family D polymerase may have an amino acid sequence comprising one or more sequences having (a) at least 98% identity to domain I, at least 96% identity to domain II, (c) at least 80% identity to domains III and IV, (d) at least 92% identity to domain V, (e) at least 90% identity to domains VI, VII, and IX, (f) at least 87% identity to domain VIII, (g) at least 95% identity to domain X, and (h) at least 94% identity to domain XI.
  • variant Family D polymerases may have a substitution at a position corresponding to position 923 of SEQ ID NO:l or one or more substitutions in the triplet Pro-His-Thr corresponding to positions 922-924 of SEQ ID NO:l.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

The present disclosure relates to polymerases (e.g., variants of Family D DNA polymerases) for polynucleotide synthesis, polynucleotide amplification, polynucleotide sequencing, cloning a polynucleotide, or combinations thereof. For example, a variant Family D polymerase may have an amino acid sequence that (a) is at least 99%, identical to SEQ ID NO:1 and (b) has a substitution at a position selected from positions corresponding to positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, and 940-962, 981-997 of SEQ ID NO:1.

Description

VARIANT FAMILY D DNA POLYMERASES
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority to U.S. Provisional Application No. 62/976,039 filed February 13, 2020, the entire contents of which are hereby incorporated by reference.
SEQUENCE LISTING STATEMENT
This disclosure includes a Sequence Listing submitted electronically in ascii format under the file name “NEB-425-l_ST25.txt”. This Sequence Listing is incorporated herein in its entirety by this reference.
BACKGROUND
DNA polymerases are classified into Families A, B, C, D, X, Y and RT according to their amino acid sequences. DNA polymerases have several properties that contribute to their replicative fidelity. For example, some DNA polymerases of Families A, B, and D have proofreading 3' to 5' (3 '-5') exonuclease activity. When a DNA polymerase incorporates an incorrect or modified nucleotide, for example, in a primer strand, it detects structural perturbations caused by mispairing or nucleotide modification and transfers the primer strand from the polymerase domain to the 3 '-5' exonuclease active site. In addition to correcting errors after they have arisen, DNA polymerases restrict access to their active sites to prevent incorporation of ribonucleotides. For example, Families A, B, X, Y, and reverse transcriptases have a steric gate that excludes rNTPs from the active site by a steric clash between a bulky amino acid side chain in the steric gate and the 2’-OH of such rNTPs. Reducing the size of the side chain at the steric gate position allows DNA polymerases so modified to incorporate a single rNTP as efficiently as dNTP.
Polymerases with such modified steric gates have been extensively employed in molecular biology applications such as single-molecule sequencing, sequencing by synthesis, and single nucleotide polymorphism (SNP) detection. Despite the range of polymerases available now, additional DNA polymerases with different properties and features may further expand the uses to which polymerases may be put.
Structurally, PolD is a heterodimeric enzyme consisting of a large 5 ’-3’ polymerase subunit and a small MRE11-like 3'-5' exonuclease subunit. The activity of each subunit requires the other subunit to be present. Despite having a catalytic core that resembles an RNA polymerase, Family D DNA polymerases preferentially incorporate dNTPs over rNTPs. The molecular basis for this selectivity has not yet been identified, which may have limited the use of wild type Family D polymerases. Cann, I.K., Komori, K., Toh, H., Kanai, S. and Ishino, Y. (1998) A heterodimeric DNA polymerase: evidence that members of Euryarchaeota possess a distinct DNA polymerase. Proc Natl Acad Sci U S A, 95, 14250-14255; Raia, P., Carroni, M., Henry, E., Pehau-Amaudet, G., Brule, S., Beguin, P., Henneke, G., Lindahl, E., Delarue, M. and Sauguet, L. (2019) Structure of the DP1-DP2 PolD complex bound with DNA and its implications for the evolutionary history of DNA and RNA polymerases. PLoS Biol, 17, e3000122; Takashima, N., Ishino, S., Oki, K., Takafuji, M., Yamagami, T., Matsuo, R., Mayanagi, K. and Ishino, Y. (2019) Elucidating functions of DPI and DP2 subunits from the Thermococcus kodakarensis family D DNA polymerase. Extremophiles, 23, 161-172; Ishino, Y., Komori, K., Cann, I.K. and Koga, Y. (1998) A novel DNA polymerase family found in Archaea. Journal of bacteriology, 180, 2232-2236; Greenough, L., Menin, J.F., Desai, N.S., Kelman, Z. and Gardner, A.F. (2014) Characterization of Family D DNA polymerase from Thermococcus sp. 9 degrees N. Extremophiles, 18, 653-664; Schermerhom, K.M. and Gardner, A.F. (2015) Pre-steady-state Kinetic Analysis of a Family D DNA Polymerase from Thermococcus sp. 9°N Reveals Mechanisms for Archaeal Genomic Replication and Maintenance. J. Biol. Chem., 290, 21800-21810; Astatke, M., Ng, K., Grindley, N.D. and Joyce, C.M. (1998) A single side chain prevents Escherichia coli DNA polymerase I (Klenow fragment) from incorporating ribonucleotides. Proc Natl Acad Sci U S A, 95, 3402-3407; Shen, Y., Musti, K., Hiramoto, M., Kikuchi, H., Kawarabayashi, Y., & Matsui, I. (2001). Invariant Asp-1122 and Asp-1124 are essential residues for polymerization catalysis of family D DNA polymerase from Pyrococcus horikoshii., 276(29), 27376-27383. http://doi.org/10.1074/jbc.M011762200.
SUMMARY
The present disclosure relates, in some embodiments, to variant Family D polymerases. For example, a variant Family D polymerase may have an amino acid sequence according to SEQ ID NO:l and (b) includes a substitution at a position corresponding to a position selected from positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, and 940-962, 981-997 of SEQ ID NO: 1. BRIEF DESCRIPTION OF THE FIGURE
FIGURE 1 shows an alignment of Family D polymerase sequences from Pyrococcus abyssi (P.ab) and Euryarchaeota (9°N PolD-L). Based on a cryo-electron microscopy structure of the P.ab sequence bound to DNA, amino acid residues that (a) appear to be within 20 A of the catalytic site of polD are shown under a solid bar, (b) appear to be within 12 A of the 2’-OH group of the substrate nucleotide are underlined, and (c) may contact the substrate nucleotide are shown in bold.
DETAILED DESCRIPTION
Compositions, methods and kits are provided here that improve among other things, the synthesis of DNA that contains ribonucleotides using a class of polymerases that do not belong to the DNA polymerase A or B families. Family D polymerases preferentially incorporate deoxyribonucleoside triphosphates relative to larger ribonucleoside triphosphates, which may be attributed to a steric gate motif limiting substrate access to the catalytic site. Variant Family D polymerases with modifications in or modifications impacting this steric gate are provided in some embodiments disclosed herein. For example, variant Family D polymerases may have an enhanced ability to exclude ribonucleotides from the active site.
Aspects of the present disclosure can be further understood in light of the embodiments, section headings, figures, descriptions and examples, none of which should be construed as limiting the entire scope of the present disclosure in any way. Accordingly, the claims set forth below should be construed in view of the full breadth and spirit of the disclosure.
Each of the individual embodiments described and illustrated herein has discrete components and features which can be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present teachings. Any recited method can be carried out in the order of events recited or in any other order which is logically possible.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Still, certain terms are defined herein with respect to embodiments of the disclosure and for the sake of clarity and ease of reference. Sources of commonly understood terms and symbols may include: standard treatises and texts such as Komberg and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); Singleton, et ah, Dictionary of Microbiology and Molecular biology, 2d ed., John Wiley and Sons, New York (1994), and Hale & Markham, the Harper Collins Dictionary of Biology, Harper Perennial, N.Y. (1991) and the like.
As used herein and in the appended claims, the singular forms “a” and “an” include plural referents unless the context clearly dictates otherwise. For example, the term “a protein” refers to one or more proteins, i.e., a single protein and multiple proteins. It is further noted that the claims can be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements or use of a “negative” limitation.
Numeric ranges are inclusive of the numbers defining the range. All numbers should be understood to encompass the midpoint of the integer above and below the integer i.e., the number 2 encompasses 1.5-2.5. The number 2.5 encompasses 2.45-2.55 etc. When sample numerical values are provided, each alone may represent an intermediate value in a range of values and together may represent the extremes of a range unless specified.
As used herein, “buffering agent” refers to an agent that allows a solution to resist changes in pH when acid or alkali is added to the solution. Examples of suitable non-naturally occurring buffering agents that may be used in the compositions, kits, and methods of the disclosure include, for example, any of Tris, HEPES, TAPS, MOPS, tricine, and MES.
With respect to an amino acid residue or a nucleotide base position, “corresponding to” refers to positions that lie across from one another when sequences are aligned, e.g., by the BLAST algorithm. An amino acid position in a functional or structural motif in one polymerase may correspond to a position within a functionally equivalent functional or structural motif in another polymerase.
As used herein, “DNA polymerase” refers to an enzyme that is capable of replicating DNA and optionally may have exonuclease activity. As used herein, “DNA template” refers to the DNA strand read by a DNA polymerase and of which a copy is synthesized.
As used herein, “Family D polymerase” (also, “polD” or “Family D DNA polymerase”) refers to a heterodimeric archaeal DNA polymerase having a small exonuclease subunit (DPI) and a large polymerase subunit (DP2). Family D polymerases may be produced in cells in an immature form and then undergo post-translational processing ( e.g ., removal of amino-terminal sequences, inteins, or both). Examples of Family D polymerases may include 9°N PolD (Euryarchaeota), Genbank Accession No. KPV61551.1 (Bathyarchaeota), Accession No. RNJ72434.1 (Thaumarchaeota), Accession No. PUA31350.1 (Aigarchaeota), Accession No. RLG53083.1 (Korarchaeota), Accession No. OLS 17959.1 (Odinarchaeota), Accession No. TET59439.1 (Lokiarchaeota), Accession No. TFH09197.1 (Thorarchaeota), and Accession No. OLS25708.1 (Heimdallarchaeota) .
As used herein, “fusion protein” refers to protein composed of a plurality of polypeptide components that are un-joined in their native state. Fusion proteins may be a combination of two, three or four or more different proteins. The term polypeptide is not intended to be limited to a fusion of two heterologous amino acid sequences. A fusion protein may have one or more heterologous domains added to the N-terminus, C-terminus, and or the middle portion of the protein. If two parts of a fusion protein are “heterologous”, they are not part of the same protein in its natural state. Examples of fusion proteins include a variant Family D polymerase fused to an SS07 DNA binding peptide (see for example, US Patent 6,627,424), a transcription factor (see for example, US patent 10,041,051), a binding protein suitable for immobilization such as maltose binding domain (MBP), a histidine tag (“His-tag”), chitin binding domain (CBD) or a SNAP-Tag® (New England Biolabs, Ipswich, MA (see for example US patents 7,939,284 and 7,888,090)). The binding peptide may be used to improve solubility or yield of the polymerase variant during the production of the protein reagent. Other examples of fusion proteins include a heterologous targeting sequence, a linker, an epitope tag, a detectable fusion partner, such as a fluorescent protein, b-galactosidase, luciferase and the functionally similar peptides.
As used herein, “NTP” refers to a nucleoside triphosphate including, for example, any deoxyribonucleoside triphosphate (“dNTP”) and any ribonucleoside triphosphate (“rNTP”).
As used herein, “non-naturally occurring” refers to a polynucleotide, polypeptide, carbohydrate, lipid, or composition that does not exist in nature. Such a polynucleotide, polypeptide, carbohydrate, lipid, or composition may differ from naturally occurring polynucleotides polypeptides, carbohydrates, lipids, or compositions in one or more respects. For example, a polymer ( e.g ., a polynucleotide, polypeptide, or carbohydrate) may differ in the kind and arrangement of the component building blocks (e.g., nucleotide sequence, amino acid sequence, or sugar molecules). A polymer may differ from a naturally occurring polymer with respect to the molecule(s) to which it is linked. For example, a “non-naturally occurring” protein may differ from naturally occurring proteins in its secondary, tertiary, or quaternary structure, by having a chemical bond (e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others) to a polypeptide (e.g., a fusion protein), a lipid, a carbohydrate, or any other molecule. Similarly, a “non-naturally occurring” polynucleotide or nucleic acid may contain one or more other modifications (e.g., an added label or other moiety) to the 5’- end, the 3’ end, and/or between the 5’- and 3 ’-ends (e.g., methylation) of the nucleic acid. A “non-naturally occurring” composition may differ from naturally occurring compositions in one or more of the following respects: (a) having components that are not combined in nature, (b) having components in concentrations not found in nature, (c) omitting one or components otherwise found in naturally occurring compositions, (d) having a form not found in nature, e.g., dried, freeze dried, crystalline, aqueous, and (e) having one or more additional components beyond those found in nature (e.g., buffering agents, a detergent, a dye, a solvent or a preservative). All publications, patents, and patent applications identified in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
As used herein, “polynucleotide copy” refers to the product of polymerization activity of a DNA polymerase. A polynucleotide copy may comprise deoxyribonucleotides with or without ribonucleotides.
With reference to an amino acid, “position” refers to the place such amino acid occupies in the primary sequence of a peptide or polypeptide numbered from its amino terminus to its carboxy terminus.
As used herein, “substitution” at a position in a comparator amino acid sequence refers to any difference at that position relative to the corresponding position in a reference sequence, including a deletion, an insertion, and a different amino acid, where the comparator and reference sequences are at least 80% identical to each other. A substitution in a comparator sequence, in addition to being different than the reference sequence, may differ from all corresponding positions in naturally occurring sequences that are at least 80% identical to the comparator sequence.
As used herein, “variant Family D polymerase” (also, “variant Family D DNA polymerase”) refers to a non-naturally occurring archaeal Family D DNA polymerase that has an amino acid sequence that is less than 100% identical to the amino acid sequence of a naturally occurring DNA polymerase from archaea or has a non-naturally occurring chemical modification ( e.g ., a polypeptide fused to its amino terminal or carboxy terminal end or other chemical modification). A variant amino acid sequence may have at least 95%, at least 97%, at least 98% or at least 99% identity to the amino acid sequence of a naturally occurring Family D polymerase without being 100% identical to any known naturally occurring polymerase. Sequence differences may include insertions, deletions and substitutions of one or more amino acids. A variant Family D polymerase may have a small exonuclease subunit (DPI) and a large polymerase subunit (DP2), with the DPI subunit having or lacking exonuclease activity. A variant Family D polymerase lacking exonuclease activity may comprise only such portion of the DPI subunit as necessary to support catalytic activity of the polymerase (DP2) subunit.
In other DNA polymerase families, a steric gate amino acid blocks ribonucleotide incorporation at low ribonucleotide concentrations and altering the steric gate amino acid has been observed to reduce discrimination against ribonucleotides at low concentration. In family D DNA polymerases, the position of the rNTP 2'-OH in the active site had not been identified. The present disclosure relates to a low resolution cryo-electron microscopy structure of the Pyrococcus abyssi (P.ab) sequence bound to DNA that provides insight into the possible structure of the active site and surrounding domains. For example, amino acid positions potentially close enough to the catalytic site of the polymerase domain (DP2) to interact with a substrate NTP (e.g., the rNTP 2'- OH) were identified from structural data using PyMol software (Schrodinger, LLC) and are shown in Tables 1-3.
Variant Family D polymerases are provided here that differ from wild type Family D DNA polymerases in their abilities to incorporate rNTPs. For example, the incorporation ratio of adenosine triphosphate to deoxyadenosine triphosphate (rA:dA) may be more than wild type. Use of wild type Family D polymerases may have been limited in the past, in part, by their rNTP/dNTP selectivity. Variant Family D polymerases are not naturally occurring and have at least one substitution relative to wild type in their amino acid sequences. For example, a variant Family D polymerase may have an amino acid sequence comprising a substitution in one or more of the domains shown in Table 1 (optionally, in order, from amino- to carboxy terminal ends). Each of these domains corresponds to the indicated portion of wild type Family D polymerase ( e.g ., Pyrococcus abyssi; SEQ ID NO: 1). Amino acid residues in each of these domains may be located in the active protein near the active site (e.g., within about 20 A).
Table 1
A variant Family D polymerase may have an amino acid sequence comprising a substitution at one or more positions that may contact the incoming substrate nucleoside triphosphate, for example, positions shown in Table 2.
Table 2
A variant Family D polymerase may have an amino acid sequence comprising a substitution at one or more positions that may be located in the protein near the active site ( e.g ., within about 12 A), for example, positions shown in Table 3.
Table 3
In some embodiments, a variant Family D polymerase may comprise one or more substitutions at one or more positions selected from positions corresponding to positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, 940-962, and 981- 997 of SEQ ID NO:l. A variant Family D polymerase may have an amino acid sequence comprising one or more sequences having (a) at least 98% identity to domain I, at least 96% identity to domain II, (c) at least 80% identity to domains III and IV, (d) at least 92% identity to domain V, (e) at least 90% identity to domains VI, VII, and IX, (f) at least 87% identity to domain VIII, (g) at least 95% identity to domain X, and (h) at least 94% identity to domain XI.
It may be desirable, in some embodiments, for variant Family D polymerases to have a substitution at a position corresponding to position 923 of SEQ ID NO:l or one or more substitutions in the triplet Pro-His-Thr corresponding to positions 922-924 of SEQ ID NO:l.

Claims

CLAIMS What is claimed is:
1. A composition, comprising a variant Family D polymerase comprising an amino acid sequence according to SEQ ID NO:l and having a substitution at a position corresponding to a position selected from positions 106-161, 243-267, 326-330, 361-365, 385-397, 441- 451, 657-667, 822-829, 919-928, and 940-962, 981-997 of SEQ ID NO: 1.
EP21710761.4A 2020-02-13 2021-02-12 Variant family d dna polymerases Pending EP4103702A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062976039P 2020-02-13 2020-02-13
PCT/US2021/017954 WO2021163559A1 (en) 2020-02-13 2021-02-12 Variant family d dna polymerases

Publications (1)

Publication Number Publication Date
EP4103702A1 true EP4103702A1 (en) 2022-12-21

Family

ID=74860503

Family Applications (1)

Application Number Title Priority Date Filing Date
EP21710761.4A Pending EP4103702A1 (en) 2020-02-13 2021-02-12 Variant family d dna polymerases

Country Status (2)

Country Link
EP (1) EP4103702A1 (en)
WO (1) WO2021163559A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2792651B1 (en) * 1999-04-21 2005-03-18 Centre Nat Rech Scient GENOMIC SEQUENCE AND POLYPEPTIDES OF PYROCOCCUS ABYSSI, THEIR FRAGMENTS AND USES THEREOF
US6627424B1 (en) 2000-05-26 2003-09-30 Mj Bioworks, Inc. Nucleic acid modifying enzymes
EP1410023B1 (en) 2001-04-10 2006-06-21 Ecole Polytechnique Fédérale de Lausanne Methods using o6-alkylguanine-dna alkyltransferases
US7888090B2 (en) 2004-03-02 2011-02-15 Ecole Polytechnique Federale De Lausanne Mutants of O6-alkylguanine-DNA alkyltransferase
US9963687B2 (en) 2014-08-27 2018-05-08 New England Biolabs, Inc. Fusion polymerase and method for using the same

Also Published As

Publication number Publication date
WO2021163559A1 (en) 2021-08-19

Similar Documents

Publication Publication Date Title
US20250115895A1 (en) Dpo4 polymerase variants
US20100159528A1 (en) Polymerase stabilization
US20240240161A1 (en) Dp04 polymerase variants
US10626383B2 (en) Thermophilic DNA polymerase mutants
US11371028B2 (en) Variant DNA polymerases having improved properties and method for improved isothermal amplification of a target DNA
CN111819188A (en) Fusion single-stranded DNA polymerase Bst, nucleic acid molecule encoding fusion DNA polymerase NeqSSB-Bst, its preparation method and use
KR20240107347A (en) double-stranded DNA deaminase
US20210108191A1 (en) Methods of Production of Biologically Active Lasso Peptides
US20250129350A1 (en) Nuclease having improved salt tolerance and/or temperature performance
US20230357732A1 (en) Thermophilic dna polymerase mutants
EP4103702A1 (en) Variant family d dna polymerases
US20230357838A1 (en) Double-Stranded DNA Deaminases and Uses Thereof
EP4103701A1 (en) Variant family d dna polymerases
CN112899254B (en) DNA polymerase for constant temperature direct amplification of nucleic acid and application method thereof
US20250115953A1 (en) Methylcytosine-Selective Deaminases and Uses Thereof
EP4437093A2 (en) Double-stranded dna deaminases
WO2025072696A1 (en) Compositions, methods, and cells for making 2-aminoadenine (dz) dna
CN116334026A (en) A kind of mutant DNA polymerase and its preparation method and application
JP2008109874A (en) Enzyme composition for self-cyclizatin of dna
Nayak et al. PRODUCTION OF TAQPOLYMERASE FROM E. COLI: A TREMENDOUS APPROACH OF CLONING AND EXPRESSION OF TAQPOLYMERASE-I GENE IN E. COLI

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20220823

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)