[go: up one dir, main page]

WO2023125605A1 - 单分子纳米孔测序方法 - Google Patents

单分子纳米孔测序方法 Download PDF

Info

Publication number
WO2023125605A1
WO2023125605A1 PCT/CN2022/142615 CN2022142615W WO2023125605A1 WO 2023125605 A1 WO2023125605 A1 WO 2023125605A1 CN 2022142615 W CN2022142615 W CN 2022142615W WO 2023125605 A1 WO2023125605 A1 WO 2023125605A1
Authority
WO
WIPO (PCT)
Prior art keywords
polynucleotide
pore
stranded
protein
binding protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/142615
Other languages
English (en)
French (fr)
Inventor
徐讯
董宇亮
季州翔
王乐乐
曾涛
黎宇翔
陈奥
章文蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN202280086381.2A priority Critical patent/CN118591641A/zh
Priority to US18/725,327 priority patent/US20250154581A1/en
Priority to EP22914857.2A priority patent/EP4458983A1/en
Publication of WO2023125605A1 publication Critical patent/WO2023125605A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/315Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/35Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Mycobacteriaceae (F)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/483Physical analysis of biological material
    • G01N33/487Physical analysis of biological material of liquid biological material
    • G01N33/48707Physical analysis of biological material of liquid biological material by electrical means
    • G01N33/48721Investigating individual macromolecules, e.g. by translocation through nanopores
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/03Fusion polypeptide containing a localisation/targetting motif containing a transmembrane segment
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • C07K2319/22Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a Strep-tag
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y306/00Hydrolases acting on acid anhydrides (3.6)
    • C12Y306/04Hydrolases acting on acid anhydrides (3.6) acting on acid anhydrides; involved in cellular and subcellular movement (3.6.4)
    • C12Y306/04012DNA helicase (3.6.4.12)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y306/00Hydrolases acting on acid anhydrides (3.6)
    • C12Y306/04Hydrolases acting on acid anhydrides (3.6) acting on acid anhydrides; involved in cellular and subcellular movement (3.6.4)
    • C12Y306/04013RNA helicase (3.6.4.13)

Definitions

  • the present invention relates to a method for polynucleotide sequencing using a nanopore, and in particular, the present invention relates to a method for polynucleotide sequencing using a polynucleotide binding protein covalently bound to a nanopore.
  • NGS Next generation sequencing technology
  • Nanopore sequencing has the characteristics of fast speed, ultra-long read length, and no need for polymerase amplification and replication.
  • the sequencing speed can reach hundreds of bases per second, and the length of a single sequencing read is tens of kilobases, which can provide more continuous and complete genome assembly.
  • the principle of nanopore sequencing Transmembrane proteins with a diameter of nanoscale are inserted into the biomimetic membrane to form nanopore channels on the membrane; two electrodes are placed on both sides of the membrane; after electrification, ions generate current through the transmembrane protein.
  • the analyte passes through the transmembrane protein under the action of an electric field, causing a characteristic change of the ion flow and changing the magnitude of the current (blocking current). Different analytes produce different blocking currents.
  • the sequencing library when the sequencing library is captured by the nanopore, the motor protein slides through the spacer under the action of an electric field, and sequencing begins. Due to the use of the adapter-motor protein complex and the method of blocking the movement of the motor protein through the spacer in the adapter, the sequencing library cannot be stored for a long time due to the stability and storage period of the motor protein after library construction. Moreover, the spacer is composed of multiple modified nucleotides, which is more costly than conventional nucleotides.
  • the third-generation nanopore single-molecule sequencing technology includes library construction and sequencing.
  • the current typical method connects the adapter-motor protein complex with the target fragment to be tested to construct a sequencing library.
  • the joint is in the shape of "Y", which is divided into upper chain and lower chain.
  • the upper chain contains the following three parts (a) guide sequence, which can guide the sequencing library to move faster to the vicinity of the pore under the action of electric field force; (b) spacer, which can prevent the motor protein from moving along the single-stranded polynucleotide and unwinding Double-stranded polynucleotide; (c) a single-stranded polynucleotide containing a thymidine deoxynucleotide (dT) at the 3' end in a region complementary to the sequence of the lower strand.
  • the lower strand consists of two parts (i) a region complementary to the upper strand and (ii) a region complementary to the tether sequence.
  • the restraints contain cholesterol molecules at one end, which anchor the biomimetic membrane. During subsequent sequencing, the restraints are incubated with the biomimetic membrane, allowing them to become anchored. Sequencing libraries can be combined with restraints. After applying a certain voltage, the leader sequence region is captured by the transmembrane protein; the motor protein moves through the spacer under the force of the electric field.
  • the double-stranded polynucleotide is unwound into a single-stranded polynucleotide under the action of the motor protein, and the single-stranded polynucleotide passes through the transmembrane protein immobilized on the biomimetic membrane under the action of an electric field and interacts with it , causing a characteristic change in the ion current.
  • DNA travels faster through transmembrane proteins and produces poorer resolution electrical signals.
  • Motor proteins can greatly slow down the speed of DNA as it passes through transmembrane proteins, and motor proteins only hydrolyze one nucleotide at a time.
  • the function of the spacer is to prevent the motor protein from moving along the single-stranded polynucleotide and unwinding the double-stranded polynucleotide in the sequencing buffer; after the sequencing library is captured by the transmembrane protein, the block is invalidated under the action of the electric field force.
  • the sequencing library cannot be stored for a long time due to the stability and storage period of the motor protein after library construction.
  • the spacer is composed of multiple modified nucleotides, which is more costly than conventional nucleotides.
  • the present application provides improved nanopore single molecule sequencing methods, thereby providing the following aspects.
  • the application provides a construct comprising a pore subunit of a transmembrane protein, wherein the subunit retains its ability to form a pore, and a polynucleotide binding protein capable of separating double Both strands of stranded polynucleotides and/or control single-stranded polynucleotides move through the pore; and, the subunits are covalently bound to the polynucleotide binding protein.
  • the covalent association is not a peptide bond through translation of the nucleic acid.
  • the covalent association is through an isopeptide bond.
  • the covalent association occurs by chemical ligation (eg, click chemistry).
  • the subunit is bound to the polynucleotide binding protein by an isopeptide bond.
  • the inventors of the present application have found that the transmembrane protein and the polynucleotide binding protein provided by the present application are connected through isopeptide bonds, which has greater advantages than other connection methods.
  • the transmembrane protein-polynucleotide binding protein complex is formed by chemical linkage, and then the complex is inserted into the membrane to form a single-channel nanopore
  • the inventors found that the transmembrane protein or polynucleotide binding protein It needs to be modified for this chemical ligation method, which increases the complexity and difficulty; at the same time, after the chemical ligation reaction is completed, it needs to be further purified to obtain the transmembrane protein-polynucleotide binding protein complex and remove the reagents required for the chemical reaction. Reagents for chemical reactions are not friendly to the activity of transmembrane proteins or polynucleotide binding proteins.
  • these reagents will often affect subsequent further experiments, for example: some compounds may pass through transmembrane proteins under the action of electric field force, which may interfere with the current detection signal; or some compounds may affect the stability of phospholipid membranes in subsequent experiments. impact, resulting in the failure of single-channel nanopores.
  • the modified porin when the modified porin is inserted into the phospholipid bilayer membrane to form a single-channel nanopore, then the polynucleotide binding protein is added, and after incubation, the transmembrane protein-polynucleotide binding protein complex is formed through chemical connection.
  • the connection between the transmembrane protein and the polynucleotide binding protein needs to be carried out in the nanopore detection reaction cell, and an appropriate voltage needs to be applied on both sides of the membrane to detect whether a transmembrane protein is formed. Protein-polynucleotide binding protein complexes.
  • adding additives or necessary components to realize the chemical ligation may affect the stability of the phospholipid membrane or affect the detection of whether the transmembrane protein-polynucleotide binding protein complex is formed impact, thereby affecting subsequent further detection.
  • connection of isopeptide bonds does not require additional substances, and there is no interference factor to the sequencing reagents or detection reagents for subsequent further experiments.
  • the isopeptide bond is formed by a protein-protein binding pair consisting of a first member (e.g., a first peptide tag) and a second member (e.g., a second peptide tag) , wherein the first member and the second member are bound by an isopeptide bond, and the first member is optionally passed through a first linker (eg, a rigid or flexible linker, such as comprising one or more glycines and/or one or A peptide linker of multiple serines) is connected to the N-terminal or C-terminal of the transmembrane protein pore subunit to form the first component, and the second member is optionally passed through a second linker (for example, a rigid or flexible linker, such as comprising a or a peptide linker of glycine and/or one or more serines) is linked to the N- or C-terminus of the polynucleotide binding protein to form the second component.
  • a first linker e.
  • the first linker is a peptide linker having the amino acid sequence shown in SEQ ID NO: 15.
  • the second linker is a peptide linker having the amino acid sequence shown in SEQ ID NO: 14.
  • the protein-protein binding pair is selected from: SpyCatcher/SpyTag pair, SpyTag002/SpyCatcher002 pair, SpyTag003/SpyCatcher003 pair, isopeptag-N/pilin-N pair, isopeptag/pilin-C pair, SnoopTag/ SnoopCatcher right.
  • the isopeptide bond is formed by the SpyTag003/SpyCatcher003 pair.
  • the SpyCatcher003 is linked to the N-terminus of the subunit.
  • said SpyTag003 is linked to the N-terminus of said polynucleotide binding protein.
  • the SpyCatcher003 has the amino acid sequence shown in SEQ ID NO:4.
  • the SpyTag003 has the amino acid sequence shown in SEQ ID NO:1.
  • the subunit is selected from hemolysin, MspA, Frac, ClyA, PA63, CsgG, GspD, XcpQ, Wza, SP1, Phi29 connector, SPP1 connector, T3 connector, T4 connector, T7 connector, Potassium ion channel protein, sodium ion channel protein, subunit of calcium ion channel protein.
  • the subunit is further linked to an additional polypeptide selected from a tag, an enzyme cleavage site, a signal or guide peptide, a detectable label, or any combination thereof.
  • the subunit has the amino acid sequence shown in SEQ ID NO: 3 or 17.
  • the sequence shown here does not contain an amino acid (eg, methionine (Met)) encoded by a start codon (eg, ATG) at its N-terminus.
  • a start codon eg, ATG
  • the first position of the generated polypeptide chain is often the amino acid encoded by the initiation codon (such as Met).
  • the subunits of the present invention not only include amino acid sequences that do not contain the amino acid encoded by the initiation codon (such as Met) at its N-terminus, but also include amino acid sequences that include the amino acid (such as Met) encoded by the initiation codon at its N-terminus. Therefore, a sequence further comprising an amino acid encoded by an initiation codon (such as Met) at the N-terminus of the above amino acid sequence is also within the protection scope of the present invention.
  • the polynucleotide binding protein is selected from nucleic acid helicases, such as DNA helicases or RNA helicases.
  • the polynucleotide binding protein is selected from Dda, UvrD, Rep, RecQ, PcrA, eIF4A, NS3, Rep, gp41 or T7gp4.
  • the polynucleotide binding protein is further linked to an additional polypeptide selected from a tag, an enzyme cleavage site, a signal peptide or guide peptide, a detectable label, or any combination thereof.
  • the polynucleotide binding protein has an amino acid sequence as shown in SEQ ID NO: 2 or 16.
  • the present application also provides a pore for polynucleotide sequencing comprising at least one construct as described above.
  • the pore is a transmembrane protein pore.
  • the pore comprises at least one construct as described above together with other subunits required to form the pore.
  • the pore comprises sufficient quantities of other subunits required for pore formation.
  • the other subunit required for pore formation is the same or different from the pore subunit of the transmembrane protein in the construct.
  • the other subunits required for pore formation are the same as the transmembrane protein pore subunits in the construct, or, the other subunits required for pore formation are the transmembrane protein pore subunits in the construct.
  • the pore comprises a construct as described above and other subunits required to form the pore.
  • the pore comprises sufficient quantities of other subunits required for pore formation.
  • the pore is formed from 1 of the constructs and 7 other subunits required for pore formation.
  • the other subunit required for pore formation is the same or different from the pore subunit of the transmembrane protein in the construct.
  • the other subunits required for pore formation are the same as the transmembrane protein pore subunits in the construct, or, the other subunits required for pore formation are the transmembrane protein pore subunits in the construct.
  • the present application also provides an isolated nucleic acid encoding a construct as described above.
  • the isolated nucleic acid molecule comprises a first nucleotide sequence encoding the first component and a second nucleotide sequence encoding the second component, the first component and the second component are as defined above, said first and second nucleotide sequences are present on the same or different isolated nucleic acid molecules.
  • the first component is a first protein in which the first member is optionally connected to the N-terminus or C-terminus of the transmembrane protein pore subunit through a first peptide linker;
  • the second component is a second protein formed by the second member optionally linked to the N-terminus or C-terminus of the polynucleotide binding protein via a second peptide linker.
  • the present application also provides a vector comprising the isolated nucleic acid molecule as described above.
  • the vector is a cloning vector or an expression vector.
  • the present application also provides a host cell comprising the isolated nucleic acid molecule as described above or the vector as described above.
  • the first component is a first protein in which the first member is optionally connected to the N-terminus or C-terminus of the transmembrane protein pore subunit through a first peptide linker;
  • the second component is a second protein formed by the second member optionally linked to the N-terminus or C-terminus of the polynucleotide binding protein via a second peptide linker.
  • the present application also provides methods of making a construct or said pore as described above.
  • a host cell containing a first nucleotide sequence encoding a first component and a second nucleotide sequence encoding a second component is cultured, the first component and the second component As defined above, and, recovering said construct from cultured host cell culture; wherein said first component comprising a transmembrane protein pore subunit and said second component comprising a polynucleotide binding protein are expressing and/or capable of forming complexes linked by isopeptide bonds during recovery;
  • the method of making pores as described above comprises the following steps:
  • the multimer in step (1) is an octamer formed by 1 first component and 7 other subunits required for pore formation.
  • the subunits in the first component are the same as the other subunits required for pore formation.
  • step (1) comprises the steps of: cultivating a host cell containing a first nucleotide sequence encoding the first component and a third nucleotide sequence encoding other subunits required for pore formation, to obtain wells free of polynucleotide binding proteins.
  • step (1) further includes the step of purifying the polynucleotide-binding protein-free wells.
  • the first component contains a first purification tag and the other subunits required for pore formation contain a second purification tag.
  • the first purification tag is different from the second purification tag, for example, the first purification tag is a His tag and the second purification tag is a Strep tag.
  • the polynucleotide binding protein-free wells are purified by the first purification tag and the second purification tag.
  • the purification by the first purification tag and the second purification tag is formed from 1 first component and 7 other subunits required for pore formation. The wells that do not contain the polynucleotide binding protein.
  • the method further includes the step of removing the first purification tag and/or the second purification tag.
  • the present application also provides a method for sequencing a target polynucleotide, comprising:
  • the polynucleotide to be tested is a double-stranded polynucleotide
  • the polynucleotide binding protein assists in separating the two strands to provide a single-stranded polynucleotide, and control the movement of the single-stranded polynucleotide through the hole
  • the double-stranded polynucleotide comprises at least one single-stranded overhang (such as a 5' end overhang and/or a 3' end overhang), and the single-stranded The overhang contains a leader sequence that guides the nucleic acid strand to which it is attached into the pore.
  • the double stranded polynucleotide may contain gaps and/or non-terminal overhangs.
  • the leader sequence enters the pore generally following a field generated by the applied potential.
  • the leader sequence generally includes a polymer.
  • the polymer is preferably negatively charged.
  • the polymer is preferably a polynucleotide, such as DNA or RNA, a modified nucleotide (eg abasic DNA), PNA, LNA, polyethylene glycol (PEG) or a polypeptide.
  • the leader sequence comprises a single-stranded polynucleotide.
  • the leader sequence includes a single-stranded DNA sequence, such as a polydT (polydT) portion.
  • the leader sequence is 10 to 150 nucleotides in length, such as 20 to 150 nucleotides in length. Those skilled in the art can adjust the length of the leader sequence according to the pore used.
  • the polynucleotide binding protein controls movement of the single-stranded polynucleotide through the pore in a 5' to 3' direction
  • the double-stranded polynucleotide comprises at least one 5' ' end overhang, the 5' end overhang containing the leader sequence
  • the single-stranded polynucleotide moves through the pore along the direction from the 5' end to the 3' end means that the nucleotides in the single-stranded polynucleotide from the 5' end to the 3' end Residues pass through the pore sequentially.
  • the polynucleotide binding protein controls movement of the single-stranded polynucleotide through the pore in a 3' to 5' direction
  • the double-stranded polynucleotide comprises at least one 3' ' end overhang, the 3' end overhang containing the leader sequence
  • the single-stranded polynucleotide moves through the pore along the direction from the 3' end to the 5' end means that the nucleotides in the single-stranded polynucleotide from the 3' end to the 5' end Residues pass through the pore sequentially.
  • test polynucleotide is a linear double-stranded polynucleotide.
  • the linear double-stranded polynucleotide comprises at least one single-stranded overhang (eg, a 5' overhang and/or a 3' overhang) at each of its two ends, the single-stranded overhang
  • the protuberance contains the leader sequence.
  • the linear double-stranded polynucleotide comprises at least one single-stranded overhang (e.g., a 5' overhang and/or a 3' overhang) at one end thereof, the single-stranded overhang comprising the leader sequence; and, the linear double-stranded polynucleotide contains a bridging part at its other end, and the bridging part covalently connects the two single strands of the linear double-stranded polynucleotide.
  • the bridging moiety covalently links the ends of the two single strands of the linear double stranded polynucleotide.
  • the bridging moiety covalently links the 3' end (or 5' end) of one single strand of the linear double-stranded polynucleotide to the 5' end (or 3' end) of the other single strand.
  • the bridging moiety is a hairpin oligonucleotide adapter.
  • the linear double-stranded polynucleotide comprising the oligonucleotide adapter substantially behaves as a duplex comprising a hairpin structure formed by one oligonucleotide strand.
  • test polynucleotide is a circular double-stranded polynucleotide.
  • the circular double-stranded polynucleotide is obtained by performing rolling circle amplification with a primer containing the leader sequence and a carrier sequence as a template.
  • the amplification process and the sequencing process are performed in the same system.
  • the circular double-stranded polynucleotide is separated into two single strands under the action of the polynucleotide binding protein, one single strand passes through the hole, and the other strand serves as a template to generate a new circular double strand polynucleotide.
  • the polynucleotide to be tested is a single-stranded polynucleotide
  • the polynucleotide binding protein controls the movement of the single-stranded polynucleotide through the pore; wherein , the single-stranded polynucleotide comprises at least one of its two ends a leader sequence that guides the nucleic acid strand to which it is attached into the pore.
  • the polynucleotide binding protein controls movement of the single-stranded polynucleotide through the pore in a 5' to 3' direction, and the single-stranded polynucleotide is at least 5 ' end contains the leader sequence.
  • the single-stranded polynucleotide moves through the pore along the direction from the 5' end to the 3' end means that the nucleotides in the single-stranded polynucleotide from the 5' end to the 3' end Residues pass through the pore sequentially.
  • the polynucleotide binding protein controls movement of the single-stranded polynucleotide through the pore in a 3' to 5' direction, and the single-stranded polynucleotide is at least 3 ' end contains the leader sequence.
  • the single-stranded polynucleotide moves through the pore along the direction from the 3' end to the 5' end means that the nucleotides in the single-stranded polynucleotide from the 3' end to the 5' end Residues pass through the pore sequentially.
  • test polynucleotide is coupled to or near the outer edge of the pore.
  • one or more tethers are conjugated to or near the outer edge of the well, the tethers comprising a capture sequence having sequence complementarity to a portion of the polynucleotide to be detected,
  • the test polynucleotide is coupled to or near the outer edge of the well by hybridization of a capture sequence to a complementary region in the test polynucleotide.
  • the portion of the polynucleotide to be detected comprises a part of the single-stranded overhang or is identical to the single-stranded overhang.
  • the portion of the polynucleotide to be tested includes a portion of the single-stranded overhang.
  • a portion of the single-stranded overhang is located 5' or 3' to the leader sequence.
  • a portion of said single-stranded overhang passes through said pore after said leader sequence.
  • the portion of the polynucleotide to be tested includes a portion of a strand aligned with the single-stranded overhang.
  • strand aligned with said single-stranded overhang refers to an oligonucleotide strand that is (i) aligned with said single-stranded overhang, or, (ii) aligned with said single-stranded overhang Where the oligonucleotide strand hybridizes to the same strand of the oligonucleotide strand, the hybridized strand.
  • test polynucleotide when said test polynucleotide is a single-stranded polynucleotide as defined above, said portion of said test polynucleotide comprises a portion of said single-stranded polynucleotide.
  • one or more restraints are conjugated to the outer edge of the hole or near, and the restraints contain a capture sequence, and the capture sequence is connected to the polynucleotide to be detected through a linker.
  • Linking wherein, the linker contains a first region complementary to the capture sequence, and a second region having sequence complementarity to the part of the polynucleotide to be detected.
  • the portion of the polynucleotide to be detected comprises a part of the single-stranded overhang or is identical to the single-stranded overhang.
  • the portion of the polynucleotide to be tested includes a portion of the single-stranded overhang.
  • a portion of the single-stranded overhang is located 5' or 3' to the leader sequence.
  • a portion of the single-stranded overhang passes through the pore after the leader sequence.
  • the portion of the polynucleotide to be tested includes a portion of a strand aligned with the single-stranded overhang.
  • test polynucleotide when said test polynucleotide is a single-stranded polynucleotide as defined above, said portion of said test polynucleotide comprises a portion of said single-stranded polynucleotide.
  • the pores are disposed in the membrane.
  • the membrane is an amphiphilic layer, such as a lipid bilayer (eg, phospholipid bilayer) or a polymeric membrane (eg, di-block, tri-block).
  • a lipid bilayer eg, phospholipid bilayer
  • a polymeric membrane eg, di-block, tri-block
  • the restraint is covalently or non-covalently attached to the pore or to a portion of the membrane proximate to the pore.
  • the restraint is anchored to a portion of the membrane near the pore by a cholesterol molecule.
  • said cholesterol molecule is linked to said capture sequence of a restraint by a spacer molecule.
  • the spacer molecule consists of 4 x iSpl8.
  • the restrainer is used to specifically or non-specifically enrich the polynucleotide to be detected near the hole. Therefore, those skilled in the art can design the length, quantity and capture sequence (specific or non-specific capture sequence) of the restrainer according to needs, which is not limited in this application.
  • step (b) and step (c) are performed in solution.
  • the solution is ionic and the property measured is the ionic current through the nanopore.
  • step (b) a potential difference across the nanopore is provided to allow single stranded nucleic acid to enter said nanopore.
  • the present application also provides a device for sequencing a target polynucleotide, comprising: (i) a membrane; (ii) a plurality of transmembrane protein pores in the membrane, selected from the above-mentioned The transmembrane protein pore of the pore;
  • the device further comprises instructions for performing the method of sequencing a target polynucleotide as described above;
  • the membrane is an amphiphilic layer, such as a lipid bilayer (eg, phospholipid bilayer) or a polymeric membrane (eg, di-block, tri-block).
  • a lipid bilayer eg, phospholipid bilayer
  • a polymeric membrane eg, di-block, tri-block
  • the device does not comprise a polynucleotide binding protein provided separately from (ii).
  • the present application also provides a kit comprising the above-mentioned construct, well, isolated nucleic acid molecule, vector, and/or, host cell.
  • the kit further comprises reagents for nanopore sequencing.
  • the present application also provides the use of the above-mentioned construct, pore, isolated nucleic acid molecule, vector, host cell, device, or kit for polynucleotide sequencing.
  • annealing As used herein, the terms “annealing”, “annealing”, “annealing”, “hybridizing” or “hybridizing” and the like refer to the presence of sufficient complementarity to form a complex via Watson-Crick base pairing. Complexes are formed between nucleotide sequences.
  • nucleic acid sequences that are “complementary to” or “complementary” or “hybridize” or “anneal” to each other should be able to form or form sufficiently stable “hybrids" or “hybrids” that serve the intended purpose. "Complex".
  • nucleic acid base within the sequence represented by one nucleic acid molecule is capable of base pairing or pairing or complexing with every nucleic acid base within the sequence represented by a second nucleic acid molecule such that the two nucleic acid molecules or one of them
  • Corresponding sequences shown are “complementary” or “anneal” or “hybridize” to each other.
  • the terms “complementary” or “complementarity” are used when referring to a sequence of nucleotides related by the base pairing rules. For example, the sequence 5'-A-G-T-3' is complementary to the sequence 3'-T-C-A-5'. Complementarity can be “partial,” wherein only some of the nucleic acid bases match according to the base pairing rules. Alternatively, there may be “perfect” or “total” complementarity between nucleic acids.
  • polynucleotide such as nucleic acid is a macromolecule containing two or more nucleotides.
  • a polynucleotide or nucleic acid may comprise any combination of nucleotides.
  • the nucleotides may be naturally occurring or synthetic.
  • the nucleotides may be modified or unmodified. In certain embodiments, the nucleotides are unmodified.
  • the sequencing library constructed by the nanopore sequencing method of the present invention does not contain motor protein, has a longer storage period, and greatly reduces the requirements for library construction conditions (for example, there is no need to consider whether the motor protein is denatured when building a library).
  • the sequencing method of the present invention does not need to use spacers, which can reduce costs.
  • the sequencing starts immediately after the sequencing library is captured by the transmembrane protein, which can improve the detection efficiency.
  • the library construction method is subject to many restrictions.
  • sequencing libraries cannot be constructed by conventional nucleic acid amplification alone.
  • the nanopore sequencing method provided by the present invention does not need to use spacers. Therefore, more library construction methods can be used or better combined with other technical solutions, such as rolling circle amplification, when constructing sequencing libraries.
  • the method of the present invention as mentioned above has greater tolerance to the library structure and construction method, in some embodiments, the sequencing process and the library construction process can be carried out in the same system, which can further Save costs and improve efficiency.
  • a adapter may contain different numbers of motor proteins, resulting in differences in performance such as sequencing speed during sequencing. If the library molecules contain motor proteins and spacers, it is generally necessary to purify the adapter complex to obtain a single adapter complex.
  • the sequencing library without motor proteins and spacers provided by the present invention can well avoid this problem.
  • Figure 1 shows the core construction of expression vector pET.28a(+)-6*His-Spytag003-DDA, pET.28a(+)-6*His-Spycatcher003-MspA and pET.21a(+)-strep-MspA schematic diagram.
  • FIG. 2 shows a schematic structural diagram of the motor protein-porin complex prepared in Example 1.
  • Fig. 3 shows the SDS-PAGE gel picture of different ratios of Spycatcher003-MspA and MspA complexes purified by ion exchange in Example 1; in order not to destroy the polymer state of MspA, when preparing samples, add 5 ⁇ non-denaturing non-reducing Protein loading buffer (P0016N, Biyuntian).
  • Fig. 4 shows that the motor protein Spytag003-DDA in Example 1 is recombined with the porin Spycatcher003-MspA (MspA) 7 and the peak pattern is obtained by molecular sieve
  • Fig. 5 shows the SDS-PAGE gel image when the motor protein Spytag003-DDA is recombined with the porin Spycatcher003-MspA (MspA) 7 in Example 1; lane 1 shows the motor protein Spytag003-DDA alone. Lane 2 shows the porin Spycatcher003-MspA (MspA) 7 alone. Lane 3 shows the mixed sample after incubation of Spycatcher003-MspA(MspA) 7 with Spytag003-DDA before passing through molecular sieves. Lane 4 shows protein ladder (26616, Thermo Scientific).
  • Lanes 5-9 show the protein at the peak position of the complex of Spycatcher003-MspA(MspA) 7 and Spytag003-DDA after passing through molecular sieves, corresponding to the first peak in Figure 4, where lane 5 shows a single complex band.
  • FIG. 6 shows the opening current distribution at different voltages (0.05V-0.1V-0.14V-0.18V) when the composite is plugged in in Example 2.
  • Figure 7 is a schematic diagram showing the structure of the double-stranded polynucleotide sequencing library constructed in Example 2 combined with the restraint device; wherein 1 is a leader sequence, including a motor protein binding region, and 2 is a double-stranded DNA part (containing DNA fragment to be tested), 3 is a restraint device.
  • FIG. 8 shows the electrical signal of the constructed double-stranded polynucleotide sequencing library in Example 2 as it is unwrapped into a single strand by Spytag003-DDA and translocated through the 7 nanopore of Spycatcher003-MspA (MspA).
  • FIG. 8A shows a current amplitude diagram of a sequencing event
  • FIG. 8B is an enlarged diagram of a section of the signal in FIG. 8A .
  • Cholesterol (cholesterol) is connected to the 5' end of the polynucleotide through 4 ⁇ iSp18, wherein "iSp18" is an 18-atom hexaethylene glycol spacer (18-atom hexa-ethyleneglycol spacer), its structure is as shown in formula I:
  • Example 1 Expression and purification of motor protein DDA containing Spytag003 and pore MspA octamer marked by Spycatcher003
  • each gene fragment was designed and synthesized.
  • 6*His-Spytag003-DDA and 6*His-Spycatcher003-MspA were cloned into ampicillin-resistant pET.28a(+) vectors to obtain recombinant plasmids pET.28a(+)-6*His-Spytag003- DDA and pET.28a(+)-6*His-Spycatcher003-MspA.
  • the Strep-MspA was cloned into the kana-resistant pET.21a(+) vector to obtain the recombinant plasmid pET.21a(+)-strep-MspA.
  • the bacterial cells obtained in the above step 2 (1) were crushed under high pressure, and purified by nickel column, ion exchange column, molecular sieve and other methods to finally obtain the Spytag003-DDA protein with good purity.
  • the present invention obtains an MspA octamer fused with only one molecule of Spycatcher003 by the following method.
  • the imidazole concentration gradient (imidazole concentration 25mM-250mM) of the nickel column was eluted sequentially and the ion exchange column was eluted with a salt ion concentration gradient (BufferA : 100mM; BufferB: 1000mM NaCl), the MspA octamer formed by Spycatcher003-MspA and MspA in a ratio of 1:7 was separated, as shown in Figure 3, thus, the MspA octamer marked by Spycatcher003 was obtained.
  • Example 2 Constructing a nanopore biosensor and applying it to sequencing
  • the DNA to be tested (amount of 2ug) was subjected to finishing treatment: mix the reagents in Table 2, place in a PCR instrument at 20°C for 10min; 65°C for 10min, and then perform magnetic bead purification.
  • the linker was formed by annealing the linker top strand (SEQ ID 10) and bottom strand (SEQ ID 11).
  • Figure 6 shows the opening current of a single nanopore in buffer solution (0.3M KCl, 25mM HEPES, 1mM EDTA, 5mM ATP, 25mMgCl 2 ), where the opening current at 50mV is ⁇ 35pA; at 100mV, the opening current is ⁇ 75pA; opening current at 140mV is ⁇ 110pA; opening current at 180mV is ⁇ 150pA.
  • FIG. 8A shows the current amplitude map of the sequencing event
  • FIG. 8B is an enlarged view of the sequencing event in FIG. 8A.
  • the results show that the motor protein in the complex is able to separate the two strands of the double-stranded polynucleotide and control the movement of the single-stranded polynucleotide through the pore formed by the complex, and complete the sequence determination of the polynucleotide.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Nanotechnology (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

一种使用纳米孔进行多核苷酸测序的方法,具体地,涉及使用将多核苷酸结合蛋白与纳米孔共价结合进行多核苷酸测序的方法。

Description

单分子纳米孔测序方法 技术领域
本发明涉及使用纳米孔进行多核苷酸测序的方法,具体地,本发明涉及使用将多核苷酸结合蛋白与纳米孔共价结合进行多核苷酸测序的方法。
背景技术
第二代测序技术(next generation sequencing,NGS)为基因组学的发展奠定了基础,加速了基础科研和临床医学的研究和发展。但NGS仍然存在诸多问题,例如:仪器昂贵,数据处理内存大费用高,读长短。
目前已经开发了第三代纳米孔单分子测序技术。纳米孔测序具有速度快,超长读长,无需聚合酶扩增复制等特点。测序速度可达每秒钟几百碱基,单条测序读长几十千碱基,可以提供更连续、更完整的基因组组装。纳米孔测序原理:直径为纳米尺度的跨膜蛋白插入仿生膜中,在膜上形成纳米孔通道;在膜两侧放置两个电极;通电后,离子通过跨膜蛋白产生电流。待测物在电场作用下通过跨膜蛋白,引起离子流的特征变化,使电流大小发生改变(阻滞电流)。不同待测物产生不同大小的阻滞电流。
在典型的纳米孔测序中,当测序文库被纳米孔所捕获,在电场力作用下马达蛋白滑过间隔器,开始测序。由于采用了接头-马达蛋白复合物,并通过接头中的间隔器阻滞马达蛋白移动的方法,这导致了建库后由于马达蛋白的稳定性和保存周期问题,测序文库无法长期保存。并且,间隔器由多个修饰核苷酸组成,较常规核苷酸成本高。
因此,需要改进的纳米孔单分子测序方法,其具有降低的成本、提高的精确度和/或更高的效率。
发明内容
第三代纳米孔单分子测序技术包括构建文库和测序,目前典型的方法将接头-马达蛋白复合物与待测目的片段进行连接以构建测序文库。其中,接头呈”Y”字形,分为上链和下链。上链包含以下三个部分(a)引导序列,在电场力作用下可引导测序文库更快移动至孔道附近;(b)间隔器,可阻止马达蛋白沿着单链多核苷酸移动并解旋双链多核苷酸;(c)与下链部分序列互补的区域,在3‘端含有一个胸腺嘧啶脱氧核苷酸(dT)的单链多核苷酸。下链包括两个部分(i)与上链互补的区域和(ii)与拘束器(tether)序列互补 的区域。拘束器在一端含有胆固醇分子,可以锚定仿生膜。在随后的测序中,拘束器与仿生膜孵育,使其锚定。测序文库可与拘束器进行结合。施加一定的电压后,引导序列区域被跨膜蛋白所捕获;在电场力作用下马达蛋白移动穿过间隔器。在测序过程中,双链多核苷酸在马达蛋白作用下解旋为单链多核苷酸,单链多核苷酸在电场力作用下穿过固定在仿生膜的跨膜蛋白并与之发生相互作用,引起离子流的特征变化。
DNA在通过跨膜蛋白时速度较快,产生的电信号分辨率较差。马达蛋白可大大降低DNA在通过跨膜蛋白时的速度,并且马达蛋白每次只水解一个核苷酸。间隔器的作用是在测序缓冲液中阻止马达蛋白沿着单链多核苷酸移动和解旋双链多核苷酸;测序文库被跨膜蛋白捕获后在电场力作用下使阻滞失效。
然而,由于采用了接头-马达蛋白复合物,并通过接头中的间隔器阻滞马达蛋白移动,这导致了建库后由于马达蛋白的稳定性和保存周期问题,测序文库无法长期保存。并且,间隔器由多个修饰核苷酸组成,较常规核苷酸成本高。
本申请提供了改进的纳米孔单分子测序方法,由此提供了以下方面。
构建体
因此,在一方面,本申请提供了一种包含跨膜蛋白孔亚基和多核苷酸结合蛋白的构建体,其中,所述亚基保留其形成孔的能力,所述多核苷酸结合蛋白能够分开双链多核苷酸的两条链和/或控制单链多核苷酸移动穿过所述孔;并且,所述亚基与所述多核苷酸结合蛋白共价结合。
在某些实施方式中,所述共价结合不是通过核酸的翻译产生的肽键。
在某些实施方式中,所述共价结合通过异肽键实现。
在某些实施方式中,所述共价结合通过化学连接(例如,点击化学)产生。
在某些实施方案中,所述亚基与所述多核苷酸结合蛋白通过异肽键结合。
特别地,本申请发明人发现,本申请提供的跨膜蛋白和多核苷酸结合蛋白通过异肽键进行连接,较其他连接方式有较大的优势。
例如,当采用化学连接方式形成跨膜蛋白-多核苷酸结合蛋白复合物后,再将复合物插入膜中形成单通道纳米孔这种方式时,发明人发现跨膜蛋白或多核苷酸结合蛋白需要针对该化学连接方式进行改造,增加复杂度和难度;同时,化学连接反应完成后,需要进一步纯化以获得跨膜蛋白-多核苷酸结合蛋白复合物,去除化学反应所需的试剂。化学反应的试剂会对跨膜蛋白或多核苷酸结合蛋白活性不友好。同时,如果不纯化,这些试剂往往会影响后续进一步实验,例如:部分化合物可能在电场力作用下穿过跨膜蛋白, 对电流检测信号存在干扰;或部分化合物对后续实验磷脂膜的稳定性存在影响,造成单通道纳米孔失效。
另一方面,当采用将改造后的孔蛋白插入磷脂双分子层膜上,形成单通道纳米孔,然后加入多核苷酸结合蛋白,孵育后通过化学连接形成跨膜蛋白-多核苷酸结合蛋白复合物这种方式时,发明人发现所述跨膜蛋白与多核苷酸结合蛋白的连接需要在纳米孔检测反应池中进行,并需要在膜两侧施加适当的电压,用于检测是否形成跨膜蛋白-多核苷酸结合蛋白复合物。如果在测序用化学反应池中进行化学连接反应,添加实现化学连接的添加物或必要成分,可能对磷脂膜的稳定性造成影响或对检测是否形成跨膜蛋白-多核苷酸结合蛋白复合物造成影响,从而影响后续进一步检测。
而本发明中,异肽键的连接无需额外添加物质,对后续进一步实验的测序试剂或检测试剂无干扰因素。
在某些实施方案中,所述异肽键由蛋白质-蛋白质结合对形成,所述蛋白质-蛋白质结合对由第一成员(例如第一肽标签)和第二成员(例如第二肽标签)组成,其中所述第一成员和第二成员由异肽键结合,并且,所述第一成员任选地通过第一接头(例如刚性或柔性接头,如包含一个或多个甘氨酸和/或一个或多个丝氨酸的肽接头)连接于所述跨膜蛋白孔亚基的N末端或C末端以形成第一组分,所述第二成员任选地通过第二接头(例如刚性或柔性接头,如包含一个或多个甘氨酸和/或一个或多个丝氨酸的肽接头)连接于所述多核苷酸结合蛋白的N末端或C末端以形成第二组分。
在某些实施方案中,所述第一接头为具有如SEQ ID NO:15所示的氨基酸序列的肽接头。
在某些实施方案中,所述第二接头为具有如SEQ ID NO:14所示的氨基酸序列的肽接头。
在某些实施方案中,所述蛋白质-蛋白质结合对选自:SpyCatcher/SpyTag对、SpyTag002/SpyCatcher002对、SpyTag003/SpyCatcher003对、isopeptag-N/pilin-N对,isopeptag/pilin-C对,SnoopTag/SnoopCatcher对。
在某些实施方案中,所述异肽键由SpyTag003/SpyCatcher003对形成。
在某些实施方案中,所述SpyCatcher003连接至所述亚基的N端。
在某些实施方案中,所述SpyTag003连接至所述多核苷酸结合蛋白的N端。
在某些实施方案中,所述SpyCatcher003具有如SEQ ID NO:4所示的氨基酸序列。
在某些实施方案中,所述SpyTag003具有如SEQ ID NO:1所示的氨基酸序列。
在某些实施方案中,所述亚基选自源于hemolysin,MspA,Frac,ClyA,PA63,CsgG,GspD,XcpQ,Wza,SP1,Phi29 connector,SPP1 connector,T3 connector,T4 connector,T7 connector,钾离子离子通道蛋白,钠离子离子通道蛋白,钙离子离子通道蛋白的亚基。
在某些实施方案中,所述亚基还连接有另外的多肽,另外的多肽选自标签、酶切位点、信号肽或导肽、可检测的标记,或其任何组合。
在某些实施方案中,所述亚基具有如SEQ ID NO:3或17所示的氨基酸序列。此处所示序列在其N端不包含起始密码子(如ATG)编码的氨基酸(如甲硫氨酸(Met))。本领域技术人员理解,在通过基因工程制备蛋白的过程中,由于起始密码子的作用,所产生的多肽链第一位经常为起始密码子编码的氨基酸(如Met)。本发明的亚基不仅囊括在其N末端不包含起始密码子编码的氨基酸(如Met)的氨基酸序列,也囊括在其N末端包含起始密码子编码的氨基酸(如Met)的氨基酸序列。因此,在上述氨基酸序列的N端进一步包含起始密码子编码的氨基酸(如Met)的序列也在本发明的保护范围内。
在某些实施方案中,所述多核苷酸结合蛋白选自核酸解旋酶,例如DNA解旋酶或RNA解旋酶。
在某些实施方案中,所述多核苷酸结合蛋白选自Dda,UvrD,Rep,RecQ,PcrA,eIF4A,NS3,Rep,gp41或T7gp4。
在某些实施方案中,所述多核苷酸结合蛋白还连接有另外的多肽,另外的多肽选自标签、酶切位点、信号肽或导肽、可检测的标记,或其任何组合。
在某些实施方案中,所述多核苷酸结合蛋白具有如SEQ ID NO:2或16所示的氨基酸序列。
用于多核苷酸测序的孔
在另一方面,本申请还提供了用于多核苷酸测序的孔,其包含至少一个如上所述的构建体。
在某些实施方案中,所述孔是跨膜蛋白孔。
在某些实施方案中,所述孔包含至少一个如上所述的构建体以及形成孔所需的其他亚基。
在某些实施方案中,所述孔包含数量足够的形成孔所需的其他亚基。
在某些实施方案中,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同或不相同。
在某些实施方案中,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同,或者,所述形成孔所需的其他亚基为所述构建体中的跨膜蛋白孔亚基的旁系同源物、同源物或其变体。
在某些实施方案中,所述孔包含一个如上所述的构建体以及形成孔所需的其他亚基。
在某些实施方案中,所述孔包含数量足够的形成孔所需的其他亚基。
在某些实施方案中,所述孔由1个所述构建体以及7个形成孔所需的其他亚基形成。
在某些实施方案中,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同或不相同。
在某些实施方案中,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同,或者,所述形成孔所需的其他亚基为所述构建体中的跨膜蛋白孔亚基的旁系同源物、同源物或其变体。
分离的核酸
在另一方面,本申请还提供了分离的核酸,其编码如上所述的构建体。
在某些实施方案中,所述分离的核酸分子包含编码所述第一组分的第一核苷酸序列以及编码所述第二组分的第二核苷酸序列,所述第一组分和第二组分如上文中定义,所述第一核苷酸序列和第二核苷酸序列存在于相同或不同的分离的核酸分子上。
在某些实施方案中,所述第一组分为所述第一成员任选地通过第一肽接头连接于所述跨膜蛋白孔亚基的N末端或C末端形成的第一蛋白;所述第二组分为所述第二成员任选地通过第二肽接头连接于所述多核苷酸结合蛋白的N末端或C末端形成的第二蛋白。
载体
在另一方面,本申请还提供了载体,其包含如上所述的分离的核酸分子。在某些实施方案中,所述载体为克隆载体或表达载体。
宿主细胞
在另一方面,本申请还提供了宿主细胞,其包含如上所述的分离的核酸分子或如上所述的载体。
制备方法
在某些实施方案中,所述第一组分为所述第一成员任选地通过第一肽接头连接于所 述跨膜蛋白孔亚基的N末端或C末端形成的第一蛋白;所述第二组分为所述第二成员任选地通过第二肽接头连接于所述多核苷酸结合蛋白的N末端或C末端形成的第二蛋白。
在这种实施方案中,本申请还提供了制备如上所述的构建体或所述孔的方法。
制备如上所述的构建体的方法包括:
在允许蛋白表达的条件下,培养含有编码第一组分的第一核苷酸序列以及编码第二组分的第二核苷酸序列的宿主细胞,所述第一组分和第二组分如上文中定义,和,从培养的宿主细胞培养物中回收所述构建体;其中,所述包含跨膜蛋白孔亚基的第一组分和所述包含多核苷酸结合蛋白的第二组分在表达和/或回收过程中能形成异肽键连接的复合物;
或,
在允许蛋白表达的条件下,分别培养含有编码第一组分的第一核苷酸序列的第一宿主细胞、以及含有编码第二组分的第二核苷酸序列的第二宿主细胞,以分别获得包含跨膜蛋白孔亚基的第一组分、以及包含多核苷酸结合蛋白的第二组分,所述第一组分和第二组分如上文中定义;将获得的所述第一组分和所述第二组分孵育,从而形成由异肽键连接的复合物。
制备如上所述的孔的方法包括以下步骤:
(1)使第一组分与形成孔所需的其他亚基接触以形成多聚体,所述多聚体为不含多核苷酸结合蛋白的孔,其中,所述第一组分如上文中定义,所述形成孔所需的其他亚基如上文中定义;
(2)将所述多聚体与第二组分接触,所述第二组分如上文中定义,使得所述第二组分与所述多聚体中的第一组分形成异肽键连接,以形成连接多核苷酸结合蛋白的孔。
在某些实施方案中,步骤(1)中所述多聚体是由1个第一组分与7个形成孔所需的其他亚基形成的八聚体。
在某些实施方案中,所述第一组分中的亚基与所述形成孔所需的其他亚基相同。
在某些实施方案中,步骤(1)包括以下步骤:培养含有编码第一组分的第一核苷酸序列和编码形成孔所需的其他亚基的第三核苷酸序列的宿主细胞,以获得不含多核苷酸结合蛋白的孔。
在某些实施方案中,步骤(1)还包括纯化所述不含多核苷酸结合蛋白的孔的步骤。
在某些实施方案中,所述第一组分含有第一纯化标签,所述形成孔所需的其他亚基含有第二纯化标签。
在某些实施方案中,所述第一纯化标签与所述第二纯化标签不同,例如,所述第一纯化标签为His标签,所述第二纯化标签为Strep标签。
在某些实施方案中,所述方法步骤(1)中,通过所述第一纯化标签和所述第二纯化标签纯化所述不含多核苷酸结合蛋白的孔。在某些实施方案中,所述方法步骤(1)中,通过所述第一纯化标签和所述第二纯化标签纯化由1个第一组分与7个形成孔所需的其他亚基形成的所述不含多核苷酸结合蛋白的孔。
在某些实施方案中,所述方法在步骤(1)之后,还包括去除所述第一纯化标签和/或所述第二纯化标签的步骤。
测序方法
在另一方面,本申请还提供了对靶多核苷酸进行测序的方法,其包括:
(a)提供如上所述的孔,以及待测多核苷酸;
(b)将所述待测多核苷酸与所述孔接触,使得所述孔的多核苷酸结合蛋白与所述待测多核苷酸结合,所述多核苷酸结合蛋白控制与其结合的单链多核苷酸穿过所述孔;和,
(c)在所述单链多核苷酸相对于所述孔移动时获取一个或多个测量值,其中所述测量值可用于表示所述待测多核苷酸的序列信息。
在某些实施方案中,在步骤(a)中,所述待测多核苷酸为双链多核苷酸,所述多核苷酸结合蛋白帮助分开所述两条链以提供单链多核苷酸,并控制所述单链多核苷酸移动通过所述孔;所述双链多核苷酸包含至少一个单链悬突(例如5’端悬突和/或3’端悬突),所述单链悬突含有前导序列,所述前导序列引领与其衔接的核酸链进入所述孔。
在某些实施方案中,所述双链多核苷酸可含有缺口和/或在非末端的悬突。
在某些实施方案中,所述前导序列通常顺着所施加电势产生的场进入所述孔中。
在某些实施方案中,所述前导序列通常包括聚合物。该聚合物优选带负电荷。聚合物优选为多核苷酸,例如DNA或RNA,修饰的核苷酸(例如无碱基的DNA),PNA,LNA,聚乙二醇(PEG)或多肽。
在某些优选的实施方案中,所述前导序列包含单链多核苷酸。在某些实施方案中,所述前导序列包括单链DNA序列,如聚dT(polydT)部分。在某些优选的实施方案中,所述前导序列为10至150个核苷酸长度,例如20至150个核苷酸长度。本领域技术人员可根据使用的所述孔调整所述前导序列的长度。
在某些实施方案中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿5’端至3’端的方向移动通过所述孔,并且,所述双链多核苷酸包含至少一个5’端悬突,所述5’端悬突 含有所述前导序列。
本领域技术人员易于理解,“单链多核苷酸沿5’端至3’端的方向移动通过所述孔”是指,所述单链多核苷酸中从5’端至3’端的核苷酸残基依次通过所述孔。
在某些实施方案中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿3’端至5’端的方向移动通过所述孔,并且,所述双链多核苷酸包含至少一个3’端悬突,所述3’端悬突含有所述前导序列。
本领域技术人员易于理解,“单链多核苷酸沿3’端至5’端的方向移动通过所述孔”是指,所述单链多核苷酸中从3’端至5’端的核苷酸残基依次通过所述孔。
在某些实施方案中,所述待测多核苷酸为线性双链多核苷酸。
在某些实施方案中,所述线性双链多核苷酸在其两个末端各包含至少一个单链悬突(例如5’端悬突和/或3’端悬突),所述单链悬突含有所述前导序列。
在某些实施方案中,所述线性双链多核苷酸在其一个末端包含至少一个单链悬突(例如5’端悬突和/或3’端悬突),所述单链悬突含有所述前导序列;并且,所述线性双链多核苷酸在其另一个末端含有桥连部分,所述桥连部分共价连接所述线性双链多核苷酸的两条单链。在某些实施方案中,所述桥连部分共价连接所述线性双链多核苷酸的两条单链的末端。例如,所述桥连部分将所述线性双链多核苷酸的其中一条单链的3’端(或5’端)与另一条单链的5’端(或3’端)共价连接。
在某些实施方案中,所述桥连部分为发夹类寡核苷酸衔接子。在某些实施方案中,所述含有所述寡核苷酸衔接子的线性双链多核苷酸在实质上表现为为由一条寡核苷酸链形成的含有发夹结构的双链体。
在某些实施方案中,所述待测多核苷酸为环状双链多核苷酸。
在某些实施方案中,所述环状双链多核苷酸由含有所述前导序列的引物以载体序列为模板进行滚环扩增得到。
在某些实施方案中,所述扩增过程与测序过程在同一体系中进行。例如,所述环状双链多核苷酸在所述多核苷酸结合蛋白作用下被分开为两条单链,其中一条单链通过所述孔,另一条链作为模板生成新的环状双链多核苷酸。
在某些实施方案中,在步骤(a)中,所述待测多核苷酸为单链多核苷酸,所述多核苷酸结合蛋白控制所述单链多核苷酸移动通过所述孔;其中,所述单链多核苷酸在其两个末端中的至少一个包含前导序列,所述前导序列引导与其衔接的核酸链进入孔。在某些实施方案中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿5’端至3’端的方向移动 通过所述孔,并且,所述单链多核苷酸至少在其5’端包含所述前导序列。
本领域技术人员易于理解,“单链多核苷酸沿5’端至3’端的方向移动通过所述孔”是指,所述单链多核苷酸中从5’端至3’端的核苷酸残基依次通过所述孔。
在某些实施方案中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿3’端至5’端的方向移动通过所述孔,并且,所述单链多核苷酸至少在其3’端包含所述前导序列。
本领域技术人员易于理解,“单链多核苷酸沿3’端至5’端的方向移动通过所述孔”是指,所述单链多核苷酸中从3’端至5’端的核苷酸残基依次通过所述孔。
在某些实施方案中,所述待测多核苷酸偶联至所述孔的外缘或附近。
在某些实施方案中,所述孔的外缘或附近缀合一个或多个拘束器(tether),所述拘束器包含与所述待测多核苷酸的部分具有序列互补性的捕获序列,通过捕获序列与待测多核苷酸中互补区域的杂交,使所述待测多核苷酸偶联至所述孔的外缘或附近。
在某些实施方案中,当所述待测多核苷酸为如上文中定义的双链多核苷酸时,所述待测多核苷酸的部分包括所述单链悬突的一部分或者与所述单链悬突对准的链的一部分。在某些实施方案中,所述待测多核苷酸的部分包括所述单链悬突的一部分。在某些实施方案中,所述单链悬突的一部分位于所述前导序列的5’端或3’端。在某些优选的实施方案中,所述单链悬突的一部分在所述前导序列之后通过所述孔。在某些实施方案中,所述待测多核苷酸的部分包括与所述单链悬突对准的链的一部分。
如本文所用,术语“与所述单链悬突对准的链”是指与(i)所述单链悬突所在的寡核苷酸链,或,(ii)与所述单链悬突所在的寡核苷酸链杂交于同一条链的寡核酸链,相杂交的链。
在某些实施方案中,当所述待测多核苷酸为如上文中定义的单链多核苷酸时,所述待测多核苷酸的部分包括所述单链多核苷酸的一部分。
在某些实施方案中,所述孔的外缘或附近缀合一个或多个拘束器(tether),所述拘束器含有捕获序列,所述捕获序列通过连接体与所述待测多核苷酸连接;其中,所述连接体含有与所述捕获序列互补的第一区域,以及,与所述待测多核苷酸的部分具有序列互补性的第二区域。
在某些实施方案中,当所述待测多核苷酸为如上文中定义的双链多核苷酸时,所述待测多核苷酸的部分包括所述单链悬突的一部分或者与所述单链悬突对准的链的一部分。在某些实施方案中,所述待测多核苷酸的部分包括所述单链悬突的一部分。在某些实施方案中,所述单链悬突的一部分位于所述前导序列的5’端或3’端。在某些优选的实施方 案中,所述单链悬突的一部分在所述前导序列之后通过所述孔。在某些实施方案中,所述待测多核苷酸的部分包括与所述单链悬突对准的链的一部分。
在某些实施方案中,当所述待测多核苷酸为如上文中定义的单链多核苷酸时,所述待测多核苷酸的部分包括所述单链多核苷酸的一部分。
在某些实施方案中,所述孔置于膜中。
在某些实施方案中,所述膜是两亲性层,例如脂质双层(例如磷脂双分子层)或高分子聚合物膜(例如di-block、tri-block)。
在某些实施方案中,所述拘束器与所述孔或所述膜在所述孔附近的一部分共价或非共价连接。
在某些实施方案中,所述拘束器通过胆固醇分子锚定于所述膜在所述孔附近的一部分。
在某些实施方案中,所述胆固醇分子通过间隔分子连接至拘束器的所述捕获序列。在某些实施方案中,所述间隔分子由4×iSp18组成。
易于理解,所述拘束器用于在所述孔附近特异性或非特异性富集所述待测多核苷酸。因此,本领域技术人员可根据需要设计所述拘束器的长度、数量以及捕获序列(特异性或非特异性捕获序列),本申请对此不做限制。
在某些实施方案中,步骤(b)和步骤(c)在溶液中进行。
在某些实施方案中,所述溶液是离子的并且所测量的性质是通过所述纳米孔的离子电流。
在某些实施方案中,步骤(b)中,提供纳米孔上的电位差以便容许单链核酸进入所述纳米孔。
装置
在另一方面,本申请还提供了用于对靶多核苷酸进行测序的装置,包含:(i)膜;(ii)所述膜中的多个跨膜蛋白孔,选自如上所述的孔的跨膜蛋白孔;
在某些实施方案中,所述装置还包含用于实施如上所述的对靶多核苷酸进行测序的方法的说明书;
在某些实施方案中,所述膜是两亲性层,例如脂质双层(例如磷脂双分子层)或高分子聚合物膜(例如di-block、tri-block)。
在某些实施方案中,所述装置不包含与(ii)分开提供的多核苷酸结合蛋白。
试剂盒
在另一方面,本申请还提供了试剂盒,其包含如上所述的构建体,孔,分离的核酸分子,载体,和/或,宿主细胞。
在某些实施方案中,所述试剂盒还包含用于纳米孔测序的试剂。
用途
在另一方面,本申请还提供了如上所述的构建体,孔,分离的核酸分子,载体,宿主细胞,装置,或试剂盒用于多核苷酸测序的用途。
术语定义
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的病毒学、生物化学、免疫学实验室操作步骤均为相应领域内广泛使用的常规步骤。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
当本文使用术语“例如”、“如”、“诸如”、“包括”、“包含”或其变体时,这些术语将不被认为是限制性术语,而将被解释为表示“但不限于”或“不限于”。
除非本文另外指明或根据上下文明显矛盾,否则术语“一个”和“一种”以及“该”和类似指称物在描述本发明的上下文中(尤其在以下权利要求的上下文中)应被解释成覆盖单数和复数。
如本文所用,术语“发生退火”、“进行退火”、“退火”、“使杂交”或“杂交”等是指,具有经由沃森-克里克碱基配对形成复合物的充分互补性的核苷酸序列之间形成复合物。就本发明来说,彼此之间“对其互补”或“与之互补”或与其“杂交”或“退火”的核酸序列应该能形成或形成服务于预定目的的足够稳定的“杂交体”或“复合物”。不要求由一个核酸分子显示的序列内的每个核酸碱基能够与由第二核酸分子显示的序列内的每个核酸碱基进行碱基配对或配对或复合,以便这两个核酸分子或其中显示的相应序列与彼此“互补”或“退火”或“杂交”。如本文所述,在提及按碱基配对法则联系的核苷酸的序列时使用术语“互补的”或“互补性”。例如,序列5’-A-G-T-3’与序列3’-T-C-A-5’互补。互补性可以是“部分的”,其中核酸碱基中只有一些根据碱基配对法则匹配。或者,在核酸之间可具有“完全的”或“全部的”互补性。
如本文所用,术语“多核苷酸”如核酸是含有两个或更多个核苷酸的大分子。多核苷酸或核酸可包括任何核苷酸的任意组合。所述核苷酸可以是天然存在的或人工合成 的。所述核苷酸可以是含有修饰的或者不含有修饰的。在某些实施方案中,所述核苷酸是不含有修饰的。
发明的有益效果
本发明的纳米孔测序方法具备以下一项或多项有益效果:
(1)本发明的纳米孔测序方法所构建的测序文库不含有马达蛋白,保存周期更长,对建库条件的要求大大降低(例如,建库时无需考虑马达蛋白是否变性的问题)。此外,由于测序文库不含马达蛋白,本发明的测序方法无需使用间隔器,可降低成本。同时,在本发明中,测序文库被跨膜蛋白捕获后即刻开始测序,可提高检测效率。
(2)特别地,一方面,现有技术中,由于文库中间隔器(通常为经修饰的特殊核苷酸)的存在,导致文库构建方式受到很多限制。例如,无法仅通过常规的核酸扩增构建测序文库。而本发明提供的纳米孔测序方法无需使用间隔器,因此,构建测序文库时可采用更多文库的构建方式或可以更好的与其他技术方案进行结合,例如滚环扩增。另一方面,正由于如前所述的本发明方法对文库结构和构建方式具有更大的包容性,导致在某些实施方案中,测序过程和文库构建过程可以在同一体系中进行,可以进一步节约成本,提高效率。
(3)此外,马达蛋白与待测核酸可能有多个结合位点,即一个接头上可能含有不同个数的马达蛋白,导致测序时测序速度等性能存在差异。文库分子中如果含有马达蛋白和间隔器,一般需要对接头复合物进行纯化获得单一的接头复合物,本发明提供的不含有马达蛋白和间隔器的测序文库能很好的避免该问题。
下面将结合附图和实施例对本发明的实施方案进行详细描述,但是本领域技术人员将理解,下列附图和实施例仅用于说明本发明,而不是对本发明的范围的限定。根据附图和优选实施方案的下列详细描述,本发明的各种目的和有利方面对于本领域技术人员来说将变得显然。
附图说明
图1示出了表达载体pET.28a(+)-6*His-Spytag003-DDA,pET.28a(+)-6*His-Spycatcher003-MspA和pET.21a(+)-strep-MspA的核心构建示意图。
图2示出了实施例1制备的马达蛋白与孔蛋白复合物的结构示意图。
图3示出了实施例1中经离子交换纯化的不同比例Spycatcher003-MspA与MspA复合物的SDS-PAGE胶图;为不破坏MspA的聚合物状态,制备样品时,加入5×非变性非还原性蛋白上样缓冲液(P0016N,碧云天)。
图4示出了实施例1中的马达蛋白Spytag003-DDA与孔蛋白Spycatcher003-MspA(MspA)7重组后通过分子筛出峰图
图5示出了实施例1中马达蛋白Spytag003-DDA与孔蛋白Spycatcher003-MspA(MspA) 7重组时的SDS-PAGE胶图;泳道1显示单独的马达蛋白Spytag003-DDA。泳道2显示单独的孔蛋白Spycatcher003-MspA(MspA) 7。泳道3显示Spycatcher003-MspA(MspA) 7与Spytag003-DDA孵育后通过分子筛之前的混合样品。泳道4显示protein ladder(26616,Thermo Scientific)。泳道5-9显示Spycatcher003-MspA(MspA) 7与Spytag003-DDA复合物通过分子筛后的出峰位置蛋白,对应图4的第一个峰值,其中泳道5显示单一的复合物条带。
图6示出了实施例2中复合物插孔时不同电压下的开孔电流分布(0.05V-0.1V-0.14V-0.18V)。
图7是示出了实施例2中构建好的双链多核苷酸测序文库与拘束器结合后的结构示意图;其中1是前导序列,包含与马达蛋白结合区,2是双链DNA部分(含待测DNA片段),3是拘束器。
图8示出了实施例2中的构建好的双链多核苷酸测序文库随着被Spytag003-DDA解开成单链并移位穿过Spycatcher003-MspA(MspA) 7纳米孔的电信号。图8A示出了一个测序事件电流振幅图,图8B是图8A其中一段信号放大图。
序列信息
本申请涉及的序列的描述提供于下表中。
表1:序列信息
Figure PCTCN2022142615-appb-000001
Figure PCTCN2022142615-appb-000002
Figure PCTCN2022142615-appb-000003
Figure PCTCN2022142615-appb-000004
注:SEQ ID NO:12所示的拘束器序列中,Cholesterol(胆固醇)通过4×iSp18连接至多核苷酸的5’端,其中“iSp18”是18原子六乙二醇间隔物(18-atom hexa-ethyleneglycol spacer),其结构如式I所示:
Figure PCTCN2022142615-appb-000005
具体实施方式
现参照下列意在举例说明本发明(而非限定本发明)的实施例来描述本发明。
除非特别指明,本发明中所使用的分子生物学实验方法和免疫检测法,基本上参照J.Sambrook等人,分子克隆:实验室手册,第2版,冷泉港实验室出版社,1989,以及F.M.Ausubel等人,精编分子生物学实验指南,第3版,John Wiley & Sons,Inc.,1995中所述的方法进行;限制性内切酶的使用依照产品制造商推荐的条件。本领域技术人员知晓,实施例以举例方式描述本发明,且不意欲限制本发明所要求保护的范围。
实施例1:含Spytag003的马达蛋白DDA与经Spycatcher003标记的孔MspA八聚体的表达和纯化
1.构建重组表达质粒
根据图1所示的表达载体构建示意图,设计和合成各基因片段。将6*His-Spytag003-DDA和6*His-Spycatcher003-MspA分别克隆入具有氨苄抗性的pET.28a(+)载体中,分别 得到重组质粒pET.28a(+)-6*His-Spytag003-DDA和pET.28a(+)-6*His-Spycatcher003-MspA。将Strep-MspA克隆进具有卡纳抗性的pET.21a(+)载体中,得到重组质粒pET.21a(+)-strep-MspA。
2.重组质粒的转化与目的蛋白的大量表达
(1)将重组质粒pET.28a(+)-6*His-Spytag003-DDA转化入DE3感受态细胞,挑取单菌落,接入5mL含有卡纳霉素的LB培养基中,37℃震荡培养过夜。然后转接入1L的LB培养基中,37℃震荡培养至OD600=0.6-0.8,降温至16℃,加入终浓度500μM的IPTG诱导表达过夜。收集菌体。
(2)将重组质粒pET.28a(+)-6*His-Spycatcher003-MspA与pET.21a(+)-strep-MspA共转化入DE3感受态细胞。挑取单菌落,接入5mL含有卡纳霉素和氨苄霉素两种抗生素的LB培养基中,37℃震荡培养过夜。然后转接入1L的LB培养基中,37℃震荡培养至OD600=0.6-0.8,降温至16℃,加入终浓度500μM的IPTG诱导表达过夜。收集菌体。
3.Spytag003-DDA的纯化
高压破碎上述步骤2(1)获得的菌体,通过镍柱,离子交换柱,分子筛等方法进行纯化,最终得到纯度良好的Spytag003-DDA蛋白。
4.Spycatcher003-MspA与MspA的1:7比例的复合物的纯化
本发明通过以下方法得到只融合有一分子Spycatcher003的MspA八聚体。首先高压破碎(Lysis Buffer:150mM NaCl,20mM Tris,1%DDM)上述步骤2(2)获得的共表达6*His-Spycatcher003-MspA和Strep-MspA的菌体,进行Strep柱纯化(Wash Buffer:150mM NaCl,20mM Tris,0.05%Tween 20;Elution buffer:150mM NaCl,20mM Tris,10mM脱硫生物素),得到两种蛋白不同比例的复合物。根据两种蛋白不同比例复合物的带电荷不同和与镍柱的亲和力不同,依次通过镍柱的咪唑浓度梯度(咪唑浓度25mM-250mM)洗脱以及离子交换柱进行盐离子浓度梯度洗脱(BufferA:100mM;BufferB:1000mM NaCl),分离得到Spycatcher003-MspA与MspA以比例为1:7形成的MspA八聚体,如图3所示,由此,得到经Spycatcher003标记的MspA八聚体。
5.马达蛋白与孔蛋白复合物的共价重组
将上述步骤4得到的经Spycatcher003标记的MspA八聚体与上述步骤3得到的Spytag003-DDA以1:1.2的比例混合。25℃反应30min,利用分子筛Superose 6 increase(纯化Buffer:150mM NaCl,20mM Tris,0.05%Tween20)除去过量的Spytag003-DDA蛋白,收集DDA-Spytag/Spycatcher-MspA复合物。结果如图4和图5所示。
实施例2:构建纳米孔生物传感器并应用于测序
(1)双链多核苷酸测序文库构建
材料与方法:
接头顶链(SEQ ID 10)
接头底链(SEQ ID 11)
拘束器序列(SEQ ID 12)
待测DNA序列PUC57载体(SEQ ID 13)
首先将待测DNA(用量2ug)进行末修处理:将表2中的试剂混匀,置于PCR仪中20℃,10min;65℃,10min,之后进行磁珠纯化。
表2 末修处理体系
NEB Next FFPE DNA Repair Buffer 3.5ul
NEB Next FFPE DNA Repair Mix 2ul
Ultra II End-prep reaction Buffer 3.5ul
Ultra II End-prep enzyme Mix 3ul
DNA样品 48ul
接头由接头顶链(SEQ ID 10)和底链(SEQ ID 11)退火形成。
然后产物与接头连接,体系如表3所示,25℃,30min,之后进行磁珠纯化。
表3 连接体系
接头 5ul
DNA样品(上述末修处理产物+接头) 60ul
NEB Ligation Buffer 25ul
NEB Next Quick T4 DNA Ligase 10ul
最后酶切处理未连接的接头和DNA片段,体系如表4,37℃,5min,之后进行磁珠纯化,完成文库构建。
表4 酶切体系
NEB Buffer 4 2.5ul
T7 Exonuclease 0.5ul
RNase Free H 2O 2ul
DNA样品 20ul
(1)纳米孔生物传感器构建和测序
将马达蛋白Spytag003-DDA与孔蛋白Spycatcher003-MspA(MspA) 7在1×PBS(生工, E607020-0500)中按等摩尔比孵育30min(温度:25℃),形成跨膜蛋白-马达蛋白复合物。在电场力作用下(施加电压0.18V),跨膜蛋白-马达蛋白复合物插入制备好的磷脂双分子层膜上,形成单通道纳米孔。使用膜片钳或其他电信号采集器采集电流信号(方法可参考:Ji,Zhouxiang,and Peixuan Guo."Channel from bacterial virus T7 DNA packaging motor for the differentiation of peptides composed of a mixture of acidic and basic amino acids."Biomaterials 214(2019):119222.)。在不同电压下纳米孔的开孔电流也不同;可以通过开孔电流大小或者电导(conductance)来判断是否形成单个通道。图6所示为单个纳米孔在缓冲液(0.3M KCl,25mM HEPES,1mM EDTA,5mM ATP,25mMgCl 2)的开孔电流,其中50mV下开孔电流为~35pA;100mV下开孔电流为~75pA;140mV下开孔电流为~110pA;180mV下开孔电流为~150pA。
单通道形成后,加入上述构建好的双链多核苷酸测序文库到腔室中。
在300μl测序缓冲液(0.3M KCl,25mM HEPES,1mM EDTA,5mM ATP,25m MgCl 2)中加入2μg双链多核苷酸测序文库以及6μl的拘束器(浓度:1μM),加入纳米孔体系中。双链多核苷酸测序文库分子与拘束器结合后的示意图如图7所示。在电场力作用下,双链多核苷酸测序文库的5‘端突出被跨膜蛋白-马达蛋白复合物所捕获并且穿过纳米孔,从而观测到电流的振幅变化。所述复合物能够连续捕获双链多核苷酸并回到开孔电流,多核苷酸随着移位穿过纳米孔被鉴定,电流信号如图8所示(电压:0.14V)。图8A展示了测序事件的电流振幅图,图8B为图8A中的测序事件放大图。结果表明,所述复合物中的马达蛋白能够分开双链多核苷酸的两条链并控制单链多核苷酸移动穿过所述复合物形成的孔,并完成多核苷酸序列测定。
尽管本发明的具体实施方式已经得到详细的描述,但本领域技术人员将理解:根据已经公布的所有教导,可以对细节进行各种修改和变动,并且这些改变均在本发明的保护范围之内。本发明的全部分为由所附权利要求及其任何等同物给出。

Claims (31)

  1. 一种包含跨膜蛋白孔亚基和多核苷酸结合蛋白的构建体,其中,所述亚基保留其形成孔的能力,所述多核苷酸结合蛋白能够分开双链多核苷酸的两条链和/或控制单链多核苷酸移动穿过所述孔;并且,所述亚基与所述多核苷酸结合蛋白共价结合;
    优选地,所述共价结合不是通过核酸的翻译产生的肽键;
    优选地,所述共价结合通过异肽键实现;
    优选地,所述共价结合通过化学连接(例如,点击化学)产生。
  2. 权利要求1的构建体,其中,所述亚基与所述多核苷酸结合蛋白通过异肽键结合。
  3. 权利要求1或2的构建体,其中,所述异肽键由蛋白质-蛋白质结合对形成,所述蛋白质-蛋白质结合对由第一成员(例如第一肽标签)和第二成员(例如第二肽标签)组成,其中所述第一成员和第二成员由异肽键结合,并且,所述第一成员任选地通过第一接头(例如刚性或柔性接头,如包含一个或多个甘氨酸和/或一个或多个丝氨酸的肽接头)连接于所述跨膜蛋白孔亚基的N末端或C末端以形成第一组分,所述第二成员任选地通过第二接头(例如刚性或柔性接头,如包含一个或多个甘氨酸和/或一个或多个丝氨酸的肽接头)连接于所述多核苷酸结合蛋白的N末端或C末端以形成第二组分;
    优选地,所述蛋白质-蛋白质结合对选自:SpyCatcher/SpyTag对、SpyTag002/SpyCatcher002对、SpyTag003/SpyCatcher003对、isopeptag-N/pilin-N对,isopeptag/pilin-C对,SnoopTag/SnoopCatcher对。
  4. 权利要求1-3任一项的构建体,其中,所述异肽键由SpyTag003/SpyCatcher003对形成;
    优选地,所述SpyCatcher003连接至所述亚基的N端;
    优选地,所述SpyTag003连接至所述多核苷酸结合蛋白的N端;
    优选地,所述SpyCatcher003具有如SEQ ID NO:4所示的氨基酸序列;
    优选地,所述SpyTag003具有如SEQ ID NO:1所示的氨基酸序列。
  5. 权利要求1-4任一项的构建体,其中,所述亚基选自源于hemolysin,MspA,Frac,ClyA,PA63,CsgG,GspD,XcpQ,Wza,SP1,Phi29 connector,SPP1 connector,T3 connector,T4 connector,T7 connector,钾离子离子通道蛋白,钠离子离子通道蛋白,钙离子离子通道蛋白的亚基;
    优选地,所述亚基还连接有另外的多肽,另外的多肽选自标签、酶切位点、信号肽或导肽、可检测的标记,或其任何组合;
    优选地,所述亚基具有如SEQ ID NO:3或17所示的氨基酸序列。
  6. 权利要求1-5任一项的构建体,其中,所述多核苷酸结合蛋白选自核酸解旋酶,例如DNA解旋酶或RNA解旋酶;
    优选地,所述多核苷酸结合蛋白选自Dda,UvrD,Rep,RecQ,PcrA,eIF4A,NS3,Rep,gp41或T7gp4;
    优选地,所述多核苷酸结合蛋白还连接有另外的多肽,另外的多肽选自标签、酶切位点、信号肽或导肽、可检测的标记,或其任何组合;
    优选地,所述多核苷酸结合蛋白具有如SEQ ID NO:2或16所示的氨基酸序列。
  7. 用于多核苷酸测序的孔,其包含至少一个权利要求1-6任一项的构建体;
    优选地,所述孔是跨膜蛋白孔。
  8. 权利要求7的孔,其包含至少一个权利要求1-6任一项的构建体以及形成孔所需的其他亚基;
    优选地,所述孔包含数量足够的形成孔所需的其他亚基;
    优选地,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同或不相同;
    优选地,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同,或者,所述形成孔所需的其他亚基为所述构建体中的跨膜蛋白孔亚基的旁系同源物、同源物或其变体。
  9. 权利要求7或8的孔,其包含一个权利要求1-6任一项的构建体以及形成孔所需的其他亚基;
    优选地,所述孔包含数量足够的形成孔所需的其他亚基;
    优选地,所述孔由1个所述构建体以及7个形成孔所需的其他亚基形成;
    优选地,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同或不相同;
    优选地,所述形成孔所需的其他亚基与所述构建体中的跨膜蛋白孔亚基相同,或者,所述形成孔所需的其他亚基为所述构建体中的跨膜蛋白孔亚基的旁系同源物、同源物或其变体。
  10. 分离的核酸,其编码权利要求1-6任一项的构建体。
  11. 权利要求10的分离的核酸分子,其中,所述分离的核酸分子包含编码所述第一组分的第一核苷酸序列以及编码所述第二组分的第二核苷酸序列,所述第一组分和第二组分如权利要求3或4中定义,所述第一核苷酸序列和第二核苷酸序列存在于相同或不同的分离的核酸分子上;
    优选地,所述第一组分为所述第一成员任选地通过第一肽接头连接于所述跨膜蛋白孔亚基的N末端或C末端形成的第一蛋白;所述第二组分为所述第二成员任选地通过第二肽接头连接于所述多核苷酸结合蛋白的N末端或C末端形成的第二蛋白。
  12. 载体,其包含权利要求10或11的分离的核酸分子;优选地,所述载体为克隆载体或表达载体。
  13. 宿主细胞,其包含权利要求10或11的分离的核酸分子或权利要求12所述的载体。
  14. 制备权利要求7-9任一项的孔的方法,所述方法包括以下步骤:
    (1)使第一组分与形成孔所需的其他亚基接触以形成多聚体,所述多聚体为不含多核苷酸结合蛋白的孔,其中,所述第一组分如权利要求3或4中定义,所述形成孔所需的其他亚基如权利要求8或9中定义;
    (2)将所述多聚体与第二组分接触,所述第二组分如权利要求3或4中定义,使得所述 第二组分与所述多聚体中的第一组分形成异肽键连接,以形成连接多核苷酸结合蛋白的孔;
    优选地,步骤(1)中所述多聚体是由1个第一组分与7个形成孔所需的其他亚基形成的八聚体;
    优选地,所述第一组分中的亚基与所述形成孔所需的其他亚基相同;
    优选地,步骤(1)包括以下步骤:培养含有编码第一组分的第一核苷酸序列和编码形成孔所需的其他亚基的第三核苷酸序列的宿主细胞,以获得不含多核苷酸结合蛋白的孔;
    优选地,步骤(1)还包括纯化所述不含多核苷酸结合蛋白的孔的步骤。
  15. 制备权利要求7-9任一项的孔的方法,所述方法包括以下步骤:
    (1)提供权利要求1-6任一项的构建体;
    (2)将所述构建体与形成孔所需的其他亚基进行孵育,形成多聚体,所述多聚体为连接多核苷酸结合蛋白的孔;
    优选地,步骤(2)中,将一个或多个所述构建体与一个或多个形成孔所需的其他亚基进行孵育;
    优选地,步骤(1)中所述多聚体是由1个构建体与7个形成孔所需的其他亚基形成的八聚体。
  16. 对靶多核苷酸进行测序的方法,其包括:
    (a)提供权利要求7-9任一项的孔,以及待测多核苷酸;
    (b)将所述待测多核苷酸与所述孔接触,使得所述孔的多核苷酸结合蛋白与所述待测多核苷酸结合,所述多核苷酸结合蛋白控制与其结合的单链多核苷酸穿过所述孔;和,
    (c)在所述单链多核苷酸相对于所述孔移动时获取一个或多个测量值,其中所述测量值可用于表示所述待测多核苷酸的序列信息。
  17. 权利要求16的方法,其中,在步骤(a)中,所述待测多核苷酸为双链多核苷酸,所述多核苷酸结合蛋白帮助分开所述两条链以提供单链多核苷酸,并控制所述单链多核苷酸移动通过所述孔;所述双链多核苷酸包含至少一个单链悬突(例如5’端悬突和/或3’端悬突),所述单链悬突含有前导序列,所述前导序列引领与其衔接的核酸链进入所述孔。
  18. 权利要求17的方法,其中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿5’端至3’端的方向移动通过所述孔,并且,所述双链多核苷酸包含至少一个5’端悬突,所述5’端悬突含有所述前导序列。
  19. 权利要求17的方法,其中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿3’端至5’端的方向移动通过所述孔,并且,所述双链多核苷酸包含至少一个3’端悬突,所述3’端悬突含有所述前导序列。
  20. 权利要求17-19任一项的方法,其中,所述待测多核苷酸为线性双链多核苷酸;
    优选地,所述线性双链多核苷酸在其两个末端各包含至少一个单链悬突(例如5’端悬突和/或3’端悬突),所述单链悬突含有所述前导序列;
    优选地,所述线性双链多核苷酸在其一个末端包含至少一个单链悬突(例如5’端悬突和/或3’端悬突),所述单链悬突含有所述前导序列;并且,所述线性双链多核苷酸在其另一个末端含有桥连部分,所述桥连部分共价连接所述线性双链多核苷酸的两条单链。
  21. 权利要求17-19任一项的方法,其中,所述待测多核苷酸为环状双链多核苷酸;
    优选地,所述环状双链多核苷酸由含有所述前导序列的引物以载体序列为模板进行滚环扩增得到。
  22. 权利要求16的方法,其中,在步骤(a)中,所述待测多核苷酸为单链多核苷酸,所述多核苷酸结合蛋白控制所述单链多核苷酸移动通过所述孔;其中,所述单链多核苷酸在其两个末端中的至少一个包含前导序列,所述前导序列引导与其衔接的核酸链进入孔。
  23. 权利要求22的方法,其中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿5’端至3’端的方向移动通过所述孔,并且,所述单链多核苷酸至少在其5’端包含所述前导序列。
  24. 权利要求22的方法,其中,所述多核苷酸结合蛋白控制所述单链多核苷酸沿3’端至5’端的方向移动通过所述孔,并且,所述单链多核苷酸至少在其3’端包含所述前导序 列。
  25. 权利要求16-24任一项的方法,其中,所述待测多核苷酸偶联至所述孔的外缘或附近;
    优选地,所述孔的外缘或附近缀合一个或多个拘束器(tether),所述拘束器包含与所述待测多核苷酸的部分具有序列互补性的捕获序列,通过捕获序列与待测多核苷酸中互补区域的杂交,使所述待测多核苷酸偶联至所述孔的外缘或附近;
    优选地,当所述待测多核苷酸如权利要求17-21任一项中定义时,所述待测多核苷酸的部分包括所述单链悬突的一部分或者与所述单链悬突对准的链的一部分;
    优选地,当所述待测多核苷酸如权利要求22-24任一项中定义时,所述待测多核苷酸的部分包括所述单链多核苷酸的一部分。
  26. 权利要求16-25任一项的方法,其中,所述孔置于膜中;
    优选地,所述膜是两亲性层,例如脂质双层(例如磷脂双分子层)或高分子聚合物膜(例如di-block、tri-block)。
  27. 权利要求16-26任一项的方法,其中,步骤(b)和步骤(c)在溶液中进行;
    优选地,所述溶液是离子的并且所测量的性质是通过所述纳米孔的离子电流。
  28. 权利要求16-27任一项的方法,其中,步骤(b)中,提供纳米孔上的电位差以便容许单链核酸进入所述纳米孔。
  29. 用于对靶多核苷酸进行测序的装置,包含:(i)膜;(ii)所述膜中的多个跨膜蛋白孔,所述跨膜蛋白孔选自权利要求7-9任一项的孔;
    优选地,所述装置还包含用于实施权利要求16-28任一项的方法的说明书;
    优选地,所述膜是两亲性层,例如脂质双层(例如磷脂双分子层)或高分子聚合物膜(例如di-block、tri-block);
    优选地,所述装置不包含与(ii)分开提供的多核苷酸结合蛋白。
  30. 试剂盒,其包含权利要求1-6任一项的构建体,权利要求7-9任一项的孔,权利要 求10或11的分离的核酸分子,权利要求12的载体,和/或,权利要求13的宿主细胞;
    优选地,所述试剂盒还包含用于纳米孔测序的试剂。
  31. 权利要求1-6任一项的构建体,权利要求7-9任一项的孔,权利要求10或11的分离的核酸分子,权利要求12的载体,权利要求13的宿主细胞,权利要求29的装置,或权利要求30的试剂盒用于多核苷酸测序的用途。
PCT/CN2022/142615 2021-12-30 2022-12-28 单分子纳米孔测序方法 Ceased WO2023125605A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202280086381.2A CN118591641A (zh) 2021-12-30 2022-12-28 单分子纳米孔测序方法
US18/725,327 US20250154581A1 (en) 2021-12-30 2022-12-28 Single molecule nanopore sequencing method
EP22914857.2A EP4458983A1 (en) 2021-12-30 2022-12-28 Single molecule nanopore sequencing method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111657436 2021-12-30
CN202111657436.4 2021-12-30

Publications (1)

Publication Number Publication Date
WO2023125605A1 true WO2023125605A1 (zh) 2023-07-06

Family

ID=86997943

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/142615 Ceased WO2023125605A1 (zh) 2021-12-30 2022-12-28 单分子纳米孔测序方法

Country Status (4)

Country Link
US (1) US20250154581A1 (zh)
EP (1) EP4458983A1 (zh)
CN (1) CN118591641A (zh)
WO (1) WO2023125605A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106574300A (zh) * 2014-05-02 2017-04-19 牛津纳米孔技术公司 改善目标多核苷酸相对于跨膜孔移动的方法
CN110168104A (zh) * 2016-12-01 2019-08-23 牛津纳米孔技术公司 使用纳米孔表征分析物的方法和系统
CN110709412A (zh) * 2017-04-24 2020-01-17 牛津大学创新有限公司 自发性异肽键形成速率提高的蛋白质和肽标签及其用途
CN111154845A (zh) * 2018-11-08 2020-05-15 西门子医疗有限公司 借助于茎环反向多核苷酸的直接rna纳米孔测序
WO2021181111A1 (en) * 2020-03-13 2021-09-16 Oxford University Innovation Limited System for covalently linking proteins

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106574300A (zh) * 2014-05-02 2017-04-19 牛津纳米孔技术公司 改善目标多核苷酸相对于跨膜孔移动的方法
CN110168104A (zh) * 2016-12-01 2019-08-23 牛津纳米孔技术公司 使用纳米孔表征分析物的方法和系统
CN110709412A (zh) * 2017-04-24 2020-01-17 牛津大学创新有限公司 自发性异肽键形成速率提高的蛋白质和肽标签及其用途
CN111154845A (zh) * 2018-11-08 2020-05-15 西门子医疗有限公司 借助于茎环反向多核苷酸的直接rna纳米孔测序
WO2021181111A1 (en) * 2020-03-13 2021-09-16 Oxford University Innovation Limited System for covalently linking proteins

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
F. M. AUSUBEL ET AL.: "Compiled Laboratory Guide to Molecular Biology", 1995, JOHN WILEY & SONS, INC.
J. SAMBROOK ET AL.: "Molecular Cloning: Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
JI, ZHOUXIANGPEIXUAN GUO: "Channel from bacterial virus T7 DNA packaging motor for the differentiation of peptides composed of a mixture of acidic and basic amino acids", BIOMATERIALS, vol. 214, 2019, pages 119222
WANG, DONGSHUAI ET AL.: "Application of Nanopore Sequencing Technology in Microbial Genomics", JOURNAL OF PREVENTIVE MEDICINE OF CHINESE PEOPLE'S LIBERATION ARMY, vol. 39, no. 1, 31 January 2021 (2021-01-31), pages 106 - 109 *

Also Published As

Publication number Publication date
US20250154581A1 (en) 2025-05-15
EP4458983A1 (en) 2024-11-06
CN118591641A (zh) 2024-09-03

Similar Documents

Publication Publication Date Title
JP7390027B2 (ja) 核酸エンコーディングおよび/または標識を使用する解析のためのキット
AU2018270075B2 (en) Transmembrane pore consisting of two CsgG pores
US9678080B2 (en) Bis-biotinylation tags
US11168321B2 (en) Methods of creating and screening DNA-encoded libraries
CN109219665B (zh) 修饰模板双链多核苷酸以用于表征的方法
US20240240245A1 (en) Method
JP2023530155A (ja) ナノポアを通って移動するポリヌクレオチドを特徴付ける方法
CN104350067A (zh) 突变胞溶素孔
JP2019516361A (ja) アルファ溶血素バリアントおよびその使用
JP7157164B2 (ja) アルファ溶血素バリアントおよびその使用
CN117210535B (zh) 一种rna直接测序的建库方法
US20220403368A1 (en) Methods and systems for preparing a nucleic acid construct for single molecule characterisation
CN117384878A (zh) 一种经修饰的CtPif1解旋酶及其应用
CN111518787A (zh) 一种识别缺口g-四链体的化合物、其制备方法及应用
EP1421189A2 (en) Method and device for integrated protein expression, purification and detection
WO2024109455A1 (zh) 一种rna-dna嵌合接头及其应用
CN115747211B (zh) 一种纳米孔测序用测序接头的设计及应用
WO2023125605A1 (zh) 单分子纳米孔测序方法
CN117337333A (zh) 用于补体链测序的方法
WO2002034907A1 (en) Method of synthesizing single-stranded nucleic acid
CN120210151A (zh) 一种突变的MuA转座酶及其应用
CN118256468B (zh) 一种修饰的ToPif1解旋酶及其应用
WO2024056038A1 (zh) 一种经修饰的CaPif1解旋酶及其应用
WO2023194713A1 (en) Method
CN118086286A (zh) 一种DNA -肽偶联物、ARCFU-like解旋酶及其在检测肽段的应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22914857

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202280086381.2

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2022914857

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022914857

Country of ref document: EP

Effective date: 20240730

WWP Wipo information: published in national office

Ref document number: 18725327

Country of ref document: US