[go: up one dir, main page]

WO2025010736A1 - Nucleic acid polypeptide complex and use thereof in peptide sequencing - Google Patents

Nucleic acid polypeptide complex and use thereof in peptide sequencing Download PDF

Info

Publication number
WO2025010736A1
WO2025010736A1 PCT/CN2023/107311 CN2023107311W WO2025010736A1 WO 2025010736 A1 WO2025010736 A1 WO 2025010736A1 CN 2023107311 W CN2023107311 W CN 2023107311W WO 2025010736 A1 WO2025010736 A1 WO 2025010736A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
acid fragment
polypeptide
fragment
stranded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2023/107311
Other languages
French (fr)
Chinese (zh)
Inventor
徐讯
季州翔
乔雨宸
黎宇翔
罗凤琴
曾涛
王冀
董宇亮
章文蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to PCT/CN2023/107311 priority Critical patent/WO2025010736A1/en
Publication of WO2025010736A1 publication Critical patent/WO2025010736A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids

Definitions

  • the present invention belongs to the technical field of sequencing, and in particular relates to a nucleic acid-polypeptide complex and its application in peptide sequencing.
  • Proteomics is one of the core areas of life science research after the completion of human genome sequencing. Fully analyzing the composition of various proteins can enable humans to have a deeper understanding of the cell regulatory mechanisms related to proteins, thereby understanding the relationship between proteins and human diseases. Therefore, in order to accurately obtain various protein sequence information, the development of single-molecule protein sequencing technology is particularly valuable.
  • the technology of DNA sequencing using nanopores has become increasingly mature. Its principle is that under the action of electric field force, the DNA molecule chain bound to the helicase will pass through the nanopore at a relatively constant speed. Since different bases have different blocking effects on the current when passing through the nanopore, the purpose of sequencing can be achieved by analyzing the corresponding relationship between the current change and the base in this process. Similarly, measuring the change of the current signal when the polypeptide chain passes through the nanopore and analyzing its corresponding relationship with different amino acids can also achieve the purpose of polypeptide sequence reading.
  • the speed of some peptides passing through the nanopore can be reduced by modifying the pore protein or optimizing the biochemical experimental conditions, thereby improving the accuracy of peptide sequencing, it is crucial to control the uniformity of the peptides passing through the nanopore channel for the precision and accuracy of peptide sequencing.
  • Oxford Nanopore Technologies Limited (WO2021/111125A1) disclosed a protein sequencing solution based on oligonucleotide-controlled protein speed control.
  • the design is to first synthesize the adaptor-peptide-dsDNA tail complex, then partially anneal the single-stranded DNA to form double-stranded DNA, and then combine it with the oligonucleotide control protein to read the electrical signal through the nanopore (as shown in Figure 3).
  • This method can solve the problem that neutral and positively charged peptides cannot penetrate the pore.
  • Its design is also to first synthesize a ssDNA-peptide-ssDNA (polyT) complex, bind to the MTA helicase without annealing to form a double strand, and drive the polypeptide fragment through the MspA-M2 nanopore protein through the regular sliding of the MTA helicase on the ssDNA chain to measure the sequencing electrical signal (as shown in Figure 4).
  • polyT ssDNA-peptide-ssDNA
  • the technical problem to be solved by the present invention is the defect that the prior art lacks a method capable of realizing perforation sequencing of any polypeptide, and provides a nucleic acid polypeptide complex and its application in peptide sequencing.
  • the nucleic acid polypeptide of the present invention solves the problem that only negatively charged polypeptides are allowed to be perforated in the polymerase-based protein sequencing route, and realizes perforation sequencing of polypeptides of any charge (positive, negative, neutral).
  • the present invention solves the above technical problems through the following technical solutions.
  • the first aspect of the present invention provides a nucleic acid-polypeptide complex, wherein the nucleic acid-polypeptide complex comprises a first nucleic acid fragment, a polypeptide to be detected, and a composite nucleic acid fragment connected in sequence;
  • the composite nucleic acid fragment comprises a nucleic acid sequence that can form a hairpin structure, and after forming the hairpin structure, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be detected;
  • the length of the single-stranded nucleic acid is greater than or equal to the length of the polypeptide to be detected.
  • the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, and the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure. After the hairpin structure is formed, the third nucleic acid fragment is complementary to a part of the second nucleic acid fragment to form a double-stranded nucleic acid, and the other part of the second nucleic acid fragment is a single-stranded nucleic acid.
  • connection between the end of the second nucleic acid fragment and the end of the third nucleic acid fragment can form a covalent connection by chemical connection, enzyme connection, etc., or no covalent connection reaction is required. Due to the partial double-stranded structure between the third nucleic acid fragment and the second nucleic acid fragment, the two form a stable composite nucleic acid fragment.
  • the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence
  • the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, after the hairpin structure is formed, the two ends of the third nucleic acid fragment are complementary to each other to form a double-stranded nucleic acid
  • the second nucleic acid fragment is a single-stranded nucleic acid, and one end of the second nucleic acid fragment is connected to the double-stranded nucleic acid, and the other end of the second nucleic acid fragment is connected to the polypeptide to be tested.
  • the junction between the second nucleic acid fragment end and the third nucleic acid fragment end needs to be connected by chemical, biological, enzymatic or other connection methods to form a covalent bond to form a stable composite nucleic acid fragment.
  • the nucleic acid-polypeptide complex further includes a fourth nucleic acid fragment, the 5' end nucleic acid sequence of the fourth nucleic acid fragment is complementary to at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment to form a double-stranded chain, and the 3' end of the fourth nucleic acid fragment is optionally connected to an anchoring portion.
  • the stability of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the stability of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.
  • the Tm value of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the Tm value of the double-stranded nucleic acid formed by the complementarity of the 5' end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.
  • the length of the first nucleic acid fragment is 5 to 200 nt, preferably 10 to 50 nt; the length of the composite nucleic acid fragment is 80 to 2000 nt, preferably 50 to 200 nt.
  • the length of the second nucleic acid fragment is 40 to 1000 nt, preferably 50 to 200 nt.
  • the length of the third nucleic acid fragment is 40 to 1000 nt, preferably 50 to 200 nt.
  • the length of the fourth nucleic acid fragment is 20-500 nt, preferably 25-100 nt.
  • the first nucleic acid fragment, the composite nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment is DNA or a nucleic acid analog.
  • the nucleic acid analog is LNA or PNA.
  • the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are single-stranded DNA.
  • the anchoring moiety is selected from any one or a combination of lipids, fatty acids, sterols, carbon nanotubes, polypeptides, proteins and/or amino acids; preferably cholesterol, palmitate or tocopherol.
  • sequence of the first nucleic acid fragment is shown as SEQ ID NO:2.
  • sequence of the second nucleic acid fragment is as shown in positions 2 to 55 of SEQ ID NO:1.
  • sequence of the third nucleic acid fragment is as shown in SEQ ID NO:4.
  • sequence of the fourth nucleic acid fragment is as shown in SEQ ID NO:5.
  • the second aspect of the present invention provides a method for preparing the nucleic acid-polypeptide complex as described in the first aspect, the method comprising the steps of subjecting the composite nucleic acid fragment or the second nucleic acid fragment modified with maleamide at the 5' end to a test polypeptide through a thiol-maleamide addition reaction to obtain the nucleic acid-polypeptide complex;
  • the method includes the step of obtaining the nucleic acid-polypeptide complex by reacting the first nucleic acid fragment whose 3' end is modified with DBCO with the polypeptide to be tested through an azide-DBCO click chemistry reaction.
  • the third aspect of the present invention provides a nucleic acid-protein complex, wherein the nucleic acid-protein complex comprises a polymerase and the nucleic acid-polypeptide complex as described in the first aspect;
  • the polymerase binds to the single-stranded nucleic acid at the interface between the single-stranded nucleic acid and the double-stranded nucleic acid in the nucleic acid-polypeptide complex.
  • the polymerase is a polymerase capable of synthesizing double-stranded DNA using the amplified single-stranded DNA as a template.
  • the polymerase is a DNA polymerase.
  • the polymerase is, for example, Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, enzyme, Bsu Large Fragment DNA polymerase, Klenow Fragment DNA polymerase, or any combination thereof.
  • the polymerase should be adaptively adjusted to a polymerase corresponding to nucleic acid.
  • the fourth aspect of the present invention provides a sequencing complex, the sequencing complex comprising a nanopore and the nucleic acid polypeptide complex as described in the first aspect or the nucleic acid protein complex as described in the third aspect; the nucleic acid polypeptide complex or the nucleic acid protein complex can move axially relative to the nanopore;
  • the first nucleic acid fragment and the polypeptide sequence to be detected pass through the nanopore and move axially relative to the nanopore.
  • the nanopore is embedded in an electrically insulating film.
  • the nanopore is a transmembrane protein pore or a solid-state pore.
  • the transmembrane protein pore is selected from hemolysin, MspA, MspB, MspC, MspD, FraC, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, Phi29 connector protein, InvG, GspD or any combination thereof.
  • the polypeptide is a tag, an enzyme cleavage site, a signal peptide or a leader peptide, a detectable marker or any combination thereof.
  • the electrically insulating film is an amphiphilic film, a high molecular polymer film or any combination thereof.
  • the electrical insulating membrane is a phospholipid bilayer, a diblock copolymer or a triblock copolymer.
  • the polymerase synthesizes a double-stranded nucleic acid using the composite nucleic acid fragment or the single-stranded nucleic acid of the second nucleic acid fragment from the end of the double-stranded nucleic acid as a template, and pulls the polypeptide to be tested from the second electric field end through the nanopore to the first electric field end, generating and recording the current change in the nanopore.
  • the first nucleic acid fragment is used to provide negative charge, and provides a pulling force to pass through the nanopore under the action of the electric field force.
  • the hairpin structure of the composite nucleic acid fragment provides a double-stranded region and an exposed single-stranded region for the nucleic acid-polypeptide complex, which is conducive to the polymerase correctly identifying the starting site of double-stranded synthesis for amplification and sequencing.
  • the fourth nucleic acid fragment can be used as a switch for automated sequencing. Prevent the polymerase from starting to synthesize double-stranded DNA in advance. When the fourth nucleic acid fragment is separated from the composite nucleic acid fragment or the second nucleic acid fragment, the polymerase recognizes the exposed start site of amplification and amplification begins; when the fourth nucleic acid fragment is in a binding state with the composite nucleic acid fragment or the second nucleic acid fragment, the polymerase cannot recognize the start site of amplification and amplification cannot be performed.
  • the anchoring portion of the fourth nucleic acid fragment may be used to aggregate the complex to the periphery of the pore.
  • the third nucleic acid fragment is not directly connected to the second nucleic acid fragment by chemical bonds or enzymes, but is only connected to the second nucleic acid fragment by a hairpin structure formed by annealing.
  • the third nucleic acid fragment provides an annealing site, thereby combining the second nucleic acid fragment and the third nucleic acid fragment without using chemical/enzymatic connection.
  • the spacer is used to present a more obvious peak on the current curve with a smaller volume, so as to facilitate the determination of the starting position of the polypeptide signal.
  • the spacer is, for example, abasic deoxynucleoside, isp6, isp18, polyT, etc.
  • the first electric field end is the nanopore end in a negative electric field
  • the second electric field end is the nanopore end in a positive electric field
  • the fourth nucleic acid fragment is separated from the composite nucleic acid fragment or the second nucleic acid fragment under the action of an electric field force.
  • the electric field force is formed by applying a voltage, and the voltage is above 50 mV.
  • the voltage is 100 mV-300 mV.
  • the method further comprises step (3): analyzing the current change to identify the amino acid sequence of the polypeptide to be detected.
  • the sample processing module processes the polypeptide to be tested to form a nucleic acid-protein complex as described in the third aspect, and transfers it to the nanopore sequencing module; the nanopore sequencing module sequences the polypeptide to be tested under the action of the power module, and the detection and analysis module detects and analyzes the current changes during the sequencing process;
  • the sample processing module processes the polypeptide to be tested to form a nucleic acid-polypeptide complex as described in the first aspect, and transfers it to a nanopore sequencing module;
  • the nanopore sequencing module includes a polymerase, and sequences the polypeptide to be tested under the action of the power module, and the detection and analysis module detects and analyzes the current changes during the sequencing process.
  • a seventh aspect of the present invention provides a kit, the kit comprising a first nucleic acid fragment and a composite nucleic acid fragment, wherein, the composite nucleic acid fragment comprises a nucleic acid sequence that can form a hairpin structure, and after forming the hairpin structure, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be detected;
  • the first nucleic acid fragment can be linked to the polypeptide to be detected and the composite nucleic acid fragment in sequence.
  • the length of the single-stranded nucleic acid is greater than or equal to the length of the polypeptide to be detected.
  • the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, and the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure. After the hairpin structure is formed, the third nucleic acid fragment is complementary to a part of the second nucleic acid fragment to form a double-stranded nucleic acid, and the other part of the second nucleic acid fragment is a single-stranded nucleic acid.
  • the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence
  • the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, after the hairpin structure is formed, the two ends of the third nucleic acid fragment are complementary to each other to form a double-stranded nucleic acid
  • the second nucleic acid fragment is a single-stranded nucleic acid, and one end of the second nucleic acid fragment is connected to the double-stranded nucleic acid, and the other end of the second nucleic acid fragment is connected to the polypeptide to be tested.
  • the kit further comprises a fourth nucleic acid fragment, the 5' end nucleic acid sequence of the fourth nucleic acid fragment is complementary to at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment to form a double strand, and the 3' end of the fourth nucleic acid fragment is optionally connected to an anchoring portion.
  • the stability of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the stability of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.
  • the Tm value of the double-stranded nucleic acid of the hairpin structure of the composite nucleic acid fragment is higher than the Tm value of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.
  • the anchoring moiety is selected from any one or a combination of lipids, fatty acids, sterols, polymer materials, carbon nanotubes, polypeptides, proteins and/or amino acids.
  • the anchoring moiety is cholesterol, palmitate or tocopherol.
  • the first nucleic acid fragment, the composite nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment is DNA or a nucleic acid analog.
  • the nucleic acid analog is LNA or PNA.
  • the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are preferably single-stranded DNA.
  • sequence of the first nucleic acid fragment is shown as SEQ ID NO:2.
  • sequence of the second nucleic acid fragment is as shown in positions 2 to 55 of SEQ ID NO:1.
  • sequence of the third nucleic acid fragment is as shown in SEQ ID NO:4.
  • sequence of the fourth nucleic acid fragment is as shown in SEQ ID NO:5.
  • the kit further comprises any one or a combination of a polymerase, a nanopore, an electrical insulating membrane and a reaction buffer.
  • An eighth aspect of the present invention provides a use of the nucleic acid-polypeptide complex as described in the first aspect, the nucleic acid-protein complex as described in the third aspect, or the sequencing complex as described in the fourth aspect in nanopore sequencing.
  • the present invention can construct a DNA-polypeptide-DNA ternary complex and use negatively charged DNA as a guide sequence to pull polypeptides of different charges into the nanopore, thereby realizing the detection of any polypeptide sequence.
  • FIG1 is a schematic diagram of protein sequencing technology based on Hel308 rate control.
  • Figure 2 is a schematic diagram of peptide sequencing technology using Phi29 DNA polymerase to control the rate.
  • FIG. 3 is a schematic diagram of protein sequencing technology based on oligonucleotide-controlled protein and rate control.
  • FIG. 4 is a schematic diagram of protein sequencing technology based on MTA helicase rate control.
  • FIG5 is a schematic diagram of a protein sequencing process according to an embodiment of the present invention, wherein A is a schematic diagram of a sequencing complex; and B is a schematic diagram of a sequencing process.
  • Figure 6 is a chemical reaction process for connecting DNA1 to a polypeptide according to Example 1: DNA1 containing a maleimide modification at the 5' end is connected to the N-terminal cysteine residue of the polypeptide via a thiol-maleimide addition reaction to obtain a DP complex, and is further connected to DNA2 containing a DBCO modification at the 3' end via a click chemistry reaction with the azide-modified lysine on the C-terminal side chain of the polypeptide to obtain a DPD complex.
  • FIG. 7 is a chromatogram showing the purification of (a) DNA1-Peptide1, (b) DNA1-Peptide2, and (c) DNA1-Peptide3 by high performance liquid chromatography (HPLC) according to Example 1.
  • HPLC high performance liquid chromatography
  • FIG. 8 is a chromatogram showing the purification of (a) DNA1-Peptide1-DNA2, (b) DNA1-Peptide2-DNA2, and (c) DNA1-Peptide3-DNA2 by high performance liquid chromatography (HPLC) according to Example 2.
  • Figure 9 is a HPLC purity analysis chromatogram and mass spectrometry characterization diagram of DNA1-Peptide-DNA2 shown in Example 2, wherein (a) and (b) correspond to DNA1-Peptide1-DNA2, (c) and (d) correspond to DNA1-Peptide2-DNA2, and (e) and (f) correspond to DNA1-Peptide3-DNA2.
  • FIG. 10 is a diagram showing the basic principle of detecting the DPD complex using a nanopore according to Example 3.
  • FIG. 11 shows the nanopore test results of DNA 3 according to Example 3.
  • FIG. 12 shows the nanopore test results of DNA 3 according to Example 3.
  • FIG. 13 shows the nanopore test results of DNA1-Peptide1-DNA2 according to Example 4.
  • FIG. 14 shows the nanopore test results of DNA1-Peptide1-DNA2 according to Example 4.
  • FIG. 15 shows the nanopore test results of DNA1-Peptide2-DNA2 according to Example 5.
  • FIG. 16 shows the nanopore test result of DNA1-Peptide2-DNA2 according to Example 5.
  • FIG. 17 shows the nanopore test results of DNA1-Peptide3-DNA2 according to Example 6.
  • FIG. 18 shows the nanopore test results of DNA1-Peptide3-DNA2 according to Example 6.
  • nucleotide sequence or “nucleic acid sequence” as used herein refers to a polymeric form of nucleotides of any length, whether ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Therefore, this term includes double-stranded and single-stranded DNA, as well as RNA.
  • nucleic acid as used herein is a single-stranded nucleotide sequence in which the 3' and 5' ends on each nucleotide are connected by a phosphodiester bond. Polynucleotides can be composed of deoxyribonucleotide bases or ribonucleotide bases.
  • Nucleic acids can be synthesized and manufactured in vitro, or isolated from natural sources. Nucleic acids can further include modified DNA or RNA, such as DNA or RNA that has been methylated, or RNA that has been subjected to post-transcriptional modification, such as 5' capping with 7-methylguanosine, 3' processing such as cleavage and polyadenylation, and splicing.
  • modified DNA or RNA such as DNA or RNA that has been methylated, or RNA that has been subjected to post-transcriptional modification, such as 5' capping with 7-methylguanosine, 3' processing such as cleavage and polyadenylation, and splicing.
  • Nucleic acids can also include synthetic nucleic acids (XNA), such as hexitol nucleic acids (HNA), cyclohexene nucleic acids (CeNA), threose nucleic acids (TNA), glycerol nucleic acids (GNA), locked nucleic acids (LNA) and peptide nucleic acids (PNA).
  • XNA synthetic nucleic acids
  • HNA hexitol nucleic acids
  • CeNA cyclohexene nucleic acids
  • TAA threose nucleic acids
  • GNA glycerol nucleic acids
  • LNA locked nucleic acids
  • PNA peptide nucleic acids
  • the size of a nucleic acid is usually expressed as the number of nucleotides (nt) in the case of a single-stranded polynucleotide.
  • polypeptide and “peptide” are used interchangeably herein to refer to polymers of amino acid residues, as well as variants and synthetic analogs thereof. Thus, these terms apply to amino acid polymers in which one or more amino acid residues are synthetic non-naturally occurring amino acids, such as chemical analogs of corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers.
  • Polypeptides may also undergo maturation or post-translational modification processes, which may include, but are not limited to, glycosylation, proteolytic cleavage, lipidation, signal peptide cleavage, propeptide cleavage, phosphorylation, and the like.
  • Peptides may be prepared using recombinant techniques, such as by expressing recombinant or synthetic polynucleotides.
  • protein is used to describe a folded polypeptide having a secondary or tertiary structure.
  • a protein may consist of a single polypeptide or may include multiple polypeptides assembled to form a multimer.
  • a multimer may be a homo-oligomer or a hetero-oligomer.
  • a protein may be a naturally occurring or wild-type protein or a modified or non-naturally occurring protein.
  • a protein may differ from a wild-type protein, for example, by the addition, substitution or deletion of one or more amino acids.
  • anchoring refers to fixing a substance in a relatively stable manner within a structure or within a certain range on its surface by relying on polarity, intermolecular forces, chemical bonds, etc.
  • the present invention constructs a DNA-peptide-DNA ternary complex structure through thiol-maleimide addition reaction and azide-DBCO click chemistry reaction, and purifies it by HPLC to obtain a pure product, which is used as a substrate for polypeptide nanopore sequencing.
  • DNA1 is a DNA fragment amplified by Phi29 DNA polymerase
  • DNA2 is a DNA fragment that guides the polypeptide into the pore.
  • the synthesized DNA-peptide-DNA complex is combined with the hairpin DNA fragment, the protective DNA fragment and the Phi29 DNA polymerase through annealing and co-incubation to form a sequencing complex.
  • the complex is mixed with the dNTP solution and added to the solution chamber of the sequencing chip. Since there is a sterol structure at the end of DNA5, the entire annealed library will be anchored on the membrane to connect it to the nanopore. Under the action of the electric field force, the chain containing DNA-peptide-DNA will move toward the trans end under the action of the electric field force, thereby tearing apart the double-stranded interaction between DNA5 and DNA1, exposing the 3' end of DNA4 to start synthesizing the DNA double strand. Through the reaction of continuously synthesizing the DNA double strand, the polypeptide is stably pulled through the nanopore and an electrical signal is obtained (the schematic diagram of the process is shown in Figure 5).
  • DNA1 (SEQ ID NO: 1):
  • /Maleimide/NAGAACTTTAGAACTTTTCAGATCTCACTATCGCATTCTCATGCAGGTCGTAGCC wherein /Maleimide/ can be connected to the C-terminus of the polypeptide.
  • DNA2 (SEQ ID NO:2):
  • DNA3 (SEQ ID NO:3):
  • DNA4 (SEQ ID NO:4):
  • N (abasic spacer) is a spacer of abasic deoxynucleoside, and its structure is shown below:
  • LYS(N3) is an azidation modification of the Lys side chain (the azide group replaces the original amino group), and its structural formula is as follows:
  • DNA3 is the control sequence (Control) used to verify the present invention, which simulates the combined sequence of DNA2+DNA1. Specifically, 25-mer polyT is used to simulate DNA2, and then the complete DNA1 sequence except the maleimide modification group is added to its 3' end.
  • DNA4 is used to anneal with DNA3 or DNA1 in DNA1-Peptide-DNA2 to form a hairpin-shaped double-stranded structure.
  • DNA5 is used to anneal with DNA3 or DNA1 in DNA1-Peptide-DNA2 to form a protective fragment of a 15bp double-stranded structure, which prevents the polymerase from starting to synthesize the DNA double strand prematurely.
  • DNA1 is the amplified DNA sequence shown in FIG5, and its sequence is SEQ ID NO: 1.
  • Peptide1, Peptide2, and Peptide 3 are the peptides to be tested shown in FIG5, and their sequences are SEQ ID NO: 6-8, respectively.
  • the preparation method of each DP-linked product is as follows:
  • 50 mM EDTA 50 mM EDTA
  • DNA1-Peptide (DP) ligation product fraction was purified and collected using a 1260 Infinity II high performance liquid chromatograph (Agilent), and the fraction was lyophilized overnight for subsequent ligation reactions.
  • Figure 6 shows the chemical reaction process of connecting DNA1 to peptides.
  • Figure 7 shows the HPLC purification chromatograms of (a) DNA1-Peptide1, (b) DNA1-Peptide2, and (c) DNA1-Peptide3, where the highest peak between 10 and 17 minutes corresponds to the DP complex fractions taken.
  • DNA1-Peptide1-DNA2 is the guide DNA sequence shown in FIG5 , and its sequence is SEQ ID NO: 2.
  • the preparation method of each is as follows:
  • DNA2 powder was fully dissolved in pure water, and its concentration was quantitatively detected by Qubit ssDNA detection kit (ThermoFisher).
  • the DP connection product and DNA2 were in a molar ratio of 1:7.5.
  • the final reaction system was adjusted to 100 ⁇ L with pure water, vortexed and mixed, and the metal bath was 25°C for 16 hours.
  • the 1260 Infinity II high performance liquid chromatograph (Agilent) was used for purification and the DNA1-Peptide-DNA2 (DPD) connection product fraction was collected. The concentration of the fractions was quantitatively detected using the Qubit ssDNA detection kit (ThermoFisher). A small amount of the purified product was retained in an EP tube for HPLC purity identification and mass spectrometry characterization. The remaining sample was packaged, freeze-dried overnight, and stored at -80°C.
  • Figure 8 shows (a) DNA1-Peptide1-DNA2, (b) DNA1-Peptide2-DNA2, (c) DNA1- HPLC purification chromatogram of Peptide3-DNA2, in which the two highest peaks between 14 and 18 minutes correspond to the DPD complex fractions taken.
  • Figure 9 shows (a) HPLC purity analysis chromatogram and (b) mass spectrum of DNA1-Peptide1-DNA2; (c) HPLC purity analysis chromatogram and (d) mass spectrum of DNA1-Peptide2-DNA2; (e) HPLC purity analysis chromatogram and (f) mass spectrum of DNA1-Peptide3-DNA2.
  • HPLC purity analysis shows that the purity of the three DPD complexes is above 85%, and the highest can exceed 95%.
  • the experimental molecular weight shown in the mass spectrum also matches the theoretical molecular weight of the three DPD complexes, respectively, proving that three DPD conjugates were obtained in this embodiment.
  • DNA3 is SEQ ID NO: 3;
  • DNA4 is the hairpin fragment shown in Figure 5, and its sequence is SEQ ID NO: 4;
  • DNA5 is the protection fragment shown in Figure 5, and its sequence is SEQ ID NO: 5.
  • DNA3, DNA4 and DNA5 powders were fully dissolved in pure water, and their concentrations were quantitatively detected by Qubit ssDNA detection kit (ThermoFisher).
  • DNA3, DNA4 and DNA5 aqueous solutions were added to EP tubes at a molar ratio of 1:1:2, and annealed at room temperature for 20 minutes.
  • the concentration of the annealed complex was quantitatively detected using Qubit dsDNA HS detection kit (ThermoFisher), and then an appropriate amount of annealed complex was taken and incubated with 1 ⁇ L Phi29 DNA polymerase (NEB) in a total volume of 8 ⁇ L 1 ⁇ PBS for 30 minutes, wherein the final concentration of the annealed complex in the incubation mixture was 75nM.
  • NEB Phi29 DNA polymerase
  • a patch clamp amplifier was used to collect current signals.
  • the electrolytic cell was divided into two chambers by a Teflon membrane with micron-sized pores (diameter 50-200 ⁇ m) in the middle: a cis chamber and a trans chamber; a pair of Ag/AgCl electrodes were placed in each chamber; after a layer of bimolecular phospholipid membrane was formed at the micropores of the two chambers, the nanopore protein MspA was added; after a single nanopore protein MspA was inserted into the phospholipid membrane, the above-mentioned co-incubation mixture and 30 ⁇ L 10Mm dNTP (NEB) were added, 180mV was applied, and the current data was recorded.
  • NNB 10Mm dNTP
  • Figures 11 and 12 show the signal data of two parallel DNA3 nanopore experiments, where (a) represents the overall through-pore signal from the opening current to the 25-mer polyT signal at the 5' end of DNA3. It can be seen that under a fixed voltage of 180mV, the nanopore current first drops from about 150pA to about 50pA from the opening current, and then rises back to about 70-80pA and oscillates for a period of time. This signal corresponds to the pore entry signal of the annealing complex, but at this time the protective fragment DNA5 has not yet detached from the complex. When DNA5 is torn apart under the action of the electric field force, the electrical signal immediately presents three 40-50pA signal valleys as shown in (b), followed by a signal peak of about 100pA.
  • This characteristic signal interval corresponds to the abasic spacer in the DNA3 sequence and a relatively continuous repeating sequence at its 3' end.
  • the nanopore current drops to a stable platform of about 60pA, corresponding to the 25-mer polyT at the 5' end of DNA3.
  • DNA4 is the hairpin fragment shown in Figure 5, and its sequence is SEQ ID NO: 4;
  • DNA5 is the protection fragment shown in Figure 5, and its sequence is SEQ ID NO: 5.
  • the concentration of the annealing complex was quantitatively detected using the Qubit dsDNA HS detection kit (ThermoFisher), and then an appropriate amount of the annealing complex was taken and incubated with 1 ⁇ L of Phi29 DNA polymerase (NEB) in a total volume of 8 ⁇ L of 1 ⁇ PBS for 30 minutes, wherein the final concentration of the annealing complex in the incubation mixture was 75 nM.
  • a patch clamp amplifier was used to collect current signals.
  • the electrolytic cell was divided into two chambers by a Teflon membrane with micron-sized pores (diameter 50-200 ⁇ m) in the middle: a cis chamber and a trans chamber; a pair of Ag/AgCl electrodes were placed in each chamber; after a layer of bimolecular phospholipid membrane was formed at the micropores of the two chambers, nanopore protein was added; after a single nanopore protein was inserted into the phospholipid membrane, the above-mentioned co-incubation mixture and 30 ⁇ L 10Mm dNTP (NEB) were added, 180mV was applied, and the current data was recorded.
  • a Teflon membrane with micron-sized pores diameter 50-200 ⁇ m
  • Figures 13 and 14 show two parallel DNA1-Peptide1-DNA2 nanopore experimental signal data, respectively, where (a) represents the nanopore current data from the opening current to the current signal fluctuating steadily in the 80-100pA interval. It can be seen that, similar to Figures 11 and 12, the pore entry signal of the annealing complex and the characteristic signal intervals of the three current signal valleys are successfully identified.
  • DNA1-Peptide1-DNA2 since in DNA1-Peptide1-DNA2, the peptide Peptide1 is sequenced immediately after DNA1, while in Example 3, the 25-mer polyT is sequenced in DNA3, so the current fluctuation trends after the characteristic signal interval are obviously different: DNA1-Peptide1-DNA2 generates a current of "40pA ⁇ 70pA ⁇ 80pA ⁇ 70pA ⁇ 80pA", as shown in Figure 13 (b) and Figure 14 (b); the polyT of DNA3 generates a relatively stable current of about 60pA.
  • the characteristic signal interval with three signal valleys similar to that shown in Example 3 generated by DNA1-Peptide1-DNA2 after passing through the nanopore, and the fluctuation signal generated by the above-mentioned peptide Peptide1 after passing through the nanopore is different from that when 25-mer polyT is pierced, which proves that negatively charged polypeptides can be captured by nanopores and detected under the speed control of Phi29 DNA polymerase.
  • DNA1-Peptide1-DNA2 was replaced by DNA1-Peptide2-DNA2.
  • Figures 15 and 16 show two parallel DNA1-Peptide2-DNA2 nanopore experimental signal data, where (a) represents the nanopore current data from the opening current to the current signal fluctuating steadily in the 80-100pA interval. It can be seen that, similar to Figures 11 and 12, the pore entry signal of the annealing complex and the characteristic signal intervals of the three current signal valleys are successfully identified. Similar to Figures 13 and 14, since the sequence of DNA1 in DNA1-Peptide2-DNA2 is consistent with the 3' end sequence of DNA3, a characteristic signal interval of "three consecutive signal valleys" will be generated when the corresponding DNA3 in Example 3 passes through the nanopore, and the third signal valley of the characteristic interval can be roughly estimated as the starting end of the polypeptide signal.
  • DNA1-Peptide2-DNA2 since in DNA1-Peptide2-DNA2, the peptide Peptide2 is sequenced immediately after DNA1, while in Example 3, the 25-mer polyT is sequenced in DNA3, so the current fluctuation trends after the characteristic signal interval are obviously different: DNA1-Peptide2-DNA2 generates a current of "40pA ⁇ 60pA ⁇ 40pA ⁇ 70pA ⁇ 85pA" as shown in Figures 15(b) and 16(b); the polyT of DNA3 generates a relatively stable current of about 60pA.
  • the characteristic interval with three signal valleys similar to that shown in Example 3 generated by DNA1-Peptide2-DNA2 after passing through the nanopore, as well as the fluctuation signal generated by the above-mentioned peptide Peptide2 after passing through the nanopore, which is different from that when 25-mer polyT is pierced, proves that electrically neutral peptides can be captured by nanopores and detected under the speed control of Phi29 DNA polymerase.
  • DNA1-Peptide1-DNA2 was replaced by DNA1-Peptide3-DNA2.
  • Figures 17 and 18 show two parallel DNA1-Peptide3-DNA2 nanopore experimental signal data, respectively, where (a) represents the nanopore current data from the opening current to the current signal fluctuating steadily in the 30pA interval. It can be seen that, similar to Figures 11 and 12, the pore entry signal of the annealing complex and the characteristic signal intervals of the three current signal valleys were successfully identified. Similar to Figures 13-16, since the sequence of DNA1 in DNA1-Peptide3-DNA2 is consistent with the 3' end sequence of DNA3 in Example 3, a characteristic signal interval of "three consecutive signal valleys" will be generated when the corresponding DNA3 in Example 3 passes through the nanopore, and the third signal valley of the characteristic signal interval can be roughly estimated as the starting end of the polypeptide signal.
  • DNA1-Peptide3-DNA2 since in DNA1-Peptide3-DNA2, the peptide Peptide3 is sequenced immediately after DNA1, while in Example 3, the 25-mer polyT is sequenced in DNA3, the current fluctuation trends after the characteristic signal interval are obviously different: DNA1-Peptide3-DNA2 generates a current of "50pA ⁇ 85pA ⁇ 95pA ⁇ 120pA ⁇ 30pA", as shown in Figure 17 (b) and Figure 18 (b); DNA3's polyT generates a relatively stable current of about 60pA. The current generated by DNA1-Peptide3-DNA2 after passing through the nanopore is similar to that in Example 3.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Organic Chemistry (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Urology & Nephrology (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Hematology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • General Engineering & Computer Science (AREA)
  • Cell Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided are a nucleic acid polypeptide complex and the use thereof in peptide sequencing. The nucleic acid polypeptide complex comprises a first nucleic acid fragment, a polypeptide to be tested and a composite nucleic acid fragment which are connected in sequence, wherein the composite nucleic acid fragment comprises a nucleic acid sequence capable of forming a hairpin structure; after the hairpin structure is formed, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be tested. The detection of any polypeptide sequence can be achieved by means of guiding the polypeptide with different charges by the nucleic acid sequence to pass through the nanopore.

Description

核酸多肽复合物及其在肽测序中的应用Nucleic acid-peptide complex and its application in peptide sequencing 技术领域Technical Field

本发明属于测序技术领域,具体涉及一种核酸多肽复合物及其在肽测序中的应用。The present invention belongs to the technical field of sequencing, and in particular relates to a nucleic acid-polypeptide complex and its application in peptide sequencing.

背景技术Background Art

蛋白质组学是人类基因组测序完成后生命科学研究领域的核心内容之一,充分解析各类蛋白质的组成,能促使人类更深入地了解于蛋白质相关的细胞调控机制,从而掌握蛋白与人类疾病之间的关联。因此,为了精确获得各种蛋白质序列信息,开发单分子蛋白测序技术就显得格外具有价值。Proteomics is one of the core areas of life science research after the completion of human genome sequencing. Fully analyzing the composition of various proteins can enable humans to have a deeper understanding of the cell regulatory mechanisms related to proteins, thereby understanding the relationship between proteins and human diseases. Therefore, in order to accurately obtain various protein sequence information, the development of single-molecule protein sequencing technology is particularly valuable.

目前,利用纳米孔对DNA的测序技术已经日趋成熟,其原理是在电场力的作用下,与解旋酶结合的DNA分子链会以一个相对恒定的速度通过纳米孔,由于不同碱基通过纳米孔时对电流的阻塞效果不同,通过分析这一过程中电流变化与碱基的对应关系,就可以达到测序的目的。类似的,测量多肽链通过纳米孔时电流信号的变化并分析其与不同氨基酸的对应关系,也可以达到多肽序列读取的目的。At present, the technology of DNA sequencing using nanopores has become increasingly mature. Its principle is that under the action of electric field force, the DNA molecule chain bound to the helicase will pass through the nanopore at a relatively constant speed. Since different bases have different blocking effects on the current when passing through the nanopore, the purpose of sequencing can be achieved by analyzing the corresponding relationship between the current change and the base in this process. Similarly, measuring the change of the current signal when the polypeptide chain passes through the nanopore and analyzing its corresponding relationship with different amino acids can also achieve the purpose of polypeptide sequence reading.

1994年,Sutherland,Todd C.等人利用纳米孔检测技术探索了多肽结构的检测(Sutherland,Todd C.,et al."Structure of peptides investigated by nanopore analysis."Nano letters 4.7(2004):1273-1277.)。2018年,Ji,Zhouxiang等人基于噬菌体马达蛋白通道构建的纳米孔传感器可实现单氨基酸精度差异的检测(Ji,Zhouxiang,et al."Nano-channel of viral DNA packaging motor as single pore to differentiate peptides with single amino acid difference."Biomaterials 182(2018):227-233.)。In 1994, Sutherland, Todd C. et al. used nanopore detection technology to explore the detection of peptide structure (Sutherland, Todd C., et al. "Structure of peptides investigated by nanopore analysis." Nano letters 4.7 (2004): 1273-1277.). In 2018, Ji, Zhouxiang et al. constructed a nanopore sensor based on the bacteriophage motor protein channel to detect single amino acid accuracy differences (Ji, Zhouxiang, et al. "Nano-channel of viral DNA packaging motor as single pore to differentiate peptides with single amino acid difference." Biomaterials 182 (2018): 227-233.).

尽管通过孔蛋白改造或生化实验条件优化等手段可以降低部分多肽穿孔的速度,从而提高多肽测序的准确率。但控制多肽均匀地穿过纳米孔通道,这对于多肽测序的精度和准确率都是至关重要的。Although the speed of some peptides passing through the nanopore can be reduced by modifying the pore protein or optimizing the biochemical experimental conditions, thereby improving the accuracy of peptide sequencing, it is crucial to control the uniformity of the peptides passing through the nanopore channel for the precision and accuracy of peptide sequencing.

Dekker,Cees实验室于2021发表了基于Hel308控速的蛋白质测序文章(Brinkerhoff,Henry,et al."Multiple rereads of single proteins at single–amino acid resolution using nanopores."Science 374.6574(2021):1509-1513.)。如图1所示,首先通过化学偶联的方式将多肽和核酸偶联,形成多肽-核酸复合物。与Hel308孵育形成peptide-DNA-hel308复合物;在电场力作用下复合物被纳米孔捕获后,互补的DNA链分离;DNA在Hel308作用下从膜下方至上方移动;当多肽位于孔sensor区域,序列被读取。该方法关键点主要是利用Hel308的易位活性实现控速的目的。In 2021, Dekker and Cees' laboratory published an article on protein sequencing based on Hel308 rate control (Brinkerhoff, Henry, et al. "Multiple rereads of single proteins at single-amino acid resolution using nanopores." Science 374.6574 (2021): 1509-1513.). As shown in Figure 1, the polypeptide and nucleic acid are first coupled by chemical coupling to form a polypeptide-nucleic acid complex. Incubated with Hel308 to form a peptide-DNA-hel308 complex; after the complex is captured by the nanopore under the action of the electric field force, the complementary DNA chains are separated; the DNA moves from the bottom to the top of the membrane under the action of Hel308; when the polypeptide is located in the sensor area of the pore, the sequence is read. The key point of this method is to use the translocation activity of Hel308 to achieve the purpose of rate control.

2021年,南京大学黄硕团队报道了利用Phi29 DNA聚合酶控速的多肽测序文章(Yan, Shuanghong,et al."Single molecule ratcheting motion of peptides in a Mycobacterium smegmatis Porin A(MspA)nanopore."Nano letters,21.15(2021):6703-6710.)。如图2所示,Phi29 DNA聚合酶与合成的ssDNA-多肽复合物孵育形成复合物。利用聚合酶对DNA扩增从而间接实现多肽运动的控速。In 2021, Huang Shuo's team from Nanjing University reported an article on peptide sequencing using Phi29 DNA polymerase to control the rate (Yan, Shuanghong, et al. "Single molecule ratcheting motion of peptides in a Mycobacterium smegmatis Porin A (MspA) nanopore." Nano letters, 21.15 (2021): 6703-6710.). As shown in Figure 2, Phi29 DNA polymerase is incubated with the synthesized ssDNA-peptide complex to form a complex. The polymerase is used to amplify DNA to indirectly control the speed of polypeptide movement.

2021年,Oxford Nanopore Technologies Limited(WO2021/111125A1)公开了基于寡聚核苷酸控制蛋白控速的蛋白测序方案。其设计是首先合成adaptor-peptide-dsDNA tail复合物,然后将单链DNA的部分退火形成双链DNA,再将其与寡聚核苷酸控制蛋白结合后,通过纳米孔进行电信号的读取(如图3所示)。该方法可以解决中性和正电性多肽无法穿孔的问题。In 2021, Oxford Nanopore Technologies Limited (WO2021/111125A1) disclosed a protein sequencing solution based on oligonucleotide-controlled protein speed control. The design is to first synthesize the adaptor-peptide-dsDNA tail complex, then partially anneal the single-stranded DNA to form double-stranded DNA, and then combine it with the oligonucleotide control protein to read the electrical signal through the nanopore (as shown in Figure 3). This method can solve the problem that neutral and positively charged peptides cannot penetrate the pore.

Chen,Zhijie等人(Chen,Zhijie,et al."Controlled movement of ssDNA conjugated peptide through Mycobacterium smegmatis porin A(MspA)nanopore by a helicase motor for peptide sequencing application."Chemical science 12.47(2021):15750-15756.)公开了类似的基于MTA解旋酶控速的蛋白测序方案。其设计也是首先合成一段ssDNA-peptide-ssDNA(polyT)复合物,在无需退火形成双链的情况下与MTA解旋酶结合,通过MTA解旋酶在ssDNA链上有规律的滑动带动多肽片段通过MspA-M2纳米孔蛋白,测得测序电信号(如图4所示)。该方法可以解决中性多肽无法穿孔的问题。Chen, Zhijie, et al. (Chen, Zhijie, et al. "Controlled movement of ssDNA conjugated peptide through Mycobacterium smegmatis porin A (MspA) nanopore by a helicase motor for peptide sequencing application." Chemical science 12.47 (2021): 15750-15756.) disclosed a similar protein sequencing scheme based on MTA helicase rate control. Its design is also to first synthesize a ssDNA-peptide-ssDNA (polyT) complex, bind to the MTA helicase without annealing to form a double strand, and drive the polypeptide fragment through the MspA-M2 nanopore protein through the regular sliding of the MTA helicase on the ssDNA chain to measure the sequencing electrical signal (as shown in Figure 4). This method can solve the problem that neutral polypeptides cannot penetrate holes.

Dekker,Cees公布的蛋白测序方法,由于多肽首先需要被纳米孔所捕获,因此其采用了侧链全部为负电荷的氨基酸(天冬氨酸和谷氨酸)组成的多肽,因此该方案无法进行中性、正电性的多肽检测。黄硕团队报道的方法也有类似的问题。The protein sequencing method published by Dekker and Cees uses peptides composed of amino acids with all negatively charged side chains (aspartic acid and glutamic acid) because the peptides need to be captured by the nanopore first. Therefore, this scheme cannot detect neutral and positively charged peptides. The method reported by Huang Shuo's team also has similar problems.

Oxford Nanopore Technologies Limited专利文献WO2021/111125A1公开的技术方案和Chen,Zhijie文章中公开的技术方案尽管可以解决中性多肽的穿孔问题,但是,大多天然野生的解旋酶在解旋(unwinding)和易位(translocation)时的活性差别较大,解旋和穿孔的速度难以保持恒定。因而,要想获得高分辨率的多肽测序信号,理想情况是每解旋一个碱基,则向前移动/易位一个碱基。因此,要获得真正能够用于多肽测序的效果好的解旋酶,需要利用蛋白工程手段进行大量突变体的筛选,难度较大。Although the technical solutions disclosed in Oxford Nanopore Technologies Limited patent document WO2021/111125A1 and the technical solutions disclosed in Chen, Zhijie's article can solve the problem of neutral peptide perforation, most natural wild helicases have large differences in activity during unwinding and translocation, and the speeds of unwinding and perforation are difficult to keep constant. Therefore, in order to obtain high-resolution peptide sequencing signals, the ideal situation is to move forward/translocate one base for every base unwound. Therefore, in order to obtain a truly effective helicase that can be used for peptide sequencing, it is necessary to use protein engineering methods to screen a large number of mutants, which is difficult.

发明内容Summary of the invention

本发明所要解决的技术问题是现有技术缺少能够实现任意多肽穿孔测序的方法的缺陷,提供了一种核酸多肽复合物及其在肽测序中的应用。本发明的核酸多肽解决了基于聚合酶的蛋白质测序路线中只允许负电性的多肽穿孔的问题,实现了任意电性(正、负、中性)的多肽穿孔测序。The technical problem to be solved by the present invention is the defect that the prior art lacks a method capable of realizing perforation sequencing of any polypeptide, and provides a nucleic acid polypeptide complex and its application in peptide sequencing. The nucleic acid polypeptide of the present invention solves the problem that only negatively charged polypeptides are allowed to be perforated in the polymerase-based protein sequencing route, and realizes perforation sequencing of polypeptides of any charge (positive, negative, neutral).

本发明通过以下技术方案解决上述技术问题。 The present invention solves the above technical problems through the following technical solutions.

本发明的第一方面提供一种核酸多肽复合物,所述核酸多肽复合物包括依次连接的第一核酸片段、待测多肽和复合核酸片段;The first aspect of the present invention provides a nucleic acid-polypeptide complex, wherein the nucleic acid-polypeptide complex comprises a first nucleic acid fragment, a polypeptide to be detected, and a composite nucleic acid fragment connected in sequence;

其中,所述复合核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后复合核酸片段包含双链核酸和单链核酸,所述单链核酸的一端与所述双链核酸连接,所述单链核酸的另一端与所述待测多肽连接;Wherein, the composite nucleic acid fragment comprises a nucleic acid sequence that can form a hairpin structure, and after forming the hairpin structure, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be detected;

本发明一些优选的实施方案中,所述单链核酸的长度大于或等于所述待测多肽的长度。In some preferred embodiments of the present invention, the length of the single-stranded nucleic acid is greater than or equal to the length of the polypeptide to be detected.

本发明一些实施方案中,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段与所述第二核酸片段中的一部分互补形成双链核酸,所述第二核酸片段中的另一部分为单链核酸。In some embodiments of the present invention, the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, and the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure. After the hairpin structure is formed, the third nucleic acid fragment is complementary to a part of the second nucleic acid fragment to form a double-stranded nucleic acid, and the other part of the second nucleic acid fragment is a single-stranded nucleic acid.

在上述方案中,所述复合核酸片段中,第二核酸片段末端和第三核酸片段末端的连接处可以通过化学连接、酶连接等方式形成共价连接,也可以不进行共价连接反应,由于第三核酸片段与第二核酸片段间存在的部分双链形成类似于“订书钉”的作用,使两者形成稳定的复合核酸片段。In the above scheme, in the composite nucleic acid fragment, the connection between the end of the second nucleic acid fragment and the end of the third nucleic acid fragment can form a covalent connection by chemical connection, enzyme connection, etc., or no covalent connection reaction is required. Due to the partial double-stranded structure between the third nucleic acid fragment and the second nucleic acid fragment, the two form a stable composite nucleic acid fragment.

本发明另一些实施方案中,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段的两个末端彼此互补形成双链核酸,所述第二核酸片段为单链核酸,且所述第二核酸片段的一端与所述双链核酸连接,所述第二核酸片段的另一端与所述待测多肽连接。In other embodiments of the present invention, the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, after the hairpin structure is formed, the two ends of the third nucleic acid fragment are complementary to each other to form a double-stranded nucleic acid, the second nucleic acid fragment is a single-stranded nucleic acid, and one end of the second nucleic acid fragment is connected to the double-stranded nucleic acid, and the other end of the second nucleic acid fragment is connected to the polypeptide to be tested.

在上述方案中,所述复合核酸片段中,第二核酸片段末端和第三核酸片段末端的连接处需要以化学、生物、酶等连接方式形成共价键进行连接,以形成稳定的复合核酸片段。In the above scheme, in the composite nucleic acid fragment, the junction between the second nucleic acid fragment end and the third nucleic acid fragment end needs to be connected by chemical, biological, enzymatic or other connection methods to form a covalent bond to form a stable composite nucleic acid fragment.

本发明一些实施方案中,所述核酸多肽复合物还包括第四核酸片段,所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成双链,所述第四核酸片段的3’端可选地连接锚定部分。In some embodiments of the present invention, the nucleic acid-polypeptide complex further includes a fourth nucleic acid fragment, the 5' end nucleic acid sequence of the fourth nucleic acid fragment is complementary to at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment to form a double-stranded chain, and the 3' end of the fourth nucleic acid fragment is optionally connected to an anchoring portion.

本发明一些优选的实施方案中,形成发卡式结构后,所述复合核酸片段的发卡式结构的双链核酸的稳定性高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的稳定性。In some preferred embodiments of the present invention, after the hairpin structure is formed, the stability of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the stability of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.

本发明另一些优选的实施方案中,所述复合核酸片段的发卡式结构的双链核酸的Tm值高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的Tm值。 In other preferred embodiments of the present invention, the Tm value of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the Tm value of the double-stranded nucleic acid formed by the complementarity of the 5' end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.

本发明一些实施方案中,所述第一核酸片段的长度为5~200nt,优选地为10~50nt;所述复合核酸片段的长度为80~2000nt,优选地为50~200nt。In some embodiments of the present invention, the length of the first nucleic acid fragment is 5 to 200 nt, preferably 10 to 50 nt; the length of the composite nucleic acid fragment is 80 to 2000 nt, preferably 50 to 200 nt.

本发明一些实施方案中,所述第二核酸片段的长度为40~1000nt,优选地为50~200nt。In some embodiments of the present invention, the length of the second nucleic acid fragment is 40 to 1000 nt, preferably 50 to 200 nt.

本发明一些实施方案中,所述第三核酸片段的长度为40~1000nt,优选地为50~200nt。In some embodiments of the present invention, the length of the third nucleic acid fragment is 40 to 1000 nt, preferably 50 to 200 nt.

本发明一些实施方案中,所述第四核酸片段的长度为20-500nt,优选地为25-100nt。In some embodiments of the present invention, the length of the fourth nucleic acid fragment is 20-500 nt, preferably 25-100 nt.

本发明一些实施方案中,所述第一核酸片段、复合核酸片段、第二核酸片段、第三核酸和/或第四核酸片段为DNA或核酸类似物。In some embodiments of the present invention, the first nucleic acid fragment, the composite nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment is DNA or a nucleic acid analog.

本发明一些优选的实施方案中,所述核酸类似物为LNA或PNA。In some preferred embodiments of the present invention, the nucleic acid analog is LNA or PNA.

本发明另一些优选的实施方案中,所述第一核酸片段、第二核酸片段、第三核酸和/或第四核酸片段为单链DNA。In other preferred embodiments of the present invention, the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are single-stranded DNA.

本发明一些实施方案中,所述锚定部分选自脂质、脂肪酸、甾醇、碳纳米管、多肽、蛋白质和/或氨基酸的任意其一或组合;优选地为胆固醇、棕榈酸酯或生育酚。In some embodiments of the present invention, the anchoring moiety is selected from any one or a combination of lipids, fatty acids, sterols, carbon nanotubes, polypeptides, proteins and/or amino acids; preferably cholesterol, palmitate or tocopherol.

本发明一些具体实施方案中,所述第一核酸片段的序列如SEQ ID NO:2所示。In some specific embodiments of the present invention, the sequence of the first nucleic acid fragment is shown as SEQ ID NO:2.

本发明一些具体实施方案中,所述第二核酸片段的序列如SEQ ID NO:1的第2位~第55位所示。In some specific embodiments of the present invention, the sequence of the second nucleic acid fragment is as shown in positions 2 to 55 of SEQ ID NO:1.

本发明一些具体实施方案中,所述第三核酸片段的序列如SEQ ID NO:4所示。In some specific embodiments of the present invention, the sequence of the third nucleic acid fragment is as shown in SEQ ID NO:4.

本发明一些具体实施方案中,所述第四核酸片段的序列如SEQ ID NO:5所示。In some specific embodiments of the present invention, the sequence of the fourth nucleic acid fragment is as shown in SEQ ID NO:5.

本发明的第二方面提供一种制备如第一方面所述的核酸多肽复合物的方法,所述方法包括使5’端经马来酰胺修饰的所述复合核酸片段或第二核酸片段与待测多肽通过巯基-马来酰胺加成反应获得所述核酸多肽复合物的步骤;The second aspect of the present invention provides a method for preparing the nucleic acid-polypeptide complex as described in the first aspect, the method comprising the steps of subjecting the composite nucleic acid fragment or the second nucleic acid fragment modified with maleamide at the 5' end to a test polypeptide through a thiol-maleamide addition reaction to obtain the nucleic acid-polypeptide complex;

和/或,所述方法包括使3’端经DBCO修饰的第一核酸片段与待测多肽通过叠氮-DBCO点击化学反应获得所述核酸多肽复合物的步骤。And/or, the method includes the step of obtaining the nucleic acid-polypeptide complex by reacting the first nucleic acid fragment whose 3' end is modified with DBCO with the polypeptide to be tested through an azide-DBCO click chemistry reaction.

本发明的第三方面提供一种核酸蛋白复合物,所述核酸蛋白复合物包括聚合酶和如第一方面所述的核酸多肽复合物;The third aspect of the present invention provides a nucleic acid-protein complex, wherein the nucleic acid-protein complex comprises a polymerase and the nucleic acid-polypeptide complex as described in the first aspect;

其中,所述聚合酶结合于所述核酸多肽复合物中单链核酸与双链核酸交界处的单链核酸上。Wherein, the polymerase binds to the single-stranded nucleic acid at the interface between the single-stranded nucleic acid and the double-stranded nucleic acid in the nucleic acid-polypeptide complex.

本发明一些实施方案中,所述聚合酶为具有以扩增DNA单链为模板进行DNA双链合成的聚合酶。In some embodiments of the present invention, the polymerase is a polymerase capable of synthesizing double-stranded DNA using the amplified single-stranded DNA as a template.

本发明一些较佳实施方案中,所述聚合酶为DNA聚合酶。In some preferred embodiments of the present invention, the polymerase is a DNA polymerase.

本发明中,所述聚合酶例如为:Bst DNA聚合酶、SD DNA聚合酶、phi29 DNA聚合 酶、Bsu Large Fragment DNA聚合酶、Klenow Fragment DNA聚合酶或其任意组合。In the present invention, the polymerase is, for example, Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, enzyme, Bsu Large Fragment DNA polymerase, Klenow Fragment DNA polymerase, or any combination thereof.

当所述第一核酸片段、第二核酸片段、第三核酸片段、第四核酸片段和/或复合核酸片段不为DNA时,所述聚合酶应当适应性调整为核酸对应的聚合酶。When the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid fragment, the fourth nucleic acid fragment and/or the composite nucleic acid fragment is not DNA, the polymerase should be adaptively adjusted to a polymerase corresponding to nucleic acid.

本发明的第四方面提供一种测序复合物,所述测序复合物包含纳米孔和如第一方面所述的核酸多肽复合物或者如第三方面所述的核酸蛋白复合物;所述核酸多肽复合物或所述核酸蛋白复合物能相对于所述纳米孔轴向移动;The fourth aspect of the present invention provides a sequencing complex, the sequencing complex comprising a nanopore and the nucleic acid polypeptide complex as described in the first aspect or the nucleic acid protein complex as described in the third aspect; the nucleic acid polypeptide complex or the nucleic acid protein complex can move axially relative to the nanopore;

其中,所述核酸多肽复合物或所述核酸蛋白复合物中,所述第一核酸片段和所述待测多肽序列穿过纳米孔并相对纳米孔轴向移动。Wherein, in the nucleic acid-polypeptide complex or the nucleic acid-protein complex, the first nucleic acid fragment and the polypeptide sequence to be detected pass through the nanopore and move axially relative to the nanopore.

本发明一些实施方案中,所述纳米孔嵌在电绝缘膜中。In some embodiments of the present invention, the nanopore is embedded in an electrically insulating film.

本发明一些实施方案中,所述纳米孔为跨膜蛋白孔或固态孔。In some embodiments of the present invention, the nanopore is a transmembrane protein pore or a solid-state pore.

本发明一些较佳实施方案中,所述跨膜蛋白孔选自溶血素、MspA、MspB、MspC、MspD、FraC、ClyA、PA63、CsgG、CsgD、XcpQ、SP1、Phi29连接器蛋白、InvG、GspD或其任意组合。In some preferred embodiments of the present invention, the transmembrane protein pore is selected from hemolysin, MspA, MspB, MspC, MspD, FraC, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, Phi29 connector protein, InvG, GspD or any combination thereof.

本发明一些实施方案中,所述纳米孔为经修饰的纳米孔。In some embodiments of the present invention, the nanopore is a modified nanopore.

本发明一些较佳实施方案中,所述经修饰的纳米孔为多肽修饰的纳米孔。In some preferred embodiments of the present invention, the modified nanopore is a polypeptide-modified nanopore.

本发明一些实施方案中,所述多肽为标签、酶切位点、信号肽或导肽、可检测的标记或其任意组合。In some embodiments of the present invention, the polypeptide is a tag, an enzyme cleavage site, a signal peptide or a leader peptide, a detectable marker or any combination thereof.

本发明一些实施方案中,所述电绝缘膜为两亲性膜、高分子聚合物膜或其任意组合。In some embodiments of the present invention, the electrically insulating film is an amphiphilic film, a high molecular polymer film or any combination thereof.

本发明一些较佳实施方案中,所述电绝缘膜为磷脂双分子层、两嵌段共聚物或三嵌段共聚物。In some preferred embodiments of the present invention, the electrical insulating membrane is a phospholipid bilayer, a diblock copolymer or a triblock copolymer.

本发明的第五方面提供一种多肽测序的方法,所述方法包括以下步骤:A fifth aspect of the present invention provides a method for polypeptide sequencing, the method comprising the following steps:

(1)向如第四方面所述的测序复合物施加电场力,使第一核酸片段被纳米孔捕获,第一核酸片段牵引待测多肽使所述待测多肽至少部分从第一电场端穿过纳米孔到达第二电场端;(1) applying an electric field force to the sequencing complex as described in the fourth aspect, so that the first nucleic acid fragment is captured by the nanopore, and the first nucleic acid fragment pulls the polypeptide to be tested so that the polypeptide to be tested at least partially passes through the nanopore from the first electric field end to the second electric field end;

(2)聚合酶以复合核酸片段或第二核酸片段自双链核酸末端起的单链核酸为模板合成双链核酸,并将待测多肽从第二电场端经纳米孔拉向第一电场端,产生并记录纳米孔中的电流变化。(2) The polymerase synthesizes a double-stranded nucleic acid using the composite nucleic acid fragment or the single-stranded nucleic acid of the second nucleic acid fragment from the end of the double-stranded nucleic acid as a template, and pulls the polypeptide to be tested from the second electric field end through the nanopore to the first electric field end, generating and recording the current change in the nanopore.

本发明中,所述第一核酸片段用于提供负电荷,在电场力作用下提供穿过纳米孔的牵引力。In the present invention, the first nucleic acid fragment is used to provide negative charge, and provides a pulling force to pass through the nanopore under the action of the electric field force.

本发明中,所述复合核酸片段的发卡式结构为所述核酸多肽复合物提供了双链区间和裸露的单链区间,有利于聚合酶正确识别双链合成的起始位点进行扩增,实现测序。In the present invention, the hairpin structure of the composite nucleic acid fragment provides a double-stranded region and an exposed single-stranded region for the nucleic acid-polypeptide complex, which is conducive to the polymerase correctly identifying the starting site of double-stranded synthesis for amplification and sequencing.

本发明中,所述第四核酸片段可作为自动化测序的开关,在事先加入dNTP的情况下 避免聚合酶提前开始合成DNA双链。当所述第四核酸片段与所述复合核酸片段或第二核酸片段分离时,聚合酶识别暴露的扩增的起始位点,扩增开始;当所述第四核酸片段与所述复合核酸片段或第二核酸片段在结合状态时,聚合酶无法识别扩增的起始位点,无法进行扩增。In the present invention, the fourth nucleic acid fragment can be used as a switch for automated sequencing. Prevent the polymerase from starting to synthesize double-stranded DNA in advance. When the fourth nucleic acid fragment is separated from the composite nucleic acid fragment or the second nucleic acid fragment, the polymerase recognizes the exposed start site of amplification and amplification begins; when the fourth nucleic acid fragment is in a binding state with the composite nucleic acid fragment or the second nucleic acid fragment, the polymerase cannot recognize the start site of amplification and amplification cannot be performed.

此外,所述第四核酸片段的锚定部分可用于将复合物聚集到孔的周边。Furthermore, the anchoring portion of the fourth nucleic acid fragment may be used to aggregate the complex to the periphery of the pore.

本发明中,所述第三核酸片段没有通过化学键或酶与所述第二核酸片段进行直接连接,而仅是通过退火形成的发卡式结构与所述第二核酸片段结合。所述第三核酸片段提供了退火位点,从而在不使用化学/酶连接的情况下将第二核酸片段和第三核酸片段结合在一起。In the present invention, the third nucleic acid fragment is not directly connected to the second nucleic acid fragment by chemical bonds or enzymes, but is only connected to the second nucleic acid fragment by a hairpin structure formed by annealing. The third nucleic acid fragment provides an annealing site, thereby combining the second nucleic acid fragment and the third nucleic acid fragment without using chemical/enzymatic connection.

本发明中,所述间隔子用于以较小的体积在电流曲线上呈现较为明显的高峰,从而利于确定多肽信号开始的位置。所述间隔子例如为无碱基脱氧核苷、isp6、isp18、polyT等。In the present invention, the spacer is used to present a more obvious peak on the current curve with a smaller volume, so as to facilitate the determination of the starting position of the polypeptide signal. The spacer is, for example, abasic deoxynucleoside, isp6, isp18, polyT, etc.

由于聚合酶扩增速度较慢,且扩增和易位的速度波动性较小。因此,相比较于解旋酶,其具有较大的优势。Since polymerase amplification is slower and the amplification and translocation speeds are less volatile, it has a greater advantage over helicase.

本发明一些实施方案中,所述第一电场端为处于负电场的纳米孔端,所述第二电场端为处于正电场的纳米孔端。In some embodiments of the present invention, the first electric field end is the nanopore end in a negative electric field, and the second electric field end is the nanopore end in a positive electric field.

本发明一些实施方案中,所述第一核酸片段被纳米孔捕获后,所述第四核酸片段在电场力作用下与复合核酸片段或第二核酸片段分离。In some embodiments of the present invention, after the first nucleic acid fragment is captured by the nanopore, the fourth nucleic acid fragment is separated from the composite nucleic acid fragment or the second nucleic acid fragment under the action of an electric field force.

本发明一些实施方案中,通过施加电压形成所述电场力,且所述电压是50mV以上。In some embodiments of the present invention, the electric field force is formed by applying a voltage, and the voltage is above 50 mV.

本发明一些较佳实施方案中,所述电压是100mV-300mV。In some preferred embodiments of the present invention, the voltage is 100 mV-300 mV.

本发明一些实施方案中,所述方法进一步包括步骤(3):分析电流变化以识别所述待测多肽的氨基酸序列。In some embodiments of the present invention, the method further comprises step (3): analyzing the current change to identify the amino acid sequence of the polypeptide to be detected.

本发明的第六方面提供一种多肽测序装置,所述多肽测序装置包括:样品处理模块、纳米孔测序模块、电力模块和检测分析模块;A sixth aspect of the present invention provides a polypeptide sequencing device, the polypeptide sequencing device comprising: a sample processing module, a nanopore sequencing module, a power module and a detection and analysis module;

其中,所述样品处理模块对待测多肽进行处理,形成如第三方面所述的核酸蛋白复合物,并转移至纳米孔测序模块中;所述纳米孔测序模块在所述电力模块的作用下对所述待测多肽进行测序,所述检测分析模块对测序过程中的电流变化进行检测和分析;Wherein, the sample processing module processes the polypeptide to be tested to form a nucleic acid-protein complex as described in the third aspect, and transfers it to the nanopore sequencing module; the nanopore sequencing module sequences the polypeptide to be tested under the action of the power module, and the detection and analysis module detects and analyzes the current changes during the sequencing process;

或者,所述样品处理模块对待测多肽进行处理,形成如第一方面所述的核酸多肽复合物,并转移至纳米孔测序模块中;所述纳米孔测序模块包括聚合酶,并在所述电力模块的作用下对所述待测多肽进行测序,所述检测分析模块对测序过程中的电流变化进行检测和分析。Alternatively, the sample processing module processes the polypeptide to be tested to form a nucleic acid-polypeptide complex as described in the first aspect, and transfers it to a nanopore sequencing module; the nanopore sequencing module includes a polymerase, and sequences the polypeptide to be tested under the action of the power module, and the detection and analysis module detects and analyzes the current changes during the sequencing process.

本发明的第七方面提供一种试剂盒,所述试剂盒包括第一核酸片段和复合核酸片段, 其中,所述复合核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后复合核酸片段包含双链核酸和单链核酸,所述单链核酸的一端与所述双链核酸连接,所述单链核酸的另一端与待测多肽连接;A seventh aspect of the present invention provides a kit, the kit comprising a first nucleic acid fragment and a composite nucleic acid fragment, Wherein, the composite nucleic acid fragment comprises a nucleic acid sequence that can form a hairpin structure, and after forming the hairpin structure, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be detected;

所述第一核酸片段经修饰后能够与待测多肽、所述复合核酸片段依次连接。After modification, the first nucleic acid fragment can be linked to the polypeptide to be detected and the composite nucleic acid fragment in sequence.

本发明一些优选的实施方案中,所述单链核酸的长度大于或等于所述待测多肽的长度。In some preferred embodiments of the present invention, the length of the single-stranded nucleic acid is greater than or equal to the length of the polypeptide to be detected.

本发明一些实施方案中,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段与所述第二核酸片段中一部分互补形成双链核酸,所述第二核酸片段中的另一部分为单链核酸。In some embodiments of the present invention, the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, and the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure. After the hairpin structure is formed, the third nucleic acid fragment is complementary to a part of the second nucleic acid fragment to form a double-stranded nucleic acid, and the other part of the second nucleic acid fragment is a single-stranded nucleic acid.

本发明另一些实施方案中,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段的两个末端彼此互补形成双链核酸,所述第二核酸片段为单链核酸,且所述第二核酸片段的一端与所述双链核酸连接,所述第二核酸片段的另一端与所述待测多肽连接。In other embodiments of the present invention, the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, after the hairpin structure is formed, the two ends of the third nucleic acid fragment are complementary to each other to form a double-stranded nucleic acid, the second nucleic acid fragment is a single-stranded nucleic acid, and one end of the second nucleic acid fragment is connected to the double-stranded nucleic acid, and the other end of the second nucleic acid fragment is connected to the polypeptide to be tested.

本发明一些实施方案中,所述试剂盒还包括第四核酸片段,所述第四核酸片段的5’端核酸序列与所述复合核酸片段中的单链核酸的至少一部分互补形成双链,所述第四核酸片段的3’端可选地连接锚定部分。In some embodiments of the present invention, the kit further comprises a fourth nucleic acid fragment, the 5' end nucleic acid sequence of the fourth nucleic acid fragment is complementary to at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment to form a double strand, and the 3' end of the fourth nucleic acid fragment is optionally connected to an anchoring portion.

本发明一些优选的实施方案中,形成发卡式结构后,所述复合核酸片段的发卡式结构的双链核酸的稳定性高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的稳定性。In some preferred embodiments of the present invention, after the hairpin structure is formed, the stability of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the stability of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.

本发明一些优选的实施方案中,所述复合核酸片段的发卡式结构的双链核酸的Tm值高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的Tm值。In some preferred embodiments of the present invention, the Tm value of the double-stranded nucleic acid of the hairpin structure of the composite nucleic acid fragment is higher than the Tm value of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment.

本发明一些较佳实施方案中,所述锚定部分选自脂质、脂肪酸、甾醇、高分子材料、碳纳米管、多肽、蛋白质和/或氨基酸的任意其一或组合。In some preferred embodiments of the present invention, the anchoring moiety is selected from any one or a combination of lipids, fatty acids, sterols, polymer materials, carbon nanotubes, polypeptides, proteins and/or amino acids.

本发明一些具体实施方案中,所述锚定部分为胆固醇、棕榈酸酯或生育酚。In some specific embodiments of the present invention, the anchoring moiety is cholesterol, palmitate or tocopherol.

本发明一些实施方案中,所述第一核酸片段、复合核酸片段、第二核酸片段、第三核酸和/或第四核酸片段为DNA或核酸类似物。In some embodiments of the present invention, the first nucleic acid fragment, the composite nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment is DNA or a nucleic acid analog.

本发明一些较佳实施方案中,所述核酸类似物为LNA或PNA。In some preferred embodiments of the present invention, the nucleic acid analog is LNA or PNA.

本发明另一些较佳实施方案中,所述第一核酸片段、第二核酸片段、第三核酸和/或第四核酸片段优选为单链DNA。 In some other preferred embodiments of the present invention, the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are preferably single-stranded DNA.

本发明一些具体实施方案中,所述第一核酸片段的序列如SEQ ID NO:2所示。In some specific embodiments of the present invention, the sequence of the first nucleic acid fragment is shown as SEQ ID NO:2.

本发明一些具体实施方案中,所述第二核酸片段的序列如SEQ ID NO:1的第2位~第55位所示。In some specific embodiments of the present invention, the sequence of the second nucleic acid fragment is as shown in positions 2 to 55 of SEQ ID NO:1.

本发明一些具体实施方案中,所述第三核酸片段的序列如SEQ ID NO:4所示。In some specific embodiments of the present invention, the sequence of the third nucleic acid fragment is as shown in SEQ ID NO:4.

本发明一些具体实施方案中,所述第四核酸片段的序列如SEQ ID NO:5所示。In some specific embodiments of the present invention, the sequence of the fourth nucleic acid fragment is as shown in SEQ ID NO:5.

本发明一些实施方案中,所述试剂盒进一步包括聚合酶、纳米孔、电绝缘膜和反应缓冲液的任意其一或组合。In some embodiments of the present invention, the kit further comprises any one or a combination of a polymerase, a nanopore, an electrical insulating membrane and a reaction buffer.

本发明的第八方面提供一种如第一方面所述的核酸多肽复合物、如第三方面所述的核酸蛋白复合物或者如第四方面所述的测序复合物在纳米孔测序中的应用。An eighth aspect of the present invention provides a use of the nucleic acid-polypeptide complex as described in the first aspect, the nucleic acid-protein complex as described in the third aspect, or the sequencing complex as described in the fourth aspect in nanopore sequencing.

本发明技术方案带来的有益效果:Beneficial effects brought by the technical solution of the present invention:

本发明可以通过构建DNA-多肽-DNA三元复合物,以具有负电性的DNA作为引导序列牵引不同电荷的多肽进入纳米孔,从而实现对任意多肽序列的检测。The present invention can construct a DNA-polypeptide-DNA ternary complex and use negatively charged DNA as a guide sequence to pull polypeptides of different charges into the nanopore, thereby realizing the detection of any polypeptide sequence.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为基于Hel308控速的蛋白质测序技术示意图。FIG1 is a schematic diagram of protein sequencing technology based on Hel308 rate control.

图2为利用Phi29 DNA聚合酶控速的多肽测序技术示意图。Figure 2 is a schematic diagram of peptide sequencing technology using Phi29 DNA polymerase to control the rate.

图3为基于寡聚核苷酸控制蛋白控速的蛋白测序技术示意图。FIG. 3 is a schematic diagram of protein sequencing technology based on oligonucleotide-controlled protein and rate control.

图4为基于MTA解旋酶控速的蛋白测序技术示意图。FIG. 4 is a schematic diagram of protein sequencing technology based on MTA helicase rate control.

图5为本发明一个实施例的蛋白测序流程示意图,其中,A为测序复合物示意图;B为测序流程示意图。FIG5 is a schematic diagram of a protein sequencing process according to an embodiment of the present invention, wherein A is a schematic diagram of a sequencing complex; and B is a schematic diagram of a sequencing process.

图6为根据实施例1示出的DNA1与多肽连接的化学反应过程:含有5’端马来酰胺修饰的DNA1经巯基-马来酰胺加成反应与多肽的N端半胱氨酸残基进行连接得到DP复合物,并进一步与含有3’端DBCO修饰的DNA2经点击化学反应与多肽C端侧链叠氮修饰的赖氨酸进行连接得到DPD复合物。Figure 6 is a chemical reaction process for connecting DNA1 to a polypeptide according to Example 1: DNA1 containing a maleimide modification at the 5' end is connected to the N-terminal cysteine residue of the polypeptide via a thiol-maleimide addition reaction to obtain a DP complex, and is further connected to DNA2 containing a DBCO modification at the 3' end via a click chemistry reaction with the azide-modified lysine on the C-terminal side chain of the polypeptide to obtain a DPD complex.

图7为根据实施例1示出的通过高效液相色谱(HPLC)对(a)DNA1-Peptide1,(b)DNA1-Peptide2,(c)DNA1-Peptide3进行纯化的色谱图。FIG. 7 is a chromatogram showing the purification of (a) DNA1-Peptide1, (b) DNA1-Peptide2, and (c) DNA1-Peptide3 by high performance liquid chromatography (HPLC) according to Example 1. FIG.

图8为根据实施例2示出的通过高效液相色谱(HPLC)对(a)DNA1-Peptide1-DNA2,(b)DNA1-Peptide2-DNA2,(c)DNA1-Peptide3-DNA2进行纯化的色谱图。8 is a chromatogram showing the purification of (a) DNA1-Peptide1-DNA2, (b) DNA1-Peptide2-DNA2, and (c) DNA1-Peptide3-DNA2 by high performance liquid chromatography (HPLC) according to Example 2.

图9为根据实施例2示出的DNA1-Peptide-DNA2的HPLC纯度分析色谱图和质谱表征图,其中(a)和(b)对应DNA1-Peptide1-DNA2,(c)和(d)对应DNA1-Peptide2-DNA2,(e)和(f)对应DNA1-Peptide3-DNA2。Figure 9 is a HPLC purity analysis chromatogram and mass spectrometry characterization diagram of DNA1-Peptide-DNA2 shown in Example 2, wherein (a) and (b) correspond to DNA1-Peptide1-DNA2, (c) and (d) correspond to DNA1-Peptide2-DNA2, and (e) and (f) correspond to DNA1-Peptide3-DNA2.

图10为根据实施例3示出的纳米孔对DPD复合物进行检测的基本原理。 FIG. 10 is a diagram showing the basic principle of detecting the DPD complex using a nanopore according to Example 3.

图11为根据实施例3示出的DNA3的纳米孔测试结果。FIG. 11 shows the nanopore test results of DNA 3 according to Example 3.

图12为根据实施例3示出的DNA3的纳米孔测试结果。FIG. 12 shows the nanopore test results of DNA 3 according to Example 3.

图13为根据实施例4示出的DNA1-Peptide1-DNA2的纳米孔测试结果。FIG. 13 shows the nanopore test results of DNA1-Peptide1-DNA2 according to Example 4.

图14为根据实施例4示出的DNA1-Peptide1-DNA2的纳米孔测试结果。FIG. 14 shows the nanopore test results of DNA1-Peptide1-DNA2 according to Example 4.

图15为根据实施例5示出的DNA1-Peptide2-DNA2的纳米孔测试结果。FIG. 15 shows the nanopore test results of DNA1-Peptide2-DNA2 according to Example 5. FIG.

图16为根据实施例5示出的DNA1-Peptide2-DNA2的纳米孔测试结果。FIG. 16 shows the nanopore test result of DNA1-Peptide2-DNA2 according to Example 5. FIG.

图17为根据实施例6示出的DNA1-Peptide3-DNA2的纳米孔测试结果。FIG. 17 shows the nanopore test results of DNA1-Peptide3-DNA2 according to Example 6.

图18为根据实施例6示出的DNA1-Peptide3-DNA2的纳米孔测试结果。FIG. 18 shows the nanopore test results of DNA1-Peptide3-DNA2 according to Example 6.

具体实施方式DETAILED DESCRIPTION

定义definition

本文所使用的术语“核苷酸序列”或“核酸序列”是指任何长度的核苷酸的聚合形式,无论是核糖核苷酸还是脱氧核糖核苷酸。此术语仅指分子的一级结构。因此,此术语包含双链和单链DNA,以及RNA。本文所使用的术语“核酸”是单链核苷酸序列,其中每个核苷酸上的3'和5'末端通过磷酸二酯键连接。多核苷酸可以由脱氧核糖核苷酸碱基或核糖核苷酸碱基构成。核酸可以在体外合成制造,或者从天然来源中分离。核酸可以进一步包含经修饰的DNA或RNA,例如已经被甲基化的DNA或RNA,或已经经受转录后修饰的RNA,所述转录后修饰例如是采用7-甲基鸟苷的5'封端、如裂解和聚腺苷酸化等3'加工以及剪接。核酸还可以包含合成核酸(XNA),如己糖醇核酸(HNA)、环己烯核酸(CeNA)、苏糖核酸(TNA)、甘油核酸(GNA)、锁核酸(LNA)和肽核酸(PNA)。核酸的大小通常在单链多核苷酸的情况下,表示为核苷酸(nt)的数量。The term "nucleotide sequence" or "nucleic acid sequence" as used herein refers to a polymeric form of nucleotides of any length, whether ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Therefore, this term includes double-stranded and single-stranded DNA, as well as RNA. The term "nucleic acid" as used herein is a single-stranded nucleotide sequence in which the 3' and 5' ends on each nucleotide are connected by a phosphodiester bond. Polynucleotides can be composed of deoxyribonucleotide bases or ribonucleotide bases. Nucleic acids can be synthesized and manufactured in vitro, or isolated from natural sources. Nucleic acids can further include modified DNA or RNA, such as DNA or RNA that has been methylated, or RNA that has been subjected to post-transcriptional modification, such as 5' capping with 7-methylguanosine, 3' processing such as cleavage and polyadenylation, and splicing. Nucleic acids can also include synthetic nucleic acids (XNA), such as hexitol nucleic acids (HNA), cyclohexene nucleic acids (CeNA), threose nucleic acids (TNA), glycerol nucleic acids (GNA), locked nucleic acids (LNA) and peptide nucleic acids (PNA). The size of a nucleic acid is usually expressed as the number of nucleotides (nt) in the case of a single-stranded polynucleotide.

术语“多肽”和“肽”在本文中可互换使用以指代氨基酸残基的聚合物以及其变体和合成类似物。因此,这些术语适用于氨基酸聚合物,其中一个或多个氨基酸残基是合成的非天然存在的氨基酸,如对应的天然存在的氨基酸的化学类似物,以及适用于天然存在的氨基酸聚合物。多肽还可以经历成熟或翻译后修饰过程,所述过程可以包含但不限于:糖基化、蛋白水解切割、脂质化、信号肽切割、前肽切割、磷酸化等。可以使用重组技术例如通过表达重组或合成的多核苷酸来制备肽。The terms "polypeptide" and "peptide" are used interchangeably herein to refer to polymers of amino acid residues, as well as variants and synthetic analogs thereof. Thus, these terms apply to amino acid polymers in which one or more amino acid residues are synthetic non-naturally occurring amino acids, such as chemical analogs of corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers. Polypeptides may also undergo maturation or post-translational modification processes, which may include, but are not limited to, glycosylation, proteolytic cleavage, lipidation, signal peptide cleavage, propeptide cleavage, phosphorylation, and the like. Peptides may be prepared using recombinant techniques, such as by expressing recombinant or synthetic polynucleotides.

术语“蛋白质”用于描述具有二级或三级结构的折叠多肽。蛋白质可以由单个多肽构成,或者可以包括组装形成多聚体的多个多肽。多聚体可以是同源寡聚体或异源寡聚体。蛋白质可以是天然存在的或野生型蛋白质,或者是经修饰的或非天然存在的蛋白质。蛋白质可以例如通过一个或多个氨基酸的添加、取代或缺失而不同于野生型蛋白质。 The term "protein" is used to describe a folded polypeptide having a secondary or tertiary structure. A protein may consist of a single polypeptide or may include multiple polypeptides assembled to form a multimer. A multimer may be a homo-oligomer or a hetero-oligomer. A protein may be a naturally occurring or wild-type protein or a modified or non-naturally occurring protein. A protein may differ from a wild-type protein, for example, by the addition, substitution or deletion of one or more amino acids.

术语“锚定”是指依赖于极性、分子间作用力、化学键等将某一物质以较为稳定的方式固定在某一结构内或其表面的一定范围内。The term "anchoring" refers to fixing a substance in a relatively stable manner within a structure or within a certain range on its surface by relying on polarity, intermolecular forces, chemical bonds, etc.

在一些实施方案中,本发明通过巯基-马来酰胺加成反应和叠氮-DBCO点击化学反应构建DNA-peptide-DNA的三元复合物结构,并通过HPLC进行纯化得到纯品,用作多肽纳米孔测序的底物。其中DNA1为Phi29 DNA聚合酶用于扩增的DNA片段,DNA2为引导多肽进孔的DNA片段。In some embodiments, the present invention constructs a DNA-peptide-DNA ternary complex structure through thiol-maleimide addition reaction and azide-DBCO click chemistry reaction, and purifies it by HPLC to obtain a pure product, which is used as a substrate for polypeptide nanopore sequencing. DNA1 is a DNA fragment amplified by Phi29 DNA polymerase, and DNA2 is a DNA fragment that guides the polypeptide into the pore.

将合成的DNA-peptide-DNA复合物通过退火以及共孵育与发卡DNA片段、保护DNA片段以及Phi29 DNA聚合酶结合形成测序复合物。并将该复合物与dNTP溶液混合后加入测序芯片的溶液仓中,由于DNA5末端存在一个固醇结构,会将整个退火后的文库锚定在膜上使其接紧纳米孔。在电场力的作用下,含有DNA-Peptide-DNA的那条链会在电场力的作用下向trans端移动,从而扯开本DNA5与DNA1之间的双链作用,使DNA4的3’端暴露以开始合成DNA双链。通过不断合成DNA双链的反应,稳定拉动多肽以通过纳米孔,并获取电信号(该流程示意图如图5所示)。The synthesized DNA-peptide-DNA complex is combined with the hairpin DNA fragment, the protective DNA fragment and the Phi29 DNA polymerase through annealing and co-incubation to form a sequencing complex. The complex is mixed with the dNTP solution and added to the solution chamber of the sequencing chip. Since there is a sterol structure at the end of DNA5, the entire annealed library will be anchored on the membrane to connect it to the nanopore. Under the action of the electric field force, the chain containing DNA-peptide-DNA will move toward the trans end under the action of the electric field force, thereby tearing apart the double-stranded interaction between DNA5 and DNA1, exposing the 3' end of DNA4 to start synthesizing the DNA double strand. Through the reaction of continuously synthesizing the DNA double strand, the polypeptide is stably pulled through the nanopore and an electrical signal is obtained (the schematic diagram of the process is shown in Figure 5).

通过将DNA-peptide-DNA与DNA对照组的纳米孔电信号进行对比,证明本发明提出的方案对不同电性的多肽具有纳米孔检测的可行性,使Phi29 DNA聚合酶可以被应用于对任意多肽的纳米孔测序时的速度控制。By comparing the nanopore electrical signals of the DNA-peptide-DNA and DNA control groups, it is demonstrated that the scheme proposed in the present invention is feasible for nanopore detection of polypeptides with different electrical properties, so that Phi29 DNA polymerase can be used to control the speed of nanopore sequencing of any polypeptide.

以下实施例中使用的DNA和多肽序列均由华大六合合成,各序列信息如下(从左至The DNA and polypeptide sequences used in the following examples were synthesized by BGI Liuhe, and the sequence information is as follows (from left to right): 右为从5’端至3’端):Right is from 5' to 3' end):

DNA1(SEQ ID NO:1):DNA1 (SEQ ID NO: 1):

/Maleimide/NAGAACTTTAGAACTTTTCAGATCTCACTATCGCATTCTCATGCAGGTCGTAGCC,其中,/Maleimide/可与多肽的C端相连。/Maleimide/NAGAACTTTAGAACTTTTCAGATCTCACTATCGCATTCTCATGCAGGTCGTAGCC, wherein /Maleimide/ can be connected to the C-terminus of the polypeptide.

DNA2(SEQ ID NO:2):DNA2 (SEQ ID NO:2):

TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT/DBCO/,其中,/DBCO/可与多肽的N端(即LYS(N3))相连。TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT/DBCO/, wherein /DBCO/ can be connected to the N-terminus of the polypeptide (i.e., LYS(N3)).

DNA3(SEQ ID NO:3):DNA3 (SEQ ID NO:3):

TTTTTTTTTTTTTTTTTTTTTTTTT/NAGAACTTTAGAACTTTTCAGATCTCACTATCGCATTCTCATGCAGGTCGTAGCCTTTTTTTTTTTTTTTTTTTTTTTT/NAGAACTTTAGAACTTTTCAGATCTCACTATCGCATTCTCATGCAGGTCGTAGCC

DNA4(SEQ ID NO:4):DNA4 (SEQ ID NO:4):

GCGTACGCCTACGGTTTTCCGTAGGCGTACGCGGCTACGACCTGCATGAGAATGC DNA5(SEQ ID NO:5):GCGTACGCCTACGGTTTTCCGTAGGCGTACGCGGCTACGACCTGCATGAGAATGC DNA5 (SEQ ID NO:5):

GATAGTGAGATCTGATTTCCCAAATTTAAA/cholesterol/GATAGTGAGATCTGATTTCCCAAATTTAAA/cholesterol/

Peptide1(SEQ ID NO:6):CGSGDDGSG{LYS(N3)}Peptide1(SEQ ID NO:6):CGSGDDGSG{LYS(N3)}

Peptide2(SEQ ID NO:7):CGSGYYGSG{LYS(N3)}Peptide2(SEQ ID NO:7):CGSGYYGSG{LYS(N3)}

Peptide3(SEQ ID NO:8):CGSGRRGSG{LYS(N3)}Peptide3(SEQ ID NO:8):CGSGRRRGSG{LYS(N3)}

其中,Maleimide(马来酰胺)的结构如下所示:
Among them, the structure of Maleimide is as follows:

N(abasic spacer)为无碱基脱氧核苷的间隔子,其结构如下所示:
N (abasic spacer) is a spacer of abasic deoxynucleoside, and its structure is shown below:

LYS(N3)为Lys侧链的叠氮化修饰(叠氮基团代替原有的氨基),其结构式如下:
LYS(N3) is an azidation modification of the Lys side chain (the azide group replaces the original amino group), and its structural formula is as follows:

DNA3为验证本发明所使用的对照组序列(Control),其模拟了DNA2+DNA1的合并序列,具体来说是使用25-mer polyT模拟DNA2,然后在其3’端继续添加的除马来酰胺修饰基团外的完整DNA1序列。DNA3 is the control sequence (Control) used to verify the present invention, which simulates the combined sequence of DNA2+DNA1. Specifically, 25-mer polyT is used to simulate DNA2, and then the complete DNA1 sequence except the maleimide modification group is added to its 3' end.

DNA4用于与DNA3或DNA1-Peptide-DNA2中的DNA1进行退火后形成发卡状双链结构的片段。DNA4 is used to anneal with DNA3 or DNA1 in DNA1-Peptide-DNA2 to form a hairpin-shaped double-stranded structure.

DNA5用于与DNA3或DNA1-Peptide-DNA2中的DNA1进行退火后形成15bp双链结构的保护片段,作用是起到防止聚合酶提前开始合成DNA双链。DNA5 is used to anneal with DNA3 or DNA1 in DNA1-Peptide-DNA2 to form a protective fragment of a 15bp double-stranded structure, which prevents the polymerase from starting to synthesize the DNA double strand prematurely.

实施例1:构建核酸-多肽复合物(DP连接产物)Example 1: Construction of nucleic acid-polypeptide complex (DP ligation product)

(1)实验方法 (1) Experimental methods

制备三种DP连接产物:(a)DNA1-Peptide1,(b)DNA1-Peptide2和(c)DNA1-Peptide3。DNA1为图5所示的扩增DNA序列,其序列为SEQ ID NO:1,Peptide1、Peptide2和Peptide 3为图5所示的待测多肽,其序列分别为SEQ ID NO:6-8。每一种DP连接产物的制备方法均参考如下:Three DP-linked products were prepared: (a) DNA1-Peptide1, (b) DNA1-Peptide2, and (c) DNA1-Peptide3. DNA1 is the amplified DNA sequence shown in FIG5, and its sequence is SEQ ID NO: 1. Peptide1, Peptide2, and Peptide 3 are the peptides to be tested shown in FIG5, and their sequences are SEQ ID NO: 6-8, respectively. The preparation method of each DP-linked product is as follows:

DNA1粉末充分溶解于纯水中,并通过Qubit ssDNA检测试剂盒(ThermoFisher)定量浓度。多肽粉末使用纯水充分溶解至20mg/mL。在EP管中加入1μL 10×连接反应缓冲液(1M HEPES(pH=7.2),50mM EDTA),并加入80nmol多肽和200mM TCEP溶液,多肽与TCEP的摩尔比为1:1,再加入纯水稀释反应液,涡旋振荡混匀后,金属浴25℃,反应10分钟;10分钟后加入5nmol的DNA1,反应最终体积为10μL,涡旋振荡混匀后,金属浴25℃,反应4小时。反应结束后,使用1260Infinity II高效液相色谱仪(Agilent)进行纯化并接取DNA1-Peptide(DP)连接产物馏分,再将馏分过夜冻干,以备后续连接反应所用。DNA1 powder was fully dissolved in pure water, and the concentration was quantified by Qubit ssDNA detection kit (ThermoFisher). Peptide powder was fully dissolved in pure water to 20 mg/mL. 1 μL 10× ligation reaction buffer (1M HEPES (pH=7.2), 50 mM EDTA) was added to the EP tube, and 80 nmol of peptide and 200 mM TCEP solution were added, with a molar ratio of peptide to TCEP of 1:1. Pure water was then added to dilute the reaction solution, vortexed and mixed, and the metal bath was 25°C for 10 minutes; 5 nmol of DNA1 was added after 10 minutes, and the final reaction volume was 10 μL. After vortexing and mixing, the metal bath was 25°C for 4 hours. After the reaction, the DNA1-Peptide (DP) ligation product fraction was purified and collected using a 1260 Infinity II high performance liquid chromatograph (Agilent), and the fraction was lyophilized overnight for subsequent ligation reactions.

(2)实验结果(2) Experimental results

图6展示了DNA1与多肽连接的化学反应过程。图7展示了(a)DNA1-Peptide1、(b)DNA1-Peptide2、(c)DNA1-Peptide3的HPLC纯化色谱图,其中10-17分钟间最高的尖峰对应所接取的DP复合物馏分。Figure 6 shows the chemical reaction process of connecting DNA1 to peptides. Figure 7 shows the HPLC purification chromatograms of (a) DNA1-Peptide1, (b) DNA1-Peptide2, and (c) DNA1-Peptide3, where the highest peak between 10 and 17 minutes corresponds to the DP complex fractions taken.

实施例2:制备DPD连接物Example 2: Preparation of DPD linker

(1)实验方法(1) Experimental methods

制备三种DPD连接物:(a)DNA1-Peptide1-DNA2,(b)DNA1-Peptide2-DNA2和(c)DNA1-Peptide3-DNA2。DNA2为图5所示的引导DNA序列,其序列为SEQ ID NO:2。每一种的制备方法均参考如下:Three DPD conjugates were prepared: (a) DNA1-Peptide1-DNA2, (b) DNA1-Peptide2-DNA2, and (c) DNA1-Peptide3-DNA2. DNA2 is the guide DNA sequence shown in FIG5 , and its sequence is SEQ ID NO: 2. The preparation method of each is as follows:

DNA2粉末充分溶解于纯水中,并通过Qubit ssDNA检测试剂盒(ThermoFisher)定量检测其浓度。DP连接产物与DNA2按1:7.5的摩尔比,将实施例2获得的DP连接产物冻干粉末中加入DNA2溶液,并加入10μL 10×连接反应缓冲液(1M HEPES(pH=7.2),50mM EDTA),再用纯水调节最终反应体系为100μL,涡旋振荡混匀后,金属浴25℃,反应16小时。反应结束后,使用1260Infinity II高效液相色谱仪(Agilent)进行纯化并接取DNA1-Peptide-DNA2(DPD)连接产物馏分。使用Qubit ssDNA检测试剂盒(ThermoFisher)定量检测馏分的浓度,在一个EP管中留存少量纯化后的产物进行高效液相色谱纯度鉴定和质谱表征,剩余的样品分装后过夜冻干,保存在-80℃中。DNA2 powder was fully dissolved in pure water, and its concentration was quantitatively detected by Qubit ssDNA detection kit (ThermoFisher). The DP connection product and DNA2 were in a molar ratio of 1:7.5. The DNA2 solution was added to the lyophilized powder of the DP connection product obtained in Example 2, and 10 μL 10× connection reaction buffer (1M HEPES (pH=7.2), 50mM EDTA) was added. The final reaction system was adjusted to 100 μL with pure water, vortexed and mixed, and the metal bath was 25°C for 16 hours. After the reaction, the 1260 Infinity II high performance liquid chromatograph (Agilent) was used for purification and the DNA1-Peptide-DNA2 (DPD) connection product fraction was collected. The concentration of the fractions was quantitatively detected using the Qubit ssDNA detection kit (ThermoFisher). A small amount of the purified product was retained in an EP tube for HPLC purity identification and mass spectrometry characterization. The remaining sample was packaged, freeze-dried overnight, and stored at -80°C.

(2)实验结果(2) Experimental results

图8展示了(a)DNA1-Peptide1-DNA2、(b)DNA1-Peptide2-DNA2、(c)DNA1- Peptide3-DNA2的HPLC纯化色谱图,其中14-18分钟间最高的两个尖峰对应所接取的DPD复合物馏分。图9展示了DNA1-Peptide1-DNA2的(a)HPLC纯度分析色谱图和(b)质谱图;DNA1-Peptide2-DNA2的(c)HPLC纯度分析色谱图和(d)质谱图;DNA1-Peptide3-DNA2的(e)HPLC纯度分析色谱图和(f)质谱图。HPLC纯度分析显示三条DPD复合物纯度均在85%以上,最高可以超过95%。质谱所示的实验分子量也分别与三条DPD复合物的理论分子量匹配,证明了本实施例获得了三种DPD连接物。Figure 8 shows (a) DNA1-Peptide1-DNA2, (b) DNA1-Peptide2-DNA2, (c) DNA1- HPLC purification chromatogram of Peptide3-DNA2, in which the two highest peaks between 14 and 18 minutes correspond to the DPD complex fractions taken. Figure 9 shows (a) HPLC purity analysis chromatogram and (b) mass spectrum of DNA1-Peptide1-DNA2; (c) HPLC purity analysis chromatogram and (d) mass spectrum of DNA1-Peptide2-DNA2; (e) HPLC purity analysis chromatogram and (f) mass spectrum of DNA1-Peptide3-DNA2. HPLC purity analysis shows that the purity of the three DPD complexes is above 85%, and the highest can exceed 95%. The experimental molecular weight shown in the mass spectrum also matches the theoretical molecular weight of the three DPD complexes, respectively, proving that three DPD conjugates were obtained in this embodiment.

实施例3:DNA3-5退火复合物的纳米孔测序Example 3: Nanopore sequencing of DNA 3-5 annealing complex

(1)实验方法(1) Experimental methods

DNA3序列为SEQ ID NO:3;DNA4为图5所示的发卡片段,其序列为SEQ ID NO:4;DNA5为图5所示的保护片段,其序列为SEQ ID NO:5。The sequence of DNA3 is SEQ ID NO: 3; DNA4 is the hairpin fragment shown in Figure 5, and its sequence is SEQ ID NO: 4; DNA5 is the protection fragment shown in Figure 5, and its sequence is SEQ ID NO: 5.

将DNA3、DNA4和DNA5粉末分别充分溶解于纯水中,并通过Qubit ssDNA检测试剂盒(ThermoFisher)定量检测其浓度。在EP管中按1:1:2的摩尔比分别加入DNA3、DNA4和DNA5的水溶液,室温静置退火20分钟。退火后的复合物使用Qubit dsDNA HS检测试剂盒(ThermoFisher)定量检测其浓度,再取适量退火复合物与1μL Phi29 DNA聚合酶(NEB)在总体积为8μL的1×PBS中共孵育30分钟,其中,退火复合物在共孵育混合物中的终浓度为75nM。如图10所示,使用膜片钳放大器进行电流信号的采集。中间由微米级小孔的(直径50-200μm)Teflon膜将电解池分为两个腔室:cis腔室和trans腔室;并各放置一对Ag/AgCl电极;在两个腔室的微孔处形成一层双分子磷脂膜后加入纳米孔蛋白MspA;待单个纳米孔蛋白MspA插入磷脂膜后,加入上述共孵育混合液以及30μL10Mm dNTP(NEB),施加180mV,记录电流数据。DNA3, DNA4 and DNA5 powders were fully dissolved in pure water, and their concentrations were quantitatively detected by Qubit ssDNA detection kit (ThermoFisher). DNA3, DNA4 and DNA5 aqueous solutions were added to EP tubes at a molar ratio of 1:1:2, and annealed at room temperature for 20 minutes. The concentration of the annealed complex was quantitatively detected using Qubit dsDNA HS detection kit (ThermoFisher), and then an appropriate amount of annealed complex was taken and incubated with 1μL Phi29 DNA polymerase (NEB) in a total volume of 8μL 1×PBS for 30 minutes, wherein the final concentration of the annealed complex in the incubation mixture was 75nM. As shown in Figure 10, a patch clamp amplifier was used to collect current signals. The electrolytic cell was divided into two chambers by a Teflon membrane with micron-sized pores (diameter 50-200μm) in the middle: a cis chamber and a trans chamber; a pair of Ag/AgCl electrodes were placed in each chamber; after a layer of bimolecular phospholipid membrane was formed at the micropores of the two chambers, the nanopore protein MspA was added; after a single nanopore protein MspA was inserted into the phospholipid membrane, the above-mentioned co-incubation mixture and 30μL 10Mm dNTP (NEB) were added, 180mV was applied, and the current data was recorded.

(2)实验结果(2) Experimental results

图11和图12分别展示了两次平行的DNA3纳米孔实验信号数据,其中(a)代表从开孔电流开始到DNA3的5’末端25-mer polyT信号的整体过孔信号。可以看到,在180mV的固定电压下,纳米孔电流从开孔电流先由150pA左右下降至50pA左右,再回升至70-80pA附近振荡一段时间,该信号对应退火复合物的进孔信号,但此时保护片段DNA5尚未从复合物上脱离。当DNA5在电场力的作用下被扯开后,电信号立即呈现出如(b)所示的三个40-50pA的信号谷,并紧接一个100pA左右的信号峰,该特征信号区间对应DNA3序列中的abasic spacer以及其3’端的一段较为连续的重复序列。在特征信号读取后,纳米孔电流下降至60pA左右的一个稳定平台,对应DNA3的5’末端25-mer polyT。Figures 11 and 12 show the signal data of two parallel DNA3 nanopore experiments, where (a) represents the overall through-pore signal from the opening current to the 25-mer polyT signal at the 5' end of DNA3. It can be seen that under a fixed voltage of 180mV, the nanopore current first drops from about 150pA to about 50pA from the opening current, and then rises back to about 70-80pA and oscillates for a period of time. This signal corresponds to the pore entry signal of the annealing complex, but at this time the protective fragment DNA5 has not yet detached from the complex. When DNA5 is torn apart under the action of the electric field force, the electrical signal immediately presents three 40-50pA signal valleys as shown in (b), followed by a signal peak of about 100pA. This characteristic signal interval corresponds to the abasic spacer in the DNA3 sequence and a relatively continuous repeating sequence at its 3' end. After the characteristic signal is read, the nanopore current drops to a stable platform of about 60pA, corresponding to the 25-mer polyT at the 5' end of DNA3.

以上结果证明DNA3-5形成的复合物能通过纳米孔,进一步确定了使用DNA4、DNA5构建的退火复合物能够按照本发明设计的方案实现信号的读取。 The above results prove that the complex formed by DNA3-5 can pass through the nanopore, and further confirm that the annealing complex constructed using DNA4 and DNA5 can realize signal reading according to the scheme designed by the present invention.

实施例4:DNA1-Peptide1-DNA2的纳米孔测序Example 4: Nanopore sequencing of DNA1-Peptide1-DNA2

(1)实验方法(1) Experimental methods

DNA4为图5所示的发卡片段,其序列为SEQ ID NO:4;DNA5为图5所示的保护片段,其序列为SEQ ID NO:5。DNA4 is the hairpin fragment shown in Figure 5, and its sequence is SEQ ID NO: 4; DNA5 is the protection fragment shown in Figure 5, and its sequence is SEQ ID NO: 5.

取实施例2获得的分装冻干的DNA1-Peptide1-DNA2,根据分装量在EP管中按1:1:2的摩尔混合DNA1-Peptide1-DNA2干粉和DNA4、DNA5溶液,室温静置退火20分钟,形成图5所示的退火复合物。使用Qubit dsDNA HS检测试剂盒(ThermoFisher)定量检测退火复合物的浓度,再取适量退火复合物与1μL Phi29 DNA聚合酶(NEB)在总体积为8μL的1×PBS中共孵育30分钟,其中,退火复合物在共孵育混合物中的终浓度为75nM。使用膜片钳放大器进行电流信号的采集。中间由微米级小孔的(直径50-200μm)Teflon膜将电解池分为两个腔室:cis腔室和trans腔室;并各放置一对Ag/AgCl电极;在两个腔室的微孔处形成一层双分子磷脂膜后加入纳米孔蛋白;待单个纳米孔蛋白插入磷脂膜后,加入上述共孵育混合液以及30μL 10Mm dNTP(NEB),施加180mV,记录电流数据。Take the freeze-dried DNA1-Peptide1-DNA2 obtained in Example 2, and mix the DNA1-Peptide1-DNA2 dry powder and DNA4 and DNA5 solutions in a molar ratio of 1:1:2 in an EP tube according to the amount of packaging, and anneal at room temperature for 20 minutes to form the annealing complex shown in Figure 5. The concentration of the annealing complex was quantitatively detected using the Qubit dsDNA HS detection kit (ThermoFisher), and then an appropriate amount of the annealing complex was taken and incubated with 1 μL of Phi29 DNA polymerase (NEB) in a total volume of 8 μL of 1×PBS for 30 minutes, wherein the final concentration of the annealing complex in the incubation mixture was 75 nM. A patch clamp amplifier was used to collect current signals. The electrolytic cell was divided into two chambers by a Teflon membrane with micron-sized pores (diameter 50-200μm) in the middle: a cis chamber and a trans chamber; a pair of Ag/AgCl electrodes were placed in each chamber; after a layer of bimolecular phospholipid membrane was formed at the micropores of the two chambers, nanopore protein was added; after a single nanopore protein was inserted into the phospholipid membrane, the above-mentioned co-incubation mixture and 30μL 10Mm dNTP (NEB) were added, 180mV was applied, and the current data was recorded.

(2)实验结果(2) Experimental results

图13和图14分别展示了两次平行的DNA1-Peptide1-DNA2纳米孔实验信号数据,其中(a)代表从开孔电流开始到电流信号平稳波动于80-100pA区间的纳米孔电流数据。可以看到,与图11和图12类似,退火复合物的进孔信号以及三个电流信号谷的特征信号区间均被成功识别。由于DNA1-Peptide1-DNA2中,DNA1的序列与实施例3中DNA3的3’端序列一致,因此会产生与实施例3中对应的DNA3通过纳米孔时呈现的“连续三个信号谷”一样的特征信号,称为特征信号区间,并且特征区间的第三个信号谷可粗略估计为多肽信号的起始端。接着,由于在DNA1-Peptide1-DNA2中,紧接DNA1后被测序的是多肽Peptide1,而实施例3中DNA3被测序的是25-mer polyT,因此特征信号区间后出现的电流波动趋势明显不同:DNA1-Peptide1-DNA2产生的是“40pA→70pA→80pA→70pA→80pA”的电流,如图13的(b)和图14的(b)所示;DNA3的polyT产生的是60pA左右的较为稳定的电流。DNA1-Peptide1-DNA2通过纳米孔后产生的类似于实施例3中所示的具有三个信号谷的特征信号区间,以及上述多肽Peptide1通过纳米孔后产生的与25-mer polyT穿孔时不同的波动信号,证明了负电性多肽可以被纳米孔捕获并在Phi29 DNA聚合酶的控速下进行检测。Figures 13 and 14 show two parallel DNA1-Peptide1-DNA2 nanopore experimental signal data, respectively, where (a) represents the nanopore current data from the opening current to the current signal fluctuating steadily in the 80-100pA interval. It can be seen that, similar to Figures 11 and 12, the pore entry signal of the annealing complex and the characteristic signal intervals of the three current signal valleys are successfully identified. Since the sequence of DNA1 in DNA1-Peptide1-DNA2 is consistent with the 3' end sequence of DNA3 in Example 3, a characteristic signal the same as the "three consecutive signal valleys" presented when the corresponding DNA3 in Example 3 passes through the nanopore will be generated, which is called the characteristic signal interval, and the third signal valley in the characteristic interval can be roughly estimated as the starting end of the polypeptide signal. Next, since in DNA1-Peptide1-DNA2, the peptide Peptide1 is sequenced immediately after DNA1, while in Example 3, the 25-mer polyT is sequenced in DNA3, so the current fluctuation trends after the characteristic signal interval are obviously different: DNA1-Peptide1-DNA2 generates a current of "40pA→70pA→80pA→70pA→80pA", as shown in Figure 13 (b) and Figure 14 (b); the polyT of DNA3 generates a relatively stable current of about 60pA. The characteristic signal interval with three signal valleys similar to that shown in Example 3 generated by DNA1-Peptide1-DNA2 after passing through the nanopore, and the fluctuation signal generated by the above-mentioned peptide Peptide1 after passing through the nanopore is different from that when 25-mer polyT is pierced, which proves that negatively charged polypeptides can be captured by nanopores and detected under the speed control of Phi29 DNA polymerase.

实施例5:DNA1-Peptide2-DNA2的纳米孔测序Example 5: Nanopore sequencing of DNA1-Peptide2-DNA2

(1)实验方法 (1) Experimental methods

参考实施例4的实验方法,将DNA1-Peptide1-DNA2替换为DNA1-Peptide2-DNA2。Referring to the experimental method of Example 4, DNA1-Peptide1-DNA2 was replaced by DNA1-Peptide2-DNA2.

(2)实验结果(2) Experimental results

图15和图16分别展示了两次平行的DNA1-Peptide2-DNA2纳米孔实验信号数据,其中(a)代表从开孔电流开始到电流信号平稳波动于80-100pA区间的纳米孔电流数据。可以看到,与图11和图12类似,退火复合物的进孔信号以及三个电流信号谷的特征信号区间均被成功识别。与图13和图14相同,由于DNA1-Peptide2-DNA2中,DNA1的序列与DNA3的3’端序列一致,因此会产生与实施例3中对应的DNA3通过纳米孔时呈现的“连续三个信号谷”的特征信号区间,该特征区间的第三个信号谷可粗略估计为多肽信号的起始端。接着,由于在DNA1-Peptide2-DNA2中,紧接DNA1后被测序的是多肽Peptide2,而实施例3中DNA3被测序的是25-mer polyT,因此特征信号区间后出现的电流波动趋势明显不同:DNA1-Peptide2-DNA2产生的是“40pA→60pA→40pA→70pA→85pA”的电流如图15(b)和图16(b)所示;DNA3的polyT产生的是60pA左右的较为稳定的电流。DNA1-Peptide2-DNA2通过纳米孔后产生的类似于实施例3中所示的具有三个信号谷的特征区间,以及上述多肽Peptide2通过纳米孔后产生的与25-mer polyT穿孔时不同的波动信号,证明了电中性多肽可以被纳米孔捕获并在Phi29 DNA聚合酶的控速下进行检测。Figures 15 and 16 show two parallel DNA1-Peptide2-DNA2 nanopore experimental signal data, where (a) represents the nanopore current data from the opening current to the current signal fluctuating steadily in the 80-100pA interval. It can be seen that, similar to Figures 11 and 12, the pore entry signal of the annealing complex and the characteristic signal intervals of the three current signal valleys are successfully identified. Similar to Figures 13 and 14, since the sequence of DNA1 in DNA1-Peptide2-DNA2 is consistent with the 3' end sequence of DNA3, a characteristic signal interval of "three consecutive signal valleys" will be generated when the corresponding DNA3 in Example 3 passes through the nanopore, and the third signal valley of the characteristic interval can be roughly estimated as the starting end of the polypeptide signal. Next, since in DNA1-Peptide2-DNA2, the peptide Peptide2 is sequenced immediately after DNA1, while in Example 3, the 25-mer polyT is sequenced in DNA3, so the current fluctuation trends after the characteristic signal interval are obviously different: DNA1-Peptide2-DNA2 generates a current of "40pA→60pA→40pA→70pA→85pA" as shown in Figures 15(b) and 16(b); the polyT of DNA3 generates a relatively stable current of about 60pA. The characteristic interval with three signal valleys similar to that shown in Example 3 generated by DNA1-Peptide2-DNA2 after passing through the nanopore, as well as the fluctuation signal generated by the above-mentioned peptide Peptide2 after passing through the nanopore, which is different from that when 25-mer polyT is pierced, proves that electrically neutral peptides can be captured by nanopores and detected under the speed control of Phi29 DNA polymerase.

实施例6:DNA1-Peptide3-DNA2的纳米孔测序Example 6: Nanopore sequencing of DNA1-Peptide3-DNA2

(1)实验方法(1) Experimental methods

参考实施例4的实验方法,将DNA1-Peptide1-DNA2替换为DNA1-Peptide3-DNA2。Referring to the experimental method of Example 4, DNA1-Peptide1-DNA2 was replaced by DNA1-Peptide3-DNA2.

(2)实验结果(2) Experimental results

图17和图18分别展示了两次平行的DNA1-Peptide3-DNA2纳米孔实验信号数据,其中(a)代表从开孔电流开始到电流信号平稳波动于30pA区间的纳米孔电流数据。可以看到,与图11和图12类似,退火复合物的进孔信号以及三个电流信号谷的特征信号区间均被成功识别。与图13-16相同,由于DNA1-Peptide3-DNA2中,DNA1的序列与实施例3中DNA3的3’端序列一致,因此会产生与实施例3中对应的DNA3通过纳米孔时呈现的“连续三个信号谷”的特征信号区间,该特征信号区间的第三个信号谷可粗略估计为多肽信号的起始端。接着,由于在DNA1-Peptide3-DNA2中,紧接DNA1后被测序的是多肽Peptide3,而实施例3中DNA3中被测序的是25-mer polyT,因此特征信号区间后出现的电流波动趋势明显不同:DNA1-Peptide3-DNA2产生的是“50pA→85pA→95pA→120pA→30pA”的电流,如图17的(b)和图18的(b)所示;DNA3的polyT产生的是60pA左右的较为稳定的电流。DNA1-Peptide3-DNA2通过纳米孔后产生的类似于实施 例3中所示的具有三个信号谷的特征信号区间,以及上述多肽Peptide3通过纳米孔后产生的与25-mer polyT穿孔时不同的波动信号,证明了正电性多肽可以被纳米孔捕获并在Phi29 DNA聚合酶的控速下进行检测。Figures 17 and 18 show two parallel DNA1-Peptide3-DNA2 nanopore experimental signal data, respectively, where (a) represents the nanopore current data from the opening current to the current signal fluctuating steadily in the 30pA interval. It can be seen that, similar to Figures 11 and 12, the pore entry signal of the annealing complex and the characteristic signal intervals of the three current signal valleys were successfully identified. Similar to Figures 13-16, since the sequence of DNA1 in DNA1-Peptide3-DNA2 is consistent with the 3' end sequence of DNA3 in Example 3, a characteristic signal interval of "three consecutive signal valleys" will be generated when the corresponding DNA3 in Example 3 passes through the nanopore, and the third signal valley of the characteristic signal interval can be roughly estimated as the starting end of the polypeptide signal. Next, since in DNA1-Peptide3-DNA2, the peptide Peptide3 is sequenced immediately after DNA1, while in Example 3, the 25-mer polyT is sequenced in DNA3, the current fluctuation trends after the characteristic signal interval are obviously different: DNA1-Peptide3-DNA2 generates a current of "50pA→85pA→95pA→120pA→30pA", as shown in Figure 17 (b) and Figure 18 (b); DNA3's polyT generates a relatively stable current of about 60pA. The current generated by DNA1-Peptide3-DNA2 after passing through the nanopore is similar to that in Example 3. The characteristic signal interval with three signal valleys shown in Example 3, as well as the fluctuation signal generated after the above-mentioned peptide Peptide3 passes through the nanopore, which is different from that when 25-mer polyT is pierced, proves that positively charged peptides can be captured by the nanopore and detected under the control of the Phi29 DNA polymerase.

虽然以上描述了本发明的具体实施方式,但是本领域的技术人员应当理解,这些仅是举例说明,在不背离本发明的原理和实质的前提下,可以对这些实施方式做出多种变更或修改。因此,本发明的保护范围由所附权利要求书限定。 Although the specific embodiments of the present invention are described above, it should be understood by those skilled in the art that these are only examples, and various changes or modifications may be made to these embodiments without departing from the principles and essence of the present invention. Therefore, the protection scope of the present invention is limited by the appended claims.

Claims (31)

一种核酸多肽复合物,其特征在于,所述核酸多肽复合物包括依次连接的第一核酸片段、待测多肽和复合核酸片段;A nucleic acid-polypeptide complex, characterized in that the nucleic acid-polypeptide complex comprises a first nucleic acid fragment, a polypeptide to be detected and a composite nucleic acid fragment connected in sequence; 其中,所述复合核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后复合核酸片段包含双链核酸和单链核酸,所述单链核酸的一端与所述双链核酸连接,所述单链核酸的另一端与所述待测多肽连接;Wherein, the composite nucleic acid fragment comprises a nucleic acid sequence that can form a hairpin structure, and after forming the hairpin structure, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be detected; 优选地,所述单链核酸的长度大于或等于所述待测多肽的长度。Preferably, the length of the single-stranded nucleic acid is greater than or equal to the length of the polypeptide to be detected. 如权利要求1所述的核酸多肽复合物,其特征在于,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段与所述第二核酸片段中的一部分互补形成双链核酸,所述第二核酸片段中的另一部分为单链核酸。The nucleic acid-polypeptide complex as described in claim 1 is characterized in that the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, and after the hairpin structure is formed, the third nucleic acid fragment is complementary to a part of the second nucleic acid fragment to form a double-stranded nucleic acid, and the other part of the second nucleic acid fragment is a single-stranded nucleic acid. 如权利要求1所述的核酸多肽复合物,其特征在于,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段的两个末端彼此互补形成双链核酸,所述第二核酸片段为单链核酸,且所述第二核酸片段的一端与所述双链核酸连接,所述第二核酸片段的另一端与所述待测多肽连接。The nucleic acid-polypeptide complex as described in claim 1 is characterized in that the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, and after the hairpin structure is formed, the two ends of the third nucleic acid fragment are complementary to each other to form a double-stranded nucleic acid, the second nucleic acid fragment is a single-stranded nucleic acid, and one end of the second nucleic acid fragment is connected to the double-stranded nucleic acid, and the other end of the second nucleic acid fragment is connected to the polypeptide to be tested. 如权利要求1-3任一项所述的核酸多肽复合物,其特征在于,所述核酸多肽复合物还包括第四核酸片段,所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成双链,所述第四核酸片段的3’端可选地连接锚定部分;The nucleic acid-polypeptide complex according to any one of claims 1 to 3, characterized in that the nucleic acid-polypeptide complex further comprises a fourth nucleic acid fragment, the 5' end nucleic acid sequence of the fourth nucleic acid fragment is complementary to at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment to form a double strand, and the 3' end of the fourth nucleic acid fragment is optionally connected to an anchoring portion; 优选地,形成发卡式结构后,所述复合核酸片段的发卡式结构的双链核酸的稳定性高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的稳定性;Preferably, after the hairpin structure is formed, the stability of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the stability of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment; 更优选地,所述复合核酸片段的发卡式结构的双链核酸的Tm值高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的Tm值。More preferably, the Tm value of the double-stranded nucleic acid of the hairpin structure of the composite nucleic acid fragment is higher than the Tm value of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment. 如权利要求1-4任一项所述的核酸多肽复合物,其特征在于,所述第一核酸片段的长度为5~200nt,优选地为10~50nt;所述复合核酸片段的长度为80~2000nt,优选地为50~200nt;所述第二核酸片段的长度为40~1000nt,优选地为50~200nt;所述第三核酸片段的长度为40~1000nt,优选地为50~200nt;所述第四核酸片段的长度为20-500nt,优选地为25-100nt。 The nucleic acid-polypeptide complex according to any one of claims 1 to 4 is characterized in that the length of the first nucleic acid fragment is 5 to 200 nt, preferably 10 to 50 nt; the length of the composite nucleic acid fragment is 80 to 2000 nt, preferably 50 to 200 nt; the length of the second nucleic acid fragment is 40 to 1000 nt, preferably 50 to 200 nt; the length of the third nucleic acid fragment is 40 to 1000 nt, preferably 50 to 200 nt; the length of the fourth nucleic acid fragment is 20-500 nt, preferably 25-100 nt. 如权利要求1-5任一项所述的核酸多肽复合物,其特征在于,所述第一核酸片段、复合核酸片段、第二核酸片段、第三核酸和/或第四核酸片段为DNA或核酸类似物;所述核酸类似物优选地为LNA或PNA;所述第一核酸片段、第二核酸片段、第三核酸和/或第四核酸片段优选地为单链DNA。The nucleic acid-polypeptide complex according to any one of claims 1 to 5, characterized in that the first nucleic acid fragment, the composite nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are DNA or nucleic acid analogs; the nucleic acid analogs are preferably LNA or PNA; the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are preferably single-stranded DNA. 如权利要求4所述的核酸多肽复合物,其特征在于,所述锚定部分选自脂质、脂肪酸、甾醇、碳纳米管、多肽、蛋白质和/或氨基酸的任意其一或组合;优选地为胆固醇、棕榈酸酯或生育酚。The nucleic acid-polypeptide complex according to claim 4, characterized in that the anchoring moiety is selected from any one or a combination of lipids, fatty acids, sterols, carbon nanotubes, polypeptides, proteins and/or amino acids; preferably cholesterol, palmitate or tocopherol. 如权利要求1~7任一项所述的核酸多肽复合物,其特征在于,所述第一核酸片段的序列如SEQ ID NO:2所示;The nucleic acid-polypeptide complex according to any one of claims 1 to 7, wherein the sequence of the first nucleic acid fragment is as shown in SEQ ID NO: 2; 和/或,所述第二核酸片段的序列如SEQ ID NO:1的第2位~第55位所示;And/or, the sequence of the second nucleic acid fragment is as shown in positions 2 to 55 of SEQ ID NO:1; 和/或,所述第三核酸片段的序列如SEQ ID NO:4所示;And/or, the sequence of the third nucleic acid fragment is shown in SEQ ID NO:4; 和/或,所述第四核酸片段的序列如SEQ ID NO:5所示。And/or, the sequence of the fourth nucleic acid fragment is as shown in SEQ ID NO:5. 一种制备如权利要求1~8任一项所述的核酸多肽复合物的方法,其特征在于,所述方法包括使5’端经马来酰胺修饰的所述复合核酸片段或第二核酸片段与待测多肽通过巯基-马来酰胺加成反应获得所述核酸多肽复合物的步骤;A method for preparing a nucleic acid-polypeptide complex according to any one of claims 1 to 8, characterized in that the method comprises the step of subjecting the composite nucleic acid fragment or the second nucleic acid fragment modified with maleamide at the 5' end to a test polypeptide through a thiol-maleamide addition reaction to obtain the nucleic acid-polypeptide complex; 和/或,所述方法包括使3’端经DBCO修饰的第一核酸片段与待测多肽通过叠氮-DBCO点击化学反应获得所述核酸多肽复合物的步骤。And/or, the method includes the step of obtaining the nucleic acid-polypeptide complex by reacting the first nucleic acid fragment whose 3' end is modified with DBCO with the polypeptide to be tested through an azide-DBCO click chemistry reaction. 一种核酸蛋白复合物,其特征在于,所述核酸蛋白复合物包括聚合酶和如权利要求1~8任一项所述的核酸多肽复合物;A nucleic acid-protein complex, characterized in that the nucleic acid-protein complex comprises a polymerase and the nucleic acid-polypeptide complex according to any one of claims 1 to 8; 其中,所述聚合酶结合于所述核酸多肽复合物中单链核酸与双链核酸交界处。Wherein, the polymerase is bound to the junction of the single-stranded nucleic acid and the double-stranded nucleic acid in the nucleic acid-polypeptide complex. 如权利要求10所述的核酸蛋白复合物,其特征在于,所述聚合酶为具有以扩增DNA单链为模板进行DNA双链合成的聚合酶;The nucleic acid-protein complex according to claim 10, characterized in that the polymerase is a polymerase capable of synthesizing double-stranded DNA using the amplified single-stranded DNA as a template; 较佳地,所述聚合酶为DNA聚合酶;Preferably, the polymerase is a DNA polymerase; 更佳地,所述聚合酶为:Bst DNA聚合酶、SD DNA聚合酶、phi29 DNA聚合酶、Bsu Large Fragment DNA聚合酶、Klenow Fragment DNA聚合酶或其任意组合。More preferably, the polymerase is: Bst DNA polymerase, SD DNA polymerase, phi29 DNA polymerase, Bsu Large Fragment DNA polymerase, Klenow Fragment DNA polymerase or any combination thereof. 一种测序复合物,其特征在于,所述测序复合物包含纳米孔和如权利要求1~8任一项所述的核酸多肽复合物或者如权利要求10或11所述的核酸蛋白复合物;所述核酸多肽复合物或所述核酸蛋白复合物能相对于所述纳米孔轴向移动;A sequencing complex, characterized in that the sequencing complex comprises a nanopore and the nucleic acid-polypeptide complex according to any one of claims 1 to 8 or the nucleic acid-protein complex according to claim 10 or 11; the nucleic acid-polypeptide complex or the nucleic acid-protein complex can move axially relative to the nanopore; 其中,所述核酸多肽复合物或所述核酸蛋白复合物中,所述第一核酸片段和所述待测多肽序列穿过纳米孔并相对纳米孔轴向移动。Wherein, in the nucleic acid-polypeptide complex or the nucleic acid-protein complex, the first nucleic acid fragment and the polypeptide sequence to be detected pass through the nanopore and move axially relative to the nanopore. 如权利要求12所述的测序复合物,其特征在于,所述纳米孔嵌在电绝缘膜中。The sequencing complex of claim 12, wherein the nanopore is embedded in an electrically insulating membrane. 如权利要求13所述的测序复合物,其特征在于,所述纳米孔为跨膜蛋白孔或固态 孔;较佳地,所述跨膜蛋白孔选自溶血素、MspA、MspB、MspC、MspD、FraC、ClyA、PA63、CsgG、CsgD、XcpQ、SP1、Phi29连接器蛋白、InvG、GspD或其任意组合。The sequencing complex according to claim 13, wherein the nanopore is a transmembrane protein pore or a solid Preferably, the transmembrane protein pore is selected from hemolysin, MspA, MspB, MspC, MspD, FraC, ClyA, PA63, CsgG, CsgD, XcpQ, SP1, Phi29 connector protein, InvG, GspD or any combination thereof. 如权利要求13或14所述的测序复合物,其特征在于,所述纳米孔为经修饰的纳米孔;所述经修饰的纳米孔优选为多肽修饰的纳米孔;较佳地,所述多肽为标签、酶切位点、信号肽或导肽、可检测的标记或其任意组合。The sequencing complex as described in claim 13 or 14 is characterized in that the nanopore is a modified nanopore; the modified nanopore is preferably a polypeptide-modified nanopore; preferably, the polypeptide is a tag, an enzyme cleavage site, a signal peptide or a leader peptide, a detectable marker or any combination thereof. 如权利要求13所述的测序复合物,其特征在于,所述电绝缘膜为两亲性膜、高分子聚合物膜或其任意组合;较佳地,所述电绝缘膜为磷脂双分子层、两嵌段共聚物或三嵌段共聚物。The sequencing complex as described in claim 13 is characterized in that the electrically insulating membrane is an amphiphilic membrane, a high molecular polymer membrane or any combination thereof; preferably, the electrically insulating membrane is a phospholipid bilayer, a diblock copolymer or a triblock copolymer. 一种多肽测序的方法,其特征在于,所述方法包括以下步骤:A method for polypeptide sequencing, characterized in that the method comprises the following steps: (1)向如权利要求12-16任一项所述的测序复合物施加电场力,使第一核酸片段被纳米孔捕获,第一核酸片段牵引待测多肽使所述待测多肽至少部分从第一电场端穿过纳米孔到达第二电场端;(1) applying an electric field force to the sequencing complex as described in any one of claims 12 to 16, so that the first nucleic acid fragment is captured by the nanopore, and the first nucleic acid fragment pulls the polypeptide to be tested so that the polypeptide to be tested at least partially passes through the nanopore from the first electric field end to the second electric field end; (2)聚合酶以复合核酸片段或第二核酸片段自双链核酸末端起的单链核酸为模板合成双链核酸,并将待测多肽从第二电场端经纳米孔拉向第一电场端,产生并记录纳米孔中的电流变化。(2) The polymerase synthesizes a double-stranded nucleic acid using the composite nucleic acid fragment or the single-stranded nucleic acid of the second nucleic acid fragment from the end of the double-stranded nucleic acid as a template, and pulls the polypeptide to be tested from the second electric field end through the nanopore to the first electric field end, generating and recording the current change in the nanopore. 如权利要求17所述的方法,其特征在于,所述第一电场端为处于负电场的纳米孔端,所述第二电场端为处于正电场的纳米孔端。The method of claim 17, wherein the first electric field end is the nanopore end in a negative electric field, and the second electric field end is the nanopore end in a positive electric field. 如权利要求17或18所述的方法,其特征在于,所述第一核酸片段被纳米孔捕获后,所述第四核酸片段在电场力作用下与复合核酸片段或第二核酸片段分离。The method according to claim 17 or 18 is characterized in that after the first nucleic acid fragment is captured by the nanopore, the fourth nucleic acid fragment is separated from the composite nucleic acid fragment or the second nucleic acid fragment under the action of the electric field force. 如权利要求17-19任一项所述的方法,其特征在于,通过施加电压形成所述电场力,且所述电压是50mV以上;较佳地,所述电压是100mV-300mV。The method according to any one of claims 17 to 19 is characterized in that the electric field force is formed by applying a voltage, and the voltage is above 50 mV; preferably, the voltage is 100 mV-300 mV. 如权利要求17-20任一项所述的方法,其特征在于,所述方法进一步包括步骤The method according to any one of claims 17 to 20, characterized in that the method further comprises the steps of (3):分析电流变化以识别所述待测多肽的氨基酸序列。(3): Analyze the current changes to identify the amino acid sequence of the polypeptide to be tested. 一种多肽测序装置,其特征在于,所述多肽测序装置包括:样品处理模块、纳米孔测序模块、电力模块和检测分析模块;A polypeptide sequencing device, characterized in that the polypeptide sequencing device comprises: a sample processing module, a nanopore sequencing module, a power module and a detection and analysis module; 其中,所述样品处理模块对待测多肽进行处理,形成如权利要求10或11所述的核酸蛋白复合物,并转移至纳米孔测序模块中;所述纳米孔测序模块在所述电力模块的作用下对所述待测多肽进行测序,所述检测分析模块对测序过程中的电流变化进行检测和分析;Wherein, the sample processing module processes the polypeptide to be tested to form the nucleic acid-protein complex as claimed in claim 10 or 11, and transfers it to the nanopore sequencing module; the nanopore sequencing module sequences the polypeptide to be tested under the action of the power module, and the detection and analysis module detects and analyzes the current changes during the sequencing process; 或者,所述样品处理模块对待测多肽进行处理,形成如权利要求1~8任一项所述的核酸多肽复合物,并转移至纳米孔测序模块中;所述纳米孔测序模块包括聚合酶,并在所述电力模块的作用下对所述待测多肽进行测序,所述检测分析模块对测序过程中的电流 变化进行检测和分析。Alternatively, the sample processing module processes the polypeptide to be tested to form the nucleic acid-polypeptide complex as described in any one of claims 1 to 8, and transfers it to the nanopore sequencing module; the nanopore sequencing module includes a polymerase, and sequences the polypeptide to be tested under the action of the power module, and the detection and analysis module processes the current during the sequencing process. Detect and analyze changes. 一种试剂盒,其特征在于,所述试剂盒包括第一核酸片段和复合核酸片段,A kit, characterized in that the kit comprises a first nucleic acid fragment and a composite nucleic acid fragment, 其中,所述复合核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后复合核酸片段包含双链核酸和单链核酸,所述单链核酸的一端与所述双链核酸连接,所述单链核酸的另一端与待测多肽连接;Wherein, the composite nucleic acid fragment comprises a nucleic acid sequence that can form a hairpin structure, and after forming the hairpin structure, the composite nucleic acid fragment comprises a double-stranded nucleic acid and a single-stranded nucleic acid, one end of the single-stranded nucleic acid is connected to the double-stranded nucleic acid, and the other end of the single-stranded nucleic acid is connected to the polypeptide to be detected; 所述第一核酸片段经修饰后能够与待测多肽、所述复合核酸片段依次连接;The first nucleic acid fragment can be sequentially connected to the polypeptide to be tested and the composite nucleic acid fragment after modification; 优选地,所述单链核酸的长度大于或等于所述待测多肽的长度。Preferably, the length of the single-stranded nucleic acid is greater than or equal to the length of the polypeptide to be detected. 如权利要求23所述的试剂盒,其特征在于,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段与所述第二核酸片段中一部分互补形成双链核酸,所述第二核酸片段中的另一部分为单链核酸。The kit as described in claim 23 is characterized in that the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, and after the hairpin structure is formed, the third nucleic acid fragment is complementary to a part of the second nucleic acid fragment to form a double-stranded nucleic acid, and the other part of the second nucleic acid fragment is a single-stranded nucleic acid. 如权利要求23所述的试剂盒,其特征在于,所述复合核酸片段由依次连接的第二核酸片段和第三核酸片段组成,所述第三核酸片段包含可形成发卡式结构的核酸序列,形成发卡式结构后第三核酸片段的两个末端彼此互补形成双链核酸,所述第二核酸片段为单链核酸,且所述第二核酸片段的一端与所述双链核酸连接,所述第二核酸片段的另一端与所述待测多肽连接。The kit as described in claim 23 is characterized in that the composite nucleic acid fragment is composed of a second nucleic acid fragment and a third nucleic acid fragment connected in sequence, the third nucleic acid fragment contains a nucleic acid sequence that can form a hairpin structure, after the hairpin structure is formed, the two ends of the third nucleic acid fragment are complementary to each other to form a double-stranded nucleic acid, the second nucleic acid fragment is a single-stranded nucleic acid, and one end of the second nucleic acid fragment is connected to the double-stranded nucleic acid, and the other end of the second nucleic acid fragment is connected to the polypeptide to be tested. 如权利要求24或25所述的试剂盒,其特征在于,所述试剂盒还包括第四核酸片段,所述第四核酸片段的5’端核酸序列与所述复合核酸片段中的单链核酸的至少一部分互补形成双链,所述第四核酸片段的3’端可选地连接锚定部分;The kit according to claim 24 or 25, characterized in that the kit further comprises a fourth nucleic acid fragment, the 5' end nucleic acid sequence of the fourth nucleic acid fragment is complementary to at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment to form a double strand, and the 3' end of the fourth nucleic acid fragment is optionally connected to an anchoring portion; 优选地,形成发卡式结构后,所述复合核酸片段的发卡式结构的双链核酸的稳定性高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的稳定性;Preferably, after the hairpin structure is formed, the stability of the double-stranded nucleic acid in the hairpin structure of the composite nucleic acid fragment is higher than the stability of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment; 更优选地,所述复合核酸片段的发卡式结构的双链核酸的Tm值高于所述第四核酸片段的5’端核酸序列与所述复合核酸片段或所述第二核酸片段中的单链核酸的至少一部分互补形成的双链核酸的Tm值。More preferably, the Tm value of the double-stranded nucleic acid of the hairpin structure of the composite nucleic acid fragment is higher than the Tm value of the double-stranded nucleic acid formed by the complementarity of the 5'-end nucleic acid sequence of the fourth nucleic acid fragment and at least a portion of the single-stranded nucleic acid in the composite nucleic acid fragment or the second nucleic acid fragment. 如权利要求26所述的试剂盒,其特征在于,所述锚定部分选自脂质、脂肪酸、甾醇、高分子材料、碳纳米管、多肽、蛋白质和/或氨基酸的任意其一或组合;优选地为胆固醇、棕榈酸酯或生育酚。The kit according to claim 26, characterized in that the anchoring moiety is selected from any one or a combination of lipids, fatty acids, sterols, polymer materials, carbon nanotubes, polypeptides, proteins and/or amino acids; preferably cholesterol, palmitate or tocopherol. 如权利要求23-27任一项所述的试剂盒,其特征在于,所述第一核酸片段、复合核酸片段、第二核酸片段、第三核酸和/或第四核酸片段为DNA或核酸类似物;所述核酸类似物优选地为LNA或PNA;所述第一核酸片段、第二核酸片段、第三核酸和/或第四核酸片段优选为单链DNA。 The kit according to any one of claims 23 to 27, characterized in that the first nucleic acid fragment, the composite nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are DNA or nucleic acid analogs; the nucleic acid analogs are preferably LNA or PNA; the first nucleic acid fragment, the second nucleic acid fragment, the third nucleic acid and/or the fourth nucleic acid fragment are preferably single-stranded DNA. 如权利要求23-28任一项所述的试剂盒,其特征在于,所述第一核酸片段的序列如SEQ ID NO:2所示;The kit according to any one of claims 23 to 28, wherein the sequence of the first nucleic acid fragment is as shown in SEQ ID NO: 2; 和/或,所述第二核酸片段的序列如SEQ ID NO:1的第2位~第55位所示;And/or, the sequence of the second nucleic acid fragment is as shown in positions 2 to 55 of SEQ ID NO:1; 和/或,所述第三核酸片段的序列如SEQ ID NO:4所示;And/or, the sequence of the third nucleic acid fragment is shown in SEQ ID NO:4; 和/或,所述第四核酸片段的序列如SEQ ID NO:5所示。And/or, the sequence of the fourth nucleic acid fragment is as shown in SEQ ID NO:5. 如权利要求23-29任一项所述的试剂盒,其特征在于,所述试剂盒进一步包括聚合酶、纳米孔、电绝缘膜和反应缓冲液的任意其一或组合。The kit according to any one of claims 23 to 29, characterized in that the kit further comprises any one or a combination of a polymerase, a nanopore, an electrically insulating membrane and a reaction buffer. 一种如权利要求1~8任一项所述的核酸多肽复合物、如权利要求10或11所述的核酸蛋白复合物或者如权利要求12-16任一项所述的测序复合物在纳米孔测序中的应用。 A use of the nucleic acid-polypeptide complex according to any one of claims 1 to 8, the nucleic acid-protein complex according to claim 10 or 11, or the sequencing complex according to any one of claims 12 to 16 in nanopore sequencing.
PCT/CN2023/107311 2023-07-13 2023-07-13 Nucleic acid polypeptide complex and use thereof in peptide sequencing Pending WO2025010736A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/107311 WO2025010736A1 (en) 2023-07-13 2023-07-13 Nucleic acid polypeptide complex and use thereof in peptide sequencing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/107311 WO2025010736A1 (en) 2023-07-13 2023-07-13 Nucleic acid polypeptide complex and use thereof in peptide sequencing

Publications (1)

Publication Number Publication Date
WO2025010736A1 true WO2025010736A1 (en) 2025-01-16

Family

ID=94214503

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/107311 Pending WO2025010736A1 (en) 2023-07-13 2023-07-13 Nucleic acid polypeptide complex and use thereof in peptide sequencing

Country Status (1)

Country Link
WO (1) WO2025010736A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109295187A (en) * 2018-10-31 2019-02-01 南京大学 A dislocation sequencing method for direct sequencing of non-natural nucleic acids based on nanopores
CN112147185A (en) * 2019-06-29 2020-12-29 清华大学 Method for controlling speed of polypeptide passing through nanopore and application of method
CN114761799A (en) * 2019-12-02 2022-07-15 牛津纳米孔科技公开有限公司 Methods of characterizing target polypeptides using nanopores
CN115135772A (en) * 2019-12-24 2022-09-30 代尔夫特科技大学 Protein and peptide fingerprinting and sequencing by nanopore translocation of peptide-oligonucleotide complexes
WO2023118891A1 (en) * 2021-12-23 2023-06-29 Oxford Nanopore Technologies Plc Method of characterising polypeptides using a nanopore

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109295187A (en) * 2018-10-31 2019-02-01 南京大学 A dislocation sequencing method for direct sequencing of non-natural nucleic acids based on nanopores
CN112147185A (en) * 2019-06-29 2020-12-29 清华大学 Method for controlling speed of polypeptide passing through nanopore and application of method
CN114761799A (en) * 2019-12-02 2022-07-15 牛津纳米孔科技公开有限公司 Methods of characterizing target polypeptides using nanopores
CN115135772A (en) * 2019-12-24 2022-09-30 代尔夫特科技大学 Protein and peptide fingerprinting and sequencing by nanopore translocation of peptide-oligonucleotide complexes
WO2023118891A1 (en) * 2021-12-23 2023-06-29 Oxford Nanopore Technologies Plc Method of characterising polypeptides using a nanopore

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BRINKERHOFF HENRY, KANG ALBERT S. W., LIU JINGQIAN, AKSIMENTIEV ALEKSEI, DEKKER CEES: "Infinite re-reading of single proteins at single-amino-acid resolution using nanopore sequencing", BIORXIV, 14 July 2021 (2021-07-14), pages 1 - 16, XP093033988, Retrieved from the Internet <URL:https://www.biorxiv.org/content/10.1101/2021.07.13.452225v1.full.pdf> [retrieved on 20230322], DOI: 10.1101/2021.07.13.452225 *
YING LUN, HU ZHENG-LI, ZHANG SHENGLI, QING YUJIA, FRAGASSO ALESSIO: "Nanopore-based technologies beyond DNA sequencing", NATURE NANOTECHNOLOGY, 1 November 2022 (2022-11-01), pages 1136 - 1146, XP055981713, DOI: 10.1038/s41565-022-01193-2 *

Similar Documents

Publication Publication Date Title
US12473595B2 (en) Coupling method
US11649490B2 (en) Method of target molecule characterisation using a molecular pore
EP2836506B1 (en) Mutant lysenin pores
CN104220874B (en) aptamer method
JP7523470B2 (en) Sensing interactions between molecular entities and nanopores
US20240240248A1 (en) Methods for complement strand sequencing
WO2025010736A1 (en) Nucleic acid polypeptide complex and use thereof in peptide sequencing
WO2025129587A1 (en) Nucleic acid-polypeptide-nucleic acid ternary complex, and use thereof in polypeptide nanopore sequencing
US20190093157A1 (en) Rna nanotubes for single molecule sensing and dna/rna/protein sequencing
AU2023251234A1 (en) Method
CN117587110A (en) Enrichment method, method for characterizing analytes and device thereof
CN119753110A (en) Adapter for characterizing analytes, characterization method and use thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23944749

Country of ref document: EP

Kind code of ref document: A1