[go: up one dir, main page]

WO2024138472A1 - Porin monomer, porin, mutant thereof and use of same - Google Patents

Porin monomer, porin, mutant thereof and use of same Download PDF

Info

Publication number
WO2024138472A1
WO2024138472A1 PCT/CN2022/143054 CN2022143054W WO2024138472A1 WO 2024138472 A1 WO2024138472 A1 WO 2024138472A1 CN 2022143054 W CN2022143054 W CN 2022143054W WO 2024138472 A1 WO2024138472 A1 WO 2024138472A1
Authority
WO
WIPO (PCT)
Prior art keywords
porin
protein
nanopore
sequencing
pore
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/143054
Other languages
French (fr)
Chinese (zh)
Inventor
刘姗姗
李登辉
孟亮
姬倩悦
蔡重阳
王乐乐
郭斐
曾涛
黎宇翔
董宇亮
章文蔚
徐讯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bgi Qingdao
BGI Shenzhen Co Ltd
Original Assignee
Bgi Qingdao
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bgi Qingdao, BGI Shenzhen Co Ltd filed Critical Bgi Qingdao
Priority to PCT/CN2022/143054 priority Critical patent/WO2024138472A1/en
Priority to CN202280102619.6A priority patent/CN120359234A/en
Publication of WO2024138472A1 publication Critical patent/WO2024138472A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the present invention relates to the field of single molecule sequencing, and in particular to a porin monomer, a porin and a mutant thereof, and applications thereof.
  • Nanopore sequencing requires that the sensing region inside the pore protein is sharp enough to have high spatial resolution in both the horizontal and vertical directions. So far, only a few natural proteins such as Mycobacterium smegmatis porin A (MspA) and curli-specific transporter (CsgG) can meet the requirements of becoming pore proteins for single-molecule detectors in the industry. Finding more excellent pore proteins that can be used for single-molecule sequencing through gene mining methods is still an unresolved problem.
  • MspA Mycobacterium smegmatis porin A
  • CsgG curli-specific transporter
  • the main purpose of the present invention is to provide a porin monomer, a porin and a mutant thereof and applications thereof, so as to provide a new porin and a nanopore sensor that can be used for nanopore sequencing.
  • the porin monomer includes a protein having any amino acid sequence of SEQ ID NO: 2-SEQ ID NO: 4.
  • a protein construct is provided.
  • the protein construct is composed of two or more porin monomers mentioned above, connected by covalent or non-covalent bonding.
  • a porin is provided, wherein the porin is composed of 7 to 11 porin monomers mentioned above, preferably 9, connected by covalent or non-covalent bonding.
  • porin protein is composed of 9 porin monomers connected non-covalently, and the porin monomers include proteins having any amino acid sequence in SEQ ID NO: 1-SEQ ID NO: 4.
  • the pore diameter of the porin is 0.5 to 3 nm.
  • a kit which comprises the above porin monomer, or the above protein construct, or the above porin.
  • an isolated DNA molecule which has: a nucleotide sequence encoding the above protein monomer; or a nucleotide sequence encoding the above protein construct, or a nucleotide sequence encoding the above porin.
  • DNA molecule that has more than 70%, preferably more than 80%, more preferably more than 90%, further preferably more than 99%, and most preferably more than 99% identity with the nucleotide sequence shown in SEQ ID NO: 5 and encodes a protein with the same function.
  • a recombinant vector which comprises the above DNA molecule.
  • a nanopore sensor which comprises: a membrane layer; and a pore protein inserted into the membrane layer and forming a pore, and when a voltage is applied across the membrane layer, the pore generates current; wherein the pore protein comprises the above-mentioned pore protein.
  • the membrane layer includes a lipid layer or an artificial polymer membrane; preferably, the lipid layer includes amphiphilic lipids; preferably, the amphiphilic lipids contain a phospholipid bilayer; preferably, the lipid layer includes a planar membrane layer or a liposome; preferably, the liposome includes a multilayer liposome or a unilamellar liposome; preferably, the lipid layer includes a phospholipid bilayer composed of diphytylphosphatidylcholine.
  • the biological molecules to be detected pass through the pores in the nanopore sensor and shift, and the pores generate a changing current; preferably, the biological molecules to be detected include DNA, RNA or polypeptides; preferably, the DNA and/or RNA include any one or more of the following modified bases: 5-methylcytosine, 6-methyladenine, 7-methylguanine, pseudouracil.
  • a nanopore sequencing device which comprises the above nanopore sensor.
  • the nanopore sequencing device includes: an electrolytic cell, the electrolytic cell contains a sequencing buffer; a nanopore sensor, the nanopore sensor is located in the center of the electrolytic cell and divides the electrolytic cell and the sequencing buffer into a positive electrolyte area and a negative electrolyte area; a first electrode and a second electrode, the first electrode and the second electrode are respectively arranged in the positive electrolyte area and the negative electrolyte area, and the first electrode and the second electrode are connected to the signal processing chip; preferably, the first electrode and the second electrode include metal or composite electrode materials; preferably, the first electrode and the second electrode are different, silver and silver chloride, respectively; or the first electrode and the second electrode are the same, including gold, platinum, graphene or titanium nitride.
  • a sequencing method which utilizes the above-mentioned pore protein, or the above-mentioned nanopore sensor, or the above-mentioned nanopore sequencing device to determine the sequence of the biological molecule to be tested by detecting and analyzing the electrical signal generated when the biological molecule to be tested passes through the pore of the pore protein.
  • biomolecules to be detected include modified or unmodified DNA, RNA or polypeptide; preferably, the electrical signal includes electric current.
  • the biological molecule to be detected is a target nucleic acid sequence
  • the sequencing method includes: (a) contacting the nucleic acid sequence with the above-mentioned pore protein and nucleic acid binding protein, so that the nucleic acid binding protein controls the movement speed of the target nucleic acid sequence through the pore of the pore protein, wherein the nucleic acid binding protein is selected from any one or more of nucleases, polymerases, topoisomerases, ligases, helicases or single-stranded binding proteins; (b) when a voltage is applied across the pore, when the nucleic acid sequence moves through the pore, measuring the electrical signal passing through the pore, wherein different types of nucleotides generate different electrical signals when passing through the pore, thereby determining the sequence information of the nucleic acid based on the electrical signal.
  • the present invention provides a new pore protein and its mutants that can be used for nanopore sequencing.
  • the nanopore sensor composed of the protein or its mutants has good stability, can meet the needs of single-molecule nanosequencing, and can realize the detection of biological small molecules such as nucleotides, amino acids, sugars, vitamins, etc. It can also be used for sequencing modified or unmodified DNA, RNA or polypeptides.
  • FIG1 shows a side view of the three-dimensional structure of BCP35 predicted according to Example 1 of the present invention.
  • FIG. 2 shows a top view of the three-dimensional structure of BCP35 predicted according to Example 1 of the present invention.
  • FIG3 shows a schematic diagram of the predicted structure of key amino acids in the gating region of BCP35 according to Example 1 of the present invention and the distances between key amino acids.
  • FIG. 4 shows an enlarged view of the structure of key amino acids in the gating region of BCP35 according to Example 1 of the present invention.
  • FIG5 shows a schematic diagram of key amino acids at the entrance of the predicted structure of BCP35 according to Example 1 of the present invention.
  • FIG6 shows a schematic diagram of key amino acids at the inner wall and the exit of the pore in the predicted structure of BCP35 according to Example 1 of the present invention.
  • FIG. 8 shows an SDS-PAGE image obtained by purification of BCP35 protein according to Example 4 of the present invention.
  • FIG. 11 shows a graph of the pore opening current of BCP35 according to Example 6 of the present invention at different voltages in a phospholipid bilayer.
  • FIG. 12 shows a graph showing the current changes when the DNA to be tested passes through the nanopore BCP35 according to Example 7 of the present invention.
  • a poron monomer which comprises: (a) a protein having an amino acid sequence as shown in SEQ ID NO: 1; or (b) a protein mutant, wherein the amino acid sequence of the protein mutant is substituted, deleted and/or added with one or more amino acids at at least one of the following positions of SEQ ID NO: 1: 91, 98, 99, 125, 126, 127, 131, 134, 136, 140, 152, 155, 174, 177, 183, 185, 188, 219, 222, 223, 230, 232, 238, 239, and the protein mutant has the function of forming a pore structure by polymerization; or (c) has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% identity with the protein in (a) or (b), and has the function of forming a pore structure by polymerization.
  • the porin monomer defined in (a) above can polymerize to form the porin BCP35 having a pore structure, and when applied to nanopore sequencing, it can allow the biomolecules to be tested to pass through the pore one by one, generating a current signal.
  • the protein is mutated, for example, at other positions such as the mutation sites disclosed in (b), after substitution and/or deletion and/or addition of one or several amino acids, the pore structure and function of the porin can still be obtained.
  • Mutating the porin monomer may affect the stability of the protein and aggregates, the inner diameter of the pore, and the amino acid residues on the inner wall of the pore, thereby affecting its physicochemical properties and the passing performance of the biomolecules to be tested, but the conventional operation mode of mutation, and the method of screening and obtaining proteins with nanopore structure and functional activity are well known to those skilled in the art.
  • Identity in this specification refers to the "identity" between amino acid sequences or nucleic acid sequences, that is, the total ratio of the same type of amino acid residues or nucleotides in the amino acid sequence or nucleic acid sequence.
  • the identity of amino acid sequences or nucleic acid sequences can be determined using alignment programs such as BLAST (Basic Local Alignment Search Tool) and FASTA.
  • Proteins with 70%, 75%, 80%, 85%, 90%, 95%, 99% or more e.g. 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even 99.9% or more
  • identity and the same function have a high probability of having the same active site, active pocket, active mechanism, protein structure, etc. as the protein provided by the sequence a).
  • the nanopore sequencing device there are an electrolytic cell containing an electrolyte, a nanopore sensor, a first electrode, and a second electrode.
  • the nanopore sensor is placed in the center of the electrolytic cell containing an electrolyte, and the electrolytic cell is decomposed into a positive electrolyte area and a negative electrolyte area.
  • the two areas are respectively provided with electrodes, and the electrodes are used to form an electric field applied to the nanopore sensor.
  • the biological molecules to be tested pass through the nanopore protein on the membrane, generating a current amplitude. By receiving this current amplitude and transmitting the current amplitude to a signal processing chip connected to the electrode. According to the difference in current amplitude, the signal processing chip, that is, the nanopore sequencing device including the signal processing chip, can perform data analysis and determination on the sequence of the biological molecules to be tested.
  • Figure 12 shows the current changes generated when the library DNA passes through the nanopore protein BCP35 under the action of an applied voltage of 0.14V. It can be seen that the library DNA can pass through the wild-type BCP35 pore protein to generate current, and the current value fluctuates as different nucleotides penetrate the pore. Similarly, when the library DNA passed through the porin BCP35 mutants (P91A, S98N, and N99S), current changes were also generated, indicating that the porin BCP35 and its mutants can be used for nanopore sequencing.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Provided in the present invention are a porin monomer, a porin, a mutant thereof and the use of same. The porin monomer comprises: (a) a protein having an amino acid sequence as shown in SEQ ID NO: 1; or (b) a protein mutant the amino acid sequence of which is obtained by means of substitution, deletion and/or addition of one or several amino acids at at least one of the following sites in SEQ ID NO: 1: the 91st site, the 98th site, the 99th site, etc., the protein mutant having the function of forming pore structures by means of polymerization; or (c) a protein having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% identity to the protein of (a) or (b), and having the function of forming pore structures by means of polymerization. Provided is a novel nanopore sensor capable of being used for nanopore sequencing, which is applicable to the field of single molecule sequencing.

Description

孔蛋白单体、孔蛋白及其突变体和其应用Porin monomer, porin and its mutant and its application 技术领域Technical Field

本发明涉及单分子测序领域,具体而言,涉及一种孔蛋白单体、孔蛋白及其突变体和其应用。The present invention relates to the field of single molecule sequencing, and in particular to a porin monomer, a porin and a mutant thereof, and applications thereof.

背景技术Background technique

目前,市场上已出现一些商业化的纳米孔测序仪,例如英国牛津纳米孔技术公司的MinION、GridION和PromethION,齐碳公司的QNome-3841。然而它们在测序准确度、通量以及芯片稳定性和可应用场景等方面仍存在较大不足,无法满足分子生物学研究的终极需求。因此,急需研制出一款高准确度、高集成度以及高稳定性的单分子测序仪。基于纳米孔的单分子测序仪是一个多学科、多技术高度融合的检测系统。此类仪器的研制需要物理、生物、化学、半导体、计算机等多学科的深度交叉与协同创新,从底层核心模块出发构建高精度的单分子纳米孔测序系统。At present, some commercial nanopore sequencers have appeared on the market, such as MinION, GridION and PromethION of Oxford Nanopore Technologies in the UK, and QNome-3841 of Qi Carbon. However, they still have major deficiencies in sequencing accuracy, throughput, chip stability and applicable scenarios, and cannot meet the ultimate needs of molecular biology research. Therefore, it is urgent to develop a single-molecule sequencer with high accuracy, high integration and high stability. The nanopore-based single-molecule sequencer is a highly integrated detection system with multiple disciplines and technologies. The development of such instruments requires deep cross-disciplinary and collaborative innovation in multiple disciplines such as physics, biology, chemistry, semiconductors, and computers, and builds a high-precision single-molecule nanopore sequencing system from the underlying core modules.

纳米孔测序要求孔道蛋白内部传感区域足够锐利,以在横向与纵向上均有高的空间分辨能力。截止目前,仅有耻垢分枝杆菌孔蛋白A(MspA)、curli-特异性转运通道(CsgG)等少数几种天然蛋白能够在产业上符合成为单分子检测器的孔蛋白的要求。通过基因挖掘的方法找到更多优异的可用于单分子测序的孔蛋白,仍是一个尚待解决的问题。Nanopore sequencing requires that the sensing region inside the pore protein is sharp enough to have high spatial resolution in both the horizontal and vertical directions. So far, only a few natural proteins such as Mycobacterium smegmatis porin A (MspA) and curli-specific transporter (CsgG) can meet the requirements of becoming pore proteins for single-molecule detectors in the industry. Finding more excellent pore proteins that can be used for single-molecule sequencing through gene mining methods is still an unresolved problem.

发明内容Summary of the invention

本发明的主要目的在于提供一种孔蛋白单体、孔蛋白及其突变体和其应用,以提供一种新的可用于纳米孔测序的孔蛋白和纳米孔传感器。The main purpose of the present invention is to provide a porin monomer, a porin and a mutant thereof and applications thereof, so as to provide a new porin and a nanopore sensor that can be used for nanopore sequencing.

为了实现上述目的,根据本发明的第一个方面,提供了一种孔蛋白单体,该孔蛋白单体包括:(a)具有SEQ ID NO:1所示的氨基酸序列的蛋白质;或(b)蛋白质突变体,蛋白质突变体的氨基酸序列在SEQ ID NO:1的如下至少一个位点发生取代、缺失和/或添加一个或几个氨基酸:91、98、99、125、126、127、131、134、136、140、152、155、174、177、183、185、188、219、222、223、230、232、238、239,且蛋白突变体具有经聚合形成孔道结构的功能;或(c)与(a)或(b)中的蛋白质具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、或至少99%同一性,且具有经聚合形成孔道结构的功能。To achieve the above-mentioned purpose, according to the first aspect of the present invention, there is provided a poron monomer, which comprises: (a) a protein having an amino acid sequence as shown in SEQ ID NO: 1; or (b) a protein mutant, the amino acid sequence of the protein mutant undergoes substitution, deletion and/or addition of one or more amino acids at at least one of the following positions of SEQ ID NO: 1: 91, 98, 99, 125, 126, 127, 131, 134, 136, 140, 152, 155, 174, 177, 183, 185, 188, 219, 222, 223, 230, 232, 238, 239, and the protein mutant has the function of forming a pore structure by polymerization; or (c) has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% identity with the protein in (a) or (b), and has the function of forming a pore structure by polymerization.

进一步地,b)中,取代的氨基酸的类型各自独立地选自如下:P91突变为P91G、P91A或P91T;S98突变为S98G、S98A、S98T、S98N或S98Q;N99突变为N99G、N99A、N99S、N99T或N99Q;E125突变为E125K、E125R、E125G、E125A、E125S、E125T、E125N或E125Q;R126突变为R126K、R126G、R126A、R126S、R126T、R126N或R126Q;K127突变为K127R、K127G、K127A、K127S、K127T、K127N或K127Q;D131 突变为D131K、D131R、D131G、D131A、D131S、D131T、D131N或D131Q;K134突变为K134R、K134G、K134A、K134S、K134T、K134N或K134Q;R136突变为R136K、R136G、R136A、R136S、R136T、R136N或R136Q;R140突变为R140K、R140G、R140A、R140S、R140T、R140N或R140Q;K152突变为K152R、K152G、K152A、K152S、K152T、K152N或K152Q;D155突变为D155R、D155K、D155G、D155A、D155S、D155T、D155N或D155Q;T174突变为T174A、T174G、T174V、T174L、T174I、T174Y、T174F或T174W;E177突变为E177A、E177G、E177S、E177T、E177N或E177Q;H183突变为H183A、H183G、H183V、H183L、H183I、H183Y、H183F或H183W;K185突变为K185A、K185G、K185V、K185L、K185I、K185Y、K185F或K185W;R188突变为R188A、R188G、R188S、R188T、R188N或R188Q;K219突变为K219A、K219G、K219V、K219L、K219I、K219Y、K219F或K219W;R222突变为R222A、R222G、R222S、R222T、R222N或R222Q;S223突变为S223A、S223G、S223V、S223L、S223I、S223Y、S223F或S223W;R230突变为R230A、R230G、R230S、R230T、R230N或R230Q;K232突变为K232A、K232G、K232S、K232T、K232N或K232Q;E238突变为E238A、E238G、E238S、E238T、E238N或E238Q;R239突变为R239A、R239G、R239V、R239L、R239I、R239Y、R239F或R239W。Further, in b), the types of substituted amino acids are independently selected from the following: P91 is mutated to P91G, P91A or P91T; S98 is mutated to S98G, S98A, S98T, S98N or S98Q; N99 is mutated to N99G, N99A, N99S, N99T or N99Q; E125 is mutated to E125K, E125R, E125 G, E125A, E125S, E125T, E125N or E125Q; R126 is mutated to R126K, R126G, R126A, R126S, R126T, R126N or R126Q; K127 is mutated to K127R, K127G, K127A, K127S, K127T, K127N or K127Q; D131 Mutation is D131K, D131R, D131G, D131A, D131S, D131T, D131N or D131Q; K134 mutation is K134R, K134G, K134A, K134S, K134T, K134N or K134Q; R136 mutation is R136K, R136G, R136A, R136S, R136T, R136N or R136Q; R140 mutation is R140K, R140G, R140A, R140S, R140T, R140N or R140Q; K152 mutation is K152R, K152G, K152A , K152S, K152T, K152N or K152Q; D155 is mutated to D155R, D155K, D155G, D155A, D155S, D155T, D155N or D155Q; T174 is mutated to T174A, T174G, T174V, T174L, T174I, T174Y, T174F or T174W; E177 is mutated to E177A, E177G, E177S, E177T, E177N or E177Q; H183 is mutated to H183A, H183G, H183V, H183L, H183I, H183Y, H183 83F or H183W; K185 mutated to K185A, K185G, K185V, K185L, K185I, K185Y, K185F or K185W; R188 mutated to R188A, R188G, R188S, R188T, R188N or R188Q; K219 mutated to K219A, K219G, K219V, K219L, K219I, K219Y, K219F or K219W; R222 mutated to R222A, R222G, R222S, R222T, R222N or R222Q; S223 mutated to S223A, S223G, R223S, R223T, R223N or R223Q; 3G, S223V, S223L, S223I, S223Y, S223F or S223W; R230 mutates to R230A, R230G, R230S, R230T, R230N or R230Q; K232 mutates to K232A, K232G, K232S, K232T, K232N or K232Q; E238 mutates to E238A, E238G, E238S, E238T, E238N or E238Q; R239 mutates to R239A, R239G, R239V, R239L, R239I, R239Y, R239F or R239W.

进一步地,孔蛋白单体包括具有SEQ ID NO:2-SEQ ID NO:4中任一氨基酸序列的蛋白质。Furthermore, the porin monomer includes a protein having any amino acid sequence of SEQ ID NO: 2-SEQ ID NO: 4.

为了实现上述目的,根据本发明的第二个方面,提供了一种蛋白构建体,该蛋白构建体由2个或更多个上述孔蛋白单体、通过共价或非共价连接而成。In order to achieve the above object, according to the second aspect of the present invention, a protein construct is provided. The protein construct is composed of two or more porin monomers mentioned above, connected by covalent or non-covalent bonding.

为了实现上述目的,根据本发明的第三个方面,提供了一种孔蛋白,该孔蛋白由7-11个上述孔蛋白单体、通过共价或非共价连接而成,优选为9个。In order to achieve the above object, according to the third aspect of the present invention, a porin is provided, wherein the porin is composed of 7 to 11 porin monomers mentioned above, preferably 9, connected by covalent or non-covalent bonding.

进一步地,孔蛋白由9个孔蛋白单体通过非共价连接而成,孔蛋白单体包括具有SEQ ID NO:1-SEQ ID NO:4中任一氨基酸序列的蛋白质。Furthermore, the porin protein is composed of 9 porin monomers connected non-covalently, and the porin monomers include proteins having any amino acid sequence in SEQ ID NO: 1-SEQ ID NO: 4.

进一步地,孔蛋白的孔道直径为0.5.~3nm。Furthermore, the pore diameter of the porin is 0.5 to 3 nm.

为了实现上述目的,根据本发明的第四个方面,提供了一种试剂盒,该试剂盒包括上述孔蛋白单体,或上述蛋白构建体,或上述孔蛋白。In order to achieve the above object, according to the fourth aspect of the present invention, a kit is provided, which comprises the above porin monomer, or the above protein construct, or the above porin.

进一步地,试剂盒还包括膜层,膜层包括脂质层或人造高分子膜。Furthermore, the kit also includes a membrane layer, and the membrane layer includes a lipid layer or an artificial polymer membrane.

进一步地,试剂盒还包括以下至少一项:测序缓冲液、核酸酶、聚合酶、拓扑异构酶、连接酶、解旋酶和连接胆固醇的单链DNA。Furthermore, the kit also includes at least one of the following: a sequencing buffer, a nuclease, a polymerase, a topoisomerase, a ligase, a helicase, and a single-stranded DNA linked to cholesterol.

为了实现上述目的,根据本发明的第五个方面,提供了一种分离的DNA分子,该DNA分子具有:编码上述蛋白单体的核苷酸序列;或编码上述蛋白构建体的核苷酸序列,或编码上述孔蛋白的核苷酸序列。To achieve the above object, according to the fifth aspect of the present invention, an isolated DNA molecule is provided, which has: a nucleotide sequence encoding the above protein monomer; or a nucleotide sequence encoding the above protein construct, or a nucleotide sequence encoding the above porin.

进一步地,,与SEQ ID NO:5所示核苷酸序列具有70%以上,优选80%以上,更优选90%以上,进一步优选99%以上、最优选99%以上同一性且编码具有相同功能蛋白质的DNA分子。Furthermore, a DNA molecule that has more than 70%, preferably more than 80%, more preferably more than 90%, further preferably more than 99%, and most preferably more than 99% identity with the nucleotide sequence shown in SEQ ID NO: 5 and encodes a protein with the same function.

为了实现上述目的,根据本发明的第六个方面,提供了一种重组载体,该重组载体包含上述DNA分子。In order to achieve the above object, according to the sixth aspect of the present invention, a recombinant vector is provided, which comprises the above DNA molecule.

为了实现上述目的,根据本发明的第七个方面,提供了一种宿主细胞,该宿主细胞转化有上述重组载体。In order to achieve the above object, according to the seventh aspect of the present invention, a host cell is provided, wherein the host cell is transformed with the above recombinant vector.

为了实现上述目的,根据本发明的第八个方面,提供了一种纳米孔传感器,该纳米孔传感器包括:膜层;以及插入膜层中且形成孔道的孔蛋白,当跨越膜层施加电压时,孔道产生电流;其中,孔蛋白包括上述孔蛋白。In order to achieve the above-mentioned purpose, according to the eighth aspect of the present invention, a nanopore sensor is provided, which comprises: a membrane layer; and a pore protein inserted into the membrane layer and forming a pore, and when a voltage is applied across the membrane layer, the pore generates current; wherein the pore protein comprises the above-mentioned pore protein.

进一步地,膜层包括脂质层或人造高分子膜;优选地,脂质层包括两亲脂类;优选地,两亲脂类包含磷脂双分子层;优选地,脂质层包括平面膜层或脂质体;优选地,脂质体包括多层脂质体或单层脂质体;优选地,脂质层包括二植酰磷脂酰胆碱组成的磷脂双分子层。Further, the membrane layer includes a lipid layer or an artificial polymer membrane; preferably, the lipid layer includes amphiphilic lipids; preferably, the amphiphilic lipids contain a phospholipid bilayer; preferably, the lipid layer includes a planar membrane layer or a liposome; preferably, the liposome includes a multilayer liposome or a unilamellar liposome; preferably, the lipid layer includes a phospholipid bilayer composed of diphytylphosphatidylcholine.

进一步地,当跨越膜层施加电压时,待测生物分子穿过纳米孔传感器中的孔道并发生移位,孔道产生变化的电流;优选地,待测生物分子包括DNA、RNA或多肽;优选地,DNA和/或RNA包括如下任意一种或多种修饰碱基:5-甲基胞嘧啶、6-甲基腺嘌呤、7-甲基鸟嘌呤、假尿嘧啶。Furthermore, when a voltage is applied across the membrane layer, the biological molecules to be detected pass through the pores in the nanopore sensor and shift, and the pores generate a changing current; preferably, the biological molecules to be detected include DNA, RNA or polypeptides; preferably, the DNA and/or RNA include any one or more of the following modified bases: 5-methylcytosine, 6-methyladenine, 7-methylguanine, pseudouracil.

为了实现上述目的,根据本发明的第九个方面,提供了一种纳米孔测序装置,该纳米孔测序装置包括上述纳米孔传感器。In order to achieve the above objective, according to a ninth aspect of the present invention, a nanopore sequencing device is provided, which comprises the above nanopore sensor.

进一步地,纳米孔测序装置包括:电解槽,电解槽含有测序缓冲液;纳米孔传感器,纳米孔传感器位于电解槽的中央,并将电解槽及测序缓冲液分割为正极电解液区和负极电解液区;第一电极和第二电极,第一电极和第二电极分别设置在正极电解液区和负极电解液区,且第一电极和第二电极与信号处理芯片相连;优选地,第一电极和第二电极包括金属或复合电极材料;优选地,第一电极和第二电极不同,分别为银和氯化银;或者第一电极和第二电极相同,包括金、铂、石墨烯或氮化钛。Furthermore, the nanopore sequencing device includes: an electrolytic cell, the electrolytic cell contains a sequencing buffer; a nanopore sensor, the nanopore sensor is located in the center of the electrolytic cell and divides the electrolytic cell and the sequencing buffer into a positive electrolyte area and a negative electrolyte area; a first electrode and a second electrode, the first electrode and the second electrode are respectively arranged in the positive electrolyte area and the negative electrolyte area, and the first electrode and the second electrode are connected to the signal processing chip; preferably, the first electrode and the second electrode include metal or composite electrode materials; preferably, the first electrode and the second electrode are different, silver and silver chloride, respectively; or the first electrode and the second electrode are the same, including gold, platinum, graphene or titanium nitride.

为了实现上述目的,根据本发明的第十个方面,提供了一种测序方法,该测序方法利用上述孔蛋白,或者上述纳米孔传感器,或者上述纳米孔测序装置通过检测并解析待测生物分子通过孔蛋白的孔道时产生的电信号,确定待测生物分子的序列。In order to achieve the above-mentioned purpose, according to the tenth aspect of the present invention, a sequencing method is provided, which utilizes the above-mentioned pore protein, or the above-mentioned nanopore sensor, or the above-mentioned nanopore sequencing device to determine the sequence of the biological molecule to be tested by detecting and analyzing the electrical signal generated when the biological molecule to be tested passes through the pore of the pore protein.

进一步地,待测生物分子包括修饰或未修饰的DNA、RNA或多肽;优选地,电信号包括电流。Furthermore, the biomolecules to be detected include modified or unmodified DNA, RNA or polypeptide; preferably, the electrical signal includes electric current.

进一步地,待测生物分子是靶核酸序列,测序方法包括:(a)使核酸序列与上述孔蛋白和核酸结合蛋白接触,使得核酸结合蛋白控制靶核酸序列通过孔蛋白的孔道的移动速度,其中核酸结合蛋白选自核酸酶、聚合酶、拓扑异构酶、连接酶、解旋酶或单链结合蛋白中的任意一种或多种;(b)在跨孔施加电压时,当核酸序列移动通过孔道时,测量通过孔道的电信 号,其中,不同类型的核苷酸通过孔道所产生的电信号不同,从而基于电信号确定核酸的序列信息。Furthermore, the biological molecule to be detected is a target nucleic acid sequence, and the sequencing method includes: (a) contacting the nucleic acid sequence with the above-mentioned pore protein and nucleic acid binding protein, so that the nucleic acid binding protein controls the movement speed of the target nucleic acid sequence through the pore of the pore protein, wherein the nucleic acid binding protein is selected from any one or more of nucleases, polymerases, topoisomerases, ligases, helicases or single-stranded binding proteins; (b) when a voltage is applied across the pore, when the nucleic acid sequence moves through the pore, measuring the electrical signal passing through the pore, wherein different types of nucleotides generate different electrical signals when passing through the pore, thereby determining the sequence information of the nucleic acid based on the electrical signal.

为了实现上述目的,根据本发明的第十一个方面,提供了一种上述孔蛋白单体,或者上述孔蛋白,或者上述试剂盒,或者上述DNA分子,或者上述重组载体,或者上述宿主细胞,或者上述纳米孔传感器,或者上述纳米孔测序装置,或者上述测序方法在生物小分子检测、核酸测序或多肽测序中的应用。In order to achieve the above-mentioned purpose, according to the eleventh aspect of the present invention, there is provided the above-mentioned porin monomer, or the above-mentioned porin, or the above-mentioned kit, or the above-mentioned DNA molecule, or the above-mentioned recombinant vector, or the above-mentioned host cell, or the above-mentioned nanopore sensor, or the above-mentioned nanopore sequencing device, or the above-mentioned sequencing method for use in biological small molecule detection, nucleic acid sequencing or polypeptide sequencing.

本发明提供了一种新的可用于纳米孔测序的孔蛋白及其突变体,由该蛋白或其突变体组成的纳米孔传感器的稳定性较好,能够满足单分子纳米测序的需求,实现对于核苷酸,氨基酸,糖类,维生素等生物小分子的检测,也可用于修饰或未修饰的DNA、RNA或多肽的测序。The present invention provides a new pore protein and its mutants that can be used for nanopore sequencing. The nanopore sensor composed of the protein or its mutants has good stability, can meet the needs of single-molecule nanosequencing, and can realize the detection of biological small molecules such as nucleotides, amino acids, sugars, vitamins, etc. It can also be used for sequencing modified or unmodified DNA, RNA or polypeptides.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

构成本申请的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings constituting a part of the present application are used to provide a further understanding of the present invention. The exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the drawings:

图1示出了根据本发明实施例1的预测得到的BCP35的三维结构侧视图。FIG1 shows a side view of the three-dimensional structure of BCP35 predicted according to Example 1 of the present invention.

图2示出了根据本发明实施例1的预测得到的BCP35的三维结构俯视图。FIG. 2 shows a top view of the three-dimensional structure of BCP35 predicted according to Example 1 of the present invention.

图3示出了根据本发明实施例1的BCP35门控区的重点氨基酸的预测结构示意图和重点氨基酸之间的距离。FIG3 shows a schematic diagram of the predicted structure of key amino acids in the gating region of BCP35 according to Example 1 of the present invention and the distances between key amino acids.

图4示出了根据本发明实施例1的BCP35门控区的重点氨基酸的的结构放大图。FIG. 4 shows an enlarged view of the structure of key amino acids in the gating region of BCP35 according to Example 1 of the present invention.

图5示出了根据本发明实施例1的BCP35预测结构中入口处的重点氨基酸示意图。FIG5 shows a schematic diagram of key amino acids at the entrance of the predicted structure of BCP35 according to Example 1 of the present invention.

图6示出了根据本发明实施例1的BCP35预测结构中孔内壁和出口处的重点氨基酸示意图。FIG6 shows a schematic diagram of key amino acids at the inner wall and the exit of the pore in the predicted structure of BCP35 according to Example 1 of the present invention.

图7示出了根据本发明实施例1的BCP35预测结构中跨膜区的重点氨基酸示意图。FIG. 7 shows a schematic diagram of key amino acids in the transmembrane region of the predicted structure of BCP35 according to Example 1 of the present invention.

图8示出了根据本发明实施例4的BCP35蛋白的纯化得到的SDS-PAGE图。FIG. 8 shows an SDS-PAGE image obtained by purification of BCP35 protein according to Example 4 of the present invention.

图9示出了根据本发明实施例5的测序文库结构示意图,其中A为测序文库的结构示意图,B为结合有带有胆固醇的单链DNA的测序文库结构示意图,a:正义链;b:反义链;c:待测双链目的片段;d:解旋酶BCH105;e:带有胆固醇的单链DNA。Figure 9 shows a schematic diagram of the sequencing library structure according to Example 5 of the present invention, wherein A is a schematic diagram of the sequencing library structure, B is a schematic diagram of the sequencing library structure combined with single-stranded DNA containing cholesterol, a: positive chain; b: antisense chain; c: double-stranded target fragment to be tested; d: helicase BCH105; e: single-stranded DNA containing cholesterol.

图10示出了根据本发明实施例5的iSp18的具体结构。FIG. 10 shows the specific structure of iSp18 according to Embodiment 5 of the present invention.

图11示出了根据本发明实施例6的BCP35在磷脂双分子层中的不同电压下的开孔电流图。FIG. 11 shows a graph of the pore opening current of BCP35 according to Example 6 of the present invention at different voltages in a phospholipid bilayer.

图12示出了根据本发明实施例7的待测DNA穿过纳米孔BCP35的电流变化图。FIG. 12 shows a graph showing the current changes when the DNA to be tested passes through the nanopore BCP35 according to Example 7 of the present invention.

具体实施方式Detailed ways

需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将结合实施例来详细说明本发明。It should be noted that, in the absence of conflict, the embodiments and features in the embodiments of the present application can be combined with each other. The present invention will be described in detail below in conjunction with the embodiments.

如背景技术所提到的,现有可以用作单分子纳米孔测序的孔蛋白种类非常少,可选择性受到很大限制。因而,在本申请中,发明人通过计算机辅助结构预测的基因挖掘手段,从深海宏基因组(来源于马里亚纳海沟11000米深处的样本)中挖掘得到一种新的纳米孔蛋白,命名为BCP35,其由九个同类型的单体聚合而成。由于其具有孔道结构,因此可以作为检测用蛋白,应用于核苷酸、氨基酸、糖类、维生素等生物小分子的检测,或者应用于基于纳米孔的DNA、RNA或多肽测序。因而提出了本申请的一系列保护方案。As mentioned in the background technology, there are very few types of pore proteins that can be used for single-molecule nanopore sequencing, and the selectivity is greatly limited. Therefore, in the present application, the inventors have mined a new nanopore protein from the deep-sea metagenome (derived from samples at a depth of 11,000 meters in the Mariana Trench) through gene mining methods using computer-assisted structure prediction, named BCP35, which is composed of nine monomers of the same type. Because it has a pore structure, it can be used as a detection protein and applied to the detection of biological small molecules such as nucleotides, amino acids, sugars, vitamins, or in nanopore-based DNA, RNA or polypeptide sequencing. Therefore, a series of protection schemes for the present application are proposed.

在本申请第一种典型的实施方式中,提供了一种孔蛋白单体,该孔蛋白单体包括:(a)具有SEQ ID NO:1所示的氨基酸序列的蛋白质;或(b)蛋白质突变体,蛋白质突变体的氨基酸序列在SEQ ID NO:1的如下至少一个位点发生取代、缺失和/或添加一个或几个氨基酸:91、98、99、125、126、127、131、134、136、140、152、155、174、177、183、185、188、219、222、223、230、232、238、239,且蛋白突变体具有经聚合形成孔道结构的功能;或(c)与(a)或(b)中的蛋白质具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、或至少99%同一性,且具有经聚合形成孔道结构的功能。In a first typical embodiment of the present application, a poron monomer is provided, which comprises: (a) a protein having an amino acid sequence as shown in SEQ ID NO: 1; or (b) a protein mutant, wherein the amino acid sequence of the protein mutant is substituted, deleted and/or added with one or more amino acids at at least one of the following positions of SEQ ID NO: 1: 91, 98, 99, 125, 126, 127, 131, 134, 136, 140, 152, 155, 174, 177, 183, 185, 188, 219, 222, 223, 230, 232, 238, 239, and the protein mutant has the function of forming a pore structure by polymerization; or (c) has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% identity with the protein in (a) or (b), and has the function of forming a pore structure by polymerization.

SEQ ID NO:1:SEQ ID NO: 1:

Figure PCTCN2022143054-appb-000001
Figure PCTCN2022143054-appb-000001

上述(a)限定的孔蛋白单体,能够聚合形成具有孔道结构的孔蛋白BCP35,应用于纳米孔测序时,能够使待测生物分子逐一从孔道中穿过,产生电流信号。在(a)序列的基础上,对蛋白进行突变,比如,在其他位置如(b)中公开的突变位点进行突变,取代和/或缺失和/或添加一个或几个氨基酸后,仍能够获得保持上述孔蛋白的孔道结构和功能。对孔蛋白单体进行突变,可能对蛋白及聚集体稳定性、孔道内径、孔道内壁氨基酸残基产生影响,从而影响其理化性质和待测生物分子通过性能,但突变的常规操作方式,以及筛选获得具有纳米孔道结构和功能活性的蛋白的方法,是本领域技术人员所公知。The porin monomer defined in (a) above can polymerize to form the porin BCP35 having a pore structure, and when applied to nanopore sequencing, it can allow the biomolecules to be tested to pass through the pore one by one, generating a current signal. Based on the sequence in (a), the protein is mutated, for example, at other positions such as the mutation sites disclosed in (b), after substitution and/or deletion and/or addition of one or several amino acids, the pore structure and function of the porin can still be obtained. Mutating the porin monomer may affect the stability of the protein and aggregates, the inner diameter of the pore, and the amino acid residues on the inner wall of the pore, thereby affecting its physicochemical properties and the passing performance of the biomolecules to be tested, but the conventional operation mode of mutation, and the method of screening and obtaining proteins with nanopore structure and functional activity are well known to those skilled in the art.

本说明书中的同一性(Identity)是指氨基酸序列或核酸序列之间的“同一性”,即氨基酸序列或核酸序列中的种类相同的氨基酸残基或核苷酸的比率的总计。氨基酸序列或核酸序列的同一性可以利用BLAST(Basic Local Alignment Search Tool)、FASTA等比对程序来确定。Identity in this specification refers to the "identity" between amino acid sequences or nucleic acid sequences, that is, the total ratio of the same type of amino acid residues or nucleotides in the amino acid sequence or nucleic acid sequence. The identity of amino acid sequences or nucleic acid sequences can be determined using alignment programs such as BLAST (Basic Local Alignment Search Tool) and FASTA.

70%、75%、80%、85%、90%、95%、99%以上(比如85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、98.5%、99%、99.5%、99.6%、99.7%、 99.8%以上,甚至99.9%以上)同一性且具有相同功能的蛋白质,其活性位点、活性口袋、活性机制、蛋白结构等均和a)序列提供的蛋白质大概率相同。Proteins with 70%, 75%, 80%, 85%, 90%, 95%, 99% or more (e.g. 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even 99.9% or more) identity and the same function have a high probability of having the same active site, active pocket, active mechanism, protein structure, etc. as the protein provided by the sequence a).

如本文所用,氨基酸残基缩写如下:丙氨酸(Ala;A)、天冬酰胺(Asn;N)、天冬氨酸(Asp;D)、精氨酸(Arg;R)、半胱氨酸(Cys;C)、谷氨酸(Glu;E)、谷氨酰胺(Gln;Q)、甘氨酸(Gly;G)、组氨酸(His;H)、异亮氨酸(Ile;I)、亮氨酸(Leu;L)、赖氨酸(Lys;K)、蛋氨酸(Met;M)、苯丙氨酸(Phe;F)、脯氨酸(Pro;P),丝氨酸(Ser;S)、苏氨酸(Thr;T)、色氨酸(Trp;W)、酪氨酸(Tyr;Y)和缬氨酸(Val;V)。As used herein, amino acid residues are abbreviated as follows: alanine (Ala; A), asparagine (Asn; N), aspartic acid (Asp; D), arginine (Arg; R), cysteine (Cys; C), glutamic acid (Glu; E), glutamine (Gln; Q), glycine (Gly; G), histidine (His; H), isoleucine (Ile; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V).

一般情况下,根据取代、替换等规则,性质类似的氨基酸之间相互替换后的效果也类似。例如,在上述蛋白中,可发生保守的氨基酸替换。“保守的氨基酸替换”包括但不限于:Generally speaking, according to the rules of substitution and replacement, the effects of replacing amino acids with similar properties are similar. For example, in the above protein, conservative amino acid replacements may occur. "Conservative amino acid replacements" include but are not limited to:

疏水性氨基酸(Ala、Cys、Gly、Pro、Met、Val、Ile、Leu)被其他疏水性氨基酸取代;Hydrophobic amino acids (Ala, Cys, Gly, Pro, Met, Val, Ile, Leu) are replaced by other hydrophobic amino acids;

侧链粗大的疏水性氨基酸(Phe、Tyr、Trp)被其他侧链粗大的疏水性氨基酸取代;The hydrophobic amino acids with bulky side chains (Phe, Tyr, Trp) are replaced by other hydrophobic amino acids with bulky side chains;

侧链带正电的氨基酸(Arg、His、Lys)被其他侧链带正电的氨基酸取代;Amino acids with positively charged side chains (Arg, His, Lys) are replaced by other amino acids with positively charged side chains;

侧链有极性不带电的氨基酸(Ser、Thr、Asn、Gln)被其他侧链有极性不带电的氨基酸取代。Amino acids with polar, uncharged side chains (Ser, Thr, Asn, Gln) are replaced by other amino acids with polar, uncharged side chains.

本领域技术人员也可以根据现有技术中的“blosum62评分矩阵”等本领域技术人员熟知的氨基酸替换规则对氨基酸进行保守替换。A person skilled in the art may also perform conservative substitutions on amino acids according to amino acid substitution rules well known to those skilled in the art, such as the "blosum62 scoring matrix" in the prior art.

本申请中所用的“AlphaFold2-Multimer”,是一种公开的能够预测蛋白复合体构象的人工智能模型,对于蛋白质立体结构的预测能够十分接近在真实试验中利用冷冻电子显微镜等设备所观测的水平。能够获得较为真实的蛋白结构,从而指导对于蛋白结构和蛋白活性的探究。The "AlphaFold2-Multimer" used in this application is a public artificial intelligence model that can predict the conformation of protein complexes. The prediction of protein three-dimensional structure can be very close to the level observed by cryo-electron microscopes and other equipment in real experiments. It can obtain a more realistic protein structure, thereby guiding the exploration of protein structure and protein activity.

在一种优选的实施例中,b)中,取代的氨基酸的类型各自独立地选自如下:P91突变为P91G、P91A或P91T;S98突变为S98G、S98A、S98T、S98N或S98Q;N99突变为N99G、N99A、N99S、N99T或N99Q;E125突变为E125K、E125R、E125G、E125A、E125S、E125T、E125N或E125Q;R126突变为R126K、R126G、R126A、R126S、R126T、R126N或R126Q;K127突变为K127R、K127G、K127A、K127S、K127T、K127N或K127Q;D131突变为D131K、D131R、D131G、D131A、D131S、D131T、D131N或D131Q;K134突变为K134R、K134G、K134A、K134S、K134T、K134N或K134Q;R136突变为R136K、R136G、R136A、R136S、R136T、R136N或R136Q;R140突变为R140K、R140G、R140A、R140S、R140T、R140N或R140Q;K152突变为K152R、K152G、K152A、K152S、K152T、K152N或K152Q;D155突变为D155R、D155K、D155G、D155A、D155S、D155T、D155N或D155Q;T174突变为T174A、T174G、T174V、T174L、T174I、T174Y、T174F或T174W;E177突变为E177A、E177G、E177S、E177T、E177N或E177Q;H183突变为H183A、H183G、H183V、H183L、H183I、H183Y、H183F或H183W;K185突变为K185A、K185G、K185V、K185L、K185I、K185Y、K185F或K185W;R188突变为R188A、R188G、R188S、R188T、R188N或R188Q;K219突变为K219A、K219G、 K219V、K219L、K219I、K219Y、K219F或K219W;R222突变为R222A、R222G、R222S、R222T、R222N或R222Q;S223突变为S223A、S223G、S223V、S223L、S223I、S223Y、S223F或S223W;R230突变为R230A、R230G、R230S、R230T、R230N或R230Q;K232突变为K232A、K232G、K232S、K232T、K232N或K232Q;E238突变为E238A、E238G、E238S、E238T、E238N或E238Q;R239突变为R239A、R239G、R239V、R239L、R239I、R239Y、R239F或R239W。In a preferred embodiment, in b), the types of substituted amino acids are each independently selected from the following: P91 is mutated into P91G, P91A or P91T; S98 is mutated into S98G, S98A, S98T, S98N or S98Q; N99 is mutated into N99G, N99A, N99S, N99T or N99Q; E125 is mutated into E125K, E125R, E125G, E125A, E125S, E125T, E125N or E125Q; R126 is mutated into R126K, R126G, R126A, R126S, R126T, R126N or R126Q. 26N or R126Q; K127 mutated to K127R, K127G, K127A, K127S, K127T, K127N or K127Q; D131 mutated to D131K, D131R, D131G, D131A, D131S, D131T, D131N or D131Q; K134 mutated to K134R, K134G, K134A, K134S, K134T, K134N or K134Q; R136 mutated to R136K, R136G, R136A, R136S, R136T, R136N or R136Q; R140 mutates to R140K, R140G, R140A, R140S, R140T, R140N or R140Q; K152 mutates to K152R, K152G, K152A, K152S, K152T, K152N or K152Q; D155 mutates to D155R, D155K, D155G, D155A, D155S, D155T, D155N or D155Q; T174 mutates to T174A, T174G, T174V, T174L, T174I, T174Y, T174F or T174W; E17 7 mutated to E177A, E177G, E177S, E177T, E177N or E177Q; H183 mutated to H183A, H183G, H183V, H183L, H183I, H183Y, H183F or H183W; K185 mutated to K185A, K185G, K185V, K185L, K185I, K185Y, K185F or K185W; R188 mutated to R188A, R188G, R188S, R188T, R188N or R188Q; K219 mutated to K219A, K219G, K219V, K219L, K219I, K219Y, K219F or K219W; R222 mutated to R222A, R222G, R222S, R222T, R222N or R222Q; S223 mutated to S223A, S223G, S223V, S223L, S223I, S223Y, S223F or S223W; R230 mutated to R230A, R230G, R230S, R230T, R230N or R230Q; K232 mutates to K232A, K232G, K232S, K232T, K232N or K232Q; E238 mutates to E238A, E238G, E238S, E238T, E238N or E238Q; R239 mutates to R239A, R239G, R239V, R239L, R239I, R239Y, R239F or R239W.

上述氨基酸位点大部分处于孔蛋白的入口、孔道内壁和出口处,对文库的捕获以及文库DNA的顺利通畅穿过孔蛋白具有重要意义。因为核酸带负电,所以在入口处增加或减少正电荷的量可以调节孔蛋白的文库捕获能力。Most of the above amino acid sites are located at the entrance, inner wall and exit of the porin, which is of great significance for the capture of the library and the smooth passage of the library DNA through the porin. Because nucleic acids are negatively charged, increasing or decreasing the amount of positive charge at the entrance can adjust the library capture ability of the porin.

在上述的突变位点中,主要包括三类位点:门控区(sensor区)的突变位点、孔蛋白入口处的突变位点和孔蛋白跨膜区的突变位点。其中,sensor区的突变位点主要决定了孔蛋白开孔情况,从而直接决定开孔电流。因为待测核酸带负电,因此通过调节孔蛋白入口处的突变位点的氨基酸,包括但不限于将不带电或带负电的氨基酸突变为带正电的氨基酸,或将带正电的氨基酸突变为其他类型氨基酸等,此种突变能够调节文库的捕获率。孔蛋白跨膜区的突变能够增强孔蛋白在脂质或者聚合物膜上的插膜稳定性。除此之外,位于孔蛋白的孔道结构内壁或出口处的、带有电荷的氨基酸也能够影响待测样品的穿孔。Among the above-mentioned mutation sites, three types of sites are mainly included: mutation sites in the gate region (sensor region), mutation sites at the entrance of the porin, and mutation sites in the transmembrane region of the porin. Among them, the mutation sites in the sensor region mainly determine the opening of the porin, thereby directly determining the opening current. Because the nucleic acid to be tested is negatively charged, by adjusting the amino acids at the mutation sites at the entrance of the porin, including but not limited to mutating uncharged or negatively charged amino acids to positively charged amino acids, or mutating positively charged amino acids to other types of amino acids, this mutation can adjust the capture rate of the library. Mutations in the transmembrane region of the porin can enhance the insertion stability of the porin on lipid or polymer membranes. In addition, charged amino acids located at the inner wall or exit of the pore structure of the porin can also affect the perforation of the sample to be tested.

上述突变位点中,P91、S98和N99位于sensor区。E125、R126、K127、D131、K134、R136、R140、K152和D155位于孔蛋白入口处,对其带电性质的突变能对调节文库的捕获率以及测序噪声具有重要作用。T174、H183、K185、K219、S223和R239位于孔蛋白跨膜区外壁,对其突变成疏水氨基酸能增强孔蛋白插孔的稳定性。E177、R188、R222和E238位于孔道结构内壁,为孔蛋白桶内壁的带电氨基酸,对其电荷的突变能促进待测DNA等待测分子顺利穿过孔蛋白。R230和K232位于孔道结构的出口loop区(环区),对其的突变能促使测序后的核酸链离开孔蛋白。Among the above mutation sites, P91, S98 and N99 are located in the sensor region. E125, R126, K127, D131, K134, R136, R140, K152 and D155 are located at the entrance of the porin protein. Mutations in their charged properties can play an important role in regulating the capture rate of the library and sequencing noise. T174, H183, K185, K219, S223 and R239 are located on the outer wall of the transmembrane region of the porin protein. Mutations to hydrophobic amino acids can enhance the stability of the porin pore. E177, R188, R222 and E238 are located on the inner wall of the pore structure. They are charged amino acids on the inner wall of the porin barrel. Mutations in their charges can promote the smooth passage of the DNA molecules to be tested through the porin. R230 and K232 are located in the exit loop region of the pore structure. Mutations in them can promote the nucleic acid chain to leave the porin protein after sequencing.

在一种优选的实施例中,孔蛋白单体包括具有SEQ ID NO:2-SEQ ID NO:4中任一氨基酸序列的蛋白质。In a preferred embodiment, the porin monomer includes a protein having any amino acid sequence of SEQ ID NO: 2-SEQ ID NO: 4.

SEQ ID NO:2:SEQ ID NO: 2:

Figure PCTCN2022143054-appb-000002
Figure PCTCN2022143054-appb-000002

SEQ ID NO:3:SEQ ID NO: 3:

Figure PCTCN2022143054-appb-000003
Figure PCTCN2022143054-appb-000003

Figure PCTCN2022143054-appb-000004
Figure PCTCN2022143054-appb-000004

SEQ ID NO:4:SEQ ID NO: 4:

Figure PCTCN2022143054-appb-000005
Figure PCTCN2022143054-appb-000005

在本申请第二种典型的实施方式中,提供了一种蛋白构建体,该蛋白构建体由2个或更多个上述孔蛋白单体、通过共价或非共价连接而成。In a second typical embodiment of the present application, a protein construct is provided. The protein construct is composed of two or more of the above-mentioned porin monomers linked covalently or non-covalently.

在本申请第三种典型的实施方式中,提供了一种孔蛋白,该孔蛋白由7-11个上述孔蛋白单体、通过共价或非共价连接而成,优选为9个。In a third typical embodiment of the present application, a porin is provided, wherein the porin is composed of 7 to 11 porin monomers mentioned above, preferably 9, connected by covalent or non-covalent bonding.

孔蛋白单体能够自发地通过氢键、离子键、疏水作用等力聚合在一起,聚合形成孔蛋白。因此表达、纯化获得的孔蛋白单体,在非变性的条件下以多聚体、尤其是九聚体的形式存在,而将蛋白质变性则是以孔蛋白单体的形式存在。Porin monomers can spontaneously aggregate together through hydrogen bonds, ionic bonds, hydrophobic interactions, etc. to form porins. Therefore, the porin monomers obtained by expression and purification exist in the form of polymers, especially nonamers, under non-denaturing conditions, while porin monomers exist after protein denaturation.

在一种优选的实施例中,孔蛋白由9个孔蛋白单体通过非共价连接而成,孔蛋白单体包括具有SEQ ID NO:1-SEQ ID NO:4中任一氨基酸序列的蛋白质。In a preferred embodiment, the porin protein is composed of 9 porin monomers connected non-covalently, and the porin monomers include proteins having any amino acid sequence of SEQ ID NO: 1-SEQ ID NO: 4.

在一种优选的实施例中,孔蛋白的孔道直径为0.5.~3nm。In a preferred embodiment, the pore diameter of the porin is 0.5 to 3 nm.

在一定范围上,纳米孔蛋白的孔道直径越小,其在用于测序时的准确度越高。若纳米孔蛋白的孔道直径过大(一次可能不止一个分子通过孔道),难以满足单分子测序的需求,待测生物分子在过大的孔道中穿过时,产生的电流信号有可能会遗漏或产生错误,导致测序的准确度低。在单分子测序中,利用对同一分子进行多次测序的方式,获得准确的测序结果。因此测序的准确度越高,所需的测序次数和时间越短。孔道直径在上述范围内的孔蛋白测序准确度均相对较高,利用此测序准确度高的纳米孔蛋白进行测序,能够大大减少测序的时间,降低成本,这种优势在高通量的测序中尤为明显。Within a certain range, the smaller the pore diameter of the nanopore protein, the higher its accuracy when used for sequencing. If the pore diameter of the nanopore protein is too large (more than one molecule may pass through the pore at a time), it is difficult to meet the needs of single-molecule sequencing. When the biological molecule to be tested passes through the overly large pore, the generated current signal may be missed or erroneous, resulting in low sequencing accuracy. In single-molecule sequencing, accurate sequencing results are obtained by sequencing the same molecule multiple times. Therefore, the higher the accuracy of sequencing, the shorter the number and time of sequencing required. The sequencing accuracy of pore proteins with pore diameters within the above range is relatively high. Using this nanopore protein with high sequencing accuracy for sequencing can greatly reduce the sequencing time and reduce costs. This advantage is particularly evident in high-throughput sequencing.

在本申请第四种典型的实施方式中,提供了一种试剂盒,该试剂盒包括上述孔蛋白单体,或上述蛋白构建体,或上述孔蛋白。In a fourth typical embodiment of the present application, a kit is provided, which includes the above-mentioned porin monomer, or the above-mentioned protein construct, or the above-mentioned porin.

为进一步提高操作便利性,在一种优选的实施例中,试剂盒还包括膜层,膜层包括脂质层或人造高分子膜。To further improve the convenience of operation, in a preferred embodiment, the kit further comprises a membrane layer, and the membrane layer comprises a lipid layer or an artificial polymer membrane.

优选地,脂质层包括两亲脂类;优选地,两亲脂类包含磷脂双分子层;优选地,脂质层包括平面膜层或脂质体;优选地,脂质体包括多层脂质体或单层脂质体;优选地,脂质层包括二植酰磷脂酰胆碱组成的磷脂双分子层。Preferably, the lipid layer comprises amphiphilic lipids; preferably, the amphiphilic lipids comprise a phospholipid bilayer; preferably, the lipid layer comprises a planar membrane layer or a liposome; preferably, the liposome comprises a multilayer liposome or a unilamellar liposome; preferably, the lipid layer comprises a phospholipid bilayer composed of diphytylphosphatidylcholine.

人造高分子膜包括但不限于聚硅氧烷、聚烯烃、全氟聚醚、全氟烃基聚醚、聚苯乙烯、 聚氧丙烯、聚乙酸乙烯酯、聚氧丁烯、聚异戊二烯、聚丁二烯、聚氯乙烯、聚烷基丙烯酸酯、聚烷基甲基丙烯酸酯、聚丙烯腈、聚丙烯、PTHF、聚甲基丙烯酸酯、聚丙烯酸酯、聚砜、聚乙烯醚、聚(环氧丙烷)及其共聚物、基取代的C1-C6烷基丙烯酸酯和甲基丙烯酸酯、丙烯酰胺、甲基丙烯酰胺、(C1-C6烷基)丙烯酰胺和甲基丙烯酰胺、N,N-二烷基-丙烯酰胺、乙氧基丙烯酸酯和甲基丙烯酸酯、聚乙二醇单甲基丙烯酸酯和聚乙二醇单甲基醚甲基丙烯酸酯、羟基取代的(C1-C6烷基)丙烯酰胺和甲基丙烯酰胺、羟基取代的C1-C6烷基乙烯基醚、乙烯基磺酸钠、苯乙烯基磺酸钠、2-丙烯酰胺-2-甲基丙磺酸、N-乙烯基吡咯、N-乙烯基-2-吡咯烷酮、2-乙烯基恶唑啉、2-乙烯基-4,4′-双烷基恶唑啉基-5-酮、2,4-乙烯基吡啶、总共具有3-5个碳原子的乙烯化不饱和羧酸,氨基(C1-C6烷基)-、单(C1-C6烷氨基)(C1-C6烷基)-和双(C1-C6烷氨基)(C1-C6烷基)-丙烯酸酯和甲基丙烯酸酯、烯丙醇、3-三甲基铵甲基丙烯酸2-羟丙基酯氯化物、二甲基氨乙基甲基丙烯酸酯(DMAEMA)、二甲基氨乙基甲基丙烯酰胺、甘油甲基丙烯酸酯、N-(1,1-二甲基-3-氧代丁基)丙烯酰胺、环亚氨基醚、乙烯基醚、包含环氧衍生物的环醚、环不饱和醚、N-取代环乙亚胺、β-内酯和β-内酰胺、乙烯酮缩醛、乙烯基缩醛或正膦中的一种或多种。Artificial polymer membranes include, but are not limited to, polysiloxanes, polyolefins, perfluoropolyethers, perfluoroalkyl polyethers, polystyrenes, polyoxypropylenes, polyvinyl acetates, polyoxybutylenes, polyisoprene, polybutadiene, polyvinyl chlorides, polyalkyl acrylates, polyalkyl methacrylates, polyacrylonitrile, polypropylene, PTHF, polymethacrylates, polyacrylates, polysulfones, polyethylene ethers, poly(propylene oxide) and copolymers thereof, alkyl-substituted C1-C6 alkyl acrylates and methacrylates, acrylamides, methacrylamides, (C1-C6 alkyl) acrylamides and methacrylamides, N,N-dialkyl-acrylamides, ethoxy acrylates and methacrylates, polyethylene glycol monomethacrylates and polyethylene glycol monomethyl ether methacrylates, hydroxy-substituted (C1-C6 alkyl) acrylamides and methacrylamides, hydroxy-substituted C1-C6 alkyl vinyl ethers, sodium vinyl sulfonates, sodium styrene sulfonates, 2-acrylamido-2-methylpropanesulfonic acid, N-vinylpyrrole, N-vinyl- One or more of 2-pyrrolidone, 2-vinyloxazoline, 2-vinyl-4,4′-bisalkyloxazolinyl-5-one, 2,4-vinylpyridine, ethylenically unsaturated carboxylic acids having a total of 3 to 5 carbon atoms, amino(C1-C6 alkyl)-, mono(C1-C6 alkylamino)(C1-C6 alkyl)- and bis(C1-C6 alkylamino)(C1-C6 alkyl)-acrylates and methacrylates, allyl alcohol, 3-trimethylammonium methacrylate 2-hydroxypropyl ester chloride, dimethylaminoethyl methacrylate (DMAEMA), dimethylaminoethyl methacrylamide, glycerol methacrylate, N-(1,1-dimethyl-3-oxobutyl)acrylamide, cyclic imino ethers, vinyl ethers, cyclic ethers containing epoxy derivatives, cyclic unsaturated ethers, N-substituted ethylenimines, β-lactones and β-lactams, vinyl ketone acetals, vinyl acetals or phosphoranes.

在一种优选的实施例中,试剂盒还包括以下至少一项:测序缓冲液、核酸酶、聚合酶、拓扑异构酶、连接酶、解旋酶和连接胆固醇的单链DNA。In a preferred embodiment, the kit further comprises at least one of the following: a sequencing buffer, a nuclease, a polymerase, a topoisomerase, a ligase, a helicase, and a single-stranded DNA linked to cholesterol.

利用上述试剂盒中的纳米孔蛋白、膜层和测序缓冲液,能够在测序缓冲液中,将一个或多个纳米孔蛋白插入到膜层中,形成纳米孔传感器。纳米孔实验缓冲液能够提供维持纳米孔蛋白和膜层稳定的中性环境,其中包含的金属离子使纳米孔实验缓冲液具有良好的导电性。对于膜层的选择可以有多种选择,在平面膜层或球形的脂质体上,在不同成分形成的脂质层上,纳米孔蛋白均能够插入,形成纳米孔传感器。By using the nanopore protein, membrane layer and sequencing buffer in the above kit, one or more nanopore proteins can be inserted into the membrane layer in the sequencing buffer to form a nanopore sensor. The nanopore experiment buffer can provide a neutral environment to maintain the stability of the nanopore protein and the membrane layer, and the metal ions contained in the nanopore experiment buffer have good conductivity. There are many options for the selection of the membrane layer. The nanopore protein can be inserted into a planar membrane layer or a spherical liposome, and on a lipid layer formed by different components to form a nanopore sensor.

单链DNA等待测分子上带有的胆固醇,可以与上述脂质层或人造高分子膜进行结合,有助于纳米孔捕获测序文库,降低测序文库上样量。上述试剂盒中的胆固醇,在实际使用时可先与待测分子结合后再加入纳米孔传感器所在的空间进行测序。The cholesterol on the single-stranded DNA molecule to be tested can bind to the above-mentioned lipid layer or artificial polymer membrane, which helps the nanopore capture the sequencing library and reduces the amount of sequencing library loading. When actually used, the cholesterol in the above-mentioned kit can first bind to the molecule to be tested and then be added to the space where the nanopore sensor is located for sequencing.

在本申请第五种典型的实施方式中,提供了一种分离的DNA分子,该DNA分子具有:编码上述孔蛋白单体的核苷酸序列;或编码上述蛋白构建体的核苷酸序列,或编码上述孔蛋白的核苷酸序列。In a fifth typical embodiment of the present application, an isolated DNA molecule is provided, which has: a nucleotide sequence encoding the above-mentioned porin monomer; or a nucleotide sequence encoding the above-mentioned protein construct, or a nucleotide sequence encoding the above-mentioned porin.

在一种优选的实施例中,与SEQ ID NO:5所示核苷酸序列具有70%以上,优选80%以上,更优选90%以上,进一步优选99%以上、最优选99%以上(比如可以是70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、98.5%、99%、99.5%、99.6%、99.7%、99.8%以上,甚至99.9%以上)同一性且编码具有相同功能蛋白质的DNA分子。In a preferred embodiment, the invention relates to a DNA molecule which has more than 70%, preferably more than 80%, more preferably more than 90%, further preferably more than 99%, and most preferably more than 99% (for example, it can be 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even 99.9% or more) identity with the nucleotide sequence shown in SEQ ID NO:5 and encodes a protein with the same function.

SEQ ID NO:5:SEQ ID NO: 5:

Figure PCTCN2022143054-appb-000006
Figure PCTCN2022143054-appb-000006

上述DNA分子,能够编码具有本申请上述结构和功能的纳米孔蛋白单体。在(a)序列的基础上对核苷酸进行突变,在严格条件下与(a)限定的DNA分子杂交,且不发生移码突变,若突变发生在编码纳米孔蛋白孔道上的核苷酸,可能会导致编码出孔道发生改变的纳米孔蛋白,影响该纳米孔蛋白的孔径和孔道内壁的氨基酸残基的性质;若突变发生在编码蛋白质非孔道部分的核苷酸上,可能会影响编码蛋白质的折叠方式、三维结构等性质,从而影响蛋白质的理化性质和稳定性。(b)限定的在严格条件下的DNA分子杂交的核苷酸序列,包括与(a)限定的DNA分子具有80%、85%、90%、95%、98%、99%、99.9%或100%互补的核苷酸序列。The above DNA molecules can encode nanopore protein monomers having the above structure and function of the present application. Based on the sequence of (a), the nucleotides are mutated, hybridized with the DNA molecule defined by (a) under strict conditions, and no frameshift mutation occurs. If the mutation occurs in the nucleotides encoding the pore of the nanopore protein, it may result in the encoding of a nanopore protein with a changed pore, affecting the pore size of the nanopore protein and the properties of the amino acid residues on the inner wall of the pore; if the mutation occurs in the nucleotides encoding the non-pore part of the protein, it may affect the folding mode, three-dimensional structure and other properties of the encoded protein, thereby affecting the physicochemical properties and stability of the protein. (b) The nucleotide sequence that hybridizes with the DNA molecule defined under strict conditions includes a nucleotide sequence that is 80%, 85%, 90%, 95%, 98%, 99%, 99.9% or 100% complementary to the DNA molecule defined by (a).

本申请中“分离的”是指“通过人工”从其天然状态改变,即,如果它在自然界中发生,则将其改变和/或从其原始环境中分离出来。例如,天然存在于生命有机体中的多核苷酸或多肽不是“分离的”,然而从其天然状态的共存物中分离的相同的多核苷酸或多肽是“分离的”(如在本文中使用的术语)。"Isolated" in this application means changed "by the hand of man" from its natural state, i.e., if it occurs in nature, it is changed and/or separated from its original environment. For example, a polynucleotide or polypeptide naturally present in a living organism is not "isolated", however, the same polynucleotide or polypeptide separated from its coexisting state in its natural state is "isolated" (as the term is used in this article).

在本申请第六种典型的实施方式中,提供了一种重组载体,该重组载体包含上述DNA分子。In a sixth typical embodiment of the present application, a recombinant vector is provided, which comprises the above-mentioned DNA molecule.

在重组载体上插入上述DNA分子即纳米孔蛋白表达基因,利用重组载体能够大量自我复制的功能,大量复制纳米孔蛋白表达基因。此处的“重组”是指通过将来自一个物种的基因移植或剪接到不同物种的宿主有机体的细胞中而制备的基因工程化的DNA。这种DNA成为宿主基因结构的一部分并被复制。The above DNA molecule, i.e., the nanopore protein expression gene, is inserted into a recombinant vector, and the nanopore protein expression gene is replicated in large quantities by utilizing the ability of the recombinant vector to replicate itself in large quantities. "Recombinant" here refers to genetically engineered DNA prepared by transplanting or splicing a gene from one species into the cells of a host organism of a different species. This DNA becomes part of the host gene structure and is replicated.

在本申请第七种典型的实施方式中,提供了一种宿主细胞,该宿主细胞转化有上述重组载体。In a seventh typical embodiment of the present application, a host cell is provided, wherein the host cell is transformed with the above-mentioned recombinant vector.

将上述重组载体转化入宿主细胞中,利用宿主细胞对重组载体上的纳米孔蛋白表达基因进行复制、转录、翻译,能够大量产生纳米孔蛋白。宿主细胞包括大肠杆菌、酵母菌、哺乳动物细胞、昆虫细胞等常用宿主细胞,利用宿主细胞对纳米孔蛋白进行折叠使之形成正确的三维结构,获得结构和功能正常的纳米孔蛋白。The above recombinant vector is transformed into a host cell, and the host cell is used to replicate, transcribe and translate the nanopore protein expression gene on the recombinant vector, so that a large amount of nanopore protein can be produced. The host cell includes common host cells such as Escherichia coli, yeast, mammalian cells, insect cells, etc. The host cell is used to fold the nanopore protein to form a correct three-dimensional structure, and a nanopore protein with normal structure and function is obtained.

在本申请第八种典型的实施方式中,提供了一种纳米孔传感器,该纳米孔传感器包括:膜层;以及插入膜层中且形成孔道的孔蛋白,当跨越膜层施加电压时,孔道产生电流;其 中,孔蛋白包括上述孔蛋白。In the eighth typical embodiment of the present application, a nanopore sensor is provided, which includes: a membrane layer; and a pore protein inserted into the membrane layer and forming a pore, and when a voltage is applied across the membrane layer, the pore generates an electric current; wherein the pore protein includes the above-mentioned pore protein.

本申请中的纳米孔传感器特指有纳米孔蛋白插入的膜层。此种纳米孔传感器,在膜层中插入带有孔道的纳米孔蛋白,能够将纳米孔蛋白的孔道朝向进行固定,在跨越膜层施加电场力时,孔道直径垂直于电场力方向,膜层两侧的离子在电场力的作用下穿过孔蛋白的孔道,产生电流。The nanopore sensor in this application specifically refers to a membrane layer with nanopore protein inserted. This type of nanopore sensor inserts a nanopore protein with a pore in the membrane layer, which can fix the pore direction of the nanopore protein. When an electric field force is applied across the membrane layer, the pore diameter is perpendicular to the direction of the electric field force, and the ions on both sides of the membrane layer pass through the pore of the pore protein under the action of the electric field force, generating current.

在一种优选的实施例中,膜层包括脂质层或人造高分子膜;优选地,脂质层包括两亲脂类;优选地,两亲脂类包含磷脂双分子层;优选地,脂质层包括平面膜层或脂质体;优选地,脂质体包括多层脂质体或单层脂质体;优选地,脂质层包括二植酰磷脂酰胆碱(DPhPC,1,2-diphytanoyl-sn-glycero-3-phosphocholine)组成的磷脂双分子层。In a preferred embodiment, the membrane layer comprises a lipid layer or an artificial polymer membrane; preferably, the lipid layer comprises amphiphilic lipids; preferably, the amphiphilic lipids comprise a phospholipid bilayer; preferably, the lipid layer comprises a planar membrane layer or a liposome; preferably, the liposome comprises a multilayer liposome or a unilamellar liposome; preferably, the lipid layer comprises a phospholipid bilayer composed of diphytanoylphosphatidylcholine (DPhPC, 1,2-diphytanoyl-sn-glycero-3-phosphocholine).

对于脂质层的选择可以有多种选择,在平面膜层或球形的脂质体上,在不同成分形成的脂质层上,纳米孔蛋白均能够插入,形成纳米孔传感器。There are many options for the selection of lipid layers. Nanopore proteins can be inserted into planar membrane layers or spherical liposomes, or into lipid layers formed of different components to form nanopore sensors.

在一种优选的实施例中,当跨越膜层施加电压时,待测生物分子穿过纳米孔传感器中的孔道并发生移位,孔道产生变化的电流;优选地,待测生物分子包括DNA、RNA或多肽;优选地,DNA和/或RNA包括如下任意一种或多种修饰碱基:5-甲基胞嘧啶(5mC)、6-甲基腺嘌呤(m6A)、7-甲基鸟嘌呤(m7G)、假尿嘧啶(pseudouridine,Ψ)。In a preferred embodiment, when a voltage is applied across the membrane layer, the biological molecules to be detected pass through the pores in the nanopore sensor and shift, and the pores generate a changing current; preferably, the biological molecules to be detected include DNA, RNA or polypeptides; preferably, the DNA and/or RNA include any one or more of the following modified bases: 5-methylcytosine (5mC), 6-methyladenine (m6A), 7-methylguanine (m7G), pseudouracil (Ψ).

当跨越膜层施加电场力时,待测生物分子在电场力的作用下经由孔道穿过纳米孔蛋白。待测生物分子包括DNA、RNA、多肽或蛋白质等携带生物遗传信息的生物大分子。待测生物分子可以带有用于修饰的基团分子。基团分子包括但不限于胆固醇、不同聚合度的聚乙二醇、生物素或荧光基团分子。When an electric field force is applied across the membrane layer, the biomolecule to be tested passes through the nanopore protein through the pore under the action of the electric field force. The biomolecule to be tested includes biomacromolecules carrying biological genetic information such as DNA, RNA, polypeptides or proteins. The biomolecule to be tested can carry a group molecule for modification. The group molecule includes but is not limited to cholesterol, polyethylene glycol with different polymerization degrees, biotin or fluorescent group molecules.

在本申请第九种典型的实施方式中,提供了一种纳米孔测序装置,该纳米孔测序装置包括上述纳米孔传感器。In a ninth typical embodiment of the present application, a nanopore sequencing device is provided, wherein the nanopore sequencing device comprises the above-mentioned nanopore sensor.

在一种优选的实施例中,纳米孔测序装置包括:电解槽,电解槽含有测序缓冲液;纳米孔传感器,纳米孔传感器位于电解槽的中央,并将电解槽及测序缓冲液分割为正极电解液区和负极电解液区;第一电极和第二电极,第一电极和第二电极分别设置在正极电解液区和负极电解液区,且第一电极和第二电极与信号处理芯片相连;优选地,第一电极和第二电极包括金属或复合电极材料;优选地,第一电极和第二电极不同,分别为银和氯化银;或者第一电极和第二电极相同,包括金、铂、石墨烯或氮化钛。In a preferred embodiment, the nanopore sequencing device includes: an electrolytic cell, the electrolytic cell contains a sequencing buffer; a nanopore sensor, the nanopore sensor is located in the center of the electrolytic cell and divides the electrolytic cell and the sequencing buffer into a positive electrolyte region and a negative electrolyte region; a first electrode and a second electrode, the first electrode and the second electrode are respectively arranged in the positive electrolyte region and the negative electrolyte region, and the first electrode and the second electrode are connected to a signal processing chip; preferably, the first electrode and the second electrode include metal or composite electrode materials; preferably, the first electrode and the second electrode are different, silver and silver chloride, respectively; or the first electrode and the second electrode are the same, including gold, platinum, graphene or titanium nitride.

在纳米孔测序装置中,包括含有电解液的电解槽、纳米孔传感器、第一电极和第二电极。将纳米孔传感器放入含有电解液的电解槽中央,将电解槽分解形成正极电解液区和负极电解液区,2个区域分别设置有电极,利用电极形成施加在纳米孔传感器上的电场。待测生物分子通过膜上的纳米孔蛋白,产生电流振幅。通过接收此种电流振幅,并将电流振幅传送至与电极相连的信号处理芯片。根据电流振幅的不同,该信号处理芯片,即包含信号处理芯片的纳米孔测序装置,能对待测生物分子的序列进行数据分析和测定。In the nanopore sequencing device, there are an electrolytic cell containing an electrolyte, a nanopore sensor, a first electrode, and a second electrode. The nanopore sensor is placed in the center of the electrolytic cell containing an electrolyte, and the electrolytic cell is decomposed into a positive electrolyte area and a negative electrolyte area. The two areas are respectively provided with electrodes, and the electrodes are used to form an electric field applied to the nanopore sensor. The biological molecules to be tested pass through the nanopore protein on the membrane, generating a current amplitude. By receiving this current amplitude and transmitting the current amplitude to a signal processing chip connected to the electrode. According to the difference in current amplitude, the signal processing chip, that is, the nanopore sequencing device including the signal processing chip, can perform data analysis and determination on the sequence of the biological molecules to be tested.

在本申请第十种典型的实施方式中,提供了一种测序方法,该测序方法利用上述孔蛋白,或者上述纳米孔传感器,或者上述纳米孔测序装置通过检测并解析待测生物分子通过孔蛋白的孔道时产生的电信号,确定待测生物分子的序列。In the tenth typical embodiment of the present application, a sequencing method is provided, which utilizes the above-mentioned pore protein, or the above-mentioned nanopore sensor, or the above-mentioned nanopore sequencing device to determine the sequence of the biological molecule to be tested by detecting and analyzing the electrical signal generated when the biological molecule to be tested passes through the pore of the pore protein.

在一种优选的实施例中,待测生物分子包括修饰或未修饰的DNA、RNA或多肽;优选地,电信号包括电流。In a preferred embodiment, the biomolecule to be detected includes modified or unmodified DNA, RNA or polypeptide; preferably, the electrical signal includes electric current.

在一种优选的实施例中,待测生物分子是靶核酸序列,测序方法包括:(a)使核酸序列与上述孔蛋白和核酸结合蛋白接触,使得核酸结合蛋白控制靶核酸序列通过孔蛋白的孔道的移动速度,其中核酸结合蛋白选自核酸酶、聚合酶、拓扑异构酶、连接酶、解旋酶或单链结合蛋白中的任意一种或多种;(b)在跨孔施加电压时,当核酸序列移动通过孔道时,测量通过孔道的电信号,其中,不同类型的核苷酸通过孔道所产生的电信号不同,从而基于电信号确定核酸的序列信息。In a preferred embodiment, the biological molecule to be detected is a target nucleic acid sequence, and the sequencing method comprises: (a) contacting the nucleic acid sequence with the above-mentioned pore protein and nucleic acid binding protein, so that the nucleic acid binding protein controls the movement speed of the target nucleic acid sequence through the pore of the pore protein, wherein the nucleic acid binding protein is selected from any one or more of nucleases, polymerases, topoisomerases, ligases, helicases or single-stranded binding proteins; (b) when a voltage is applied across the pore, when the nucleic acid sequence moves through the pore, measuring the electrical signal passing through the pore, wherein different types of nucleotides generate different electrical signals when passing through the pore, thereby determining the sequence information of the nucleic acid based on the electrical signal.

在本申请第十一种典型的实施方式中,提供了一种孔蛋白单体,或者上述孔蛋白,或者上述试剂盒,或者上述DNA分子,或者上述重组载体,或者上述宿主细胞,或者上述纳米孔传感器,或者上述纳米孔测序装置,或者上述测序方法在生物小分子检测、核酸测序或多肽测序中的应用。In the eleventh typical embodiment of the present application, a porin monomer, or the porin, or the kit, or the DNA molecule, or the recombinant vector, or the host cell, or the nanopore sensor, or the nanopore sequencing device, or the sequencing method for use in biological small molecule detection, nucleic acid sequencing or polypeptide sequencing is provided.

在本申请中的“小分子”包括但不限于核苷酸、氨基酸、多糖或维生素等小分子物质。"Small molecules" in this application include but are not limited to small molecules such as nucleotides, amino acids, polysaccharides or vitamins.

基于目前纳米孔测序相较于传统测序的一大优势:不会因为错误累积而影响准确率,因此可达到极长的读长。进而可以弥补传统测序短测序片段组装时无法避免的空缺(Gap)问题,判断染色体中是否发生长片段的缺失、重复、倒位、易位,覆盖典型长度为数kb的转录组全长,从而为基因组组装、结构变异、可变剪切等科学研究提供全新的解决方案。Based on the major advantage of nanopore sequencing over traditional sequencing: it will not affect accuracy due to accumulated errors, so it can achieve extremely long read lengths. It can make up for the gap problem that cannot be avoided when assembling short sequencing fragments in traditional sequencing, determine whether long fragments are missing, repeated, inverted, or translocated in chromosomes, and cover the full length of the transcriptome with a typical length of several kb, thus providing a new solution for scientific research such as genome assembly, structural variation, and variable splicing.

由于纳米孔测序无需PCR扩增,因此可保留待测核酸分子上的原始碱基修饰信息,进而直接一次性测序获知修饰碱基的种类、位点及丰度。因而,本申请的纳米孔蛋白同样能够对数种带有DNA/RNA修饰碱基的核酸分子进行检测:包括5-甲基胞嘧啶(5mC)、6-甲基腺嘌呤(m6A)、7-甲基鸟嘌呤(m7G)、假尿嘧啶(pseudouridine,Ψ)等。通过对各种修饰碱基进行特定模型训练与算法开发,纳米孔测序可完成更多修饰碱基的识别定位,从而构建出更为完备的基因组/转录组修饰图谱。Since nanopore sequencing does not require PCR amplification, the original base modification information on the nucleic acid molecule to be tested can be retained, and the type, site and abundance of the modified base can be directly sequenced at one time. Therefore, the nanopore protein of the present application can also detect several nucleic acid molecules with DNA/RNA modified bases: including 5-methylcytosine (5mC), 6-methyladenine (m6A), 7-methylguanine (m7G), pseudouracil (pseudouridine, Ψ), etc. Through specific model training and algorithm development for various modified bases, nanopore sequencing can complete the identification and positioning of more modified bases, thereby constructing a more complete genome/transcriptome modification map.

此外,从临床应用的角度考虑,纳米孔测序具有长读长、高便携性、快测序速度与实时读出的特点,因而适合应用于重大疫情监测及病原快速检测中(比如,寨卡(Zika virus)病毒、埃博拉病毒(Ebola virus)、登革热病毒(Dengue virus)及新型冠状病毒(Coronavirus)等大规模流行病的行动中),极具时效性。除病毒外,纳米孔测序还可用于细菌、真菌等其它病原体的快速检测。In addition, from the perspective of clinical application, nanopore sequencing has the characteristics of long read length, high portability, fast sequencing speed and real-time readout, so it is suitable for major epidemic monitoring and rapid detection of pathogens (for example, large-scale epidemic operations such as Zika virus, Ebola virus, Dengue virus and Coronavirus), which is very timely. In addition to viruses, nanopore sequencing can also be used for rapid detection of other pathogens such as bacteria and fungi.

基于蛋白质与核酸分子的共性组成,纳米孔测序平台在蛋白质测序领域也具有巨大的应用潜力。比如,根据目前已经进行的探索可知:通过使用蛋白解折叠酶作为控速工具,成功观察到蛋白质特征性信号,并实现蛋白质种类与修饰状态的初步识别,验证了纳米孔蛋白质 测序的可能性。在未来的发展中,通过进一步优化控速体系、开发适配的纳米孔蛋白与信号解析算法,可最终实现在单分子水平对蛋白质进行指纹图谱识别甚至序列鉴定。Based on the common composition of proteins and nucleic acid molecules, the nanopore sequencing platform also has great application potential in the field of protein sequencing. For example, according to the exploration that has been carried out so far, by using protein unfolding enzymes as rate control tools, the characteristic signals of proteins have been successfully observed, and the initial identification of protein types and modification states has been achieved, verifying the possibility of nanopore protein sequencing. In future development, by further optimizing the rate control system, developing suitable nanopore proteins and signal analysis algorithms, it will eventually be possible to achieve fingerprint recognition and even sequence identification of proteins at the single-molecule level.

除应用于测序领域外,纳米孔平台还可作为基础检测平台,结合传感手段,完成各种小分子与大分子的代谢组学检测。结合基因组学、蛋白组学、代谢组学,纳米孔平台最终可发展为一种满足全组学分析需求的通用型测量平台,为更深刻地理解生命规律与疾病发生机制提供强有力的研究工具。In addition to being used in the field of sequencing, the nanopore platform can also be used as a basic detection platform, combined with sensing methods, to complete the metabolomics detection of various small and large molecules. Combining genomics, proteomics, and metabolomics, the nanopore platform can eventually develop into a universal measurement platform that meets the needs of full-omics analysis, providing a powerful research tool for a deeper understanding of the laws of life and the mechanisms of disease occurrence.

下面将结合具体的实施例来进一步详细解释本申请的有益效果。但是本领域技术人员将会理解,下列实施例仅用于说明本发明,而不应视为限定本发明的范围。所用试剂或仪器未注明生产厂商者,均为可以通过市场获得的常规产品。所使用的实验方法如无特殊说明,均为常规方法。The beneficial effects of the present application will be further explained in detail below in conjunction with specific examples. However, those skilled in the art will appreciate that the following examples are only used to illustrate the present invention and should not be considered to limit the scope of the present invention. The reagents or instruments used without indicating the manufacturer are conventional products that can be obtained on the market. The experimental methods used are conventional methods unless otherwise specified.

实施例1 野生型BCP35的Alphafold2-Multimer的预测结构Example 1 Predicted structure of the Alphafold2-Multimer of wild-type BCP35

九个孔蛋白单体(SEQ ID NO:1)通过非共价聚合为九聚体,即可获得孔蛋白BCP35,利用Alphafold2-Multimer对BCP35进行结构预测。预测结果如图1、图2、图3、图4、图5、图6和图7所示。图1为BCP35预测结构的侧视图(sideview),图2为BCP35预测结构的俯视图(topview)。Nine porin monomers (SEQ ID NO: 1) are non-covalently polymerized into nonamers to obtain porin BCP35, and the structure of BCP35 is predicted using Alphafold2-Multimer. The prediction results are shown in Figures 1, 2, 3, 4, 5, 6 and 7. Figure 1 is a side view of the predicted structure of BCP35, and Figure 2 is a top view of the predicted structure of BCP35.

由于sensor区(门控区)的氨基酸的组成对于电流信号的产生起到决定性作用,因此首先尝试BCP35的突变体为针对其sensor区的突变。附图3和附图4为BCP35预测结构的sensor区的重要氨基酸的侧链结构,显示出氨基酸侧链的三个氨基酸分别为P91、S98和N99。Since the composition of the amino acids in the sensor region (gating region) plays a decisive role in the generation of current signals, the first mutants of BCP35 that were tried were mutations in the sensor region. Figures 3 and 4 show the side chain structures of the important amino acids in the sensor region of the predicted structure of BCP35, showing that the three amino acids in the amino acid side chains are P91, S98 and N99.

由于核酸带负电,所以如果孔蛋白入口处的氨基酸为正电的话可以增加纳米孔测序的文库捕获效率,反之,则会降低文库捕获率。同时文库和孔蛋白结合强弱对测序速度和测序电流信号也会有一定影响。附图5展示了BCP35入口区的一些重要的氨基酸,它们分别为E125、R126、K127、D131、K134、R136、R140、K152、D155。Since nucleic acids are negatively charged, if the amino acids at the entrance of the porin are positively charged, the library capture efficiency of nanopore sequencing can be increased, otherwise, the library capture rate will be reduced. At the same time, the strength of the binding between the library and the porin will also have a certain impact on the sequencing speed and sequencing current signal. Figure 5 shows some important amino acids in the entrance region of BCP35, which are E125, R126, K127, D131, K134, R136, R140, K152, and D155.

同时,桶内壁和出口的一些氨基酸对待测物顺利穿过和离开纳米孔具有影响,附图6展示了BCP35的桶内壁和出口的一些可能影响待测物顺利穿孔的氨基酸,尤其是带电荷的氨基酸,它们分别是E177、R188、R222、R230、K232、E238。At the same time, some amino acids on the inner wall and outlet of the barrel have an impact on the smooth passage of the analyte through and out of the nanopore. Figure 6 shows some amino acids on the inner wall and outlet of the barrel of BCP35 that may affect the smooth passage of the analyte through the pore, especially charged amino acids, which are E177, R188, R222, R230, K232, and E238.

由于孔蛋白跨膜区朝向膜方向的氨基酸更加偏好于疏水氨基酸,而带电荷氨基酸和极性氨基酸可能会影响插孔效率。附图7展示了BCP35跨膜区存在的多个朝向膜方向的带电荷和极性氨基酸,它们分别为T174、H183、K185、K219、S223、R239。Since the amino acids in the membrane-facing direction of the porin transmembrane region prefer hydrophobic amino acids, charged amino acids and polar amino acids may affect the efficiency of pore insertion. Figure 7 shows multiple charged and polar amino acids facing the membrane in the transmembrane region of BCP35, which are T174, H183, K185, K219, S223, and R239.

实施例2 孔蛋白BCP35及其突变体表达载体的构建Example 2 Construction of expression vectors for porin BCP35 and its mutants

通过In-fusion的方法,采用NdeI和XhoI酶切后,将孔蛋白单体编码的DNA序列(SEQ ID NO:5)插入到载体pET24a的多克隆区。在孔蛋白单体的氨基酸序列(SEQ ID NO:1)的C端添加StrepII氨基酸作为纯化标签,其中筛选标签为卡那霉素,将构建好的载体命名为 pET24a-BCP35。通过定点突变的方法,采用Agilent定点突变试剂盒,以孔蛋白BCP35的表达载体为模板,构建相应的突变体。本实施例构建了P91A、S98N、N99S三个突变体。构建好的突变体载体分别命名为pET24a-BCP35-P91A、pET24a-BCP35-S98N和pET24a-BCP35-N99S。By the In-fusion method, after digestion with NdeI and XhoI, the DNA sequence encoding the porin monomer (SEQ ID NO: 5) was inserted into the multi-cloning region of the vector pET24a. StrepII amino acid was added to the C-terminus of the amino acid sequence of the porin monomer (SEQ ID NO: 1) as a purification tag, wherein the screening tag was kanamycin, and the constructed vector was named pET24a-BCP35. By the site-directed mutagenesis method, the Agilent site-directed mutagenesis kit was used, and the expression vector of porin BCP35 was used as a template to construct the corresponding mutants. This embodiment constructs three mutants of P91A, S98N, and N99S. The constructed mutant vectors were named pET24a-BCP35-P91A, pET24a-BCP35-S98N, and pET24a-BCP35-N99S, respectively.

实施例3 孔蛋白单体菌株的培养和诱导Example 3 Cultivation and induction of porin monomer strains

将构建好的孔蛋白单体或其突变体表达质粒分别独立转化到大肠杆菌表达菌株E.coli BL21(DE3)中,将菌液均匀涂抹在含50μg/mL卡那霉素的平板上,37℃过夜培养。次日挑取单菌落于含50μg/mL卡那霉素的5mL LB培养基中,37℃,200rpm,过夜培养。将上述所得菌液,按1∶100接种于含有50μg/mL卡那霉素的50mL LB中,37℃,200rpm,培养4h。将扩大培养的菌液,按1∶100接种于含有50μg/mL卡那霉素的2L LB中培养,37℃,200rpm。待OD600值达0.6-0.8左右,加入终浓度为0.5mM的IPTG,16℃,200rpm,培养约16-18h。将菌液于8000rpm离心收集,菌体冻存于-20℃待用。The constructed porin monomer or its mutant expression plasmids were independently transformed into the E. coli expression strain E. coli BL21 (DE3), and the bacterial solution was evenly spread on a plate containing 50μg/mL kanamycin and cultured at 37°C overnight. The next day, a single colony was picked and cultured in 5mL LB medium containing 50μg/mL kanamycin at 37°C, 200rpm, overnight. The above-obtained bacterial solution was inoculated into 50mL LB containing 50μg/mL kanamycin at a ratio of 1:100, and cultured at 37°C, 200rpm for 4h. The expanded cultured bacterial solution was inoculated into 2L LB containing 50μg/mL kanamycin at a ratio of 1:100 and cultured at 37°C, 200rpm. When the OD600 value reached about 0.6-0.8, IPTG was added at a final concentration of 0.5mM, and cultured at 16°C, 200rpm for about 16-18h. The bacterial solution was collected by centrifugation at 8000 rpm and the bacteria were frozen at -20°C until use.

实施例4 重组型孔蛋白单体的提取及纯化Example 4 Extraction and purification of recombinant porin monomers

(1)Buffer配制:(1) Buffer preparation:

Buffer A:20mM Tris-HCl,250mM NaCl,1%DDM,pH 8.0。Buffer A: 20 mM Tris-HCl, 250 mM NaCl, 1% DDM, pH 8.0.

Buffer B:20mM Tris-HCl,250mM NaCl,0.05%DDM,pH 8.0。Buffer B: 20 mM Tris-HCl, 250 mM NaCl, 0.05% DDM, pH 8.0.

Buffer C:20mM Tris-HCl,250mM NaCl,0.05%DDM,5mM脱硫生物素,pH 8.0。Buffer C: 20 mM Tris-HCl, 250 mM NaCl, 0.05% DDM, 5 mM desthiobiotin, pH 8.0.

(2)纯化步骤:(2) Purification step:

按1g菌体加10mL BufferA的比例充分重悬菌体,超声破碎细胞至菌体溶液澄清。然后,置于旋转仪上4℃旋转过夜。次日18000rpm 4℃离心1h,取上清,0.22μm滤膜过滤后于4℃待用。Resuspend the cells thoroughly at a ratio of 10 mL Buffer A per 1 g of cells, and ultrasonically disrupt the cells until the cell solution is clear. Then, place the cells on a rotator and rotate them at 4°C overnight. Centrifuge at 18,000 rpm at 4°C for 1 hour the next day, take the supernatant, filter with a 0.22 μm filter membrane, and store at 4°C for later use.

在AKTA pure层析仪上,将Strep-Tactin beads(IBA Lifesciences)层析柱利用Buffer A平衡5柱体积(CV)后,2mL/min上样。上样完成后,使用BufferB冲洗20CV,使用Buffer C洗脱,收集目的蛋白。On an AKTA pure chromatograph, the Strep-Tactin beads (IBA Lifesciences) column was equilibrated with Buffer A for 5 column volumes (CV) and then loaded at 2 mL/min. After loading, it was rinsed with Buffer B for 20 CV and eluted with Buffer C to collect the target protein.

将得到的蛋白浓缩至1mL,过经buffer B平衡的Superdex 6increase 10/300GL(Cytiva)柱子柱子,收集目的蛋白,随后储存于-80℃。将纯化后获得的目的蛋白进行SDS-PAGE电泳,野生型的蛋白单体结果如图8所示。其中2、3泳道和4、5泳道分别是未煮(孔蛋白九聚体BCP35)和95℃煮后(变性后为孔蛋白单体)的电泳条带。结果显示目的蛋白在未煮的情况下为聚体状态,煮后为单体状态。突变体的SDS-PAGE结果与野生型蛋白表现一致,在此不做具体展示。The obtained protein was concentrated to 1 mL, passed through a Superdex 6increase 10/300GL (Cytiva) column equilibrated with buffer B, and the target protein was collected and then stored at -80°C. The target protein obtained after purification was subjected to SDS-PAGE electrophoresis, and the results of the wild-type protein monomer are shown in Figure 8. Lanes 2 and 3 and lanes 4 and 5 are the electrophoresis bands of uncooked (porin nonamer BCP35) and boiled at 95°C (porin monomer after denaturation). The results showed that the target protein was in a polymer state before boiling and in a monomer state after boiling. The SDS-PAGE results of the mutant were consistent with those of the wild-type protein, and will not be specifically displayed here.

实施例5 文库构建Example 5 Library construction

将两条部分区域互补的DNA链的正义链和反义链(SEQ ID NO:8),与待测双链目的片段pUC57(SEQ ID NO:9)利用T4DNA连接酶在室温下连接并纯化,制备测序文库。然后 该测序文库与解旋酶BCH105(SEQ ID NO:10)在25℃孵育1h(摩尔浓度比1∶8),形成含有BCH105马达蛋白的如图9A所示结构的测序文库。在测序时,该测序文库能够进一步与带有胆固醇的单链DNA(SEQ ID NO:11,胆固醇连接在DNA的5′端)互补配对结合,形成如图9B所示结构。The sense strand and antisense strand of two partially complementary DNA strands (SEQ ID NO: 8) were connected to the double-stranded target fragment pUC57 (SEQ ID NO: 9) to be tested using T4 DNA ligase at room temperature and purified to prepare a sequencing library. The sequencing library was then incubated with helicase BCH105 (SEQ ID NO: 10) at 25°C for 1 hour (molar concentration ratio 1:8) to form a sequencing library containing the BCH105 motor protein with a structure as shown in Figure 9A. During sequencing, the sequencing library can further complementarily pair with single-stranded DNA with cholesterol (SEQ ID NO: 11, cholesterol is connected to the 5′ end of DNA) to form a structure as shown in Figure 9B.

接头序列正义链:S1-(iSp18)4-S2。Linker sequence positive chain: S1-(iSp18)4-S2.

其中S1的序列如SEQ ID NO:6所示,S2的序列如SEQ ID NO:7所示,iSp18的结构如图10所示。The sequence of S1 is shown in SEQ ID NO: 6, the sequence of S2 is shown in SEQ ID NO: 7, and the structure of iSp18 is shown in Figure 10.

SEQ ID NO:6:tttttttttttttttttttttttttttttttttttttttt。SEQ ID NO:6:ttttttttttttttttttttttttttttttttttttttttttttt.

SEQ ID NO:7:ggttgtttctgttggtgctgatattgct。SEQ ID NO:7:ggttgtttctgttggtgctgatattgct.

SEQ ID NO:8(接头序列反义链):SEQ ID NO: 8 (antisense linker sequence):

Figure PCTCN2022143054-appb-000007
Figure PCTCN2022143054-appb-000007

SEQ ID NO:9:SEQ ID NO: 9:

Figure PCTCN2022143054-appb-000008
Figure PCTCN2022143054-appb-000008

Figure PCTCN2022143054-appb-000009
Figure PCTCN2022143054-appb-000009

SEQ ID NO:10:SEQ ID NO: 10:

Figure PCTCN2022143054-appb-000010
Figure PCTCN2022143054-appb-000010

实施例6 利用孔蛋白BCP35及其突变体构建纳米孔生物传感器Example 6 Construction of nanopore biosensor using porin BCP35 and its mutants

使用膜片钳放大器采集电流信号。Ag/AgCl电极浸润在测序缓冲液中并且电极分别位于电解槽顺式(cis)区域和反式(trans)区域。使用1xPBS缓冲液将孔蛋白(即实施例4纯化获得的孔蛋白BCP35)稀释一定的倍数后,在外加电场力作用下将单个纳米孔蛋白BCP35插入由二脂酰磷脂酰胆碱(DPhPC,1,2-diphytanoyl-sn-glycero-3-phosphocholine)组成的磷脂双分子层中,形成纳米孔生物传感器。稀释倍数以孔蛋白是否嵌入膜中(即嵌孔)为标准。一般地,使用0.1mg/ml的蛋白浓度,用PBS稀释100倍或者50倍或者其他倍数,进行尝试。如果某一稀释浓度未能嵌孔,则需要降低稀释倍数继续尝试,直至纳米孔蛋白成功嵌入膜层中。施加外加电压,获得单个孔蛋白的电流振幅值。图11为施加0.02V、0.04V、0.10V、0.14V和0.18V电压时纳米孔蛋白BCP35的孔道产生的生物传感电流。可见,孔蛋白BCP35在不同电压下开孔电流平稳,噪声很小。同样的,孔蛋白BCP35突变体(P91A、S98N和N99S)也产生了平稳的生物传感电流。The current signal is collected using a patch clamp amplifier. The Ag/AgCl electrode is immersed in the sequencing buffer and the electrodes are respectively located in the cis region and the trans region of the electrolytic cell. After diluting the porin (i.e., the porin BCP35 purified and obtained in Example 4) by a certain multiple using 1xPBS buffer, a single nanoporin BCP35 is inserted into a phospholipid bilayer composed of diacylphosphatidylcholine (DPhPC, 1,2-diphytanoyl-sn-glycero-3-phosphocholine) under the action of an external electric field force to form a nanopore biosensor. The dilution multiple is based on whether the porin is embedded in the membrane (i.e., embedded in the pore). Generally, a protein concentration of 0.1 mg/ml is used, and it is diluted 100 times, 50 times, or other times with PBS for an attempt. If a certain dilution concentration fails to embed the pore, it is necessary to reduce the dilution multiple and continue to try until the nanoporin is successfully embedded in the membrane layer. Apply an applied voltage to obtain the current amplitude value of a single porin. Figure 11 shows the biosensing current generated by the pore of nanoporin BCP35 when voltages of 0.02 V, 0.04 V, 0.10 V, 0.14 V and 0.18 V are applied. It can be seen that the pore current of porin BCP35 is stable at different voltages with little noise. Similarly, porin BCP35 mutants (P91A, S98N and N99S) also produce stable biosensing current.

实施例7 将孔蛋白BCP35及其突变体用于DNA测序Example 7 Use of porin BCP35 and its mutants for DNA sequencing

将实施例5制备获得的含有pUC57序列的测序文库和带有胆固醇的单链DNA(SEQ ID NO:11,胆固醇cholestero连接在DNA的5′端)与测序缓冲液混合并加入纳米孔生物传感器中;施加外加电压0.14V或0.18V后,观察到DNA被纳米孔捕获,产生特征的阻滞电流振幅值。并且随着DNA通过纳米孔移动,电流振幅值改变。不同的DNA序列产生不同的阻滞电流振幅值。带有胆固醇的单链DNA可以与磷脂双分子层进行结合,有助于纳米孔捕获测序文库,降低测序文库的上样量。图12为在外加电压0.14V作用下,文库DNA穿过纳米孔蛋白BCP35时产生的电流变化。可见,文库DNA可以穿过野生型BCP35孔蛋白,产生电流,随着不同核苷酸穿孔,电流值发生波动。同样的,文库DNA穿过孔蛋白BCP35突变体(P91A、S98N和N99S)时也产生了电流变化。因此表明孔蛋白BCP35及其突变体可以用于纳米孔测序。The sequencing library containing the pUC57 sequence prepared in Example 5 and the single-stranded DNA with cholesterol (SEQ ID NO: 11, cholesterol is connected to the 5′ end of the DNA) were mixed with the sequencing buffer and added to the nanopore biosensor; after applying an external voltage of 0.14V or 0.18V, it was observed that the DNA was captured by the nanopore, generating a characteristic blocking current amplitude value. And as the DNA moves through the nanopore, the current amplitude value changes. Different DNA sequences produce different blocking current amplitude values. Single-stranded DNA with cholesterol can bind to the phospholipid bilayer, which helps the nanopore capture the sequencing library and reduce the amount of sequencing library loaded. Figure 12 shows the current changes generated when the library DNA passes through the nanopore protein BCP35 under the action of an applied voltage of 0.14V. It can be seen that the library DNA can pass through the wild-type BCP35 pore protein to generate current, and the current value fluctuates as different nucleotides penetrate the pore. Similarly, when the library DNA passed through the porin BCP35 mutants (P91A, S98N, and N99S), current changes were also generated, indicating that the porin BCP35 and its mutants can be used for nanopore sequencing.

带有胆固醇的单链DNA序列(SEQ ID NO:11):Single-stranded DNA sequence with cholesterol (SEQ ID NO: 11):

cholestero-ttgaccgctcgcctc。cholestero-ttgaccgctcgcctc.

测序缓冲液:0.47M KCl、25mM HEPES、1mM EDTA、5mM ATP、25mM MgCl 2、pH7.6。 Sequencing buffer: 0.47 M KCl, 25 mM HEPES, 1 mM EDTA, 5 mM ATP, 25 mM MgCl 2 , pH 7.6.

从以上的描述中,可以看出,本发明上述的实施例实现了如下技术效果:本发明发现了一种新的纳米孔蛋白单体,该单体经聚合为孔蛋白BCP58,该孔蛋白及其突变体的稳定性较好,能够满足单分子纳米孔测序的需求,利用上述纳米孔蛋白及其突变体,能够形成纳米孔传感器及进一步的纳米孔测序装置,实现对于核苷酸、氨基酸、糖类、维生素等小分子的检测,同样也可以用于DNA、RNA以及多肽等样品的测序。From the above description, it can be seen that the above embodiments of the present invention achieve the following technical effects: the present invention has discovered a new nanopore protein monomer, which is polymerized into the porin BCP58. The porin and its mutants have good stability and can meet the needs of single-molecule nanopore sequencing. The above nanopore protein and its mutants can be used to form a nanopore sensor and a further nanopore sequencing device to realize the detection of small molecules such as nucleotides, amino acids, sugars, vitamins, etc., and can also be used for sequencing samples such as DNA, RNA and polypeptides.

以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims (23)

一种孔蛋白单体,其特征在于,所述孔蛋白单体包括:A porin monomer, characterized in that the porin monomer comprises: (a)具有SEQ ID NO:1所示的氨基酸序列的蛋白质;或(a) a protein having the amino acid sequence shown in SEQ ID NO: 1; or (b)蛋白质突变体,所述蛋白质突变体的氨基酸序列在SEQ ID NO:1的如下至少一个位点发生取代、缺失和/或添加一个或几个氨基酸:91、98、99、125、126、127、131、134、136、140、152、155、174、177、183、185、188、219、222、223、230、232、238、239,且所述蛋白突变体具有经聚合形成孔道结构的功能;或(b) a protein mutant, wherein the amino acid sequence of the protein mutant undergoes substitution, deletion and/or addition of one or more amino acids at at least one of the following positions of SEQ ID NO: 1: 91, 98, 99, 125, 126, 127, 131, 134, 136, 140, 152, 155, 174, 177, 183, 185, 188, 219, 222, 223, 230, 232, 238, 239, and the protein mutant has the function of forming a pore structure through polymerization; or (c)与(a)或(b)中所述的蛋白质具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、或至少99%同一性,且具有经聚合形成孔道结构的功能。(c) has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% identity with the protein described in (a) or (b), and has the function of forming a pore structure through polymerization. 根据权利要求1所述的孔蛋白单体,其特征在于,所述b)中,取代的氨基酸的类型各自独立地选自如下:The porin monomer according to claim 1, characterized in that in b), the types of substituted amino acids are each independently selected from the following: P91突变为P91G、P91A或P91T;P91 mutated to P91G, P91A, or P91T; S98突变为S98G、S98A、S98T、S98N或S98Q;S98 mutation to S98G, S98A, S98T, S98N, or S98Q; N99突变为N99G、N99A、N99S、N99T或N99Q;N99 mutated to N99G, N99A, N99S, N99T, or N99Q; E125突变为E125K、E125R、E125G、E125A、E125S、E125T、E125N或E125Q;E125 mutation is E125K, E125R, E125G, E125A, E125S, E125T, E125N, or E125Q; R126突变为R126K、R126G、R126A、R126S、R126T、R126N或R126Q;R126 mutation to R126K, R126G, R126A, R126S, R126T, R126N, or R126Q; K127突变为K127R、K127G、K127A、K127S、K127T、K127N或K127Q;K127 mutation is K127R, K127G, K127A, K127S, K127T, K127N, or K127Q; D131突变为D131K、D131R、D131G、D131A、D131S、D131T、D131N或D131Q;D131 mutation is D131K, D131R, D131G, D131A, D131S, D131T, D131N or D131Q; K134突变为K134R、K134G、K134A、K134S、K134T、K134N或K134Q;K134 mutation is K134R, K134G, K134A, K134S, K134T, K134N, or K134Q; R136突变为R136K、R136G、R136A、R136S、R136T、R136N或R136Q;R136 mutation to R136K, R136G, R136A, R136S, R136T, R136N, or R136Q; R140突变为R140K、R140G、R140A、R140S、R140T、R140N或R140Q;R140 mutation to R140K, R140G, R140A, R140S, R140T, R140N, or R140Q; K152突变为K152R、K152G、K152A、K152S、K152T、K152N或K152Q;K152 mutation is K152R, K152G, K152A, K152S, K152T, K152N, or K152Q; D155突变为D155R、D155K、D155G、D155A、D155S、D155T、D155N或D155Q;D155 is mutated to D155R, D155K, D155G, D155A, D155S, D155T, D155N, or D155Q; T174突变为T174A、T174G、T174V、T174L、T174I、T174Y、T174F或T174W;T174 mutation is T174A, T174G, T174V, T174L, T174I, T174Y, T174F, or T174W; E177突变为E177A、E177G、E177S、E177T、E177N或E177Q;E177 mutation to E177A, E177G, E177S, E177T, E177N, or E177Q; H183突变为H183A、H183G、H183V、H183L、H183I、H183Y、H183F或H183W;H183 mutated to H183A, H183G, H183V, H183L, H183I, H183Y, H183F, or H183W; K185突变为K185A、K185G、K185V、K185L、K185I、K185Y、K185F或K185W;K185 mutation is K185A, K185G, K185V, K185L, K185I, K185Y, K185F, or K185W; R188突变为R188A、R188G、R188S、R188T、R188N或R188Q;R188 mutation to R188A, R188G, R188S, R188T, R188N, or R188Q; K219突变为K219A、K219G、K219V、K219L、K219I、K219Y、K219F或K219W;K219 mutation is K219A, K219G, K219V, K219L, K219I, K219Y, K219F, or K219W; R222突变为R222A、R222G、R222S、R222T、R222N或R222Q;R222 mutated to R222A, R222G, R222S, R222T, R222N or R222Q; S223突变为S223A、S223G、S223V、S223L、S223I、S223Y、S223F或S223W;S223 mutation to S223A, S223G, S223V, S223L, S223I, S223Y, S223F, or S223W; R230突变为R230A、R230G、R230S、R230T、R230N或R230Q;R230 mutated to R230A, R230G, R230S, R230T, R230N, or R230Q; K232突变为K232A、K232G、K232S、K232T、K232N或K232Q;K232 mutation is K232A, K232G, K232S, K232T, K232N, or K232Q; E238突变为E238A、E238G、E238S、E238T、E238N或E238Q;E238 mutation is E238A, E238G, E238S, E238T, E238N, or E238Q; R239突变为R239A、R239G、R239V、R239L、R239I、R239Y、R239F或R239W。R239 is mutated to R239A, R239G, R239V, R239L, R239I, R239Y, R239F or R239W. 根据权利要求1或2所述的孔蛋白单体,其特征在于,所述孔蛋白单体包括具有SEQ ID NO:2-SEQ ID NO:4中任一氨基酸序列的蛋白质。The porin monomer according to claim 1 or 2 is characterized in that the porin monomer comprises a protein having any amino acid sequence of SEQ ID NO: 2-SEQ ID NO: 4. 一种蛋白构建体,其特征在于,所述蛋白构建体由2个或更多个权利要求1至3中任一项所述的孔蛋白单体、通过共价或非共价连接而成。A protein construct, characterized in that the protein construct is formed by covalently or non-covalently linking two or more porin monomers according to any one of claims 1 to 3. 一种孔蛋白,其特征在于,所述孔蛋白由7-11个权利要求1至3中任一项所述的孔蛋白单体、通过共价或非共价连接而成,优选为9个。A porin, characterized in that the porin is composed of 7 to 11 porin monomers according to any one of claims 1 to 3, connected covalently or non-covalently, preferably 9 porin monomers. 根据权利要求5所述的孔蛋白,其特征在于,所述孔蛋白由9个孔蛋白单体通过非共价连接而成,所述孔蛋白单体包括具有SEQ ID NO:1-SEQ ID NO:4中任一氨基酸序列的蛋白质。The porin according to claim 5 is characterized in that the porin is composed of 9 porin monomers connected non-covalently, and the porin monomers include proteins having any amino acid sequence in SEQ ID NO: 1-SEQ ID NO: 4. 根据权利要求5或6所述的孔蛋白,其特征在于,所述孔蛋白的孔道直径为0.5.~3nm。The porin according to claim 5 or 6, characterized in that the pore diameter of the porin is 0.5 to 3 nm. 一种试剂盒,其特征在于,所述试剂盒包括权利要求1至3中任一项所述的孔蛋白单体,或权利要求4所述的蛋白构建体,或权利要求5至7中任一项所述的孔蛋白。A kit, characterized in that the kit comprises the porin monomer according to any one of claims 1 to 3, or the protein construct according to claim 4, or the porin according to any one of claims 5 to 7. 根据权利要求8所述的试剂盒,其特征在于,所述试剂盒还包括膜层,所述膜层包括脂质层或人造高分子膜。The kit according to claim 8 is characterized in that the kit further comprises a membrane layer, and the membrane layer comprises a lipid layer or an artificial polymer membrane. 根据权利要求9所述的试剂盒,其特征在于,所述试剂盒还包括以下至少一项:测序缓冲液、核酸酶、聚合酶、拓扑异构酶、连接酶、解旋酶和连接胆固醇的单链DNA。The kit according to claim 9, characterized in that the kit further comprises at least one of the following: a sequencing buffer, a nuclease, a polymerase, a topoisomerase, a ligase, a helicase, and a single-stranded DNA linked to cholesterol. 一种分离的DNA分子,其特征在于,所述DNA分子具有:An isolated DNA molecule, characterized in that the DNA molecule has: 编码权利要求1至3中任一项所述的孔蛋白单体的核苷酸序列;或编码权利要求4所述的蛋白构建体的核苷酸序列,或编码权利要求5至7中任一项所述的孔蛋白的核苷 酸序列。A nucleotide sequence encoding a porin monomer according to any one of claims 1 to 3; or a nucleotide sequence encoding a protein construct according to claim 4, or a nucleotide sequence encoding a porin according to any one of claims 5 to 7. 根据权利要求11所述的DNA分子,其特征在于,与SEQ ID NO:5所示核苷酸序列具有70%以上,优选80%以上,更优选90%以上,进一步优选99%以上、最优选99%以上同一性且编码具有相同功能蛋白质的DNA分子。The DNA molecule according to claim 11 is characterized in that it has more than 70%, preferably more than 80%, more preferably more than 90%, further preferably more than 99%, and most preferably more than 99% identity with the nucleotide sequence shown in SEQ ID NO: 5 and encodes a DNA molecule with the same functional protein. 一种重组载体,其特征在于,所述重组载体包含权利要求11或12所述的DNA分子。A recombinant vector, characterized in that the recombinant vector comprises the DNA molecule according to claim 11 or 12. 一种宿主细胞,其特征在于,所述宿主细胞转化有权利要求13所述的重组载体。A host cell, characterized in that the host cell is transformed with the recombinant vector according to claim 13. 一种纳米孔传感器,其特征在于,所述纳米孔传感器包括:A nanopore sensor, characterized in that the nanopore sensor comprises: 膜层;以及film layer; and 插入所述膜层中且形成孔道的孔蛋白,当跨越所述膜层施加电压时,所述孔道产生电流;a porin protein inserted into the membrane layer and forming a pore that generates an electric current when a voltage is applied across the membrane layer; 其中,所述孔蛋白包括权利要求5至7中任一项所述的孔蛋白。Wherein, the porin comprises the porin according to any one of claims 5 to 7. 根据权利要求15所述的纳米孔传感器,其特征在于,所述膜层包括脂质层或人造高分子膜;The nanopore sensor according to claim 15, characterized in that the membrane layer comprises a lipid layer or an artificial polymer membrane; 优选地,所述脂质层包括两亲脂类;Preferably, the lipid layer comprises amphiphilic lipids; 优选地,所述两亲脂类包含磷脂双分子层;Preferably, the amphiphilic lipid comprises a phospholipid bilayer; 优选地,所述脂质层包括平面膜层或脂质体;Preferably, the lipid layer comprises a planar membrane layer or a liposome; 优选地,所述脂质体包括多层脂质体或单层脂质体;Preferably, the liposomes comprise multilamellar liposomes or unilamellar liposomes; 优选地,所述脂质层包括二植酰磷脂酰胆碱组成的磷脂双分子层。Preferably, the lipid layer comprises a phospholipid bilayer composed of diphytylphosphatidylcholine. 根据权利要求15所述的纳米孔传感器,其特征在于,The nanopore sensor according to claim 15, characterized in that 当跨越所述膜层施加电压时,待测生物分子穿过所述纳米孔传感器中的孔道并发生移位,所述孔道产生变化的电流;When a voltage is applied across the membrane layer, the biomolecule to be detected passes through the pore in the nanopore sensor and shifts, and the pore generates a changing current; 优选地,所述待测生物分子包括DNA、RNA或多肽;Preferably, the biomolecule to be detected includes DNA, RNA or polypeptide; 优选地,所述DNA和/或RNA包括如下任意一种或多种修饰碱基:5-甲基胞嘧啶、6-甲基腺嘌呤、7-甲基鸟嘌呤、假尿嘧啶。Preferably, the DNA and/or RNA comprises any one or more of the following modified bases: 5-methylcytosine, 6-methyladenine, 7-methylguanine, pseudouracil. 一种纳米孔测序装置,其特征在于,所述纳米孔测序装置包括权利要求15至17中任一项所述的纳米孔传感器。A nanopore sequencing device, characterized in that the nanopore sequencing device comprises the nanopore sensor according to any one of claims 15 to 17. 根据权利要求18所述的纳米孔测序装置,其特征在于,所述纳米孔测序装置包括:The nanopore sequencing device according to claim 18, characterized in that the nanopore sequencing device comprises: 电解槽,所述电解槽含有测序缓冲液;an electrolytic cell containing a sequencing buffer; 纳米孔传感器,所述纳米孔传感器位于所述电解槽的中央,并将所述电解槽及所述测序缓冲液分割为正极电解液区和负极电解液区;A nanopore sensor, wherein the nanopore sensor is located in the center of the electrolytic cell and divides the electrolytic cell and the sequencing buffer into a positive electrode electrolyte region and a negative electrode electrolyte region; 第一电极和第二电极,所述第一电极和所述第二电极分别设置在所述正极电解液区和所述负极电解液区,且所述第一电极和所述第二电极与信号处理芯片相连;A first electrode and a second electrode, wherein the first electrode and the second electrode are respectively arranged in the positive electrode electrolyte region and the negative electrode electrolyte region, and the first electrode and the second electrode are connected to a signal processing chip; 优选地,所述第一电极和所述第二电极包括金属或复合电极材料;Preferably, the first electrode and the second electrode comprise metal or composite electrode materials; 优选地,所述第一电极和所述第二电极不同,分别为银和氯化银;或者所述第一电极和所述第二电极相同,包括金、铂、石墨烯或氮化钛。Preferably, the first electrode and the second electrode are different, namely silver and silver chloride, respectively; or the first electrode and the second electrode are the same, including gold, platinum, graphene or titanium nitride. 一种测序方法,其特征在于,所述测序方法利用5至7中任一项所述的孔蛋白,或者权利要求15至17中任一项所述的纳米孔传感器,或者权利要求18或19所述的纳米孔测序装置通过检测并解析待测生物分子通过所述孔蛋白的孔道时产生的电信号,确定所述待测生物分子的序列。A sequencing method, characterized in that the sequencing method uses the porin described in any one of 5 to 7, or the nanopore sensor described in any one of claims 15 to 17, or the nanopore sequencing device described in claim 18 or 19 to determine the sequence of the biological molecule to be tested by detecting and analyzing the electrical signal generated when the biological molecule to be tested passes through the pore of the porin. 根据权利要求20所述的测序方法,其特征在于,所述待测生物分子包括修饰或未修饰的DNA、RNA或多肽;The sequencing method according to claim 20, characterized in that the biomolecule to be detected includes modified or unmodified DNA, RNA or polypeptide; 优选地,所述电信号包括电流。Preferably, the electrical signal comprises an electric current. 根据权利要求20所述的测序方法,其特征在于,所述待测生物分子是靶核酸序列,所述测序方法包括:The sequencing method according to claim 20, characterized in that the biological molecule to be detected is a target nucleic acid sequence, and the sequencing method comprises: (a)使所述核酸序列与权利要求5至7中任一项所述的孔蛋白和核酸结合蛋白接触,使得所述核酸结合蛋白控制靶核酸序列通过所述孔蛋白的孔道的移动速度,其中所述核酸结合蛋白选自核酸酶、聚合酶、拓扑异构酶、连接酶、解旋酶或单链结合蛋白中的任意一种或多种;(a) contacting the nucleic acid sequence with a porin and a nucleic acid binding protein according to any one of claims 5 to 7, so that the nucleic acid binding protein controls the speed at which the target nucleic acid sequence moves through the pore of the porin, wherein the nucleic acid binding protein is selected from any one or more of a nuclease, a polymerase, a topoisomerase, a ligase, a helicase or a single-stranded binding protein; (b)在跨孔施加电压时,当所述核酸序列移动通过所述孔道时,测量通过孔道的电信号,其中,不同类型的核苷酸通过所述孔道所产生的电信号不同,从而基于所述电信号确定所述核酸的序列信息。(b) When a voltage is applied across the pore, when the nucleic acid sequence moves through the pore, an electrical signal passing through the pore is measured, wherein different types of nucleotides generate different electrical signals when passing through the pore, thereby determining the sequence information of the nucleic acid based on the electrical signal. 权利要求1至3中任一项所述的孔蛋白单体,或者权利要求5至7中任一项所述的孔蛋白,或者权利要求8至10中任一项所述的试剂盒,或者权利要求11或12所述的DNA分子,或者权利要求13所述的重组载体,或者权利要求14所述的宿主细胞,或者权利要求15至17中任一项所述的纳米孔传感器,或者权利要求18或19所述的纳米孔测序装置,或者权利要求20至22中任一项所述的测序方法在生物小分子检测、核酸测序或多肽测序中的应用。Use of the porin monomer according to any one of claims 1 to 3, or the porin according to any one of claims 5 to 7, or the kit according to any one of claims 8 to 10, or the DNA molecule according to claim 11 or 12, or the recombinant vector according to claim 13, or the host cell according to claim 14, or the nanopore sensor according to any one of claims 15 to 17, or the nanopore sequencing device according to claim 18 or 19, or the sequencing method according to any one of claims 20 to 22 in biological small molecule detection, nucleic acid sequencing or polypeptide sequencing.
PCT/CN2022/143054 2022-12-28 2022-12-28 Porin monomer, porin, mutant thereof and use of same Ceased WO2024138472A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/143054 WO2024138472A1 (en) 2022-12-28 2022-12-28 Porin monomer, porin, mutant thereof and use of same
CN202280102619.6A CN120359234A (en) 2022-12-28 2022-12-28 Porin monomer, porin and mutant and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/143054 WO2024138472A1 (en) 2022-12-28 2022-12-28 Porin monomer, porin, mutant thereof and use of same

Publications (1)

Publication Number Publication Date
WO2024138472A1 true WO2024138472A1 (en) 2024-07-04

Family

ID=91715862

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/143054 Ceased WO2024138472A1 (en) 2022-12-28 2022-12-28 Porin monomer, porin, mutant thereof and use of same

Country Status (2)

Country Link
CN (1) CN120359234A (en)
WO (1) WO2024138472A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110914290A (en) * 2017-06-30 2020-03-24 弗拉芒区生物技术研究所 Novel protein pores
CN113735948A (en) * 2021-09-28 2021-12-03 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
WO2022074397A1 (en) * 2020-10-08 2022-04-14 Oxford Nanopore Technologies Limited Modification of a nanopore forming protein oligomer
CN114957412A (en) * 2022-04-28 2022-08-30 清华大学 Novel porin monomer and application thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110914290A (en) * 2017-06-30 2020-03-24 弗拉芒区生物技术研究所 Novel protein pores
WO2022074397A1 (en) * 2020-10-08 2022-04-14 Oxford Nanopore Technologies Limited Modification of a nanopore forming protein oligomer
CN113735948A (en) * 2021-09-28 2021-12-03 成都齐碳科技有限公司 Mutant of porin monomer, protein pore and application thereof
CN114957412A (en) * 2022-04-28 2022-08-30 清华大学 Novel porin monomer and application thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DATABASE Protein 22 February 2021 (2021-02-22), ANONYMOUS: "MAG: Curli production assembly/transport component CsgG [uncultured Thiotrichaceae bacterium]", XP093185518, retrieved from NCBI Database accession no. CAA6799946.1 *

Also Published As

Publication number Publication date
CN120359234A (en) 2025-07-22

Similar Documents

Publication Publication Date Title
Afshar Bakshloo et al. Nanopore-based protein identification
Huang et al. Electro-osmotic capture and ionic discrimination of peptide and protein biomarkers with FraC nanopores
JP7499761B2 (en) pore
CN109072295A (en) The nano-pore of modification, composition and its application comprising it
JP2022500074A (en) Biological nanopores with adjustable pore diameter and their use as analytical tools
Krishnan R et al. Assembly of transmembrane pores from mirror-image peptides
Wei et al. Narrowing signal distribution by adamantane derivatization for amino acid identification using an α-hemolysin nanopore
JP2025528144A (en) DE NOVO Pore
Shestakova et al. Conductometric and potentiometric titration of carboxyl groups in polymer microspheres
Gonzalez Solveyra et al. Orientational Pathways during Protein Translocation through Polymer-Modified Nanopores
WO2024138472A1 (en) Porin monomer, porin, mutant thereof and use of same
WO2021212561A1 (en) Method for constructing nanopore with dual recognition sites
WO2024138473A1 (en) Porin monomer, porin, mutant thereof, and use thereof
WO2024138382A1 (en) Porin monomer, porin, mutant thereof and use thereof
WO2024138470A1 (en) Nanopore sensor and use thereof in sequencing
CN120265979A (en) Bionanopore sensor, preparation method and application thereof
CN121013859A (en) Large conical nanopores and their use in analyte sensing
EP4644407A1 (en) Novel porin bcp34, and mutant thereof and use thereof
WO2024138425A1 (en) Novel nanopore protein and use thereof
WO2024138424A1 (en) Nanopore protein and application thereof
JP2025500472A (en) pore
CN118234741A (en) Nanoporous proteins and related applications in sequencing
WO2024138565A1 (en) Nanopore protein, and mutant and use thereof
WO2020152563A1 (en) Method and device for nanopore-based optical recognition of molecules
WO2025076825A1 (en) Method for embedding nanopore protein into biomimetic membrane and use thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22969623

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202280102619.6

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 202280102619.6

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE