WO2016164530A1 - Compositions et procédés pour le séquençage de protéines à haut débit - Google Patents
Compositions et procédés pour le séquençage de protéines à haut débit Download PDFInfo
- Publication number
- WO2016164530A1 WO2016164530A1 PCT/US2016/026354 US2016026354W WO2016164530A1 WO 2016164530 A1 WO2016164530 A1 WO 2016164530A1 US 2016026354 W US2016026354 W US 2016026354W WO 2016164530 A1 WO2016164530 A1 WO 2016164530A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- amino acid
- polypeptides
- sequence
- terminal amino
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/12—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by hydrolysis, i.e. solvolysis in general
- C07K1/128—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by hydrolysis, i.e. solvolysis in general sequencing
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/12—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by hydrolysis, i.e. solvolysis in general
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/13—Labelling of peptides
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/68—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
- G01N33/6803—General methods of protein analysis not limited to specific proteins or families of proteins
- G01N33/6818—Sequencing of polypeptides
- G01N33/6824—Sequencing of polypeptides involving N-terminal degradation, e.g. Edman degradation
Definitions
- the present disclosure relates generally to protein analysis and more specifically to high-throughput protein sequence determination.
- MS Mass spectrometry
- the present disclosure is related to improved compositions and methods for elucidating the amino acid sequences of a plurality of polypeptides. It is suitable for a variety of high-throughput approaches, and can be used to analyze polypeptides from a wide diversity of sources, and for detecting proteins of low abundance from any particular source.
- the compositions comprise, among other components, novel binding partners that have specificity for N-terminal amino acids. In various aspects the N-terminal amino acid binding agent is detectably labeled.
- Methods provided herein general comprise obtaining a plurality of polypeptides, binding detectably labeled N-terminal amino acid binding agents to the N-terminal amino acid of the polypeptides, detecting the N-terminal amino acid binding agents to identify the N- terminal amino acid for some or all of the polypeptides, removing the N-terminal amino acid binding agents, liberating the N-terminal amino acid to reveal the next amino acid in the polypeptides in the N->C terminal direction, and repeating the process to determine some or all of the amino acid sequence of the polypeptides.
- At least one N-terminal amino acid binding agent used in the methods, compositions, and or/kits of this disclosure comprises a sequence selected from SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:8, or the sequence of SEQ ID NO: 1 comprising any one or any combination of changes selected from the group consisting of L47N, V53D, F61C, Dl 15V, Y301C, or the sequence of SEQ ID NO:2, comprising any one or any combination of changes selected from the group consisting of M178I D229V, and L227R, or the sequence of SEQ ID NO:9, comprising an
- the binding agents may comprise binding partners such as antibodies or specific N-terminal amino-acid binding fragments thereof.
- the disclosure comprises using a binding partner having a complementarity determining region (CDR) selected from (LI) SGDALPKKYAY (SEQ ID NO:3); (L2) EDVKRLS (SEQ ID NO:4); (L3) YSNSKTGNYNV (SEQ ID NO:5); (HI) GYTFTDYWIS (SEQ ID NO:6); (H2) QIAMTNSATVYGPSFQG (SEQ ID NO:7); (H3) DYSDNYYNDTYS (SEQ ID NO:8).
- CDR complementarity determining region
- the disclosure includes but is not limited to using combinations of LI and HI, LI and H2, LI and H3, L2 and HI, L2 and H2, L2 and H3, L3 and HI, L3 and H2, and L3 and H3.
- Methods of the disclosure are suitable for determining the sequence of all or a portion of the polypeptides in or derived from, for example, any biological sample.
- the disclosure includes generating a report, including but not necessarily limited to a printed report, or a digitized report that includes some or all of the peptide sequence information that is generated using the compositions and methods described herein.
- the disclosure includes a complex comprising a polypeptide and an N-terminal amino acid binding agent, wherein the binding agent comprises a sequence selected from SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:8, the sequence of SEQ ID NO: l comprising any one or any combination of changes selected from the group consisting of L47N, V53D, F61C, Dl 15V, Y301C, or the sequence of SEQ ID NO:2, comprising any one or any combination of changes selected from the group consisting of M178I D229V, and L227R, or the sequence of SEQ ID NO:9, comprising an I496F change.
- the polypeptides may be provided in physical association with a solid substrate.
- the polypeptides may be present in an array of polypeptides.
- kits that are useful for performing methods of the disclosure.
- the kits comprise at least one sealed container comprising at least one of the N- terminal amino acid binding agents that are described herein, and may comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 19, or 20 distinct binding partners.
- the kits can further comprise printed material describing use of N-terminal amino acid binding agents in a process for determining amino acid sequences of a plurality of polypeptides. DESCRIPTION OF THE FIGURES
- Figure 1 depicts an illustration of single molecule peptide sequencing according to one embodiment of this disclosure.
- Figure 2 depicts the crystal structure (top) and a depiction (bottom) of an N- terminus Met occupying the active site of a methionyl-tRNA synthetase.
- Figure 3 provides a flowchart showing affinity selection cycle using a phage display protein library.
- Figure 4A provides a depiction of one illustrative embodiment of the disclosure showing covalent immobilization of trypsin digested peptides onto glass surface coated with methylisourea through a C-terminus lysine.
- the surface is coated with a compound, such as polyethylene glycol, to inhibit nonspecific binding.
- Figure 4B provides a graphical summary of data obtained from determining expression of distinct tRNA synthetases expressed on M13 phage as Flag-aaRS-pIII fusion proteins. A higher OD450 represents better expression.
- FIG. 5A Representative scheme depicting labeling and purification of NAA- binding proteins with organic dyes by using as an illustrative example bio-orthogonal click chemistry. As a non-limiting example of an embodiment of this disclosure, the excitation and emission maxima of four organic dyes are indicated.
- FIG. 5B Graphical depiction of data showing PheRS shows their specificity to N-terminus phenylalanine over leucine.
- a 96-well plate was first coated with NeutroAvidin (100 ⁇ of 10 ⁇ g/mL for each well), then incubated with a peptide containing N-terminal Phe and C-terminal biotin connected with 11 polyethylene glycol (PEG) units. 100 ⁇ ⁇ serially diluted Ml 3 phage expressing wild-type PheRS-pIII on the surface was bound to the plate. The bound phages were detected with the HRP-conjugated anti- Mi 3 antibody and TMB substrate.
- NeutroAvidin 100 ⁇ of 10 ⁇ g/mL for each well
- PEG polyethylene glycol
- TyrRS mutants resulting from three cycles of panning.
- the phage-displayed TyrRS library was generated by error prone PCR. The library was incubated with immobilized peptide containing N-terminus tyrosine and C-terminal biotin connected with 11-PEG units. After three cycles, individual clones were isolated and sequenced. Their binding to immobilized N-terminal tyrosine and leucine (a control) peptides were the same as described in Figure 5B.
- the cartoon demonstrates an aspect of library screening. The table provides an indication of amino acid changes in the TyrRS and demonstrates distinct substrate differences that correlate with the amino acid changes.
- FIG. 7 Graphical depiction of data obtained from binding assays of scFv towards N- and C-terminal tyrosine.
- a naive phage-display scFv library was selected for binding with an immobilized peptide containing N-terminal tyrosine. Three cycles were performed. The enriched clones were tested with ELISA as described herein. Clone p807/C2 shows specific binding to N-terminal tyrosine.
- Figure 8 Graphical depiction of results obtained from PheRS Screening performed generally as outlined in Figure 5B and its description. Two clones (Al with Ml 781, D229V mutations; CI with L227R mutation) show binding preference to N-terminus Phe peptides over control N-terminus Leu peptides (Al, 1.33 fold; CI, 1.96 fold).
- Figure 9 Graphical depiction of results obtained from LeuRS Screening performed generally as outlined in Figure 5B and its description, but with LeuRS.
- One clone (D2 with an I496F mutation) shows binding preference to N-terminus Leu peptides over control N-terminus Phe peptides (1.54 fold).
- the disclosure includes all polynucleotide and amino acid sequences described herein, and every polynucleotide sequence referred to herein includes its complementary DNA sequence, and also includes the RNA equivalents thereof to the extent an RNA sequence is not given. Every DNA and RNA sequence encoding polypeptides disclosed herein is encompassed by this disclosure
- compositions and reagents designed for determining amino acid sequences of a plurality of polypeptides comprise novel binding partners that have specificity for N-terminal amino acids.
- the method comprises obtaining a plurality of polypeptides, binding detectably labeled N-terminal amino acid binding agents to the N-terminal amino acid of the polypeptides, detecting the N-terminal amino acid binding agents to identify the N-terminal amino acid, removing the N-terminal amino acid binding agents, liberating the N-terminal amino acid to reveal the next amino acid in the polypeptides in the N->C terminal direction, and repeating the process to determine some or all of the amino acid sequence of the polypeptides. Additional description of this process and reagents suitable for use in performing are described further below, and in part with reference to Figure 1.
- proteins are separated from a sample using any suitable approaches, and denatured using conventional techniques.
- they may be modified, such as by being alkylated on cysteine residues, and digested by any one or a combination of known proteases, including but not necessarily limited to trypsin. If trypsin is used, the average peptide length will be from 7-25 residues.
- peptides analyzed according to the present disclosure can in certain embodiments comprise or consist of between 7-25 amino acid residues, inclusive, and including all integers and ranges there between.
- sequencing 7-10 residues is expected to be sufficient to identify a unique peptide.
- the compositions and methods are adapted for the identification and quantification of peptides/proteins regardless of their relative abundance in any particular sample or set of samples.
- the compositions and methods are adapted for concurrent or sequential sequencing of a plurality of proteins ranging from two proteins, up to and including a billion proteins, or more.
- by determining amino acid sequences the amount of one or more distinct peptides in a plurality of proteins is determined.
- the disclosure includes but is not necessarily limited to determining use of amino acid sequence determination to measure the relative abundance of one or more distinct proteins relative to other proteins in a sample, and/or absolute peptide quantification, such as by determining the mass, molarity, or number of any one or more distinct polypeptides in a sample, or from more than one sample.
- one or more low abundance proteins can be determined to have been present in a sample that is analyzed according to methods of the present disclosure.
- Polypeptides analyzed using the compositions and methods of this disclosure can be from any source that contains or is expected to contain polypeptides.
- Embodiments of the disclosure are adaptable for use in, for example, large-scale, massively parallel peptide sequencing.
- the present disclosure has broad applicability for use in, for example, identification of low-abundance polypeptides from a diversity of sources, wherein the polypeptides may have significance in a wide variety of areas, including but not necessarily limited to the human and veterinary health areas, diagnostics and therapeutics, forensics, agricultural products and processes, food science-based technologies, microbiomics, proteomics- based analysis of whole organisms, organs, systems, tissues, microorganisms and viruses, including but not limited to pathogens, biofilms, and cell populations, including but not limited to cultured cells or cells obtained or derived from an organism, or populations of cells enriched for one or more cell types, and biological fluids which include but are not limited to mucosa, serum, blood, lymph, urine, cerebrospinal fluid, semen, saliva, tears, and in any other composition of matter in which the identification of proteins would be desirable.
- the disclosure includes analyzing the sequence of a plurality of proteins from a test sample, and comparing one or more proteins analyzed in the first sample to any suitable reference. By comparing results obtained by analyzing one or more proteins in the first sample and comparing the results to a reference, a difference in the test sample can be identified and used to characterize the test sample.
- the disclosure includes determining a plurality of protein sequences from a biological sample obtained or derived from an individual. By comparing the analysis of one or more proteins from the test sample to the reference, such as a suitable control sample or a standardized reference, one or more differences in the test sample relative to the reference can be identified.
- Such differences if present may be used for a variety of purposes, such as phenotyping, or diagnosing or aiding in a physician's diagnosis of a condition or disorder, for staging a disease and/or making a prognosis, for making a treatment recommendation, for monitoring the progress of a treatment, or for identification of a source of one or more of the polypeptides identified by performance of the method.
- the method is used for analysis of a sample obtained or derived from an individual who has is at risk for, is suspected of having, or has been diagnosed with a condition or disorder, non-limiting examples of which include any form of cancer, an infectious disease, an auto-immune disorder, a muscular or neuromuscular disorder, a non-cancer blood disorder, a disorder confined to a particular organ or tissue, or any other condition wherein a protein biomarker may be present.
- the compositions and methods are used to identify novel polypeptides that are, for example, previously unknown proteins, isoforms, splice variants, and the like.
- the binding partners that have specificity for N-terminal amino acids can be any suitable moiety, compound or composition of matter that can distinguish one of the 20 naturally occurring amino acids from the others.
- the disclosure includes single and combinations of novel N-terminal amino acid binding partners, methods of using them, as well as kits that comprise them.
- the novel N-terminal binding partners may be combined with, for example, previously available or naturally occurring N-terminal amino acid binding partners that, given the benefit of this disclosure, will be recognized by those skilled in the art as being adaptable for certain aspects of the protein sequencing approaches described further herein.
- selective evolution of binding agents is employed to provide improved N-terminal amino acid binding reagents.
- the N-terminal amino acid binding agents are abbreviated as "NAA" from time to time in this disclosure.
- the disclosure includes novel N-terminal amino acid binding agents that are modified or otherwise engineered proteins.
- protein-based binding agents are developed by modifying naturally occurring proteins that bind amino acids with specificity, such as aminoacyl-tRNA synthetases (referred to herein from time to time as "RS"), and/or amino-acid binding fragments thereof.
- the RS fragments comprise or consist of the amino acid binding pocket of the RS, with modifications thereof. Any RS of any origin can be adapted for use in the present disclosure.
- the RS is a prokaryotic RS, or a eukaryotic RS.
- the RS is a class I or a class II RS.
- a combination of an RS and a tRNA can be used.
- the immunological N-terminal binding agent can comprise antibody Fab fragments, Fab' fragments, F(ab')2 fragments, Fv fragments, scFv fragments, antibody -based aptamers, nanobodies, llama bodies, diabodies, or any other N-terminal amino acid binding portion of an immunological molecule so long as adequate complementary determining regions (CDRs) are included to achieve the requisite specificity.
- CDRs complementary determining regions
- the disclosure provides N-terminal amino acid binding agents in the form of peptide aptamers which generally comprise a variable peptide loop attached at both ends to a protein scaffold, wherein the combination of the loop and the scaffold imparts N-terminal amino acid binding specificity.
- nucleic acid-based novel N-terminal amino acid binding agents include but are not limited to RNA, DNA, hybrid RNA/DNA molecules, or XNA as a modified form of a polynucleotide.
- these agents comprise single-stranded DNA or RNA (ssDNA or ssRNA) molecules provided as aptamers.
- the modified nucleic acids are altered to, for example, be resistant to degradation, or for enhanced amino acid recognition, or a combination thereof.
- Modified nucleic acids include but are not necessarily limited to polynucleotides which comprise modified nucleotides, and/or modified phosphodiester linkages.
- polynucleotide based N- terminal amino acid binding agents can comprise, in non-limiting embodiments, an inter- nucleoside linkage that is an alkylphosphonate, phosphorothioate, phosphorodithioate, phosphate ester, alkylphosphonothioate, phosphoramidate, carbamate, carbonate, morpholino, phosphate trister, acetamidate, and/or carboxymethyl ester, or combinations thereof.
- the polynucleotide based N-terminal amino acid binding agents can comprise modified nucleotides.
- the modifications will generally be at the 2' position of the ribose and include but are not limited to 2'O-methy, 2'-0-(2-Methoxyethyl), and 2'0-(2-Aminopropyl) modifications, and combinations thereof.
- the nucleic acid-based novel N- terminal amino acid binding agents are selected from aptamers, ribozymes, or modified riboswitches.
- a modified lysine riboswitch which is a lysine-binding RNA molecule that regulates lysine biosynthesis is used.
- a combination of a peptide and polynucleotide is used as an N-terminal amino acid binding agent.
- the N-terminal amino acid binding agent comprises a polynucleotide
- the presence of the particular amino acid at the N-terminus can be established by determining all or part of the sequence of the polynucleotide.
- polynucleotides that are disclosed herein, all amino acid sequences, and polynucleotides encoding the peptide sequences are encompassed in this disclosure.
- all methods of making the N-terminal amino acid binding agents, polynucleotides encoding the protein-based N-terminal amino acid binding agents, expression systems comprising those polynucleotides, and cell cultures comprising those expression vectors are also included in this disclosure.
- the disclosure includes expressing a recombinant, modified N-terminal amino acid binding agent in a cell culture, and separating the modified N-terminal amino acid binding agent from the cell culture to obtain an isolated and/or purified modified N-terminal amino acid binding agent.
- kits for use in performing one or more methods or steps of this disclosure comprise one or more containers, such as glass or plastic containers that can be sealed to contain, for example, one or more of the modified N-terminal amino acid binding agents.
- the kits can comprise up to 20 modified N- terminal amino acid binding agents provided in one or more of the containers.
- the binding agents can be provided in solution, such as in a stabilized buffer, or as a frozen solution, or they can be provided in a dry form to be reconstituted for use in the method.
- the kits can further include a solid substrate, such as planar surface, or a spherical surface such as a plurality of beads, for use in the method.
- the solid substrate can be modified such that it is suitable for binding the C-termini of a plurality of polypeptides, as will be further outlined below.
- the kits can also include reagents for use in fixing polypeptides to the substrate, and/or for labeling and/or detecting the binding agents. Further, the binding agents may be provided pre-labeled for use in the method, and if desired segregated into groups of binding agents having distinct labels for stepwise determination of polypeptide sequences.
- the kits can include printed material providing instructions for using the modified N-terminal amino acid binding agents in a method of this disclosure.
- Figure 1 The principle behind the single molecule polypeptide sequencing of this disclosure is illustrated in Figure 1, which provides a non-limiting and illustrative example of one embodiment.
- a plurality of peptides from enzymatic digestion of a sample can be randomly immobilized on a glass surface through their C-terminus amino acids.
- the N-terminus amino acids of these peptides are bound to detectably labeled proteins, each of which recognizes one of the 20 NAAs with high affinity and specificity.
- the proteins bind to the NAA with a nanomolar affinity.
- an exemplary minimum off-rate is less than 10 "1 to lO ⁇ sec "1 .
- 20 distinct N-terminus detectably labeled amino acid binding partners are used which allows for detection of the N-terminus of every bound peptide.
- fewer than 20 distinct N-terminus detectably labeled amino acid binding partners are used, and sequential detection steps are employed.
- the binding partners are fluorescently labeled, and one or more fluorescent images of single molecule resolution is acquired by a CCD camera (illustrative Step 1).
- CCD camera illustrated Step 1
- alternative detection methods can be used, such as fluorometers, scanning lasers, and microfluidic based imaging devices.
- NAAs of peptides are removed by employing any suitable technique, such as the well-known Edman degradation chemistry to expose a new set of NAAs (illustrative Step 2). This procedure is repeated for multiple cycles. Finally, all acquired fluorescent images (or other signals) are assembled by, for example, a computer to identify the sequences of all peptides. As will be recognized by those skilled in the art, the iterative sequence approach of this disclosure can be integrated with a variety of devices for detection of the NAAs, and for separating the labeled NAAs from the bound peptides.
- the peptides to be sequenced are placed in physical association with a substrate, illustrated as glass in Figure 1.
- the peptides may be covalently or non-covalently attached to the substrate, and in either case the attachment may be reversible or irreversible.
- the peptides are attached via their C-terminus, and are either attached directly to the substrate, or attached by an intermediate, such as a linking group, a functionalized group, or any other suitable composition of matter that will keep the peptide in place to perform the method of the invention.
- a linking group may be monofunctional, specifically binding a peptide to the surface, or bifunctional to include an additional moiety that could be used in proximity ligation-type assays that would increase the discrimination of a true NAA binding signal over non-specific background noise.
- the substrate, the peptide, or both may be functionalized to facilitate peptide attachment. The substrate may be blocked prior to attaching the peptides to reduce artifacts.
- proteins can be extracted from cells and digested with proteases such as trypsin into short peptides.
- cysteine containing peptides can be specifically conjugated onto glass coated with a bromacetyl group.
- the C-terminus lysine of trypsin digested peptides can be selectively immobilized on glass derivatized with a methylisourea functional group (see, i.e., Figure 4A).
- peptides it is possible for peptides to adsorb nonspecifically to untreated glass surface and lead to inaccessible N-termini. This can be reduced by, for example, using well-establish surface derivatization methods to passivate the surface against non-specific protein adsorption.
- coating a glass surface with polyethylene glycol can significantly reduce nonspecific binding and has been widely used in microarray -based technologies.
- the use of the high- affinity NAA-binding proteins described herein will also allow stringent washing steps to remove non-specifically bound NAA-binding proteins, if desired.
- peptides will be immobilized at an average density of 4 peptides per 1 ⁇ 2 by controlling the density of available functional groups on the glass surface. The distance between two peptides (500 nm) is large enough for visualization by conventional optical system, while a 2 cm x 2 cm surface can hold up to 1.6 billion peptides.
- the N- terminal amino acid binding agents of this disclosure may be modified so that they are detectably labeled. Any moiety or other compound or composition that is capable of producing a detectable signal is included in this disclosure.
- the detectable label produces a fluorescent signal as generally outlined in Figure 1.
- the N-terminal amino acid binding agents are adapted for detecting by way of proton or photon release assay, or a fluorescence detection assay such as Fluorescence Resonance Energy Transfer (FRET).
- FRET Fluorescence Resonance Energy Transfer
- NAA-binding partners are fused with fluorescent proteins, i.e., Green florescent protein and its derivatives, red florescent protein and its derivatives, yellow fluorescent proteins and others known in the art.
- fluorescent proteins i.e., Green florescent protein and its derivatives, red florescent protein and its derivatives, yellow fluorescent proteins and others known in the art.
- the labels comprise bright fluorophores that have adequate photostability, high quantum yields, and narrow emission spectra so that they can be used for certain embodiments that employ multicolor detection.
- the detectable label will comprise one or more organic dyes as the detectable label.
- the four fluorescent dyes used in well-known Sanger DNA sequencing methods can be used. These labels display high quantum yields, reasonable photostability, and well-resolved emission spectra for four-color detection. In this non-limiting approach, 20 NAA-binding proteins will be equally divided into five groups and four proteins in each group will be labeled by these organic dyes respectively, thus meeting both the practical limit in number of available fluorophores as well as channels for single molecule detection.
- embodiments of the disclosure includes detecting up to 10 NAA-binding proteins in a single cycle.
- suitable detectable fluorescent moieties include use of one or a combination of Acridine dyes, Cyanine dyes, Fluorone dyes, Oxazine dyes, Phenanthridine dyes, or Rhodamine dyes.
- the detectable labels are Xanthene derivatives, including but not limited to fluorescein, rhodamine, Oregon green, eosin, and Texas red; Cyanine derivatives, including but not limited to cyanine,
- the NAA-binding proteins in order to label the NAA-binding proteins that are RS or modified RA, can be expressed so that they contain an unnatural azide group at a position away from the NAA-binding site.
- This approach can genetically incorporate an unnatural amino acid bearing the azide into the NAA-binding protein site- specifically. This azide will allow the NAA-binding proteins to be labeled with organic dyes containing a terminal alkyne group through bio- orthogonal "click" chemistry (the reaction between azide and alkyne).
- the labeled proteins can be purified using any suitable techniques, such as on immobilized antibodies against fluorescent dyes.
- the N-terminal amino acid binding agents can be detected by imaging a signal that can be interpreted by a machine, such as a camera or other detecting machine that is configured to function with, for example, a charge- coupled device (CCD), i.e., a CCD camera.
- a CCD charge- coupled device
- a microfluidic fluorescence imager can be used.
- a confocal microscope coupled with a microfluidic liquid handling system can be adapted for use with the compositions and methods of this disclosure. Confocal imaging can detect fluorescence from a thin focal plane, so fluorescent signals of all peptides immobilized on the inner wall of a glass tube can be recorded by CCD cameras from different angles.
- a suitable microfluidic fluorescence imager can be adapted from, as one example, the single molecule DNA sequencer available from Helicos (www.helicosbio.com).
- Helicos DNA sequencer uses a single laser (635 nm) for the detection of four nucleotides one by one.
- multiple lasers e.g.
- the disclosure includes a system for determining a plurality of polypeptide sequences.
- the system can comprise the NAAs, the substrate for immobilizing polypeptides, and/or one or more devices for detecting the NAAs when bound to the N-termini of immobilized peptides.
- the device(s) can be integrated with a digital processor and/or software to interpret the position, amount, frequency, etc.
- the system can include a microfluidic component, and/or a device for capturing images.
- the present disclosure also contemplates addressing sequencing errors at the single molecule level caused by the combination of several factors, such as incomplete Edman degradation and possible posttranslational modifications. For small posttranslational
- NAA-binding proteins can be evolved for their recognition and used in the approaches of this disclosure.
- the present disclosure comprises recording of the amino acid present at the N-terminus of a plurality of peptides over successive rounds of sequencing as described above, and fixing such amino acid sequences in a tangible medium of expression, such as a digitized file.
- the disclosure includes generating a report that comprises such amino acid sequences.
- the disclosure comprises conveying the report to a third party, for example, a party from which a protein-containing sample is received and analyzed according to the method of this disclosure.
- the disclosure includes receiving a protein-containing sample, analyzing the sample to determine amino acid sequences in the sample, generating a report describing the amino acid sequences, and communicating the amino acid sequences and/or the report to a party.
- NAA-binding proteins representative examples are described below.
- additional agents will be derived from the catalytic domain of other RS by affinity selection.
- a representative example of an RS is shown in Figure 2 where the active site of a methionyl-tRNA synthetase is occupied by a Met residue. Both the a-amine and the side chain of Met are buried deeply, while its carboxylate is more exposed to surface.
- these enzymes specifically interact with the NAA of a peptide, but not the same residues embedded inside the peptide, because only the NAA has free a-amine to prevent spatial clash.
- a phage display library based on Ml 3 bacteriophage a bacterial virus composed of a circular single-strand DNA
- Figure 3 When an exogenous gene is inserted into the Ml 3 DNA can be expressed as a fusion protein on the surface of Ml 3 phage.
- a library of genes is engineered into Ml 3 DNA, a library of corresponding proteins can be produced, each of which is present on an individual phage surface as multiple copies and can be identified by sequencing the insertion on phage DNA.
- the gene encoding the catalytic domain of the tyrosyl-tRNA synthetase and its mutants at random positions can be inserted into Ml 3 DNA to make a library of tyrosyl-tRNA synthetase carrying random point mutations.
- This library can be incubated with isoleucine immobilized on solid support beads through its carboxylate. After unbound phages are washed away, the remaining phages are stripped off from beads, amplified by infecting bacterial cells, and subjected to next round of selection.
- Free phenylalanine which is quite similar to tyrosine, can be added into the incubation solution as a competing agent to remove phages capable of binding with N-terminus
- Flag (sequence: DYKDDDDK (SEQ ID NO: 10)) tagged TyrRS-pIII, PheRS-pIII or LeuRS-pIII were serially diluted to several concentrations as indicated (pfu/mL, plague forming unit/mL). Then 100 ⁇ of these diluted phage solutions for each aaRS were used to coat individual wells on a 96-well plate, washed and then incubated with HRP-conjugated anti-Flag antibody. M13K07 helper phage was also included as a negative control. TMB substrate was added and the absorbance at 450 nm (OD450) was monitored to detect the level of Flag-RS-pIII fusion proteins on Ml 3 phage surface.
- wild-type enzymes may exhibit different degree of specificity and affinity towards their respective N-terminal amino acids.
- TyrRS, PheRS, and LeuRS expressed on Ml 3 phage surface were tested for their ability to interact with immobilized peptides with cognate N- terminal amino acids, they indeed displayed such variation.
- wild-type PheRS can specifically bind to N-terminal phenylalanine, but not leucine ( Figure 5 A), while neither TyrRS nor LeuRS showed such specific binding.
- the disclosure includes a modified PheRS comprising any one or any combination of these three mutations, or a Phe binding fragment thereof.
- LeuRS mutants The E. coli wild type LeuRS sequence is: MNNPGnSTSSARKAVLTRAFGLCYADLK HINATFVAVLKTGPLAAMQEQYRPEEIES K VQLHWDEKRTFE VTEDE SKEK Y YCL SMLP YP S GREHMGHVRN YTIGD VI AR YQRML GKNVLQPIGWDAFGLPAEGAAVKNNTAPAPWTYDNIAYMKNQLKMLGFGYDWSREL ATCTPEYYRWEQKFFTELYKKGLVYKKTSAVNWCP DQTVLA EQVIDGCCWRCDTK VERKEffQWFIKITAYADELL DLDKLDHWPDTVKTMQRNWIGRSEGVEITFNV DYD NTLTVYTTRPDTFMGCTYLAVAAGHPLAQKAAEN PELAAFIDECRNTKVAEAEMAT MEKKGVDTGFKAVHPLTGEEIPVWAA FVLMEYGT
- mutant D2 comprising the I496F change has enhanced Leu binding properties.
- the disclosure includes a modified LeuRS comprising this mutation, or a Leu binding fragment thereof.
- scFv antibody an engineered form of native IgG antibody, has broad applications due to its small size and easiness for affinity selection.
- One of clones, p807/C2 indeed exhibited preferably binding towards N-tyrosine over C-tyrosine ( Figure 7), indicating that phage-displayed scFv antibodies may complement aaRS as NAA-binding reagent.
- the CDR sequence of Clone 807/C2 (LI) SGDALPKKYAY (SEQ ID NO:3); (L2) EDVKRLS (SEQ ID NO:4); (L3) YSNSKTGNYNV (SEQ ID NO:5); (HI) GYTFTDYWIS (SEQ ID NO:6); (H2) QIAMTNSATVYGPSFQG (SEQ ID NO:7); (H3) DYSDNYYNDTYS (SEQ ID NO:8).
- Binding partners including any one or any combination of these distinct amino acid sequences are encompassed in this disclosure, including but not limited to binding partners that include LI and HI, LI and H2, LI and H3, L2 and HI, L2 and H2, L2 and H3, L3 and HI, L3 and H2, and L3 and H3. [0060] While the invention has been described through specific embodiments, routine modifications will be apparent to those skilled in the art and such modifications are intended to be within the scope of the present invention.
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Medicinal Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biochemistry (AREA)
- Genetics & Genomics (AREA)
- Analytical Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Hematology (AREA)
- Physics & Mathematics (AREA)
- Urology & Nephrology (AREA)
- Biomedical Technology (AREA)
- Immunology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Cell Biology (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Food Science & Technology (AREA)
- General Physics & Mathematics (AREA)
- Pathology (AREA)
- Gastroenterology & Hepatology (AREA)
- Peptides Or Proteins (AREA)
Abstract
La présente invention concerne des compositions et des procédés permettant d'élucider les séquences d'acides aminés d'une pluralité de polypeptides dans des approches à haut débit. Les compositions comprennent des partenaires de liaison uniques qui ont une spécificité pour des acides aminés à terminaison N. Les procédés consistent à utiliser les partenaires de liaison uniques dans des processus qui comprennent l'obtention d'une pluralité de polypeptides, la liaison d'agents de liaison à l'acide aminé à terminaison N marqués de manière détectable à l'acide aminé à terminaison N des polypeptides, la détection des agents de liaison à l'acide aminé à terminaison N pour identifier l'acide aminé à terminaison N pour une partie ou la totalité des polypeptides, le retrait des agents de liaison à l'acide aminé à terminaison N, la libération de l'acide aminé à terminaison N pour révéler l'acide aminé suivant dans les polypeptides dans la direction terminaison N->C, et la répétition du processus pour déterminer une partie ou la totalité de la séquence d'acides aminés des polypeptides. Les approches peuvent être utilisées pour analyser des polypeptides provenant d'une grande diversité de sources, et pour détecter des protéines de faible abondance à partir de n'importe quelle source particulière.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201562144204P | 2015-04-07 | 2015-04-07 | |
| US62/144,204 | 2015-04-07 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2016164530A1 true WO2016164530A1 (fr) | 2016-10-13 |
Family
ID=57072895
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2016/026354 Ceased WO2016164530A1 (fr) | 2015-04-07 | 2016-04-07 | Compositions et procédés pour le séquençage de protéines à haut débit |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2016164530A1 (fr) |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11105812B2 (en) | 2011-06-23 | 2021-08-31 | Board Of Regents, The University Of Texas System | Identifying peptides at the single molecule level |
| US11162952B2 (en) | 2014-09-15 | 2021-11-02 | Board Of Regents, The University Of Texas System | Single molecule peptide sequencing |
| US11169157B2 (en) | 2020-01-07 | 2021-11-09 | Encodia, Inc. | Methods for stable complex formation and related kits |
| US11435358B2 (en) | 2011-06-23 | 2022-09-06 | Board Of Regents, The University Of Texas System | Single molecule peptide sequencing |
| US11634709B2 (en) | 2019-04-30 | 2023-04-25 | Encodia, Inc. | Methods for preparing analytes and related kits |
| US11782062B2 (en) | 2017-10-31 | 2023-10-10 | Encodia, Inc. | Kits for analysis using nucleic acid encoding and/or label |
| US11959922B2 (en) | 2016-05-02 | 2024-04-16 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US11959920B2 (en) | 2018-11-15 | 2024-04-16 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| WO2024150012A1 (fr) * | 2023-01-12 | 2024-07-18 | SMI Drug Discovery Limited | Détection et analyse d'analytes |
| US12065466B2 (en) | 2020-05-20 | 2024-08-20 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US12196760B2 (en) | 2018-07-12 | 2025-01-14 | Board Of Regents, The University Of Texas System | Molecular neighborhood detection by oligonucleotides |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5807748A (en) * | 1994-07-08 | 1998-09-15 | City Of Hope | N-terminal protein sequencing reagents and methods which form amino acid detectable by a variety of techniques |
| US20030059937A1 (en) * | 2000-06-16 | 2003-03-27 | Ruben Steven M. | Antibodies that immunospecifically bind BLyS |
| US20100273178A1 (en) * | 2000-05-05 | 2010-10-28 | Purdue Research Foundation | Affinity selected signature peptides for protein identification and quantification |
| US20120070457A1 (en) * | 2010-09-10 | 2012-03-22 | J. Craig Venter Institute, Inc. | Polypeptides from neisseria meningitidis |
| US20120087861A1 (en) * | 2010-10-11 | 2012-04-12 | Roger Nitsch | Human Anti-Tau Antibodies |
-
2016
- 2016-04-07 WO PCT/US2016/026354 patent/WO2016164530A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5807748A (en) * | 1994-07-08 | 1998-09-15 | City Of Hope | N-terminal protein sequencing reagents and methods which form amino acid detectable by a variety of techniques |
| US20100273178A1 (en) * | 2000-05-05 | 2010-10-28 | Purdue Research Foundation | Affinity selected signature peptides for protein identification and quantification |
| US20030059937A1 (en) * | 2000-06-16 | 2003-03-27 | Ruben Steven M. | Antibodies that immunospecifically bind BLyS |
| US20120070457A1 (en) * | 2010-09-10 | 2012-03-22 | J. Craig Venter Institute, Inc. | Polypeptides from neisseria meningitidis |
| US20120087861A1 (en) * | 2010-10-11 | 2012-04-12 | Roger Nitsch | Human Anti-Tau Antibodies |
Cited By (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11105812B2 (en) | 2011-06-23 | 2021-08-31 | Board Of Regents, The University Of Texas System | Identifying peptides at the single molecule level |
| US12379381B2 (en) | 2011-06-23 | 2025-08-05 | Board Of Regents, The University Of Texas System | Single molecule peptide sequencing |
| US11435358B2 (en) | 2011-06-23 | 2022-09-06 | Board Of Regents, The University Of Texas System | Single molecule peptide sequencing |
| US11162952B2 (en) | 2014-09-15 | 2021-11-02 | Board Of Regents, The University Of Texas System | Single molecule peptide sequencing |
| US12235276B2 (en) | 2016-05-02 | 2025-02-25 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US12019077B2 (en) | 2016-05-02 | 2024-06-25 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US11959922B2 (en) | 2016-05-02 | 2024-04-16 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US12320813B2 (en) | 2016-05-02 | 2025-06-03 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US12123878B2 (en) | 2016-05-02 | 2024-10-22 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US12019078B2 (en) | 2016-05-02 | 2024-06-25 | Encodia, Inc. | Macromolecule analysis employing nucleic acid encoding |
| US12130291B2 (en) | 2017-10-31 | 2024-10-29 | Encodia, Inc. | Kits for analysis using nucleic acid encoding and/or label |
| US11782062B2 (en) | 2017-10-31 | 2023-10-10 | Encodia, Inc. | Kits for analysis using nucleic acid encoding and/or label |
| US12467928B2 (en) | 2017-10-31 | 2025-11-11 | Encodia, Inc. | N-terminal modifier agents and binders for treating and analyzing peptides |
| US12292446B2 (en) | 2017-10-31 | 2025-05-06 | Encodia, Inc. | Kits for analysis using nucleic acid encoding and/or label |
| US12196760B2 (en) | 2018-07-12 | 2025-01-14 | Board Of Regents, The University Of Texas System | Molecular neighborhood detection by oligonucleotides |
| US12000835B2 (en) * | 2018-11-15 | 2024-06-04 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US12174196B2 (en) | 2018-11-15 | 2024-12-24 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US12259391B2 (en) * | 2018-11-15 | 2025-03-25 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US11959920B2 (en) | 2018-11-15 | 2024-04-16 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US12360114B2 (en) | 2018-11-15 | 2025-07-15 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US12055548B2 (en) | 2018-11-15 | 2024-08-06 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| US11634709B2 (en) | 2019-04-30 | 2023-04-25 | Encodia, Inc. | Methods for preparing analytes and related kits |
| US11169157B2 (en) | 2020-01-07 | 2021-11-09 | Encodia, Inc. | Methods for stable complex formation and related kits |
| US12065466B2 (en) | 2020-05-20 | 2024-08-20 | Quantum-Si Incorporated | Methods and compositions for protein sequencing |
| WO2024150012A1 (fr) * | 2023-01-12 | 2024-07-18 | SMI Drug Discovery Limited | Détection et analyse d'analytes |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2016164530A1 (fr) | Compositions et procédés pour le séquençage de protéines à haut débit | |
| US20240003892A1 (en) | Heterogeneous single cell profiling using molecular barcoding | |
| JP7256860B2 (ja) | 改善されたアッセイ方法 | |
| ES2901952T3 (es) | Métodos de ensayo mejorados | |
| JP7578294B2 (ja) | 空間分析のための方法および関連キット | |
| WO2003062402A2 (fr) | Utilisation de collections de sites de liaison pour etablir des profils d'echantillons et autres applications | |
| AU2020261334A1 (en) | Methods for spatial analysis of proteins and related kits | |
| JP2018154627A (ja) | スクリーニング方法およびその使用 | |
| Wang et al. | Highly paralleled emulsion droplets for efficient isolation, amplification, and screening of cancer biomarker binding phages | |
| CN101105493B (zh) | 基于蛋白芯片的免疫共沉淀检测蛋白相互作用的方法以及一种蛋白相互作用检测试剂盒 | |
| US20210349080A1 (en) | Multi-faceted method for detecting and analyzing target molecule by molecular aptamer beacon (mab) | |
| US20220120745A1 (en) | Cell-free biofragment compositions and related systems, devices, and methods | |
| CN101539573A (zh) | 蛋白-蛋白相互作用的高通量可视化芯片检测方法以及一种蛋白-蛋白相互作用检测试剂盒 | |
| Lim et al. | Correlated matrix‐assisted laser desorption/ionization mass spectrometry and fluorescent imaging of photocleavable peptide‐coded random bead‐arrays | |
| WO2017059239A1 (fr) | Procédés de traitement de matrices biopolymères | |
| Haraszti et al. | Comparative colocalization single-molecule spectroscopy (CoSMoS) with multiple RNA species | |
| CN100587076C (zh) | 可目视化生物芯片的制备方法 | |
| Soloviev et al. | Combinatorial peptidomics: a generic approach for protein expression profiling | |
| US20250306015A1 (en) | Single molecule-resolved characterization of affinity reagent kinetics and thermodynamics | |
| US20230193245A1 (en) | Methods and compositions for making and using peptide arrays | |
| WO2025160469A1 (fr) | Procédés de profilage de répertoires d'immunoglobulines | |
| HK1214862B (en) | Cell-free biofragment compositions and related systems, devices, and methods |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16777245 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16777245 Country of ref document: EP Kind code of ref document: A1 |