WO2024235913A1 - Protéine d'étalonnage - Google Patents
Protéine d'étalonnage Download PDFInfo
- Publication number
- WO2024235913A1 WO2024235913A1 PCT/EP2024/063101 EP2024063101W WO2024235913A1 WO 2024235913 A1 WO2024235913 A1 WO 2024235913A1 EP 2024063101 W EP2024063101 W EP 2024063101W WO 2024235913 A1 WO2024235913 A1 WO 2024235913A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- citrate synthase
- functional fragment
- protein
- synthase protein
- acid sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1025—Acyltransferases (2.3)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/52—Genes encoding for enzymes or proenzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y203/00—Acyltransferases (2.3)
- C12Y203/03—Acyl groups converted into alkyl on transfer (2.3.3)
- C12Y203/03001—Citrate (Si)-synthase (2.3.3.1)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/20—Fusion polypeptide containing a tag with affinity for a non-protein ligand
- C07K2319/21—Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
Definitions
- the present invention concerns the provision of a molecular weight standard for various apparatus and techniques that determine, inter alia, molecular weight.
- molecular weight standards contain a diverse range of unknown proteins, ranging in molecular weight from a few thousand to hundreds of thousands of Daltons.
- the present invention seeks to simplify the provision of molecular weight standards, by supplying a single protein species, that has the ability to form multiple distinct complexation states (i.e., dimers, trimers and/or tetramers), even at relatively low concentrations in solution.
- Such a simplified provision enables calibration of apparatus to be performed, reducing costs and provides a dedicated, appropriate and robust standard for newer techniques such as mass photometry.
- the invention relates to ancestral citrate synthase proteins which are capable of forming at least 2, at least 3, at least 4 or at least 5 different complexation states simultaneously.
- the invention further relates to compositions comprising the ancestral citrate synthase proteins, use of the ancestral citrate synthase proteins or compositions, nucleic acids encoding the ancestral citrate synthase proteins, and methods of purifying the ancestral citrate synthase proteins.
- Mass photometry is an example of an interferometric scattering microscopy technique (W02018/011591) which can be used to determine the molecular mass of proteins, protein complexes and other biomolecules in solution in a single molecule approach; it is a novel technique that has only been recently commercially established.
- the measured signal is the interferometric contrast created from the light scattered by the biomolecule of interest and the light reflected by the measurement surface.
- the measured contrast directly correlates with the molecular mass (Young, G., et al.).
- Translation of the signal into the molecular weight of a biomolecule requires a calibration step that has to be performed with samples of varying and known masses.
- the currently used calibration standard is not specifically dedicated or optimised for mass photometry, but reuses a commercially available protein mixture used for gel electrophoresis such as NativeMarkTM Unstained Protein Standard (InvitrogenTM LC0725).
- This standard contains a mixture of multiple, nondisclosed proteins of different sizes at non-disclosed concentrations, and its use is fairly cost-intensive.
- the present invention concerns a single protein that forms multiple distinct complexation states in solution. The resulting protein populations (populations of complexation states) are stable and have discrete molecular weight differences, which can be detected and distinguished by mass photometry.
- the invention enables reproducible and convenient mass calibration with a single protein amenable to large scale production in heterologous host bacteria.
- the protein itself is a citrate synthase but its amino acid sequence is not naturally found in an extant organism. It is an ancient representative of this enzyme family that is an amino acid sequence which has been inferred by ancestral sequence reconstruction - a method that uses related amino acid sequences to resurrect ancestral proteins.
- Natural citrate synthases are known to form complexes, for example whilst eukaryotes (in their mitochondria) and Gram-positive bacteria use Type I citrate synthases which form dimers, Gramnegative bacteria use Type II citrate synthases which form hexamers.
- Type I citrate synthases which form dimers
- Gramnegative bacteria use Type II citrate synthases which form hexamers.
- the monomers form only their designated multimer, and not a variety of stable multimers.
- Schmidtmann et al Arabidopsis citrate synthase was analysed. When all mitochondrial proteins were studied together, Blue-Native PAGE noted the presence of a dimer (100 kDa) and a higher molecular weight complex of up to 1000 kDa.
- the mitochondrial extract was then further analysed via gel-filtration chromatography, followed by Western blotting.
- recombinant CS4 was analysed. Whilst the concentration of the proteins was not confirmed, the amounts used were lOOpg (recombinant CS4) and 180pg (mitochondrial extract) and the maximum loading volume for the column used in 500 pL, thus the concentrations far exceed the amounts used for mass standard purposes. Looking at the isolated protein, the authors concluded that the recombinant protein could form dimers (this is noted as the most abundant form) and oligomeric complexes.
- citrate synthase proteins of the invention produce a series of defined and stable complexation states even at low protein concentrations, such as 10 nM.
- the individual complexation states each occur in adequate abundance and are sufficiently different in size to be detected and distinguished in mass photometry measurements.
- the difference in mass may be at least 25 kDa - the difference being measured between complexation states that differ by one monomeric unit (i.e., monomer to dimer).
- the resulting mass calibrations reproducibly yield a mass error of about 1 %, which is at least as good as the commercial protein standard.
- the citrate synthase protein provided herein is also comparatively more stable for longer times at -20°C and after repeated freeze-thaw cycles. It has even shown to be sufficiently stable to be stored at 4 °C for several weeks, which is not the case for the commercial protein mixture. The abundance of the different mass species and their mass differences are significantly better optimized for calibration of mass photometers than the commercial protein mixture.
- the protein can also be used as a molecular weight standard for other, non-denaturing techniques such as Native polyacrylamide gel electrophoresis (Native PAGE).
- the invention comprises a two-step purification protocol for the protein comprising heterologous production in E. coli which results in high yields, purity and batch-to-batch consistency.
- the present invention provides a protein that can form multiple distinct complexation states simultaneously, and once formed, these complexation states are maintained, providing a heterologous mixture of complexes, each with a distinct molecular mass. Such is ideal for the calibration of instruments, apparatus, and techniques that determine mass of biomolecules and the like.
- the invention provided herein is a citrate synthase protein or functional fragment thereof, encoded by a nucleic acid sequence, wherein the nucleic acid is an ancestral gene and wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention provided herein is a citrate synthase protein, or functional fragment thereof, having an amino acid sequence at least 75%, 80%, 85%, 90% or 95% identical to a sequence selected from SEQ ID NOs: 1-7 wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention provided herein is a citrate synthase protein, or functional fragment thereof, having an amino acid sequence with at least 75%, 80%, 85%, 90% or 95% sequence similarity to a sequence selected from SEQ. ID NOs: 1-7 wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention provided herein is a citrate synthase protein, or functional fragment thereof, comprising a sequence selected from any of SEQ ID 1-7.
- the invention provided herein is a citrate synthase protein comprising a sequence, or functional fragment thereof, comprising greater than 75%, 80%, 85%, 90% or 95% sequence identity to a sequence selected from SEQ ID No. 15-21 wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention provided herein is a citrate synthase protein, or functional fragment thereof, comprising a sequence comprising greater than 75%, 80%, 85%, 90% or 95% sequence similarity to a sequence selected from SEQ ID No. 15-21 wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention provided herein is a citrate synthase protein, or functional fragment thereof, comprising a sequence selected from any of SEQ. ID 15-21.
- the invention provided herein is a citrate synthase protein, or functional fragment thereof, comprising at least 1 ancestral mutation wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention provided herein is a composition comprising a citrate synthase protein, or functional fragment thereof, according to any previous aspect wherein the concentration of the citrate synthase protein, or functional fragment thereof, is at least 5 nM, preferably the concentration is between 5 nM and 100 pM.
- the citrate synthase according to any definition of the invention is capable of forming at least 3 complexation states simultaneously.
- Each of these three complexation states are present in the solution in a sufficient amount to permit detection.
- each of the three complexation states may comprise at least 15% of the total citrate synthase protein complexes present.
- the citrate synthase according to any definition of the invention is capable of forming at least 4 complexation states simultaneously.
- Each of these four complexation states are present in the solution in a sufficient amount to permit detection.
- each of the four complexation states may comprise at least 5% of the total citrate synthase protein complexes present.
- the most abundant complexation state may preferably comprise less than 50% of the total citrate synthase present.
- the most abundant (most common, predominant) complexation states may preferably comprise less than 45%, less than 40% or less than 35% of the total citrate synthase present.
- At least 3 distinct complexation states are formed in solution, and that these are sufficiently distinct to enable the use of the protein as a mass standard. This can be achieved with at least 25 kDa difference between each of the complexation states.
- a monomer could be 50 kDA, a dimer 100 kDa, a trimer 150 kDa and so on.
- the difference may be measured between complexation states that differ by one monomeric unit (i.e., monomer to dimer, tetramer to pentamer and so on).
- the invention provided herein is the use of a citrate synthase protein, or functional fragment thereof, or composition according to any previous definition of the invention as a molecular weight standard.
- the invention provided herein is the use of a composition according to any previous definition of the invention in the calibration or standardisation of a biochemical technique.
- the invention provided herein is a method of calibrating a mass photometry device comprising the step of detecting the mass of particles in the composition of any previous definition of the invention on a mass photometry device.
- the invention provided herein is a nucleic acid sequence encoding a citrate synthase, or functional fragment thereof, of any of the previous definitions of the invention.
- the invention, provided herein is a vector comprising the nucleic acid of a previous definitions of the invention.
- the invention provided herein is a cell transformed with a vector comprising a nucleic acid according to a previous definitions of the invention.
- the invention provided herein is a method of producing a citrate synthase protein, or functional fragment thereof, according to a previous definition of the invention comprising culturing a cell transformed with a vector according to a previous definition of the invention and carrying out at least one purification step.
- Figure 1 shows the overall profiles of Phylogenetic Trees 1 and 2 with general taxonomic descriptors for each branch.
- Figures 2a-f show expanded views of Tree 1 used for ancestral sequence reconstruction.
- Amino acid sequences of 418 extant citrate synthase genes (from bacteria, archaea and eukaryotes) were collected from the NCBI Reference Sequence Database and aligned via MUSCLE v3.8.31 28. The maximum likelihood phylogeny was inferred from the multiple sequence alignment using raxML v8.2.10 29.
- Nodes corresponding to resurrected proteins are labelled CS1, CS5, CS6, and CS7.
- Labels a-j indicate how the branches of the phylogeny connect across the Figure.
- Figures 3a-g expanded views of Tree 2 produced used for ancestral sequence reconstruction.
- Amino acid sequences of 418 extant citrate synthase genes were collected from the NCBI Reference Sequence Database and aligned via MUSCLE v3.8.31 28. The maximum likelihood phylogeny was inferred from the multiple sequence alignment using raxML v8.2.10 29. Nodes corresponding to resurrected proteins are labelled CS2, CS3, and CS4. Labels a-i indicate how the branches of the phylogeny connect across the Figure.
- Figure 4 shows an image of an SDS-PAGE gel of CSl-His to CS7-His proteins.
- Figure 5 shows an image of a Native PAGE gel of CSl-His, CS2-His and CS3-His proteins.
- Figures 6a to d shows mass photometry histograms taken for samples of CSl-His from four different purifications showing batch-to batch reproducibility.
- Figure 7 shows a comparison of proteins of known molecular weight against the mass obtained through mass photometry when using CSl-His as a calibrant.
- Figures 8 a and b shows a comparison of a mass photometry histogram using NativeMarkTM versus CSl-His HMW.
- Citrate synthase is a ubiquitous enzyme involved in the citric acid cycle where it catalyses the condensation of acetate (from acetyl-CoA) and oxaloacetate to form citrate. All citrate synthases likely originated from the same common ancestral protein, however whilst eukaryotes (in their mitochondria) and Gram-positive bacteria use Type I citrate synthases which form dimers, Gramnegative bacteria use Type II citrate synthases which form hexamers. To better understand the factors leading to this divergence in structure, ancestral protein sequences were inferred using phylogenetic methods and these proteins were expressed and purified from E. coli.
- each of the resulting "ancestral" citrate synthase proteins exhibited a range of complexation states simultaneously in solution.
- Different ancestral citrate proteins exhibit different distributions of complexation states, with most forming dimers, tetramers, hexamers, octamers and decamers, providing a set of detection targets spanning a range of molecular weights from about 85 kDa to about 430 kDa with regular molecular weight spacing between each state.
- the ancestral citrate synthase proteins are also highly stable, being able to survive multiple rounds of freezing and thawing and extended incubation under refrigeration with minimal change to their complexation properties.
- citrate synthase protein, or functional fragment thereof encoded by a nucleic acid sequence wherein the nucleic acid is an ancestral gene and wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- the invention may relate simply to the citrate synthase protein as defined herein.
- the term "functional fragment” as used herein means a portion or part of the protein sequence which retains the capability of forming multiple complexation states.
- the functional fragment may be produced in any appropriate way but can involve the deletion of sequences not critical for interaction with other protein sequences. At least 3%, 5%, 7%, 10%, 12%, 15% or 20%, or more of the protein sequence may be removed in the functional fragment.
- the term “functional fragment” does not refer to any enzymatic function in this instance.
- an ancestral gene refers to a common gene from which a family of genes is proposed to have descended.
- An ancestral gene may be derived from phylogenetic techniques such as ancestral gene restoration wherein an ancestral protein sequence is inferred (Harms & Thornton 2010, Harms & Thornton 2013, Selberg et al. 2021) as described above and a DNA molecule coding for that protein is synthesized.
- the ancestral gene is typically an artificial DNA sequence inferred from an artificial amino acid sequence.
- the amino acid sequence may be inferred using ancestral sequence reconstruction as described below.
- citrate synthase protein originating from an ancestral gene may be referred to as "ancestral citrate synthase" for brevity.
- the present invention may utilise ancestral sequence reconstruction as starting point for engineering citrate synthase proteins that could be used as mass calibration standards.
- the ancestral gene may be inferred by ancestral sequence reconstruction.
- ancestral sequence reconstruction uses the immense and ever-expanding amount of sequence data available in sequence databases to create an alignment of present-day amino acid sequences of a citrate synthase protein.
- Phylogenetic and statistical analyses under appropriate models of evolution are then used to define amino acid sequences at the branch points or "nodes" of the phylogenetic tree generated in the ancestral sequence reconstruction.
- the amino acid sequences at the tree nodes are candidates for ancestral amino acid sequences of the citrate synthase, which have given rise to the amino acid sequences of the extant citrate synthase proteins.
- a phylogenetic tree also known as a phylogeny, is a diagram that depicts the lines of evolutionary descent of different genes from a common ancestor.
- the citrate synthase proteins resulting from production of the ancestral sequences reconstructed according to the invention are found to have unique properties not observed in extant citrate synthase proteins. Accordingly, these ancestral sequences benefit from an inherently different self-interaction specificities.
- the ancestral sequences have been found to form different multimeric forms simultaneously in a solution.
- the ancestral sequences may simultaneously form two or more, preferably three or more of dimers, trimers, tetramers, pentamers, hexamers, and even higher multimeric forms, including octamers and decamers.
- MUSCLE Multiple Sequence Comparison by Log-Expectation as described in Edgar, R.C. BMC Bioinformatics 5, 113 (2004). doi.org/10.1186/1471-2105-5-1173 was used for sequence alignment.
- Other available alignment software includes Clustal, T-Coffee, Beauty-Phy or Phylo.
- any one of the various appropriate models for tree reconstruction can be utilised.
- Phylogenetic trees There are several widely used methods for estimating phylogenetic trees (Neighbour Joining, Maximum Parsimony, Bayesian Inference, and Maximum Likelihood ).
- RAxML Randomized Axelerated Maximum Likelihood -as described in Stamatakis A., Bioinformatics
- PhyML Guard S. & Gascuel O. Syst. Biol
- Other available software to infer phylogenetic trees includes PhyML, MrBayes, FastTree or IQ-Tree.
- complexation state refers to a protein complex comprising a specific number of monomeric units of the same protein.
- a complexation state may also be described as a macromolecular complex or multimer, these are generally formed of the monomers non-covalently bound to each other.
- non-covalent interactions between the hydrophobic and hydrophilic regions on the monomer units help to stabilise the quaternary structure of the complexation state.
- multimers it Is usual that non-covalent interactions between the monomeric units arrange the monomers into a particular structure. Usually, the monomeric units are therefore assembled in a regular, predictable structure. .
- the number of monomeric units in a complexation state may be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or greater than 10.
- the complexation state may be a monomer (1), dimer (2), trimer (3), tetramer (4), pentamer (5), hexamer (6) and larger multimers (usually 7+).
- the term may therefore be considered to comprise both monomers and homo-oligomer, wherein a monomer is a single protein with the capability of forming complexes and a homo-oligomer is a complex comprising a specific number of the same monomer.
- the monomeric units are present in solution, and form various complexation states, which are discrete from each other and stable such that environmental conditions such as freeze/thaw does not impact the proportion of each multimer present.
- the solution may be any appropriate solution for a protein.
- the solution may contain any appropriate buffers and/or salts, for example.
- the concentration of the protein in solution may be 5 nM to 1000 nM (1 pM).
- an agglomeration refers to the clustering of proteins that are not coherently structured and are based on random interactions.
- the ancestral citrate synthase sequences identified by the present inventors has the surprising ability to form more than one complexation state when present in solution. Indeed, the inventors have observed that the ancestral citrate synthase proteins were able to form more than 3 different complexation states simultaneously in solution. Indeed, in Figures 6a to d, where the number of each citrate synthase protein complex was counted via mass photometry, at least 5 discretely identifiable complexation states were identifiable. Figure 6d, for example, has 7 identifiable complexation states. Further, the monomeric unit appears to be capable of forming multiple complexation states, without one complexation state being overly predominant (greater than 50% (by number) of the total citrate synthase protein complexes present).
- Mass photometry has the ability to detect the mass of single particles in solution without the need for labels. In general, it is usual to detect the mass of all particles present (for example on the coverslip) and count the number of particles assigned to each mass - for example see Figure 6a-6d where the "count" of each particle is given. Thus, the total number of particles present is known, as well as the number of counts for each species of particle - in this case each protein complex. It will be understood that the momomer is included in the term "protein complex", as it is one possible species formed by the citrate synthase. Thus, to determine the percentage of each citrate synthase protein complex, the number of that complex present is determined, together with the number of total. Thus, the percentages listed here are based on number.
- the ancestral citrate synthase can form at least 3 complexation states simultaneously, and that each of those three complexation states provides at least 10%, 15%, 20%, 21%, 22%, 23%, 24% or 25% (by number) of the total citrate synthase protein complexes present in the solution.
- the 3 major (or most common) complexation states are each present between 15 and 40% (by number), optionally 18% to 35% (by number) of the total citrate synthase protein complexes in solution.
- citrate synthase of the invention there are present different complexations states which form at the same time - for example, monomers, dimers, trimers and tetramers. Once formed, these complexation states are stable and are discrete, i.e., there is little, if any, dissociation and reassociation.
- At least 3 complexation states of the citrate synthase of the invention are individually present at least 10%, 15%, 20%, 21%, 22%, 23%, 24% or 25% (by number) of the total citrate synthase protein complexes present in the solution.
- each of the three most common/most abundant complexation states at present at a sufficient level to permit accurate detection and thus calibration of the apparatus.
- the citrate synthase protein, or functional fragment thereof, of the present invention may preferably have an amino acid sequence at least 75%, 80%, 85%, 90% or 95% identical to a sequence selected from SEQ ID NOs: 1-7 wherein the citrate synthase protein, or functional fragment, thereof is also capable of forming at least 3 complexation states simultaneously.
- the citrate synthase protein, or functional fragment thereof of the present invention may preferably have an amino acid sequence with at least 75%, 80%, 85%, 90% or 95% sequence similarity to a sequence selected from SEQ. ID NOs: 1-7 wherein the citrate synthase protein, or functional fragment, thereof is also capable of forming at least 3 complexation states simultaneously.
- Sequence identity refers to the comparison of aligned sequence. If both sequences contain the same amino acid in the same position after alignment then the sequences are considered identical at this position. Tools to calculate sequence identity are readily available and widely utilised in many life disciplines of life sciences.
- sequence identity refers to the identity output from the publicly available Emboss Needle program https://www.ebi.ac.uk/Tools/psa/emboss_needle/ (Needleman and Wunsch (https://www.sciencedirect.com/science/article/abs/pii/00222836709005747via%3Dihub)).
- sequence similarity is related to sequence identity, however it makes allowance for the similar properties of some amino acids. For instance both glutamate and aspartate have similar chemical, structural and electrostatic properties. Therefore, if one sequence contains glutamate in a position where the other contains aspartate after alignment the sequences are considered similar at this position, but not identical. Similar relationships between other amino acids are taken into account.
- sequence similarity refers to the similarity output from the publicly available Emboss Needle program https://www.ebi.ac.uk/Tools/psa/emboss_needle/ (Needleman and Wunsch
- the present invention relates to a citrate synthase protein, or functional fragment thereof, comprising a sequence selected from any of SEQ ID 1-7. Further, the citrate synthase protein may comprise or consist of the sequence of SEQ. ID No. 1.
- the citrate synthase protein, or functional fragment thereof, of the present invention may comprise at least 1 ancestral mutation.
- the citrate synthase protein, or functional fragment thereof, of the present invention may comprise at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 ancestral mutations.
- an ancestral mutation refers to an alteration of a first amino acid sequence of a protein such as citrate synthase, wherein the alteration to the first amino acid sequence increases the similarity and/or identity of the first amino acid sequence to a second amino acid sequence of the same protein wherein the second amino acid sequence is an ancestral sequence which has been generated by phylogenetic methods such as ancestral sequence reconstruction. In such techniques an ancestral sequence is inferred by comparison of the known sequences of extant organisms together with the knowledge of their evolutionary history as described above.
- E. coli citrate synthase (Accession number WP_166726827) is given below (SEQ ID NO. 29) as aligned to SEQ ID No. 1 which is an ancestral sequence inferred by comparison of the known sequences of extant organisms (including E. coli though this is not a requirement) together with the knowledge of their evolutionary history. These sequences have an identity of 37.1% and a similarity of 52.7% as determined by Emboss Needle.
- W396I mutation in SEQ. ID No. 29 which increases the similarity with SEQ ID No. 1; removal of any/all of residues 297-299 from SEQ ID No. 29 to increase the overlap with SEQ ID No. 1; addition of suitable residues between N349 and D350 of SEQ ID No. 29 to increase the overlap, and optionally the identity or similarity, with SEQ ID No. 1.
- the citrate synthase protein, or functional fragment thereof of the present invention may preferably have at least 2, at least 3, at least 4, at least 5, at least 10, at least 15 or at least 20 ancestral mutations.
- the skilled person can readily determine by trial and error the type, position and number of ancestral mutations required to achieve a citrate synthase protein which is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- any of the above referenced citrate synthase proteins, or functional fragments thereof, may be joined to another amino acid sequence and thus may form part of a fusion protein.
- any of the citrate synthase proteins provided herein may further comprise a purification tag which may be a His- tag, Strep-tag or any other suitable affinity tag.
- the purification tag may be positioned at the N terminus or the C terminus of the citrate synthase protein.
- Such purification tags simplify the purification of the target protein.
- the spacer may be 1 to 5 amino acids.
- the spacer may increase the flexibility of the purification tag in solution and thus increase its availability for binding its affinity partner.
- the spacer may comprise a target for cleavage such that the purification tag can be removed in a post-processing step.
- a citrate synthase protein, or functional fragment thereof comprising a sequence comprising greater than 75%, 80%, 85%, 90% or 95% sequence identity to a sequence selected from SEQ ID No. 15-21 wherein the citrate synthase protein, or functional fragment thereof is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- a citrate synthase protein, or functional fragment thereof comprising a sequence comprising greater than 75%, 80%, 85%, 90% or 95% sequence similarity to a sequence selected from SEQ ID No. 15-21 wherein the citrate synthase protein, or functional fragment thereof, is capable of forming at least 2, at least 3, at least 4 or at least 5 complexation states simultaneously.
- a citrate synthase protein, or functional fragment thereof comprising a sequence selected from any of SEQ ID 15-21.
- the citrate synthase protein, or functional fragment thereof comprises the sequence SEQ. ID No. 15.
- the citrate synthase protein, or functional fragment thereof is not expressed by an extant organism.
- Extant organisms means organisms living today and includes any known living organism but does not include organisms that have been genetically modified/transformed/edited for the purpose of expressing the citrate synthase proteins, or functional fragments thereof, described herein.
- the citrate synthase proteins described herein are inferred ancestral proteins and are not believed to be produced by extant organisms.
- the citrate synthase proteins provided herein have an artificially constructed amino acid sequence, they are not known to currently exist in nature.
- the citrate synthase protein, or functional fragment thereof may be an artificial citrate synthase protein, or functional fragment thereof.
- artificial means produced from a non-natural gene sequence rather than from a gene sequence found in an extant organism.
- citrate synthase proteins, or functional fragments thereof, of the present invention can be readily screened for their capacity to form multiple complexation states using mass photometry or other native mass analysis techniques.
- composition comprising a citrate synthase protein, or functional fragment thereof, according to description of the invention wherein the concentration of the citrate synthase protein, or functional fragment thereof is at least 5 nM, preferably the concentration is between 5 nM and 100 pM.
- concentration of the protein in solution may be 5 nM to 1000 nM (1 pM), or 1 pM to 100 pM, and any value between these ranges.
- concentration of the proteins in the solution may be varied according to the parameters of the apparatus or technique it is used to calibrate.
- the composition may further comprise: buffering reagents, suitable to maintain a pH between 5 and 10 during routine use; and/or any appropriate salts which may optionally be selected from a list including sodium chloride and potassium chloride; wherein the total salt concentration is at least 50 mM, preferably the salt concentration is in the range 50 to 500 mM.
- buffering reagents suitable to maintain a pH between 5 and 10 during routine use
- any appropriate salts which may optionally be selected from a list including sodium chloride and potassium chloride
- the total salt concentration is at least 50 mM, preferably the salt concentration is in the range 50 to 500 mM.
- Mass photometers require regular calibration; usually at least once per day or measuring session and sometimes even more frequent since the interferometric contrast fluctuates depending on temperature, runtime of the laser and other changing parameters.
- the complexation states of the citrate synthase protein, or functional fragment thereof, are dependent on the structure of the protein which determines their protein-protein interactions.
- the citrate synthase protein is therefore not suitable as a molecular weight standard in denaturing techniques which reduce proteins to their primary structure such as sodium dodecyl sulphate PAGE (SDS-PAGE). If the complexation states of the protein were fixed by, for example, chemical cross linking then they would be resistant to disassociation and may have wider applications for more techniques.
- compositions according to any description of the invention in the calibration or standardisation of apparatus and/or a biochemical technique.
- the composition may be used in the calibration of a mass photometry device.
- the composition may be used as a molecular weight standard in Native PAGE.
- a method of calibrating a mass photometry device comprising detecting the mass of particles in the composition as described herein.
- the detected contrast values are converted to molecular masses using the instrument's software, which is calibrated using molecular weight standards (as described herein). Calibration should be regularly validated by measuring the composition of the present invention and verifying the molecular masses the protein complexes, which are known. Thus, the molecular masses of the calibrant/molecular weight standard are known, and the instrument's software can be adjusted based on the detected masses of the calibrant/molecular weight standard. As good laboratory practice, it is recommended to include the calibration measurement with every series of experiments.
- nucleic acid sequence encoding a citrate synthase of any of the aspects above.
- the citrate synthase protein may be encoded by an ancestral gene.
- the nucleic acid is codon optimised for production of the protein in an organism, preferably the sequence has been codon optimised for expression in E. coli.
- Codon optimization refers to the redundant alteration of the nucleic acid encoding a protein to take advantage of the known tRNA distribution of an organism and thereby enhance expression of the protein.
- nucleic acid as described here.
- a vector as used herein refers to any particle used as a vehicle to artificially carry a foreign nucleic sequence into a cell, where it can be replicated and/or expressed.
- Vectors include but are not limited to plasmids, cosmids, viruses and phages.
- the vector is a plasmid, preferably the vector is a pET plasmid.
- a cell transformed with a vector comprising a nucleic acid as described herein is an E. coli cell.
- the transformed E. coli cell expresses a citrate synthase as described herein.
- the invention comprises a method of producing a citrate synthase protein as described herein comprising culturing a cell as described herein and carrying out at least one purification step.
- the purification step comprises affinity chromatography.
- the purification step comprises size exclusion chromatography (SEC).
- citrate synthase protein Whilst the aspects above use the term "citrate synthase protein" to describe the proteins of the invention there is no requirement that the protein have any enzymatic activity in order to meet this definition.
- the protein merely needs to comprise a sequence originally deriving from, or be similar to, a citrate synthase protein as provided herein; and may also have been modified in any of the above indicated ways.
- the proteins provided herein may be isolated or purified.
- ancestral proteins and ancestral genes are a product of algorithmic analysis which determines the probability of each amino acid being in any position within the protein sequence for a particular node of the phylogenetic tree.
- the ancestral sequences disclosed herein are high probability solutions to the ancestral sequence reconstruction algorithm. We will likely never know for certain the actual sequence of the ancestor proteins or the genes encoding them since too much time has passed for biological samples to survive intact. This in no way diminishes the utility of the invention. Examples:
- Example 1 Ancestral protein/gene reconstruction of citrate synthase proteins:
- Table 1 Sequence names and corresponding SEQ. ID No. italics indicates the nucleic acid sequence encoding the indicated protein
- Table 2 RefSeq Accession codes for extant citrate synthase proteins used to construct Trees 1 and 2 as shown in Figures 1-3. Plain text accession codes (first three columns) are common to both Trees. Accession codes in bold are found only in Tree 1 ( Figure 2). Accession codes in italics are found only in Tree 2 ( Figure 3).
- Table 3 Sequence identity and similarity of CS2-7 when compared to CS1. Results generated using EMBOSS NEEDLE program https://www.ebi.ac.uk/Tools/psa/emboss_needle/
- Example 2 Expression and purification of citrate synthase proteins pET vectors comprising Seq ID Nos. 22-28 were transformed into BL21(DE3) cells which were cultured at 30 °C with shaking in 2 L baffled flasks containing 500 mL LB medium supplemented with 6.25 g lactose and 100 pg/mL carbenicill in. Cells were cultured for approximately 16 hours after which time the cells were harvested by centrifugation for 15 mins at 4500 xg.
- Cells were resuspended in 20 mM Tris:HCI, 300 mM NaCI, 20 mM Imidazole pH 8.0 buffer (Buffer A) at a ratio of 30 mL per litre of original cell culture. Cells were lysed using a microfluidizer with 3 cycles at 15000 psi. Lysed cells were ultracentrifuged at 30000 xg for 30 mins to remove membranes and cell debris. The clarified lysate was then filtered through a 0.45 micron syringe filter.
- the lysate was loaded onto a Nickel-NTA prepacked column pre-equilibrated with Buffer A.
- the bound protein was washed with 7 column volumes of Buffer A and 7 column volumes of 20 mM Tris:HCI, 300 mM NaCI, 68 mM Imidazole pH 8.0 (Buffer B).
- Bound protein was then eluted in 20 mM Tris:HCI, 300 mM NaCI, 500 mM Imidazole pH 8.0 (Buffer C).
- Eluted protein was exchanged into 20 mM Tris:HCI, 200 mM NaCI, pH 7.5 (Buffer X) or Phosphate buffered saline (PBS) using a PD-10 column.
- Protein is sufficiently pure to use as a molecular weight standard/mass photometry calibrant and can be frozen in liquid nitrogen and stored for extended periods at -20 °C. SDS-PAGE (Fig. 4) and Native PAGE (Fig. 5) were performed on the samples.
- the resulting protein can be enriched for higher molecular weight complexation states using SEC.
- This step can be performed on the protein after elution from the Nickel-NTA column, or after the step of exchanging into Buffer X.
- An analytical SEC-column Enrich 650 Biorad was equilibrated with Buffer X. Protein sample was concentrated to 5-25 mg/mL protein and 250 pL was loaded onto the column. The sample was eluted at 1 mL/min.
- the sample eluted as a first, broader peak encompassing higher molecular weight (HMW) complexation states (trimers and above) and a second sharp peak attributed to lower molecular weight (LMW) complexation states (dimers and below) however the two peaks could not be completely resolved from each other.
- HMW higher molecular weight
- LMW lower molecular weight
- Reusable silicone gaskets (CultureWellTM, CW-50R-1.0, 50-3mm diameter x 1 mm depth) were set up on a cleaned microscopic cover slip (1.5 H, 24 x 60 mm, Carl Roth) and mounted on the stage of the mass photometer (Refeyn Ltd., UK) using immersion oil (Immersol 518F, Zeiss).
- the gasket was filled with 19 pl buffer (PBS or Buffer C) to focus the instrument.
- the protein was prediluted to a concentration of approx. 400nM in the same buffer. Then 1 pl of prediluted protein solution was added to the buffer droplet and mixed thoroughly. Final concentration of the proteins during measurement was 20 nM. Data was acquired for 60 s at 100 frames per second.
- CSl-His was used to calibrate a Two MP mass photometer as described above and proteins of known molecular weight were analysed by mass photometry. Sample concentration was 10-40 nM depending on the protein being analysed.
- Figure 7 shows the proteins' masses as determined by mass photometry when calibrated using CSl-His are shown next to the masses expected from the proteins' primary sequences. The results show that calibration with CSl-His leads to calculated masses that are within 2% of the expected masses meaning the calibration is effective.
- NativeMarkTM unstained protein standard (InvitrogenTM) which is designed for use as a standard for native PAGE. NativeMarkTM is advertised as containing 8 proteins (1236, 1048, 720, 480, 242, 146, 66 and 20 kDa).
- NativeMarkTM was compared against CSl-His HMW for the usability and accuracy of the calibration.
- NativeMarkTM was diluted 250-fold in PBS whilst CSl-His HMW was diluted to 40 nM (monomer concentration).
- 10 pL of the diluted protein was added to the well and a 1 minute movie was recorded.
- the data was then analysed using Discover MP software (RefeynTM). The comparison between Native Mark TM and CSl-His HMW was performed on the same day.
- Figures 8 a and b shows a comparison of the results for the two standards.
- the peaks obtained with NativeMarkTM are heavily weighted towards a single species which accounts for 67% of the binding events.
- the low intensity peaks have a lower signal to noise ratio and are less accurate.
- the error for the mass/radiometric contrast gradient for NativeMarkTM is 5.8%, whereas for the data set shown for CSl-His HMW the error is only 0.7%.
- a more concentrated sample of NativeMarkTM could be used to achieve higher intensity peaks, and thus more accurate data, for the higher molecular weight species but this adds complexity, time and cost.
- NativeMarkTM is not designed and formulated for use as a mass photometry standard, instead being designed to achieve roughly similar staining intensity when stained after running in native PAGE.
- a cell comprising the nucleic acid sequence or vector according to any of clauses K, clause N or clause O.
- a method of producing a citrate synthase protein of any of clauses A-l or clause L comprising culturing a bacterium according to clause R and carrying out at least one purification step.
- composition comprising a citrate synthase protein according to any of clauses A-l or clause L wherein the concentration of the citrate synthase protein is at least 5 nM, preferably the concentration is between 5 nM and 100 pM.
- composition according to clause U wherein the composition further comprises at least one component selected from a buffering reagents and a salt.
- a method of calibrating a mass photometry device comprising the step of assaying the composition of clause W or clause X.
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GBGB2307091.5A GB202307091D0 (en) | 2023-05-12 | 2023-05-12 | Calibrant protein |
| GB2307091.5 | 2023-05-12 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024235913A1 true WO2024235913A1 (fr) | 2024-11-21 |
Family
ID=86872367
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2024/063101 Pending WO2024235913A1 (fr) | 2023-05-12 | 2024-05-13 | Protéine d'étalonnage |
Country Status (2)
| Country | Link |
|---|---|
| GB (1) | GB202307091D0 (fr) |
| WO (1) | WO2024235913A1 (fr) |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018011591A1 (fr) | 2016-07-13 | 2018-01-18 | Oxford University Innovation Limited | Microscopie à diffusion interférométrique |
-
2023
- 2023-05-12 GB GBGB2307091.5A patent/GB202307091D0/en not_active Ceased
-
2024
- 2024-05-13 WO PCT/EP2024/063101 patent/WO2024235913A1/fr active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018011591A1 (fr) | 2016-07-13 | 2018-01-18 | Oxford University Innovation Limited | Microscopie à diffusion interférométrique |
Non-Patent Citations (11)
| Title |
|---|
| DATABASE UniProt [online] 31 May 2011 (2011-05-31), "RecName: Full=Citrate synthase {ECO:0000256|PIRNR:PIRNR001369};", XP093185680, retrieved from EBI accession no. UNIPROT:F2NMC3 Database accession no. F2NMC3 * |
| DATABASE UniProt [online] 8 March 2011 (2011-03-08), "RecName: Full=Citrate synthase {ECO:0000256|PIRNR:PIRNR001369};", XP093185669, retrieved from EBI accession no. UNIPROT:E6SLW3 Database accession no. E6SLW3 * |
| GUINDON SGASCUEL O: "PhyML: ''A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.", SYST. BIOL., vol. 52, no. 5, 2003, pages 696 - 704, XP002525185, DOI: 10.1080/10635150390235520 |
| HARMS, M.J.THORNTON, J.W.: "Analyzing protein structure and function using ancestral gene reconstruction", CURR. OPIN. STRUCT. BIOL., vol. 20, 2010, pages 360 - 366, XP027067347, Retrieved from the Internet <URL:https://doi.org/10.1016/j.sbi.2010.03.005> DOI: 10.1016/j.sbi.2010.03.005 |
| HARMS, M.THORNTON, J.: "Evolutionary biochemistry: revealing the historical and physical causes of protein properties", NOT REV GENET, vol. 14, 2013, pages 559 - 571, Retrieved from the Internet <URL:https://doi.org/10.1038/nrg3540> |
| NEEDLEMAN, S.B.WUNSCH, C.D.: "A general method applicable to the search for similarities in the amino acid sequence of two proteins", JMB, vol. 48, 1970, pages 443 - 453, XP024011703, Retrieved from the Internet <URL:https://doi.org/10.1016/0022-2836(70)90057-4> DOI: 10.1016/0022-2836(70)90057-4 |
| NN: "Instructions: Supersignal Molecular weight protein ladder 84785", 31 December 2011 (2011-12-31), pages 1 - 3, XP093185655, Retrieved from the Internet <URL:https://www.thermofisher.com/document-connect/document-connect.html?url=https://assets.thermofisher.com/TFS-Assets%2FLSG%2Fmanuals%2FMAN0011723_SupSig_Molec_Wght_Protein_Lad_UG.pdf> [retrieved on 20240715] * |
| SCHMIDTMANN ET AL.: "Redox Regulation of Arabidopsis Mitochondrial Citrate Synthase", MOL. PLANT, vol. 7, 2014, pages 156 - 159 |
| SELBERG, A.G.A.GAUCHER, E.A.LIBERLES, D.A.: "Ancestral Sequence Reconstruction: From Chemical Paleogenetics to Maximum Likelihood Algorithms and Beyond", J MOL EVOL, vol. 89, 2021, pages 157 - 164, XP037403769, Retrieved from the Internet <URL:https://doi.org/10.1007/s00239-021-09993-1> DOI: 10.1007/s00239-021-09993-1 |
| STAMATAKIS A: "Randomized Axelerated Maximum Likelihood", BIOINFORMATICS, vol. 30, no. 9, 2014, pages 1312 - 3 |
| YOUNG G, HUNDT N, COLE D: "Quantitative mass imaging of single biological macromolecules", SCIENCE, vol. 360, 2018, pages 423 - 427, XP055807752, DOI: 10.1126/science.aar5839 |
Also Published As
| Publication number | Publication date |
|---|---|
| GB202307091D0 (en) | 2023-06-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20230357731A1 (en) | Polymerases, compositions, and methods of use | |
| Baumann et al. | Solution structure and DNA-binding properties of a thermostable protein from the archaeon Sulfolobus solfataricus | |
| DeDecker et al. | The crystal structure of a hyperthermophilic archaeal TATA-box binding protein | |
| Woestenenk et al. | His tag effect on solubility of human proteins produced in Escherichia coli: a comparison between four expression vectors | |
| Wang et al. | New methods enabling efficient incorporation of unnatural amino acids in yeast | |
| JP2025118694A (ja) | 合成タンパク質の安定性を高めるためのシステムおよび方法 | |
| Tonella et al. | '98 Escherichia coli SWISS‐2DPAGE database update | |
| Trésaugues et al. | Refolding strategies from inclusion bodies in a structural genomics project | |
| JP6785521B2 (ja) | エンドトキシン測定剤の感度の増強方法 | |
| Chatterjee et al. | The functionally important N-terminal half of fission yeast Mid1p anillin is intrinsically disordered and undergoes phase separation | |
| Deepankumar et al. | Temperature sensing using red fluorescent protein | |
| Datiles et al. | Two-dimensional gel electrophoretic analysis of human lens proteins | |
| WO2024235913A1 (fr) | Protéine d'étalonnage | |
| Gała̧zkiewicz et al. | Polymerization of G‐actin by caldesmon | |
| TW202115249A (zh) | 用於定序反應之聚合酶 | |
| Morris et al. | In vitro protein polymerization and nucleoprotein reconstitution of tobacco rattle virus | |
| CN113774074A (zh) | 一种基于肽段的靶向蛋白质组精确定量方法 | |
| Roberts et al. | [24] The use of synthetic oligodeoxyribonucleotides in the examination of calmodulin gene and protein structure and function | |
| Wang et al. | Reconstructed protein arrays from 3D HPLC/tandem mass spectrometry and 2D gels: complementary approaches to Porphyromonas gingivalis protein expression | |
| ZA200507499B (en) | Separation and accumulation of subcellar components, and proteins derived therefrom | |
| Vaezzadeh et al. | pICarver: a software tool and strategy for peptides isoelectric focusing | |
| KR102617593B1 (ko) | 바이러스 뉴클레오캡시드를 이용한 목적 단백질 발현 플랫폼 | |
| Stikeleather et al. | Translation Accuracy in E. coli | |
| RU2779599C1 (ru) | Полимеразы, композиции и способы их применения | |
| Turan et al. | A Simplified method for the extraction of recombinant Taq DNA polymerase from Escherichia coli |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24726207 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 11202507038S Country of ref document: SG |
|
| WWP | Wipo information: published in national office |
Ref document number: 11202507038S Country of ref document: SG |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024726207 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2024726207 Country of ref document: EP Effective date: 20251212 |