[go: up one dir, main page]

WO2005114458A1 - Sondage informatique de proteines pour identifier des sites de liaison - Google Patents

Sondage informatique de proteines pour identifier des sites de liaison Download PDF

Info

Publication number
WO2005114458A1
WO2005114458A1 PCT/US2004/014069 US2004014069W WO2005114458A1 WO 2005114458 A1 WO2005114458 A1 WO 2005114458A1 US 2004014069 W US2004014069 W US 2004014069W WO 2005114458 A1 WO2005114458 A1 WO 2005114458A1
Authority
WO
WIPO (PCT)
Prior art keywords
sites
macromolecule
binding sites
binding
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2004/014069
Other languages
English (en)
Inventor
Frank Guarnieri
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sarnoff Corp
Original Assignee
Sarnoff Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sarnoff Corp filed Critical Sarnoff Corp
Priority to JP2007511330A priority Critical patent/JP2007536618A/ja
Priority to PCT/US2004/014069 priority patent/WO2005114458A1/fr
Priority to EP04822039A priority patent/EP1751669A4/fr
Publication of WO2005114458A1 publication Critical patent/WO2005114458A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis

Definitions

  • the present invention relates to methods of identifying binding sites on proteins, methods for identifying classes of compounds suitable for binding a protein, and methods of conducting experiments to identify compounds that interact with a protein to affect a biological process.
  • Determinations of protein structures have to date been conducted by isolating crystals of the protein of interest, and analyzing structure by X-ray crystallography.
  • the protein has been co-crystallized with heavy metal component, or subjected to multiple co-crystallizations, with the heavy metal providing a reference for solving the crystallographic data.
  • the method does not require or typically use information on the function of the macromolecule, as the method avoids subjective biases and instead depends purely on physical parameters. Further, the method can be refined further to narrow the possible choices of binding sites and identify the functionalities, i.e., organic fragments or "ORFs," that effectively interact with the binding site(s).
  • ORFs organic fragments
  • the data obtained for ORFs further identifies the orientations of the functionalities useful in a candidate binding agent, thereby providing a tool for searching chemical databases to identify candidate binding agents.
  • the methods described herein identify more than one potential binding site, the data generated through these methods can be used to energetically rank the binding sites, and thereby quantitatively determine which site has the potential to more strongly bind molecules.
  • the computational method described here generates maps of binding site preferences that are nearly identical with maps produced by compiling data generated by traditional methods, but with one important difference —the experimentally produced data took many years to produce while the data produced as described herein can be produced in no more than a few weeks.
  • the invention provides an important development in unbiased simulation methods for predicting the character of agents that bind to biological macromolecules to affect the function of the macromolecules.
  • a method of identifying binding sites on a macromolecule comprising: (a) for at least one organic fragment (ORF), conducting, at separate values of parameter B, two or more simulated annealing of chemical potential calculations using the ORF as the inserted solvent; and (b) comparing converged solutions from step (a) to identify first locations at which the relevant ORF is strongly bound, thereby identifying candidate sites for binding ligand molecules.
  • the method further comprises: (c) identifying clusters of sites that strongly bind an ORF.
  • the method further comprises: (d) conducting steps (a) and (b) for each of two or more ORFs and identifying clusters where two or more distinct ORFs bind.
  • a cluster that binds three or more distinct ORFs is identified.
  • the method can identify further functionalities that contribute to the binding of bioactive agents by reducing the binding stringency in the vicinity of a cluster to further identify elements that would contribute to the binding of a bioactive agent.
  • the method further comprises: (e) conducting, at separate values a measure of chemical potential, two or more simulated annealing of chemical potential calculations using water as the inserted solvent; (f) comparing converged solutions from step (c) to identify locations at which water is strongly bound, thereby identifying locations on the protein which are not candidate sites for binding ligand molecules; and (g) identifying first locations that are not water locations.
  • the simulated annealing of chemical potential calculations comprise multiple steps of sampling, and wherein in a number of steps of the sampling the ORFs position is changed by a small amount and the resulting new position is accepted or rejected based on the change in energy as a result of the change attempted.
  • a method of identifying the chemical characteristics of compounds that bind a macromolecule comprising examining the functionalities and relative orientations of the ORFs found in a cluster pursuant to the binding site identifying method outlined above.
  • Also provided is a method of conducting combinatorial chemistry to identify compounds that interact with a macromolecule comprising: (a) identifying classes of reactants that are modeled by the functionalities of the ORFs found in a cluster pursuant to the binding site identifying method of macromolecule; (b) designing a combinatorial synthetic protocol that calls for two or more synthetic procedures that react reagents of at least two of the classes identified in step (a); and (c) conducting the combinatorial synthetic protocol to create candidate binding molecules.
  • a method of conducting a bioactive agent discovery process comprising: (a) from a group of established combinatorial synthetic protocols or a collections of chemical compounds or pools of chemical compounds, identifying those members of the group that provide a high density of compounds that meet for a macromolecule selection criteria identified from the binding site identifying method of macromolecule; and (b) conducting binding or functional assays to identify compounds obtained from the identified collections or protocols which bind or affect the function of the macromolecule.
  • Figure 1A illustrates a solved crystal structure
  • Figure IB displays the structure with a grid imposed
  • Figures 2A-2D display the method of the invention applied to the crystallographic solution of elastase; the method can be exemplified using methanol as the ORF.
  • Figures 3A and 3B show the combined results for several ORFs bound to elastase after simulations at relatively low B values, with the results in Figure 3B filtered to identify clusters of these bound ORFs.
  • Figure 3C shows the two clusters of Figure 3B which remain after excluding strong water binding sites, and Figure 3D shows the one cluster that remains after extending the analysis to another ORF;
  • Figure 3E shows the analysis extended to still a further ORF.
  • FIG. 3F compare the simulation results to a co- crystallography result.
  • Illustrated in Figure 4 A are the amide binding sites extracted from the data of six co-crystallization experiments with elastase and known ligands; and illustrated in Figure 4B is a cluster of the highest affinity amide binding sites determined by the simulation method of the invention.
  • Illustrated in Figure 4C are the amide ORFs of Figure 4B plus amides which are in the vicinity of the cluster but which appear in the simulation at second highest affinity binding values.
  • solutions obtained with co-crystals of elastase inhibitors are compared with data obtained by the methods herein described.
  • Figures 6A and 6B show the surfaces of elastase involved in binding ligands as indicated by the crystallograpfiic data, Figure 6A, and as indicated by the solutions obtained using the method described herein, Figure 6B.
  • Figure 7 shows a schematic illustration of the type of titrations for water binding to a macromolecule that can be used to help identify a level of relatively strong water binding.
  • Bioactive agent refers to a substance such as a chemical that can act on a cell, virus, organ or organism, including but not limited to drugs (i.e., pharmaceuticals) to create a change in the functioning of the cell, virus, organ or organism.
  • drugs i.e., pharmaceuticals
  • the method of identifying bioactive agents of the invention is applied to organic molecules having molecular weight of about 600 or less or to polymeric species such as peptides, proteins, nucleic acids, proteoglycans and the like.
  • a bioactive agent can be a medicament, i.e. a substance used in therapy of an animal, preferably a human.
  • Cluster of free grid points refers to free grid points that are within a "cluster” in that, relative to a given ORF, there is a sufficient number of nearby or adjacent free grid points to allow a reasonable probability that the ORF could be inserted at the cluster.
  • the cluster of free grid points for H 2 O must be defined to identify all volumes at the surface or interior of a macromolecule that could accommodate H 2 O — though the selection criteria should err to identifying some volumes that do not accommodate H 2 O, as needed to assure that all appropriate volumes are sampled in the simulation process.
  • a cluster of free grid points is defined differently depending on the size of the ORFs (e.g., compare H 2 O and benzene) and the spacing of the grid.
  • a “cluster of ORF binding sites” typically refers to a pattern of closely located or superimposed sites that bind ORFs with sufficient affinity to merit further considerati on .
  • Collection of chemical compounds refers to any collection of compounds collected or organized with the intention that they can be examined to identify bioactive agents (e.g., having a biological activity measured directly or through a surrogate for biological activity such as binding to a macromolecule or interfering with a function of a macromolecule).
  • the collection can be prepared from a collection of simpler molecules (which can be bound to a support) by a chemical scheme designed to generate a diversity of chemicals. Collections of this latter type are often referred to as “combinatorial libraries.”
  • Free grid points refers to grid points (which are discussed below) which are, for a given accepted definition of atomic radius, "free” in that they do not fall within the atomic radii of the mapped atoms of the relevant macromolecule.
  • Macromolecule refers to a molecule or collection of molecules which has a time-averaged tertiary structure.
  • Macromolecules are used in the method described herein with reference to maps of their tertiary structure. Such maps are typically generated by X-ray diffraction studies, which have generated maps for thousands of macromolecules. However, maps can be produced by other methods such as computational methods or computational methods supplemented by other data such as NMR data. While computational methods have been difficult to apply, recent studies appear to have achieved some successes.
  • Organic fragments or “ORFs” are molecules or molecular fragments that can be used to model one or more modes of interaction with a macromolecule, such as the interactions of carbonyls, hydroxyls, amides, hydrocarbons, and the like.
  • Water locations are locations at which water is strongly bound, meaning, in one embodiment, for example locations where the simulation indicates water remains bound when the simulation is run at values of B that are equal to or less than the B value for the transition point indicating those water molecules that are strongly influenced by the macromolecule. Illustrated in Figure 7 is a conceptualization of the titration of simulated bound water molecules with decreasing values of B, a parameter described further below. A transition point indicates water molecules that are strongly influenced by the macromolecule.
  • a B value less than or equal to that at the transition point can be designated as defining water binding of sufficient strength to render competitive binding by another molecule unlikely, as illustrated by point SB in the illustration.
  • this point SB is selected so that about 100 to about 50 water molecules remain bound for a 50 kd protein.
  • the simulation process of the present invention works by artificially inserting a given ORF at an unbiased sampling of all the sites on or within a macromolecule structure where such ORF can, as a practical matter, reside. These sites can be termed the "sampling sites.”
  • a schedule of simulations for each of a number of ORFs are run, with each simulation run at a separate value of a parameter B, which is related to the excess chemical potential.
  • the schedule provides for simulations conducted at each of a number of B values, typically ranging from 10 to about -15.
  • the simulation assesses at each step of the simulation whether the insertion of the ORF at a given site shall be accepted or rejected, with the assessment based on a grand canonical ensemble probability density function.
  • the algorithm models the insertion of the ORF at the site.
  • a forced bias canonical probability density function is used to translate and rotate the ORF in small steps (e.g., ⁇ 0.2A, ⁇ 30°) to identify an energy minimized insertion given the simulation parameters in place at the time of the simulation step.
  • the probability of the insertion is then determined from the grand canonical ensemble probability density function, and the ORF can be represented as resident at the site by a random number generating protocol weighed to the probability value.
  • Figure 1 illustrates a solved crystal structure (Figure 1A) on which a grid is imposed ( Figure IB).
  • the grid can have about l ⁇ A to about lA spacing, with the grid intersection points defining the candidates for sampling sites.
  • the spacing of the grid is preferably selected to be less than the smallest cross-section of the ORF.
  • the spacing is typically selected to be small enough in relation to the size of the ORF so that the probability that free volumes that could define free grid point clusters have sufficient free grid points to allow useful sampling as described below. Such relatively small spacing minimizes the chance that the selection of how to orient the grid will bias the algorithm against identifying certain
  • sampling sites are selected from sites that are unoccupied by the macromolecule ( Figure IB). A final elimination of "grid bias" is achieved by varying the test insertion points away from strict initial insertion at grid points, as described below.
  • the sampling sites are limited to those sites having enough adjacent volume free of the macromolecule to allow the ORF to be inserted.
  • the grid points can be selected for those free grid points that are within a cluster of free grid points, such as, for example, a cluster of 3, 4, 5, 6, 7, 8 or more free grid points, depending on the size of the ORF and the spacings of the grid.
  • the ORF is not necessarily initially inserted exactly at the grid points, but instead at a random sampling of insertion points within a short distance of the grid points, such as points within a sphere shape centered at the grid point and having a diameter of about some percentage, such as 10%, of the grid spacing, or within a box shape centered at the grid point having width, length and height of about such a percentage of the grid distance.
  • the next step of the process is to conduct simulations with additional ORFs and identify clusters of relatively high affinity ORF binding sites.
  • simulations can be conducted to determine binding for ORFs for ammonia, methanol, ketone and amide.
  • Figure 3A Clusters of ORF binding sites are identified in Figure 3B.
  • the method of the present invention seeks to identify clusters of ORF binding sites, where the clusters can be made up solely of one type of ORF. Preferably, however, the cluster will include binding sites for 2, 3, 4, 5, 6, 7 or more distinct ORFs. Examples of useful ORFs include:
  • the ORFs selected are representative of chemical features that have proven useful in the design of pharmaceuticals or other bioactive chemicals.
  • an important part of the process is to run the simulations with several ORFs, identifying clusters of sites that bind multiple ORFs with relatively high affinity. These clusters are strong candidate sites for ligand binding sites.
  • the relative positioning of the ORFs is instructive of the features of good binding agents.
  • a cluster having two benzene rings with an amide interposed between them models some of the strongest elastase inhibitors derived from an extensive research program, which inhibitors have a sulfonamide in place of the carbon- based amide of the simulation.
  • Tables XXIII and XXV of Edwards et al "Synthetic Inhibitors of Elastase," Medicinal Research Reviews 74:127-194 (1994).
  • clusters of ORF binding sites alone will identify, or substantially narrow the range of choices for, the sites at which ligands interact with a given protein.
  • the sites that bind water strongly are identified, and the clusters that intersect with strong water binding sites are discounted.
  • the candidate ligand binding sites of Figure 3B are narrowed by excluding water binding sites, as illustrated in Figure 3C. If the analysis is extended to five ORFs as illustrated in Figure 3D, a single candidate site remains.
  • Figure 3E shows a slightly different perspective of the same site illustrated in Figure 3D, with the analysis extended to six ORFs.
  • Figure 3F shows how well the candidate site (left panel) matches up with the structure of a co-crystal containing the ligand trifluoroacetyl-lysyl-prolyl-/?- isopropylanilide.
  • an optional step in the process is to narrow the choices for ligand binding sites by excluding ORF clusters that intersect with relatively strong water binding sites.
  • clusters of ORFs are typically identified at relatively low B values, thereby helping to identify prospective binding sites for ligands.
  • further information about prospective binding sites can be gleaned by looking, in the vicinity of a prospective binding site, at more weakly binding ORFs. This information value flows from the prospect of more weakly binding ORFs modeling a ligand interaction which, while weak in isolation, models a real contribution to ligand binding affinity of a bioactive agent as a whole.
  • Illustrated in Figure 4A are the amide binding sites extracted from the data of six co-crystallization experiments with elastase and known ligands.
  • Illustrated in Figure 4B is a cluster of the highest affinity amide binding sites determined by simulation.
  • Illustrated in Figure 4C are the amide ORFs of Figure 4B plus amides which are in the vicinity of the cluster but which appear in the simulation at the second highest affinity values. As illustrated, this last step of expanding the results by looking at neighboring lower affinity ORF binding sites helps to better model the results seen in co- crystallography.
  • the cluster results identify the site at which the majority of amide, binding sites are seen in crystallography, but the expansion extends the results to another cleft in elastase where amides have been experimentally located. Additionally, the expansion identifies part of another cleft at which ligand interactions are seen (as will be illustrated in other Figures).
  • the features of ligand binding sites indicated by other modes of analysis are expanded upon by looking to less stringent simulation results in the vicinity of ORF clusters.
  • the above illustration focused on a cluster of one type of ORF, but is applicable with clusters of many types of ORFs, where the expansions can be limited to one type of ORF or multiple types of ORFs.
  • the data in Figures 4A-4C illustrate an important concept. Both in actual ligand bindings and in the simulations, multiple effective binding locations and orientations for a given type of moiety can be found to overlap.
  • Figure 5B the solutions for approximately 10 ORFs, which are in their respective high affinity protein binding states are overlaid. Both methods identify a region which favors the binding of aromatic moieties.
  • the simulation process achieves approximately 90% 3D geometric identity with the crystallography results.
  • Figures 6A and 6B show the regions of elastase involved in binding ligands as indicated by the crystallographic data, Figure 6A, and as indicated by the solutions obtained from the computational method described herein, Figure 6B.
  • the simulations of the invention utilize a Monte Carlo algorithm. The form of Monte Carlo simulation useful in the present invention is described in
  • the simulation method can comprise: • Locate a numeric representation of the macromolecule in a periodic cell. • Optimize the position of the macromolecule in the cell. • Locate all the cavities in the macromolecule, whether interior or surface cavities. • Insert and delete the ORFs (including water) in these cavities. • Compute the probabilities of occupation of the ORFs using a grand canonical ensemble probability density function. • Vary the chemical potential yielding relative free energies of binding.
  • the methodology, grand-canonical ensemble simulation can be introduced as follows:
  • the distinguishing feature of simulations in the grand-canonical ensemble is the change in the number of molecules (ORFs) in the system during the simulation.
  • ORFs the number of molecules
  • the sampling is not restricted to the configuration space of a given dimension but it has to be extended to a set of configuration spaces.
  • Applicant has found, unexpectedly, that the complexity of allowing for these changing numbers of molecules and the resulting changing mass nonetheless makes the simulation computationally extremely more efficient.
  • the change in the number of molecules corresponds to the fact that the grand-canonical partition function ⁇ is the linear combination of the corresponding canonical partition functions of a different number, N, of molecules, Q: ⁇ lT.V, M ⁇ ⁇ ' kT) Q ⁇ ,V.N) (1)
  • the insertion site will be chosen with probability 1/V and the molecule (ORF) to be deleted will be chosen with probability 1 N.
  • the simulation proceeds by alternating attempts to move, insert and delete molecules (ORFs) and accepting them with probabilities P v c e , P n " s cc ,
  • Equations 8 and 9 l ⁇ H represents the volume of the regions of the system that contain cavities of suitable size.
  • the efficiency of the cavity- biased method follows from the fact that the algorithm searching for cavities also yields TM V N without extra steps. Calculations on a variety of fluids
  • Grand canonical ensemble simulations are generally performed by placing a molecule in a periodic simulation cell, setting a parameter B, which is representative of free energy, in such a way as to achieve an experimentally determined density, sampling potential hydration positions around the molecule by inserting and deleting water molecules from the simulation cell using a technique such as cavity-bias, 2 ' 3 and accepting or rejecting the attempt based on a Metropolis Monte Carlo 4 criteria using a grand canonical ensemble probability function.
  • the most salient feature of this progression is the differential hydration of the major and minor groove of the DNA.
  • the B - 6 simulation shows the DNA essentially uniformly solvated.
  • the first hydration shell (defined by the position of the first minimum of the radial distribution -function) of the major and minor groove has a comparable density (0.012 and 0.013, respectively), while the second hydration shell of the minor groove has twice the density of the major groove.
  • Illustrating the differential hydration propensities of the major and minor grooves of DNA is computationally undemanding (3 days of CPU time to run one annealing schedule and 3 days of CPU time to run one proximity analysis 7 on an SGI Power Challenge) using simulated annealing of chemical potential because only a coarse "cooling" schedule of the chemical potential is required. Since the chemical potential is a free energy, a very fine cooling schedule may be used to estimate quantitatively the hydration free energy difference of two different functional groups or even two different atoms of the DNA. Two atoms that desolvate at the same 5-value have similar solvation free energy, or alternatively, require a finer cooling schedule to resolve the differences.
  • the model system used here consisted of ionic DNA with 22 negative charges and no sodium counterions.
  • the findings presented herein about the preferential hydration of the minor groove corresponds very well to results from X-ray crystallographic and NMR studies. Possible reasons for the stronger binding of water molecules in the minor groove may include the following: the high density of the charged rows of phosphate groups, steric constraints, and specific water-water, water-DNA interactions. The regions where water binds tightly on a protein, are regions which are precluded from ORF binding. Thus, the remaining sites on the protein unoccupied by water are candidates for good ORF binding.
  • Candidate bioactive agents identified by the methods of the invention can be tested to assess their binding to the macromolecule in question. Where the macromolecules are responsible for many biological functions, including disease states, it is therefore desirable to devise screening methods to identify compounds which stimulate or which inhibit the function of the macromolecule. Accordingly, in a further aspect, the present invention provides for a method of screening compounds to identify those which stimulate or which inhibit the function of such a macromolecule.
  • agonists or antagonists can be employed for therapeutic and prophylactic purposes for diseases.
  • Compounds can be identified from a variety of sources, for example, cells, cell-free preparations, chemical libraries, and natural product mixtures.
  • the screening methods can simply measure the binding of a candidate compound to the macromolecule, or to cells or membranes bearing the macromolecule.
  • the macromolecule can be a variant of the macromolecule used in the simulation method, such as a fragment retaining the binding site identified in the simulation or a fusion protein used to make recombinant synthetic methods more practical.
  • the screening method can involve competition with a labeled competitor. Further, these screening methods can test whether the candidate compound results in a signal generated by activation or inhibition of the macromolecule, using detection systems appropriate to the cells comprising the macromolecule. Inhibitors of activation are generally assayed in the presence of a known agonist and the effect on activation by the agonist by the presence of the candidate compound is observed.
  • the screening methods can simply comprise the steps of mixing a candidate compound with a solution containing a macromolecule, measuring macromolecule activity in the mixture, and comparing the activity of the mixture to a standard.
  • the invention also provides a method of screening compounds to identify those which enhance (agonist) or block (antagonist) the action of macromolecules, including association of the macromolecule with itself or another macromolecule.
  • the method of screening can involve high- throughput techniques.
  • a synthetic reaction mix for example, a synthetic reaction mix, a cellular compartment, such as a membrane, cell envelope or cell wall, or a preparation of any thereof, comprising macromolecule and a labeled substrate or ligand of such polypeptide is incubated in the absence or the presence of a candidate molecule that can be a agonist or antagonist.
  • the ability of the candidate molecule to agonize or antagonize the macromolecule is reflected in decreased binding of the labeled ligand or decreased production of product from a substrate.
  • Molecules that bind gratuitously, i.e., without inducing the effects of macromolecule are most likely to be good antagonists.
  • Molecules that bind well and, as the case can be, increase for example the rate of product production from substrate, increase signal transduction, or increase chemical channel activity are agonists. Detection of the rate or level of, as the case can be, production of product from substrate, signal transduction, or chemical channel activity can be enhanced by using a reporter system. Reporter systems that can be useful in this regard include but are not limited to colorimetric, labeled substrate converted into product, a reporter gene that is responsive to changes in macromolecule activity, and binding assays known in the art.

Landscapes

  • Bioinformatics & Cheminformatics (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Computing Systems (AREA)
  • Peptides Or Proteins (AREA)

Abstract

L'invention porte sur un procédé d'identification de sites de liaison sur une macromolécule, ce procédé consistant à: (a) pour au moins un fragment organique (ORF), effectuer, selon des valeurs distinctes du paramètre B, au moins de recuits simulés de calculs potentiels de produits chimiques à l'aide du fragment organique utilisé comme solvant; et (b) comparer des solutions de convergence à partir de l'étape (a) afin d'identifier des premiers emplacements au niveau desquels le fragment organique concerné est fortement lié, ce qui permet d'identifier des sites candidats pour la liaison des molécules aux ligands.
PCT/US2004/014069 2004-05-06 2004-05-06 Sondage informatique de proteines pour identifier des sites de liaison Ceased WO2005114458A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2007511330A JP2007536618A (ja) 2004-05-06 2004-05-06 結合部位を同定するためのコンピュータによるタンパク質探索法
PCT/US2004/014069 WO2005114458A1 (fr) 2004-05-06 2004-05-06 Sondage informatique de proteines pour identifier des sites de liaison
EP04822039A EP1751669A4 (fr) 2004-05-06 2004-05-06 Sondage informatique de proteines pour identifier des sites de liaison

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2004/014069 WO2005114458A1 (fr) 2004-05-06 2004-05-06 Sondage informatique de proteines pour identifier des sites de liaison

Publications (1)

Publication Number Publication Date
WO2005114458A1 true WO2005114458A1 (fr) 2005-12-01

Family

ID=35428554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/014069 Ceased WO2005114458A1 (fr) 2004-05-06 2004-05-06 Sondage informatique de proteines pour identifier des sites de liaison

Country Status (3)

Country Link
EP (1) EP1751669A4 (fr)
JP (1) JP2007536618A (fr)
WO (1) WO2005114458A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013163348A1 (fr) * 2012-04-24 2013-10-31 Laboratory Corporation Of America Holdings Procédés et systèmes d'identification d'un site de liaison protéique
EP3100023A4 (fr) * 2014-01-29 2017-08-16 University of Maryland, Baltimore Procédés et systèmes pour l'échantillonnage de soluté organique dans des environnements aqueux et hétérogènes
US11270098B2 (en) 2017-11-16 2022-03-08 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Clustering methods using a grand canonical ensemble

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6426205B1 (en) * 1997-10-24 2002-07-30 Mount Sinai Hospital Corporation Methods and compositions for modulating ubiquitin dependent proteolysis
US6716614B1 (en) * 1999-09-02 2004-04-06 Lexicon Genetics Incorporated Human calcium dependent proteases, polynucleotides encoding the same, and uses thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6735530B1 (en) * 1998-09-23 2004-05-11 Sarnoff Corporation Computational protein probing to identify binding sites
JP3843260B2 (ja) * 2001-01-19 2006-11-08 株式会社インシリコサイエンス 誘導適合を含めたタンパク質の立体構造構築方法およびその利用
US20040267509A1 (en) * 2003-06-27 2004-12-30 Stephan Brunner Method and computer program product for drug discovery using weighted Grand Canonical Metropolis Monte Carlo sampling

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6426205B1 (en) * 1997-10-24 2002-07-30 Mount Sinai Hospital Corporation Methods and compositions for modulating ubiquitin dependent proteolysis
US6716614B1 (en) * 1999-09-02 2004-04-06 Lexicon Genetics Incorporated Human calcium dependent proteases, polynucleotides encoding the same, and uses thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1751669A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013163348A1 (fr) * 2012-04-24 2013-10-31 Laboratory Corporation Of America Holdings Procédés et systèmes d'identification d'un site de liaison protéique
EP3100023A4 (fr) * 2014-01-29 2017-08-16 University of Maryland, Baltimore Procédés et systèmes pour l'échantillonnage de soluté organique dans des environnements aqueux et hétérogènes
US11270098B2 (en) 2017-11-16 2022-03-08 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Clustering methods using a grand canonical ensemble

Also Published As

Publication number Publication date
EP1751669A1 (fr) 2007-02-14
JP2007536618A (ja) 2007-12-13
EP1751669A4 (fr) 2008-11-05

Similar Documents

Publication Publication Date Title
US6735530B1 (en) Computational protein probing to identify binding sites
Bonneau et al. Ab initio protein structure prediction: progress and prospects
Zhang Protein structure prediction: when is it useful?
Boja et al. Proteogenomic convergence for understanding cancer pathways and networks
Melquiond et al. Next challenges in protein–protein docking: from proteome to interactome and beyond
EA005286B1 (ru) Способ работы компьютерной системы для осуществления дискретного субструктурного анализа
US9218460B2 (en) Defining and mining a joint pharmacophoric space through geometric features
Park et al. Comparing expression profiles of genes with similar promoter regions
JP2003524831A (ja) 組み合わせ空間を探索するためのシステムおよび方法
Lauria et al. Drugs polypharmacology by in silico methods: new opportunities in drug discovery
WO2002065119A9 (fr) Procede de prediction d'interactions moleculaires dans des reseaux
US20040267456A1 (en) Method and computer program product for drug discovery using weighted grand canonical metropolis Monte Carlo sampling
EP1751669A1 (fr) Sondage informatique de proteines pour identifier des sites de liaison
US20090094012A1 (en) Methods and systems for grand canonical competitive simulation of molecular fragments
Hunjan et al. The size of the intermolecular energy funnel in protein–protein interactions
EP1912145A2 (fr) Sondage informatique d'une protéine afin d'identifier les sites de liaison
EP1604319A1 (fr) Test d'elaboration informatique de proteines pour identifier des sites de liaison
Marti-Renom et al. Structure comparison and alignment
Raman et al. Prediction report
Mann et al. Classifying proteinlike sequences in arbitrary lattice protein models using LatPack
US20040267509A1 (en) Method and computer program product for drug discovery using weighted Grand Canonical Metropolis Monte Carlo sampling
Zhou et al. Characterizing DNA recognition preferences of transcription factors using global couplings and high-throughput sequencing
Alber et al. Integrative structure determination of protein assemblies by satisfaction of spatial restraints
Poole et al. Accelerating fragment-based drug discovery using grand canonical nonequilibrium candidate Monte Carlo
Dariusz et al. Ab Initio server prototype for prediction of phosphorylation sites in proteins

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2007511330

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

WWE Wipo information: entry into national phase

Ref document number: 2004822039

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2004822039

Country of ref document: EP