WO2005038689A1 - Apparatus and method for chemical library design - Google Patents
Apparatus and method for chemical library design Download PDFInfo
- Publication number
- WO2005038689A1 WO2005038689A1 PCT/EP2004/052438 EP2004052438W WO2005038689A1 WO 2005038689 A1 WO2005038689 A1 WO 2005038689A1 EP 2004052438 W EP2004052438 W EP 2004052438W WO 2005038689 A1 WO2005038689 A1 WO 2005038689A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reagents
- reagent
- library
- filtering
- criteria
- Prior art date
Links
- 239000000126 substance Substances 0.000 title claims abstract description 90
- 238000000034 method Methods 0.000 title claims description 23
- 238000002898 library design Methods 0.000 title claims description 21
- 239000003153 chemical reaction reagent Substances 0.000 claims abstract description 240
- 238000001914 filtration Methods 0.000 claims abstract description 66
- 230000009467 reduction Effects 0.000 claims description 41
- 238000006243 chemical reaction Methods 0.000 claims description 33
- 239000012634 fragment Substances 0.000 claims description 19
- 150000003839 salts Chemical group 0.000 claims description 16
- 231100000777 Toxicophore Toxicity 0.000 claims description 10
- 238000004458 analytical method Methods 0.000 claims description 9
- 238000004140 cleaning Methods 0.000 claims description 9
- 239000002184 metal Substances 0.000 claims description 8
- 229910052751 metal Inorganic materials 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 6
- 230000035502 ADME Effects 0.000 claims description 5
- 238000003041 virtual screening Methods 0.000 claims description 4
- 230000007717 exclusion Effects 0.000 claims description 3
- 238000003032 molecular docking Methods 0.000 claims description 3
- 108090000623 proteins and genes Proteins 0.000 claims description 3
- 102000004169 proteins and genes Human genes 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 2
- 230000003993 interaction Effects 0.000 claims description 2
- 238000003860 storage Methods 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims 1
- 238000013461 design Methods 0.000 abstract description 5
- 238000005457 optimization Methods 0.000 abstract description 3
- 239000000047 product Substances 0.000 description 43
- 125000002924 primary amino group Chemical class [H]N([H])* 0.000 description 13
- 230000008569 process Effects 0.000 description 11
- 239000007795 chemical reaction product Substances 0.000 description 8
- 239000002253 acid Substances 0.000 description 7
- 150000001805 chlorine compounds Chemical class 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 150000002739 metals Chemical class 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- NGNBDVOYPDDBFK-UHFFFAOYSA-N 2-[2,4-di(pentan-2-yl)phenoxy]acetyl chloride Chemical compound CCCC(C)C1=CC=C(OCC(Cl)=O)C(C(C)CCC)=C1 NGNBDVOYPDDBFK-UHFFFAOYSA-N 0.000 description 3
- 150000001408 amides Chemical class 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 239000000178 monomer Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 2
- 244000046052 Phaseolus vulgaris Species 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000012912 drug discovery process Methods 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 238000000302 molecular modelling Methods 0.000 description 2
- 150000002823 nitrates Chemical class 0.000 description 2
- 230000000144 pharmacologic effect Effects 0.000 description 2
- 238000013439 planning Methods 0.000 description 2
- 239000000376 reactant Substances 0.000 description 2
- 101100261173 Arabidopsis thaliana TPS7 gene Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 235000006719 Cassia obtusifolia Nutrition 0.000 description 1
- 235000014552 Cassia tora Nutrition 0.000 description 1
- 244000201986 Cassia tora Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 238000004617 QSAR study Methods 0.000 description 1
- GLQOALGKMKUSBF-UHFFFAOYSA-N [amino(diphenyl)silyl]benzene Chemical compound C=1C=CC=CC=1[Si](C=1C=CC=CC=1)(N)C1=CC=CC=C1 GLQOALGKMKUSBF-UHFFFAOYSA-N 0.000 description 1
- 125000002777 acetyl group Chemical class [H]C([H])([H])C(*)=O 0.000 description 1
- 150000001266 acyl halides Chemical class 0.000 description 1
- 150000001299 aldehydes Chemical class 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 125000004429 atom Chemical group 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 150000005829 chemical entities Chemical class 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 238000012377 drug delivery Methods 0.000 description 1
- 238000007876 drug discovery Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003337 fertilizer Substances 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 239000000417 fungicide Substances 0.000 description 1
- 150000004820 halides Chemical class 0.000 description 1
- 239000004009 herbicide Substances 0.000 description 1
- 229940042795 hydrazides for tuberculosis treatment Drugs 0.000 description 1
- 150000002466 imines Chemical class 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 150000002540 isothiocyanates Chemical class 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 125000004433 nitrogen atom Chemical group N* 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 150000002978 peroxides Chemical class 0.000 description 1
- 239000000575 pesticide Substances 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000000344 soap Substances 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 238000013417 toxicology model Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B60/00—Apparatus specially adapted for use in combinatorial chemistry or with libraries
- C40B60/02—Integrated apparatus specially adapted for creating libraries, screening libraries and for identifying library members
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/60—In silico combinatorial chemistry
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/60—In silico combinatorial chemistry
- G16C20/62—Design of libraries
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01J—CHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
- B01J2219/00—Chemical, physical or physico-chemical processes in general; Their relevant apparatus
- B01J2219/00274—Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
- B01J2219/00277—Apparatus
- B01J2219/00351—Means for dispensing and evacuation of reagents
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01J—CHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
- B01J2219/00—Chemical, physical or physico-chemical processes in general; Their relevant apparatus
- B01J2219/00274—Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
- B01J2219/0068—Means for controlling the apparatus of the process
- B01J2219/00695—Synthesis control routines, e.g. using computer programs
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01J—CHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
- B01J2219/00—Chemical, physical or physico-chemical processes in general; Their relevant apparatus
- B01J2219/00274—Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
- B01J2219/0068—Means for controlling the apparatus of the process
- B01J2219/007—Simulation or vitual synthesis
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01J—CHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
- B01J2219/00—Chemical, physical or physico-chemical processes in general; Their relevant apparatus
- B01J2219/00274—Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
- B01J2219/00718—Type of compounds synthesised
- B01J2219/0072—Organic compounds
Definitions
- the present invention relates to apparatus and methods for designing chemical libraries.
- the invention relates to an integrated computer software tool for selecting reagents, calculating the chemical products which may be produced by combining the reagents (library enumeration) and assessing the chemical characteristics, such as diversity, ofthe anticipated chemical products (library profiling).
- library profiling the members of such chemical libraries, once synthesised in the laboratory, are screened for useful biochemical activity, for example against cancer cells or bacteria.
- the selection of chemical reagents for combination to produce a chemical library is one ofthe most critical steps in synthesis planning and chemical library design. Availability, synthesis feasibility, structural or property similarity/dissimilarity, and cost are among criteria applied by chemists in selecting reagents for a particular chemical library or for synthesis of a single compound. Application of these criteria often determine the success or failure of lead generation or optimization programs. In selecting reagents, chemists often spend many hours classifying and selecting from in- house or commercial reagent lists.
- WO 02/25504 discloses a system for encoding and building products of a virtual combinational library. Particular data structures and software methods are used to aid efficient enumeration (calculation of expected chemical products from various reagent combinations) of large combinational libraries.
- US 6377895 discloses a chemical library design tool for automatically and intelligently selecting reagents appropriate for synthesising a chemical library having desired characteristics.
- a computer implemented expert system is used, including a knowledge base containing rules based on information obtained from experts in the appropriate subject matter. The expert system also contains an inference engine which applies the rules to select a set of reagents based on the desired characteristics.
- the characteristics ofthe resulting virtual chemical library are assessed and compared against the desired characteristics.
- the results ofthe comparison may be used to update the knowledge base and the set of reagents for enumeration.
- the invention provides a chemical library design apparatus comprising: a reagent search element adapted to extract from one or more reagent databases, according to user defined search criteria, data defining a set of reagents; one or more reagent filtering elements each adapted to form from data defining a set or subset of reagents, according to user defined filtering criteria, data defining a filtered subset of reagents; and a library enumeration element adapted to construct from one or more of said sets or filtered subsets of reagents, according to user defined reaction criteria, data defining a set of chemical products.
- the apparatus may be put into effect on a single or a group of computers suitably linked.
- the reagent databases may be available locally or over the Internet and these databases will define a large number of chemicals available for use in the laboratory. These chemicals will typically be defined in standard ways using known formats, such as the SDF, MOL, RXN, SMILES and SMIRKS formats which are all familiar to the skilled person.
- the design apparatus may also include one or more library reduction elements.
- Each library reduction element is adapted to assist a user in selecting a subset of chemical products from a larger set, according to user defined reduction criteria.
- Such library reduction elements may be provided by means of re-use of some or all ofthe above-mentioned reagent filtering elements, supplemented as necessary by specialist product-oriented elements.
- the reagent filtering and library reduction elements may be controlled, in part, by suitable pre-defined datasets of molecules, molecular fragments, metals, salts, isotopes and so on.
- the user can preferably edit these data sets and/or provide his own corresponding data sets tailored to individual preference and need.
- Various visualisation and/or profiling tools may also be provided to enable the user to study distributions of properties of reagents and products defined in the various data sets, as well as to study individual molecules and their properties.
- a reagent listing element is preferably also provided, to generate a list or other data grouping defining which reagents are needed to synthesize the products ofthe reduced library. Further facilities may be provided to assist in ordering reagents from various sources.
- Each ofthe elements mentioned above may be implemented and accessible from a common graphical user interface using common file or data handling facilities to assist in their integration.
- a reagent set prepared using the reagent search element is easily passed to any ofthe reagent filtering elements. Filtered subsets of reagents are easily passed to the library enumeration element. Chemical products from the library enumeration element are easily passed to any ofthe library reduction elements.
- the reagent search element preferably allows the user to define search criteria including at least an inclusion criteria and an exclusion criteria, these criteria defining by structure and/or composition classes of molecules to include in the set of reagents to be formed.
- search criteria including at least an inclusion criteria and an exclusion criteria, these criteria defining by structure and/or composition classes of molecules to include in the set of reagents to be formed.
- a molecular weight criteria may also be used.
- the reagent filtering elements may include a variety of tools, including chemistry cleaning, physico-chemical filtering, manual picking, diversity analysis and duplicate checking tools.
- the library enumeration element or elements may be implemented to use any of several known techniques for enumeration of virtual chemical libraries.
- the enumeration elements provide facilities to enable multi-step enumeration, and to provide combinatorial, non-combinatorial, transform and other types of enumeration, including manual selection of particular reagents to combine, as well as automatic selection.
- the library reduction elements enable the user to cut down the size ofthe product library obtained in the enumeration step, and to this end similar tools to those provided for reagent filtering are provided. Further tools may include the application of models such as ADME/TOX models, protein docking models and pharmacophore models. Preferably, all ofthe above elements are available from the common graphical user interface.
- the various sets and subsets of reagents and products are preferably contained in a single database, such as an Oracle database.
- the invention also provides computer software providing each ofthe elements and facilities disclosed, and one or more computer readable media containing the software in a suitable computer readable form.
- the invention also provides an integrated workflow tool for assisting a chemist in designing a combinatorial or non-combinatorial chemical library, comprising: a reagent search element for finding suitable reagents according to user defined search criteria; a reagent filtering element for optimising the set of reagents found using the search element, according to user defined filtering criteria; and a library enumeration element for calculating chemical products obtainable from said optimised set of reagents according to user defined reaction criteria.
- Such a workflow tool preferably also comprises a library reduction element for reducing the number of chemical products obtained, according to user defined reduction criteria and a reagent listing element for calculating which reagents are required to form the members ofthe reduced set of chemical products. - . - . _
- the invention also provides a method of providing a workflow tool for assisting a chemist in designing a chemical library, comprising: providing a reagent search facility for finding suitable reagents according to user defined search criteria; providing a reagent filtering facility for optimizing the set of reagents found, according to user defined filtering criteria; providing a library enumeration facility for calculating chemical products obtainable from said optimized set of reagents, according to user defined reaction criteria; and providing a graphical user interface giving direct access to the reagent search facility, reagent filtering facility and library enumeration facility.
- Figure 1 illustrates a chemical library design workflow scheme implemented by embodiments ofthe invention
- Figure 2 illustrates, schematically, a library design tool embodying the invention
- Figure 3 shows a main graphical user interface ofthe tool of figure 2
- Figure 4 illustrates a reagent search element ofthe tool of figure 2
- Figure 5 illustrates reagent filtering elements ofthe tool of figure 2
- Figure 6 illustrates a library enumeration element ofthe tool of figure 2
- Figure 7 illustrates library profiling/reduction elements ofthe tool of figure 2
- Figure 8 illustrates a reagent listing element ofthe tool of figure 2
- Figure 9 shows a first computer system architecture for implementing the tool of figure 2
- Figure 10 shows a second computer system architecture for implementing the tool of figure 2.
- FIG. 1 there is shown, schematically, a chemical library design workflow scheme effected and enabled by embodiments ofthe invention subsequently described.
- the scheme comprises a plurality of workflow steps which implement the complete workflow of the chemist, from reagent retrieval to reagent ordering and purchasing, when designing a whole chemical library or just a small set of compounds.
- a reagent searching step (10) the user can search for and retrieve data defining reagents, from suitable reagent databases (5).
- the reagent search may identify a huge list of reagents with undesirable features such as particular atoms (especially selected isotopes and metals) and toxic fragments. It is important that these reagents are removed from the list before library enumeration, and reagent purchasing.
- the set of reagents is therefore filtered in the reagent filtering step (12), using one or more filtering tools to remove reagents with undesirable properties.
- the data defining the filtered set of reagents is passed to a library enumeration on step 14 in which the chemical products which may be obtained from the reagents, using one or more user defined reactions, are calculated.
- the number of chemical products can be reduced in the library reduction step 16 using a variety of tools, many of which may be similar or the same as those used in the reagent filtering step (16).
- the data defining the reduced library of chemical products is then used in a reagent listing step (18) to calculate which reagents are required to synthesise the library.
- the reagent list may be stored in a reagent list database 20, and used to prepare spreadsheets and, ultimately, business orders to obtain the reagents from suppliers (step 22). Reagents obtained may then be used to synthesise the chemical library (step 24).
- Figure 2 illustrates, in general terms, apparatus in the form of an integrated workflow tool, for putting the workflow of figure 1 into effect.
- the apparatus may be implemented on a single computer, or more usually, on a plurality of interlinked computers, suitably programmed to provide the functional elements shown.
- a control/GUI (graphical user interface) element 30 provides a user with a central point of control to implement the desired chemical library design.
- the control/GUI element 30 provides access to a plurality of workflow elements (32, 38, 40, 42, 44). Each illustrated workflow element may be implemented using multiple elements providing different functions as desired.
- a reagent search element 32 provides facilities to search one or more reagent databases 5, which may be local or accessed over networks of various types, including the public Internet 34, to effect the reagent searching step 10 of figure 1.
- Data identifying reagents found is stored in a local database 34, as illustrated by data group 36.
- One or more reagent filtering elements 38 then provide facilities to enable the user to process the data group 36 according to the reagent filtering step 12 illustrated in figure 1.
- Library enumeration elements 40, library reduction elements 42 and a reagent listing element 44 are similarly provided. Facilities for assisting the user to prepare reagent summaries and order forms may also be provided.
- FIG. 3 illustrates aspects of an embodiment ofthe control/GUI element 30 of figure 2.
- a GUI window 50 is provided by a conventional operating system.
- a menu bar 52 provides menu access to the various workflow elements illustrated in figure 2, as will be seen from the menu headings "Search”, “Filtering”, “Enumerate” and “Profiling” (including library reduction facilities). Workflow elements falling under a particular heading but having various different detailed functionalities are accessed using detailed menus which drop down from the main menu bar.
- a data file area 54 presents a tree of folders and files. Many ofthe files contain data sets output by a particular invocation of a workflow element. For example, a first file may contain data defining reagents found using the reagent search element on a particular occasion, while a second file might contain data defining products calculated using a library enumeration element. In general, any data file defining a set of reagents or products may be used as input to any appropriate workflow element.
- Some files contain ancillary data for use in applying user-defined constraints during execution of workflow elements.
- Such files may contain, for example, definitions of classes of molecules for defining inclusion and exclusion constraints when executing the reagent search element 32 and reagent filtering elements, reaction schemes for application to reagent sets by a library enumeration element 42, and parameters defining toxicity models for application to product sets by library reduction elements 42.
- the data files accessible using the data file area 54 are logically grouped, for example into a number of projects, each project relating to the design of a particular molecular library.
- file handling facilities provided by the interface enable the user to organise data largely according to individual whim.
- a main workspace area 56 is used to display GUI sub windows for interaction with particular workflow elements.
- a reagent search GUI 58 is shown including chemical structure drawings defining a first class of molecules to include (left) and a second class of molecules to exclude (right) from the reagent list to be constructed.
- a pending requests list 60 (shown as empty) lists any workflow element tasks executing in the background. Such tasks may include, for example, reagent search and library enumeration tasks which may take minutes to hours to complete, depending on the size ofthe task bandwidth of data channels and the computing power available.
- a button bar 62 provides quick access to workflow elements, file management tools and other functions which are mostly also provided through the menu bar 52.
- Figure 4 illustrates, in more detail, an embodiment of a reagent search workflow element 32 and its operation.
- the user is enabled to search and retrieve reagents (e.g. monomers) from databases 5 which may include both locally available, in-house and commercially available databases such as ACD (Available Chemical Directory, MDL Information Systems, Inc.).
- databases 5 which may include both locally available, in-house and commercially available databases such as ACD (Available Chemical Directory, MDL Information Systems, Inc.).
- Multiple instances of a searching engine element 70 can be used to carry out multi-threaded searching, for example to search more than one reagent database simultaneously and merge the results in a single unfiltered reagent list file 72 within the dataspace 74 for a project.
- User-defined constraints on the reagent search are typically defined using a molecule drawing tool 76.
- "Marvin sketch” available from ChemAxon Ltd, is used, but other well known drawing tools such as Isis-Draw from MDL Information Systems, Inc. or
- a reagent searching GUI 78 is used to control the reagent searching process, and the drawing tool 78 may be incorporated into this GUI.
- the user can define both include and exclude constraints defining classes of molecules to be included and excluded from the resulting set of reagents. Other constraints may also be imposed, for example by defining minimum and/or maximum molecular weights as illustrated in the GUI 78 shown in figure 4.
- the GUI 78 also enables the user to import, save, copy and paste search constraints prepared using external molecule drawing software. These constraints may vary from simple fragment queries (e.g. primary amine, acid chloride, halide, etc) to multiple disconnected structures (e.g. primary amine including an aromatic ring not necessarily comiected to the nitrogen atom). More sophisticated constraints may be built using the SMARTS language of Daylight Chemical Information Systems Ltd.
- Figure 5 illustrates, in more detail, the reagent filtering step 12 of figure 1 and corresponding workflow elements 38 of figure 2.
- an unfiltered 72 or a filtered reagent list 84 produced by a reagent searching process or earlier filtering process are processed by one or more ofthe reagent filtering elements 86 - 98, in any desired order or combination, to output a filtered reagent list, usually containing data defining a reduced set of reagents.
- a list of reagents rejected by a filtering step may be stored in the same or in a different results file.
- a separate GUI 100 is provided to enable a user to control each filtering element.
- the ordering ofthe exemplary filtering elements shown in figure 5 is only for ease of illustration, and only one, or more ofthe elements, may be applied to produce a filtered reagent list 84 under the control ofthe user with the interface 60 illustrated in figure 3 in combination with GUIs 100.
- the filtering elements illustrated in figure 5 include a chemistry cleaning element 86, an unwanted fragments element 88, a toxicophore filtering element 90, a physico-chemical properties element 92, a manual picking element 94, a diversity analysis element 96 and a duplicate checking element 98.
- the chemistry cleaning element 86 provides the user with "strip salts", "filter isotopes" and "filter metals” options.
- the strip salts option removes known salt components present on reagents but retains the non-salt part of such reagents.
- the filter isotopes option rejects reagents containing particular undesirable isotopes as represented on a user editable list. Such a list could include, for example, 14 C or 13 C.
- the filter metals option presents the user with a configurable list of metals. Any reagent having a selected metal in its non salt part is rejected. Following chemical cleaning, the surviving reagents are stored in a filtered reagent list and the reject reagents may be stored in a rejected reagent list.
- the strip salts option uses a predefined list of known salts.
- Reagents containing an unknown salt part are rejected altogether, rather than stripped ofthe unknown salt part.
- the user can amend the list of known salt parts, or use a list of their own choosing.
- the user may elect to be notified of reagents rejected because they contain an unknown salt part.
- the unwanted fragments element 88 presents the user with an editable display of molecules and molecule fragments. Reagents containing molecules or fragments selected by the user are rejected. Such fragments may include, for example, acetals, acyl-Halides, aldehydes, imines, nitrates and so on.
- the GUI 100 for this element preferably displays each fragment as a conventional chemical drawing, along with a selection box to tick. The user can edit the list of fragments to be used, or can provide a different list.
- the toxicophore filtering element 90 presents the user with an editable display of toxicophores.
- the GUI 100 for this element preferably displays each fragment or molecule as a conventional chemical drawing, along with a selection box. New toxicophores may be added, for example by launching a drawing tool, and pre-prepared toxicophore lists can be loaded as desired. Reagents falling into, or having fragments falling into the selected toxicophore classes, such as nitrates, peroxides, hydrazides and isothiocyanates, are rejected.
- the physico-chemical properties element 92 presents the user with a selectable list of properties such as clog P, TPSA, PK and a measure ofthe number of single rotatable bonds.
- a calculate button enables the user to trigger a calculation or recalculation ofthe selected properties for the current reagent list, although this calculation could be carried out at a different time, or automatically.
- the user can set maximum and minimum values for each selected property and the number of reagents falling within the chosen range is displayed, although other selection schemes according to property could be used.
- the GUI 100 for this element may provide histograms illustrating the distribution ofthe current reagent list in terms of selected properties, and may also provide a reagent table within which details of each reagent and the associated chemico -physical properties are displayed. Finally, a reduced reagent list, constrained by the selected property ranges, can be saved as a filtered reagent list.
- rejected reagents may be stored in a separate list or file.
- the manual picking element 94 presents the user with a graphical and/or textual display ofthe reagents in a reagent list, from which the user can select or reject reagents to produce a filtered list.
- the diversity analysis element 96 enables a user to filter a reagent list according to various measures of chemical and/or physical diversity.
- the GUI 100 for this element may provide selections between different types of diversity analysis, such as K-means and min - max, of a maximum number of reagents to select, and of particular metrics such as physico-chemical properties upon which the diversity analysis is to be based.
- the duplicate checking element 98 checks for exact matches between reagent structures. In some circumstances, a chemist may consider as duplicates reagents with different functional groups but the same core, because when used in particular chemical reactions, they lead to the same product. This is taken into consideration, where possible. Duplicates found by the element 98 may be displayed graphically, or as a list, and the user provided with an opportunity to verify the selection.
- the operation ofthe library enumeration element 40 shown in figure 2 is further illustrated in figure 6.
- a library enumeration engine 110 calculates and stores in a successful reaction products file 112 the chemical products expected from combination of reagents defined in one or more reagent lists or filtered reagent lists 84. This enumeration is carried out according to constraints imposed, for example in the form of reaction schemes, by means of GUI 114, which may in turn include or invoke one or more drawing tools 116.
- the Markush approach makes use of a Markush structure which represents a common scaffold with variation sites.
- the virtual chemical library is enumerated by systematic attachment of clipped reagents to the appropriate variation sites.
- the technique preferred for embodiments ofthe present invention is, however, the reaction transform approach (see Drug Discovery Today 5 (2000) 326 - 336).
- the reaction transform specifies which parts ofthe reacting molecules undergo chemical transformations, and the nature of these transformations. This approach mimics more closely than does the Markush approach the stages involved in actual chemical synthesis.
- the library enumeration GUI 114 allows the entry of one or more reaction transforms, by drawing the reaction using the associated drawing tool 116, or by importing a reaction transform from another source, for example from a file represented in the data file area 54 shown in the main user interface of figure 3.
- the reaction transform is validated and displayed in the reaction transform area 118 ofthe user interface 114.
- the number of reactants required for the transform is automatically determined.
- the user can then select whether to use automated or manual enumeration using tabbed panes 120 and 122. If automated enumeration is chosen, the tabbed pane 120 is divided into a number of rows 124 corresponding to the number of required reagents.
- the user then associates a reagent list, filtered or otherwise, with each required reagent.
- a row of radio buttons 126 provides the facility to choose combinational, non-combinational or transform reactions, depending on the number of reactants defined by the reaction transform, and may also allow the user to specify which combinations of more than one active site on a particular reagent class are to be used.
- the required reagent lists are still loaded by drag and drop actions from the data file area 54, but check boxes next to each displayed reagent are provided to enable the user to select limited numbers of reagents to combine in particular ways, so that the user has full control over which products are enumerated.
- the enumeration engine 110 preferably returns at least two categories of results.
- Data defines products resulting from successful combinations of reagents according to the reaction transform shown in reaction transform area 118 is written to a successful reaction products file 112.
- Data defining combinations for which the reaction transform cannot be successfully applied is written to an unsuccessful reaction products file 130.
- the unsuccessful combinations can then be used to debug the reaction transform or to identify reagents having structures which do not fit into the reaction scheme.
- Multiple reaction steps may be used, each controlled by a different reaction transform.
- data defining successful reaction products 112 may be used to provide reagent data in a subsequent reaction, as illustrated by the broken lines hi figure 6.
- Such multi-step enumeration may be provided by a dedicated enumeration element, which allows entry of multiple reaction transforms and all the necessary reagent lists into a single GUI, for simultaneous submission to one or more enumeration engines 110.
- Embodiments of library reduction elements 42 shown in figure 2 are illustrated in figure 7.
- Library reduction elements may provide the user with all the same mechanisms as provided by reagent filtering elements 38, but adapted if necessary to reduce the number of reaction products in a list, as well as tools which are suitable only for reaction products.
- the library reduction elements may include an insilico ADME/TOX element 200, a virtual screening element 202, and a physico- chemical profiling element 204 as well as, for example, a manual picking element 194, a diversity analysis element 196, a duplicate checking element 198 and a chemistry cleaning element 206.
- Library reduction elements operate on lists of reaction products 112 to produce filtered, or reduced lists of reaction products 210.
- the ADME/TOX element 200 enables the user to apply various ADME/TOX (absorption-degradation-metabolism-excretion toxicity) models to lists of products and to reject products which fall beyond certain thresholds in respect of these models.
- the virtual screening element 202 enables the user to apply various virtual screening models and algorithms to lists of products, including QSAR/QSPR (quantitative structure-activity/quantitative structure-property) models, ligand based pharmacaphore models, and structure based protein docking models.
- the physico- chemical profiling element 204 provides similar functionality to the corresponding reagent filtering element 92, in providing histograms of product distribution by property, and filtering of products by property ranges.
- Each library reduction element is associated with a corresponding GUI 207.
- At least some ofthe library reduction elements may be conveniently provided by simple application of a corresponding filtering element to a product data set. If desired, the library reduction elements may be applied to imported product data sets, as well as product sets generated using the described enumeration elements.
- FIG 8. An embodiment ofthe reagent listing element 44 of figure 2 is illustrated in figure 8.
- a filtered, reduced product list 210 is passed to a calculate reagents element 220 which, with- reference to the relevant reagent lists 72, 84, calculates which reagents are required to form only the specified products.
- These required reagents are defined by data written to a required reagent list 222.
- spreadsheets or similar data structures 224 detailing, for example, reagent names, database sources and frequency of occurrence in the filtered products, are automatically generated. These spreadsheets may then be conveniently used in ordering reagents for in- vitro construction ofthe designed chemical library.
- a reagent search element 32 was used to look for primary amines and acid chloride reagents in the ACD database.
- the following constraints were applied during the search by drawing appropriate molecule structures in the included and excluded areas ofthe reagent search GUI 78 : (1) the primary amines should exclude any monomer with more than one primary amine function; and (2) the acid chlorides should not contain any primary amine fragments.
- This ACD database search identified 1060 acid chlorides and 4593 primary amines.
- the primary amine and acid chloride lists were submitted to various reagent filtering elements in turn. Chemical cleaning returned 4493 primary amines and 1012 acid chlorides. Applying a standard set of unwanted fragments with an unwanted fragments element 88 left 3369 primary amines and 795 acid chlorides. Application of a duplicates checking element 98 resulted in lists of 775 unique acid chlorides and 3427 unique primary amines. A diversity analysis element 96 was then used to select diverse subsets of 1000 primary amines and 100 acid chlorides. No further manual selection was performed.
- the 1000 primary amines and 100 acid chlorides were passed to a library enumeration element 40, along with an appropriate reaction transform, to result in 100000 successful amide products.
- the products list was processed using library reduction elements 42.
- a toxicophore fragment filter reduced the number of products to 83745.
- a drug-likeness filter was effected by a physico-chemical profiling element 204 by applying Clog P ⁇ 5, Molecular weight ⁇ 500, HBD ⁇ 10 and HBA ⁇ 5 (Lipinski's rule of five) and resulted in a further reduction, to 45123 products, all of which were retained by the duplicate checking element 198.
- a set of 1000 amide products was then chosen using the diversity analysis element 196, followed by a final visual inspection of a few products.
- a two tier client-data structure is used as illustrated in figure 9.
- a client application 250 written in Java is used to implement the various graphical user interfaces and supporting logic, while the underlying calculation and data services are grouped in the data tier 255.
- Two main calculation servers 260, 265 are used to integrate and provide all the required molecular modelling and chemo-informatics functions, incorporating some commercially available tools.
- a Microsoft NT server 265 is used to run Pipeline Pilot (Stevenson and Mulready, J.
- the second server 260 houses an object-oriented wrapper library 270 to expose various software libraries 275 via an Apache server process 280 with an internal Mod-Perl and an external SOAP interface.
- the data tier 255 also includes the ACD monomers database 285, from MDL Information Systems, Inc., connected to the client using ODBC protocol.
- the Java client 250 works on file-based results.
- the processing of very large operating system files tends to cause very heavy network traffic and memory problems.
- the client 250 is very large, encoding a great deal of logic, and the resulting structure is inflexible.
- the client 250 needs to keep running until servers 260, 265 return a requested result, to avoid processing state inconsistencies.
- a thinner Java client 300 is used in the client tier, with no direct connection with databases.
- the client 300 communicates only with an application server 302 implemented as a BEA Weblogic cluster of workstations running the Linux operating system, using the Weblogic implementation of Message Oriented Middlewear 304.
- Message Driven Enterprise Java Beans 306 are used to handle requests from the client 300 and to send responses, so as to accomplish asynchronous processing.
- Session beans SEJB
- SEJB Session beans
- SEJB Session beans
- Data files resulting from the workflow elements of figure 2 as implemented by the Session Beans 308, Script language component 314 and Web Services 310 are stored in a dedicated Oracle database 316, rather than as operating system files. Since all the data processing occurs in the middle tier, only relevant GUI data is communicated with the client 300.
- One advantage ofthe second architecture is that the dedicated database 316 enables more efficient data querying, extraction and storage. Another advantage is that calculations and processing required by workflow elements can be implemented in the middle tier so as to operate asynchronously with the associated GUI elements implemented in the client tier. Requests, for example for a large library enumeration, can be made, following which the requesting client can close down and restart and still receive the results ofthe processing.
Landscapes
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Theoretical Computer Science (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biotechnology (AREA)
- Library & Information Science (AREA)
- Organic Chemistry (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Bioethics (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Analytical Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
Abstract
A tool to assist in design of chemical libraries is disclosed. Facilities for reagent searching, filtering found reagents, library enumeration and library optimization are provided from a common user interface.
Description
PRD 2132
APPARATUS AND METHOD FOR CHEMICAL LIBRARY DESIGN
BACKGROUND TO THE INVENTION
The present invention relates to apparatus and methods for designing chemical libraries. In particular, but not exclusively, the invention relates to an integrated computer software tool for selecting reagents, calculating the chemical products which may be produced by combining the reagents (library enumeration) and assessing the chemical characteristics, such as diversity, ofthe anticipated chemical products (library profiling). In a typical application, the members of such chemical libraries, once synthesised in the laboratory, are screened for useful biochemical activity, for example against cancer cells or bacteria.
The introductions of combinational chemistry and high-throughput in- vitro screening were expected by pharmaceutical and biotechnology companies to deliver significant improvements in both the efficiency and productivity ofthe drug discovery process. Several factors have lead to the frustration of these expectations. Pharmaceutical companies tend to be cautious in wanting to identify the best system before investing heavily in relatively new technologies. At the research chemistry level, exploration of new chemistry tends to be undertaken either in a very focussed manner due to the limited availability of chemical building blocks, or in a broad fashion that takes into account little consideration of knowledge ofthe biological target, diversity criteria, or basis pharmacokinetic characteristics. Selection of reagents for construction of chemical product libraries is often a largely intuitive process, and is frequently restricted to the use of easily available, for example in-house building blocks. The selection of chemical reagents for combination to produce a chemical library is one ofthe most critical steps in synthesis planning and chemical library design. Availability, synthesis feasibility, structural or property similarity/dissimilarity, and cost are among criteria applied by chemists in selecting reagents for a particular
chemical library or for synthesis of a single compound. Application of these criteria often determine the success or failure of lead generation or optimization programs. In selecting reagents, chemists often spend many hours classifying and selecting from in- house or commercial reagent lists.
Some ofthe published research on computational aspects of reagent selection and library design focuses on the problem of designing diverse libraries or selecting diverse sets of compounds, e.g. B. Ward and T. Juehne, Nucleic Acids Research 26 (1998) 879 - 886. However, it is generally recognised that diversity alone is largely sufficient in the lead optimization phase ofthe drug discovery process, e.g. Brown R D et al., J Mol Graph Model 18 (2000) 427 - 37, 537.
Various computer software tools are known for assisting chemists planning the laboratory preparation of chemical libraries. WO 02/25504 discloses a system for encoding and building products of a virtual combinational library. Particular data structures and software methods are used to aid efficient enumeration (calculation of expected chemical products from various reagent combinations) of large combinational libraries. US 6377895 discloses a chemical library design tool for automatically and intelligently selecting reagents appropriate for synthesising a chemical library having desired characteristics. A computer implemented expert system is used, including a knowledge base containing rules based on information obtained from experts in the appropriate subject matter. The expert system also contains an inference engine which applies the rules to select a set of reagents based on the desired characteristics.
Following automatic enumeration, the characteristics ofthe resulting virtual chemical library are assessed and compared against the desired characteristics. The results ofthe comparison may be used to update the knowledge base and the set of reagents for enumeration.
Usually, chemical libraries are designed by expert chemists who are very much aware ofthe chemistry issues involved in designing an appropriate chemical library for a particular purpose. A problem with the expert system based approach of US6377895
is that the reagent selection process is encapsulated and hidden from the library designer. So, although US6377895 addresses the problem of library design being a slow and labour-intensive process it also frustrates the expert who requires careful control in library design.
It would therefore be desirable to provide a chemical library design tool which assists the chemist in the laborious task of preparing reagent lists, while at the same time enabling the chemist to maintain fine control over the process so as to benefit from his own expertise.
It would also be desirable to provide a chemical library design tool which assists the natural or normal workflow ofthe chemist.
It would also be desirable to provide a chemical library design tool which brings together existing available tools in a coherent package and common framework so as to make best use of chemist time and expertise during the process of library design.
It would also be desirable to provide a chemical library design tool which assists a chemist in designing useful chemical libraries which are smaller, require fewer reagents and/or are more effective in terms of library diversity, pharmacological characteristics and so on.
SUMMARY OF THE INVENTION
Accordingly, the invention provides a chemical library design apparatus comprising: a reagent search element adapted to extract from one or more reagent databases, according to user defined search criteria, data defining a set of reagents; one or more reagent filtering elements each adapted to form from data defining a set or subset of reagents, according to user defined filtering criteria, data defining a filtered subset of reagents; and a library enumeration element adapted to construct from one or more of said sets or filtered subsets of reagents, according to user defined reaction criteria, data defining a set of chemical products.
The apparatus may be put into effect on a single or a group of computers suitably linked. Typically, the reagent databases may be available locally or over the Internet and these databases will define a large number of chemicals available for use in the laboratory. These chemicals will typically be defined in standard ways using known formats, such as the SDF, MOL, RXN, SMILES and SMIRKS formats which are all familiar to the skilled person.
The design apparatus may also include one or more library reduction elements. Each library reduction element is adapted to assist a user in selecting a subset of chemical products from a larger set, according to user defined reduction criteria. Such library reduction elements may be provided by means of re-use of some or all ofthe above-mentioned reagent filtering elements, supplemented as necessary by specialist product-oriented elements.
The reagent filtering and library reduction elements, such as the chemistry cleaning, the fragment filtering and the toxicophore filtering elements described in detail below, may be controlled, in part, by suitable pre-defined datasets of molecules, molecular fragments, metals, salts, isotopes and so on. However, the user can preferably edit these data sets and/or provide his own corresponding data sets tailored to individual preference and need. Various visualisation and/or profiling tools may also be provided to enable the user to study distributions of properties of reagents and products defined in the various data sets, as well as to study individual molecules and their properties.
A reagent listing element is preferably also provided, to generate a list or other data grouping defining which reagents are needed to synthesize the products ofthe reduced library. Further facilities may be provided to assist in ordering reagents from various sources.
Each ofthe elements mentioned above may be implemented and accessible from a common graphical user interface using common file or data handling facilities to assist in their integration. For example, a reagent set prepared using the reagent
search element is easily passed to any ofthe reagent filtering elements. Filtered subsets of reagents are easily passed to the library enumeration element. Chemical products from the library enumeration element are easily passed to any ofthe library reduction elements.
The reagent search element preferably allows the user to define search criteria including at least an inclusion criteria and an exclusion criteria, these criteria defining by structure and/or composition classes of molecules to include in the set of reagents to be formed. A molecular weight criteria may also be used.
The reagent filtering elements may include a variety of tools, including chemistry cleaning, physico-chemical filtering, manual picking, diversity analysis and duplicate checking tools. The library enumeration element or elements may be implemented to use any of several known techniques for enumeration of virtual chemical libraries. Preferably, the enumeration elements provide facilities to enable multi-step enumeration, and to provide combinatorial, non-combinatorial, transform and other types of enumeration, including manual selection of particular reagents to combine, as well as automatic selection.
The library reduction elements enable the user to cut down the size ofthe product library obtained in the enumeration step, and to this end similar tools to those provided for reagent filtering are provided. Further tools may include the application of models such as ADME/TOX models, protein docking models and pharmacophore models. Preferably, all ofthe above elements are available from the common graphical user interface. The various sets and subsets of reagents and products are preferably contained in a single database, such as an Oracle database. The invention also provides computer software providing each ofthe elements and facilities disclosed, and one or more computer readable media containing the software in a suitable computer readable form.
The invention also provides an integrated workflow tool for assisting a chemist in designing a combinatorial or non-combinatorial chemical library, comprising: a reagent search element for finding suitable reagents according to user defined search criteria; a reagent filtering element for optimising the set of reagents found using the search element, according to user defined filtering criteria; and a library enumeration element for calculating chemical products obtainable from said optimised set of reagents according to user defined reaction criteria. Such a workflow tool preferably also comprises a library reduction element for reducing the number of chemical products obtained, according to user defined reduction criteria and a reagent listing element for calculating which reagents are required to form the members ofthe reduced set of chemical products. - . - . _
The invention also provides a method of providing a workflow tool for assisting a chemist in designing a chemical library, comprising: providing a reagent search facility for finding suitable reagents according to user defined search criteria; providing a reagent filtering facility for optimizing the set of reagents found, according to user defined filtering criteria; providing a library enumeration facility for calculating chemical products obtainable from said optimized set of reagents, according to user defined reaction criteria; and providing a graphical user interface giving direct access to the reagent search facility, reagent filtering facility and library enumeration facility.
BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, of which: Figure 1 illustrates a chemical library design workflow scheme implemented by embodiments ofthe invention;
Figure 2 illustrates, schematically, a library design tool embodying the invention; Figure 3 shows a main graphical user interface ofthe tool of figure 2; Figure 4 illustrates a reagent search element ofthe tool of figure 2; Figure 5 illustrates reagent filtering elements ofthe tool of figure 2; Figure 6 illustrates a library enumeration element ofthe tool of figure 2; Figure 7 illustrates library profiling/reduction elements ofthe tool of figure 2; Figure 8 illustrates a reagent listing element ofthe tool of figure 2; Figure 9 shows a first computer system architecture for implementing the tool of figure 2; and Figure 10 shows a second computer system architecture for implementing the tool of figure 2.
DESCRIPTION OF PREFERRED EMBODIMENTS
Referring now to figure 1 there is shown, schematically, a chemical library design workflow scheme effected and enabled by embodiments ofthe invention subsequently described. The scheme comprises a plurality of workflow steps which implement the complete workflow of the chemist, from reagent retrieval to reagent ordering and purchasing, when designing a whole chemical library or just a small set of compounds.
In a reagent searching step (10) the user can search for and retrieve data defining reagents, from suitable reagent databases (5). The reagent search may identify a huge list of reagents with undesirable features such as particular atoms (especially selected isotopes and metals) and toxic fragments. It is important that these reagents are removed from the list before library enumeration, and reagent purchasing. The set of reagents is therefore filtered in the reagent filtering step (12), using one or more filtering tools to remove reagents with undesirable properties. The data defining the filtered set of reagents is passed to a library enumeration on step 14 in which the chemical products which may be obtained from the reagents, using one or more user defined reactions, are calculated. The number of chemical products can be reduced in
the library reduction step 16 using a variety of tools, many of which may be similar or the same as those used in the reagent filtering step (16). The data defining the reduced library of chemical products is then used in a reagent listing step (18) to calculate which reagents are required to synthesise the library. The reagent list may be stored in a reagent list database 20, and used to prepare spreadsheets and, ultimately, business orders to obtain the reagents from suppliers (step 22). Reagents obtained may then be used to synthesise the chemical library (step 24).
Figure 2 illustrates, in general terms, apparatus in the form of an integrated workflow tool, for putting the workflow of figure 1 into effect. The apparatus may be implemented on a single computer, or more usually, on a plurality of interlinked computers, suitably programmed to provide the functional elements shown. A control/GUI (graphical user interface) element 30 provides a user with a central point of control to implement the desired chemical library design. The control/GUI element 30 provides access to a plurality of workflow elements (32, 38, 40, 42, 44). Each illustrated workflow element may be implemented using multiple elements providing different functions as desired.
A reagent search element 32 provides facilities to search one or more reagent databases 5, which may be local or accessed over networks of various types, including the public Internet 34, to effect the reagent searching step 10 of figure 1. Data identifying reagents found is stored in a local database 34, as illustrated by data group 36. One or more reagent filtering elements 38 then provide facilities to enable the user to process the data group 36 according to the reagent filtering step 12 illustrated in figure 1. Library enumeration elements 40, library reduction elements 42 and a reagent listing element 44 are similarly provided. Facilities for assisting the user to prepare reagent summaries and order forms may also be provided.
Figure 3 illustrates aspects of an embodiment ofthe control/GUI element 30 of figure 2. A GUI window 50 is provided by a conventional operating system. A menu bar 52 provides menu access to the various workflow elements illustrated in figure 2, as will be seen from the menu headings "Search", "Filtering", "Enumerate" and "Profiling" (including library reduction facilities). Workflow elements falling under a
particular heading but having various different detailed functionalities are accessed using detailed menus which drop down from the main menu bar.
A data file area 54 presents a tree of folders and files. Many ofthe files contain data sets output by a particular invocation of a workflow element. For example, a first file may contain data defining reagents found using the reagent search element on a particular occasion, while a second file might contain data defining products calculated using a library enumeration element. In general, any data file defining a set of reagents or products may be used as input to any appropriate workflow element.
•Some files contain ancillary data for use in applying user-defined constraints during execution of workflow elements. Such files may contain, for example, definitions of classes of molecules for defining inclusion and exclusion constraints when executing the reagent search element 32 and reagent filtering elements, reaction schemes for application to reagent sets by a library enumeration element 42, and parameters defining toxicity models for application to product sets by library reduction elements 42.
Generally, but not necessarily, the data files accessible using the data file area 54 are logically grouped, for example into a number of projects, each project relating to the design of a particular molecular library. However, file handling facilities provided by the interface enable the user to organise data largely according to individual whim.
A main workspace area 56 is used to display GUI sub windows for interaction with particular workflow elements. In figure 3 a reagent search GUI 58 is shown including chemical structure drawings defining a first class of molecules to include (left) and a second class of molecules to exclude (right) from the reagent list to be constructed. A pending requests list 60 (shown as empty) lists any workflow element tasks executing in the background. Such tasks may include, for example, reagent search and library enumeration tasks which may take minutes to hours to complete, depending on the size ofthe task bandwidth of data channels and the computing power available.
Finally, a button bar 62 provides quick access to workflow elements, file management tools and other functions which are mostly also provided through the menu bar 52.
Figure 4 illustrates, in more detail, an embodiment of a reagent search workflow element 32 and its operation. The user is enabled to search and retrieve reagents (e.g. monomers) from databases 5 which may include both locally available, in-house and commercially available databases such as ACD (Available Chemical Directory, MDL Information Systems, Inc.). Multiple instances of a searching engine element 70 can be used to carry out multi-threaded searching, for example to search more than one reagent database simultaneously and merge the results in a single unfiltered reagent list file 72 within the dataspace 74 for a project. User-defined constraints on the reagent search are typically defined using a molecule drawing tool 76. In a preferred embodiment, "Marvin sketch", available from ChemAxon Ltd, is used, but other well known drawing tools such as Isis-Draw from MDL Information Systems, Inc. or Chem- Draw from CambridgeSoft Ltd could be used.
A reagent searching GUI 78 is used to control the reagent searching process, and the drawing tool 78 may be incorporated into this GUI. As illustrated in figures 3 and 4 the user can define both include and exclude constraints defining classes of molecules to be included and excluded from the resulting set of reagents. Other constraints may also be imposed, for example by defining minimum and/or maximum molecular weights as illustrated in the GUI 78 shown in figure 4.
The GUI 78 also enables the user to import, save, copy and paste search constraints prepared using external molecule drawing software. These constraints may vary from simple fragment queries (e.g. primary amine, acid chloride, halide, etc) to multiple disconnected structures (e.g. primary amine including an aromatic ring not necessarily comiected to the nitrogen atom). More sophisticated constraints may be built using the SMARTS language of Daylight Chemical Information Systems Ltd.
Figure 5 illustrates, in more detail, the reagent filtering step 12 of figure 1 and corresponding workflow elements 38 of figure 2. In this step an unfiltered 72 or a filtered reagent list 84 produced by a reagent searching process or earlier filtering process are processed by one or more ofthe reagent filtering elements 86 - 98, in any desired order or combination, to output a filtered reagent list, usually containing data defining a reduced set of reagents. A list of reagents rejected by a filtering step may be stored in the same or in a different results file.
A separate GUI 100 is provided to enable a user to control each filtering element. The ordering ofthe exemplary filtering elements shown in figure 5 is only for ease of illustration, and only one, or more ofthe elements, may be applied to produce a filtered reagent list 84 under the control ofthe user with the interface 60 illustrated in figure 3 in combination with GUIs 100. The filtering elements illustrated in figure 5 include a chemistry cleaning element 86, an unwanted fragments element 88, a toxicophore filtering element 90, a physico-chemical properties element 92, a manual picking element 94, a diversity analysis element 96 and a duplicate checking element 98. The chemistry cleaning element 86 provides the user with "strip salts", "filter isotopes" and "filter metals" options. The strip salts option removes known salt components present on reagents but retains the non-salt part of such reagents. The filter isotopes option rejects reagents containing particular undesirable isotopes as represented on a user editable list. Such a list could include, for example, 14C or 13C. The filter metals option presents the user with a configurable list of metals. Any reagent having a selected metal in its non salt part is rejected. Following chemical cleaning, the surviving reagents are stored in a filtered reagent list and the reject reagents may be stored in a rejected reagent list. The strip salts option uses a predefined list of known salts. Reagents containing an unknown salt part are rejected altogether, rather than stripped ofthe unknown salt part. The user can amend the list of known salt parts, or use a list of their own
choosing. The user may elect to be notified of reagents rejected because they contain an unknown salt part.
The unwanted fragments element 88 presents the user with an editable display of molecules and molecule fragments. Reagents containing molecules or fragments selected by the user are rejected. Such fragments may include, for example, acetals, acyl-Halides, aldehydes, imines, nitrates and so on. The GUI 100 for this element preferably displays each fragment as a conventional chemical drawing, along with a selection box to tick. The user can edit the list of fragments to be used, or can provide a different list.
The toxicophore filtering element 90 presents the user with an editable display of toxicophores. As for the unwanted fragments display, the GUI 100 for this element preferably displays each fragment or molecule as a conventional chemical drawing, along with a selection box. New toxicophores may be added, for example by launching a drawing tool, and pre-prepared toxicophore lists can be loaded as desired. Reagents falling into, or having fragments falling into the selected toxicophore classes, such as nitrates, peroxides, hydrazides and isothiocyanates, are rejected. The physico-chemical properties element 92 presents the user with a selectable list of properties such as clog P, TPSA, PK and a measure ofthe number of single rotatable bonds. A calculate button enables the user to trigger a calculation or recalculation ofthe selected properties for the current reagent list, although this calculation could be carried out at a different time, or automatically. The user can set maximum and minimum values for each selected property and the number of reagents falling within the chosen range is displayed, although other selection schemes according to property could be used. The GUI 100 for this element may provide histograms illustrating the distribution ofthe current reagent list in terms of selected properties, and may also provide a reagent table within which details of each reagent and the associated chemico -physical properties are displayed. Finally, a reduced reagent list, constrained by the selected property ranges, can be saved as a filtered reagent list. As for the other filtering elements, rejected reagents may be stored in a separate list or file.
The manual picking element 94 presents the user with a graphical and/or textual display ofthe reagents in a reagent list, from which the user can select or reject reagents to produce a filtered list.
The diversity analysis element 96 enables a user to filter a reagent list according to various measures of chemical and/or physical diversity. The GUI 100 for this element may provide selections between different types of diversity analysis, such as K-means and min - max, of a maximum number of reagents to select, and of particular metrics such as physico-chemical properties upon which the diversity analysis is to be based.
The duplicate checking element 98 checks for exact matches between reagent structures. In some circumstances, a chemist may consider as duplicates reagents with different functional groups but the same core, because when used in particular chemical reactions, they lead to the same product. This is taken into consideration, where possible. Duplicates found by the element 98 may be displayed graphically, or as a list, and the user provided with an opportunity to verify the selection. The operation ofthe library enumeration element 40 shown in figure 2 is further illustrated in figure 6. A library enumeration engine 110 calculates and stores in a successful reaction products file 112 the chemical products expected from combination of reagents defined in one or more reagent lists or filtered reagent lists 84. This enumeration is carried out according to constraints imposed, for example in the form of reaction schemes, by means of GUI 114, which may in turn include or invoke one or more drawing tools 116.
Two main techniques are commonly used in the prior art for virtual enumeration of chemical libraries. The Markush approach makes use of a Markush structure which represents a common scaffold with variation sites. The virtual chemical library is enumerated by systematic attachment of clipped reagents to the appropriate variation sites. The technique preferred for embodiments ofthe present invention is, however, the reaction transform approach (see Drug Discovery Today 5
(2000) 326 - 336). The reaction transform specifies which parts ofthe reacting molecules undergo chemical transformations, and the nature of these transformations. This approach mimics more closely than does the Markush approach the stages involved in actual chemical synthesis.
The library enumeration GUI 114 allows the entry of one or more reaction transforms, by drawing the reaction using the associated drawing tool 116, or by importing a reaction transform from another source, for example from a file represented in the data file area 54 shown in the main user interface of figure 3. The reaction transform is validated and displayed in the reaction transform area 118 ofthe user interface 114. The number of reactants required for the transform is automatically determined. The user can then select whether to use automated or manual enumeration using tabbed panes 120 and 122. If automated enumeration is chosen, the tabbed pane 120 is divided into a number of rows 124 corresponding to the number of required reagents. The user then associates a reagent list, filtered or otherwise, with each required reagent. This may be achieved, for example, by a drag and drop action from the data file area 54 shown in figure 3. A row of radio buttons 126 provides the facility to choose combinational, non-combinational or transform reactions, depending on the number of reactants defined by the reaction transform, and may also allow the user to specify which combinations of more than one active site on a particular reagent class are to be used.
If manual enumeration is selected then the required reagent lists are still loaded by drag and drop actions from the data file area 54, but check boxes next to each displayed reagent are provided to enable the user to select limited numbers of reagents to combine in particular ways, so that the user has full control over which products are enumerated.
The enumeration engine 110 preferably returns at least two categories of results. Data defines products resulting from successful combinations of reagents according to the reaction transform shown in reaction transform area 118 is written to a successful reaction products file 112. Data defining combinations for which the reaction transform cannot be successfully applied is written to an unsuccessful reaction products
file 130. The unsuccessful combinations can then be used to debug the reaction transform or to identify reagents having structures which do not fit into the reaction scheme. Multiple reaction steps may be used, each controlled by a different reaction transform. For example, data defining successful reaction products 112 may be used to provide reagent data in a subsequent reaction, as illustrated by the broken lines hi figure 6. Such multi-step enumeration may be provided by a dedicated enumeration element, which allows entry of multiple reaction transforms and all the necessary reagent lists into a single GUI, for simultaneous submission to one or more enumeration engines 110.
Embodiments of library reduction elements 42 shown in figure 2 are illustrated in figure 7. Library reduction elements may provide the user with all the same mechanisms as provided by reagent filtering elements 38, but adapted if necessary to reduce the number of reaction products in a list, as well as tools which are suitable only for reaction products. Accordingly, the library reduction elements may include an insilico ADME/TOX element 200, a virtual screening element 202, and a physico- chemical profiling element 204 as well as, for example, a manual picking element 194, a diversity analysis element 196, a duplicate checking element 198 and a chemistry cleaning element 206. Library reduction elements operate on lists of reaction products 112 to produce filtered, or reduced lists of reaction products 210.
The ADME/TOX element 200 enables the user to apply various ADME/TOX (absorption-degradation-metabolism-excretion toxicity) models to lists of products and to reject products which fall beyond certain thresholds in respect of these models. Similarly, the virtual screening element 202 enables the user to apply various virtual screening models and algorithms to lists of products, including QSAR/QSPR (quantitative structure-activity/quantitative structure-property) models, ligand based pharmacaphore models, and structure based protein docking models. The physico- chemical profiling element 204 provides similar functionality to the corresponding reagent filtering element 92, in providing histograms of product distribution by
property, and filtering of products by property ranges. Each library reduction element is associated with a corresponding GUI 207.
At least some ofthe library reduction elements may be conveniently provided by simple application of a corresponding filtering element to a product data set. If desired, the library reduction elements may be applied to imported product data sets, as well as product sets generated using the described enumeration elements.
An embodiment ofthe reagent listing element 44 of figure 2 is illustrated in figure 8. A filtered, reduced product list 210 is passed to a calculate reagents element 220 which, with- reference to the relevant reagent lists 72, 84, calculates which reagents are required to form only the specified products. These required reagents are defined by data written to a required reagent list 222. Preferably, spreadsheets or similar data structures 224 detailing, for example, reagent names, database sources and frequency of occurrence in the filtered products, are automatically generated. These spreadsheets may then be conveniently used in ordering reagents for in- vitro construction ofthe designed chemical library.
An example process of applying the described chemical library design tool to the design of an amide library will now be presented. A reagent search element 32 was used to look for primary amines and acid chloride reagents in the ACD database. The following constraints were applied during the search by drawing appropriate molecule structures in the included and excluded areas ofthe reagent search GUI 78 : (1) the primary amines should exclude any monomer with more than one primary amine function; and (2) the acid chlorides should not contain any primary amine fragments. This ACD database search identified 1060 acid chlorides and 4593 primary amines.
The primary amine and acid chloride lists were submitted to various reagent filtering elements in turn. Chemical cleaning returned 4493 primary amines and 1012 acid chlorides. Applying a standard set of unwanted fragments with an unwanted fragments element 88 left 3369 primary amines and 795 acid chlorides. Application of a duplicates checking element 98 resulted in lists of 775 unique acid chlorides and 3427 unique primary amines. A diversity analysis element 96 was then used to select diverse
subsets of 1000 primary amines and 100 acid chlorides. No further manual selection was performed.
The 1000 primary amines and 100 acid chlorides were passed to a library enumeration element 40, along with an appropriate reaction transform, to result in 100000 successful amide products. The products list was processed using library reduction elements 42. A toxicophore fragment filter reduced the number of products to 83745. A drug-likeness filter was effected by a physico-chemical profiling element 204 by applying Clog P<5, Molecular weight <500, HBD<10 and HBA<5 (Lipinski's rule of five) and resulted in a further reduction, to 45123 products, all of which were retained by the duplicate checking element 198. A set of 1000 amide products was then chosen using the diversity analysis element 196, followed by a final visual inspection of a few products. The chemical library design tool discussed above with reference to figures 2 to
8 may be implemented using known software tools for carrying out the required chemistry, general data handling, user interface and operating system functions, brought together as discussed herein. Two particular architectures which may be used are described below, although others are within the scope ofthe abilities of a suitably skilled person. In a first architecture, a two tier client-data structure is used as illustrated in figure 9. A client application 250 written in Java is used to implement the various graphical user interfaces and supporting logic, while the underlying calculation and data services are grouped in the data tier 255. Two main calculation servers 260, 265 are used to integrate and provide all the required molecular modelling and chemo-informatics functions, incorporating some commercially available tools. A Microsoft NT server 265 is used to run Pipeline Pilot (Stevenson and Mulready, J. Am. Chem. Soc. 2003) to provide the client 250 with the required molecular modelling and chemo-informatics protocol handling services. The second server 260 houses an object-oriented wrapper library 270 to expose various software libraries 275 via an Apache server process 280 with an internal Mod-Perl and an external SOAP interface.
The data tier 255 also includes the ACD monomers database 285, from MDL Information Systems, Inc., connected to the client using ODBC protocol.
In the first architecture the Java client 250 works on file-based results. The processing of very large operating system files tends to cause very heavy network traffic and memory problems. The client 250 is very large, encoding a great deal of logic, and the resulting structure is inflexible. Moreover, the client 250 needs to keep running until servers 260, 265 return a requested result, to avoid processing state inconsistencies.
In a second architecture, three tiers are used, by adding a middle- tier as illustrated in figure 10. A thinner Java client 300 is used in the client tier, with no direct connection with databases. The client 300 communicates only with an application server 302 implemented as a BEA Weblogic cluster of workstations running the Linux operating system, using the Weblogic implementation of Message Oriented Middlewear 304.
In the application server 302, Message Driven Enterprise Java Beans 306 (MDEJB) are used to handle requests from the client 300 and to send responses, so as to accomplish asynchronous processing. Session beans (SEJB) are used to implement processing logic, and to do so have access to third party Web Services 310, database components 312 and a Script language component 314. Data files resulting from the workflow elements of figure 2 as implemented by the Session Beans 308, Script language component 314 and Web Services 310 are stored in a dedicated Oracle database 316, rather than as operating system files. Since all the data processing occurs in the middle tier, only relevant GUI data is communicated with the client 300.
One advantage ofthe second architecture is that the dedicated database 316 enables more efficient data querying, extraction and storage. Another advantage is that calculations and processing required by workflow elements can be implemented in the middle tier so as to operate asynchronously with the associated GUI elements implemented in the client tier. Requests, for example for a large library enumeration,
can be made, following which the requesting client can close down and restart and still receive the results ofthe processing.
The foregoing description of preferred embodiments ofthe present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in the art. For example, a range of other reagent filtering and library reduction elements may be provided. The described method and system may be adapted for design of chemical libraries for a variety of uses other than pharmacological, such as for chemicals for use in agriculture (fertilizers, herbicides, fungicides, pesticides), polymer production, biopolymers for medical engineering, drug delivery and so on, and any other situation in which synthesis reactions are used to produce chemical entities from building blocks. Accordingly, it is intended that the scope of the invention be defined by the following claims and their equivalents.
Claims
1. A chemical library design apparatus comprising: a reagent search element adapted to extract from one or more reagent databases, according to user defined search criteria, data defining a set of reagents; one or more reagent filtering elements each adapted to form from data defining a set or subset of reagents, according to user defined filtering criteria, data defining a filtered subset of said reagents; and a library enumeration element adapted to construct from one or more of said sets or filtered subsets of reagents, according to user defined reaction criteria, data defining a set of chemical products. -
2. The apparatus of claim 1 wherein said user defined search criteria include at least an inclusion criterion defining by structure and/or composition a class of molecules to include in said set of reagents and an exclusion criterion defining by structure and/or composition a class of molecules to exclude from said set of reagents.
3. The apparatus of claim 2 wherein said user defined search criteria further include a molecular weight criterion defining by molecular weight a class of molecules to exclude from said set of reagents.
4. The apparatus of claim 1 wherein said reagent filtering elements include filtering elements arranged to automatically exclude, from said filtered subset of reagents, reagents comprising one or more ofthe following: selected metal atoms; selected chemical fragments; selected toxicophores; and isotopes.
5. The apparatus of claim 1 wherein said reagent filtering elements include a chemistry cleaning filtering element operable to automatically strip salt parts from reagents and to include, in said filtered subset of reagents, reagents having had their salt parts so stripped.
6. The apparatus of claim 1 wherein said reagent filtering elements include a physico-chemical properties filtering element arranged to automatically exclude, from said filtered subset of reagents, reagents having physico-chemical properties falling outside a user-defined class of physico-chemical properties.
7. The apparatus of claim 1 wherein said reagent filtering elements include a manual-picking filtering element adapted to enable a user to exclude, from said filtered subset of reagents, reagents specifically selected by the user.
8. The apparatus of claim 1 wherein said reagent filtering elements include a diversity analysis filtering element arranged to automatically exclude, from said filtered subset of reagents, reagents rejected according to user defined diversity criteria so as to maintain the diversity ofthe filtered subset.
9. The apparatus of claim 1 wherein said reagent filtering elements include a duplicate checking filtering element arranged to construct said filtered subset so as to contain only a single instance of each reagent.
10. The apparatus of claim 1 wherein one or more of said reagent filtering elements is adapted to form, from said data defining a set or subset of reagents, data defining a rejected subset of reagents not included in said filtered subset.
11. The apparatus of claim 1 wherein said library enumeration element provides an automatic enumeration function which attempts to combine all provided reagents according to the user defined reaction criteria.
12. The apparatus of claim 11 wherein said library enumeration element operating according to said automatic enumeration function further forms data defining single reagents or classes of two or more reagents which could not be combined according to the user defined reaction scheme.
13. The apparatus of claim 1 wherein said library enumeration element provides a manual enumeration function which attempts to combine provided reagents only according to user defined reagent combination criteria.
14. The apparatus of claim 1 further comprising one or more library reduction elements, each adapted to select from data defining a set or subset of chemical products, according to user defined reduction criteria, data defining a subset of said products.
15. The apparatus of claim 14 wherein said library reduction elements include reduction elements arranged to automatically exclude, from said subset of products, products comprising one or more ofthe following: selected metal atoms; selected chemical fragments; selected toxicophores; and isotopes.
16. The apparatus of claim 14 wherein said library reduction elements include a chemistry cleaning element operable to automatically strip salt parts from products and to include, in said subset of products, products having had their salt parts so stripped.
17. The apparatus of claim 14 wherein said library reduction elements include a physico-chemical properties reduction element arranged to automatically exclude, from said reduced subset of products, products having physico-chemical properties falling outside a user-defined class of physico-chemical properties.
18. The apparatus of claim 14 wherein said library reduction elements include one or more virtual screening reduction elements arranged to automatically exclude, from said reduced subset of products, products falling outside classes arising from user defined constraints applied to one or more of: an ADME/TOX model; a pharmacaphore model; and a protein docking model.
19. The apparatus of claim 1 further comprising a reagent listing element adapted to construct a required reagent dataset defining those reagents required to synthesise a particular set or subset of chemical products.
20. The apparatus of claim 1 wherein said elements are accessible from a common graphical user interface.
21. The apparatus of claim 1 wherein said set of reagents, said filtered subset of reagents and said set of chemical products are stored in a database.
22. The apparatus of claim 1 wherein said apparatus comprises: a client layer providing graphical user interface elements to enable user interaction with said reagent search element, said reagent filtering elements and said library enumeration element; a middle layer comprising processing logic for putting into effect the extraction of said reagent search element, the forming of said reagent filtering elements and the constructing of said library enumeration element; and a data layer providing storage of said set of reagents, said subsets of reagents and said sets of chemical products.
23. An integrated work flow tool for assisting a chemist in designing a combinatorial or non-combinatorial chemical library, comprising: a reagent search element for finding suitable reagents according to user defined search criteria; a reagent filtering element for optimizing the set of reagents found using the search element, according to user defined filtering criteria; and a library enumeration element for calculating chemical products obtainable from said optimized set of reagents according to user defined reaction criteria.
24. The workflow tool of claim 23 further comprising a library reduction element for reducing the number of chemical products according to user defined reduction criteria.
25. A computer readable medium comprising computer program code for providing a chemical library design tool, the program code including: a reagent search element adapted to extract from one or more reagent databases, according to user defined search criteria, data defining a set of reagents; one or more reagent filtering elements each adapted to form from data defining a set or subset of reagents, according to user defined filtering criteria, data defining a filtered subset of said reagents; and a library enumeration element adapted to construct from one or more of said sets or filtered subsets of reagents, according to user defined reaction criteria, data defining a set of chemical products.
26. The medium of claim 25 further comprising one or more library reduction elements, each adapted to select from data defining a set or subset of chemical products, according to user defined reduction criteria, data defining a subset of said products.
27. The medium of claim 25 further comprising a reagent listing element adapted to construct a required reagent dataset defining those reagents required to synthesise a particular set or subset of chemical products.
28. A method of providing a workflow tool for assisting a chemist in designing a chemical library, comprising: providing a reagent search facility for finding suitable reagents according to user defined search criteria; providing a reagent filtering facility for optimizing the set of reagents found, according to user defined filtering criteria; providing a library enumeration facility for calculating chemical products obtainable from said optimized set of reagents, according to user defined reaction criteria; and providing a graphical user interface giving direct access to the reagent search facility, reagent filtering facility and library enumeration facility.
29. A method according to claim 28 further comprising providing a library reduction facility for optimizing the chemical products calculated, according to user defined reduction criteria.
30. A method according to claim 29 further comprising providing, in the graphical user interface, a common display and common controls of a plurality of data sets, said datasets including reagent data sets generated by the reagent search and filtering facilities, product data sets generated by the library enumeration and reduction facilities, and constraint data sets defining filtering criteria and reduction criteria.
31. A method according to claim 30 further comprising providing, in the graphical user interface, drag and drop control of said data sets between said common display and said facilities.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EPPCT/EP03/11249 | 2003-10-09 | ||
| EP0311249 | 2003-10-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2005038689A1 true WO2005038689A1 (en) | 2005-04-28 |
Family
ID=34442838
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2004/052438 WO2005038689A1 (en) | 2003-10-09 | 2004-10-05 | Apparatus and method for chemical library design |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2005038689A1 (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2015021912A1 (en) * | 2013-08-15 | 2015-02-19 | International Business Machines Corporation | Incrementally retrieving data for objects to provide a desired level of detail |
| US9767222B2 (en) | 2013-09-27 | 2017-09-19 | International Business Machines Corporation | Information sets for data management |
| US10223401B2 (en) | 2013-08-15 | 2019-03-05 | International Business Machines Corporation | Incrementally retrieving data for objects to provide a desired level of detail |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6377895B1 (en) * | 1996-02-26 | 2002-04-23 | Pharmacopeia, Inc. | Method for planning the generation of combinatorial chemistry libraries method for planning the generation of combinatorial chemistry libraries |
| US20020077754A1 (en) * | 1998-10-28 | 2002-06-20 | Malcolm J. Mcgregor | Pharmacophore fingerprinting in primary library design |
-
2004
- 2004-10-05 WO PCT/EP2004/052438 patent/WO2005038689A1/en active Application Filing
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6377895B1 (en) * | 1996-02-26 | 2002-04-23 | Pharmacopeia, Inc. | Method for planning the generation of combinatorial chemistry libraries method for planning the generation of combinatorial chemistry libraries |
| US20020077754A1 (en) * | 1998-10-28 | 2002-06-20 | Malcolm J. Mcgregor | Pharmacophore fingerprinting in primary library design |
Non-Patent Citations (5)
| Title |
|---|
| DARVAS F ET AL: "Diversity measures for enhancing ADME admissibility of combinatorial libraries.", JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES. 2000 MAR-APR, vol. 40, no. 2, March 2000 (2000-03-01), pages 314 - 322, XP002294631, ISSN: 0095-2338 * |
| GOBBI ALBERTO ET AL: "Developing an in-house system to support combinatorial chemistry", PERSPECTIVES IN DRUG DISCOVERY AND DESIGN, vol. 7-8, no. 0, December 1997 (1997-12-01), pages 131 - 158, XP008034345, ISSN: 0928-2866 * |
| LEACH A R ET AL: "Implementation of a system for reagent selection and library enumeration, profiling, and design.", JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES. 1999 NOV-DEC, vol. 39, no. 6, November 1999 (1999-11-01), pages 1161 - 1172, XP002294632, ISSN: 0095-2338 * |
| LEACH A R ET AL: "The in silico world of virtual libraries", DRUG DISCOVERY TODAY, ELSEVIER SCIENCE LTD, GB, vol. 5, no. 8, August 2000 (2000-08-01), pages 326 - 336, XP000952162, ISSN: 1359-6446 * |
| OPREA T I ET AL: "Chemical information management in drug discovery: optimizing the computational and combinatorial chemistry interfaces", JOURNAL OF MOLECULAR GRAPHICS AND MODELLING, ELSEVIER SCIENCE, NEW YORK, NY, US, vol. 18, no. 4/5, August 2000 (2000-08-01), pages 512 - 524, XP001059443, ISSN: 1093-3263 * |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2015021912A1 (en) * | 2013-08-15 | 2015-02-19 | International Business Machines Corporation | Incrementally retrieving data for objects to provide a desired level of detail |
| US10223401B2 (en) | 2013-08-15 | 2019-03-05 | International Business Machines Corporation | Incrementally retrieving data for objects to provide a desired level of detail |
| US10445310B2 (en) | 2013-08-15 | 2019-10-15 | International Business Machines Corporation | Utilization of a concept to obtain data of specific interest to a user from one or more data storage locations |
| US10515069B2 (en) | 2013-08-15 | 2019-12-24 | International Business Machines Corporation | Utilization of a concept to obtain data of specific interest to a user from one or more data storage locations |
| US10521416B2 (en) | 2013-08-15 | 2019-12-31 | International Business Machines Corporation | Incrementally retrieving data for objects to provide a desired level of detail |
| US9767222B2 (en) | 2013-09-27 | 2017-09-19 | International Business Machines Corporation | Information sets for data management |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6721754B1 (en) | System and method for database similarity join | |
| Angello et al. | Closed-loop optimization of general reaction conditions for heteroaryl Suzuki-Miyaura coupling | |
| Engkvist et al. | Computational prediction of chemical reactions: current status and outlook | |
| Warr | A short review of chemical reaction database systems, computer‐aided synthesis design, reaction prediction and synthetic feasibility | |
| US6434542B1 (en) | Statistical deconvoluting of mixtures | |
| WO2002099725A1 (en) | Systems, methods and computer program products for integrating biological/chemical databases to create an ontology network | |
| US20050177280A1 (en) | Methods and systems for discovery of chemical compounds and their syntheses | |
| US7991730B2 (en) | Methods for similarity searching of chemical reactions | |
| US7127458B1 (en) | Matching and cleansing of part data | |
| Dubovenko et al. | Functional analysis of OMICs data and small molecule compounds in an integrated “knowledge-based” platform | |
| Willett | From chemical documentation to chemoinformatics: 50 years of chemical information science | |
| Schuffenhauer et al. | Molecular diversity management strategies for building and enhancement of diverse and focused lead discovery compound screening collections | |
| WO2011123712A2 (en) | Systems and methods for entity registration and management | |
| CN103562905A (en) | Improved data visualization configuration system and method | |
| Ertl et al. | Web-based cheminformatics tools deployed via corporate Intranets | |
| Cheng et al. | Four association coefficients for relating molecular similarity measures | |
| Bleicher et al. | Enhanced utility of AI/ML methods during lead optimization by inclusion of 3D ligand information | |
| WO2005038689A1 (en) | Apparatus and method for chemical library design | |
| Kozlowski et al. | Computer-aided design of chiral ligands: Part I. Database search methods to identify chiral ligand types for asymmetric reactions | |
| AU2022241571A1 (en) | Methods and systems for generating workflows for analysing large data sets | |
| Yasri et al. | REALISIS: a medicinal chemistry-oriented reagent selection, library design, and profiling platform | |
| US20110029493A1 (en) | Generic Method To Build Catalogs For Change Management | |
| H. Lushington et al. | Chemical informatics and the drug discovery knowledge pyramid | |
| Liang et al. | MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools | |
| US8103610B2 (en) | Dynamic categorization of rules in expert systems wherein a profile definition yields classification data that classifies rules and allows for rules to be searchable |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| 122 | Ep: pct application non-entry in european phase |