[go: up one dir, main page]

WO2002093297A2 - Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants - Google Patents

Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants Download PDF

Info

Publication number
WO2002093297A2
WO2002093297A2 PCT/US2002/014601 US0214601W WO02093297A2 WO 2002093297 A2 WO2002093297 A2 WO 2002093297A2 US 0214601 W US0214601 W US 0214601W WO 02093297 A2 WO02093297 A2 WO 02093297A2
Authority
WO
WIPO (PCT)
Prior art keywords
compound
model
throughput
solid
screening
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2002/014601
Other languages
English (en)
Other versions
WO2002093297A3 (fr
Inventor
Douglas Levinson
Christopher Mcnulty
Christopher Moore
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transform Pharmaceuticals Inc
Original Assignee
Transform Pharmaceuticals Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transform Pharmaceuticals Inc filed Critical Transform Pharmaceuticals Inc
Priority to AU2002303683A priority Critical patent/AU2002303683A1/en
Priority to IL15878702A priority patent/IL158787A0/xx
Priority to CA002447047A priority patent/CA2447047A1/fr
Priority to EP02731725A priority patent/EP1395808A4/fr
Publication of WO2002093297A2 publication Critical patent/WO2002093297A2/fr
Publication of WO2002093297A3 publication Critical patent/WO2002093297A3/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • G16C20/64Screening of libraries
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Definitions

  • the present invention relates to the field of computerized data processing of experimental data relating to chemical compounds or compositions and formulations and solid forms of chemical compounds or compositions.
  • solubility, bioavailability, shelf-life, usability, taste and many other properties of the chemical product may vary in a complex way with the formulation due to interactions among the active agent and the excipients that make up the chemical product, and the particular use or administration method, thereof.
  • properties of the solid form of an ingredient such as its crystal habit and morphology, can significantly affect properties such as stability, bioavailability, and industrial processing. Selection of optimal formulations and solid form can therefore significantly alter the performance of pharmaceuticals and other chemical products. Dietary supplements, alternative medicines, nutraceuticals, sensory compounds, agrochemicals, and consumer and industrial formulations, also can benefit from reformulation and new solid forms.
  • NORVTR brand ritonavir was introduced in 1996 as a semisolid capsule formulation and as a liquid formulation.
  • many lots of the Norvir capsules started to fail dissolution testing, because a large portion of the active pharmaceutical ingredient (ritonavir) was precipitating out of the semisolid formulated product.
  • ritonavir active pharmaceutical ingredient
  • Form II a previously unknown crystal polymorph, called Form II, was in the precipitates.
  • Form II is more thermodynamically stable than Form I, and has a much lower solubility in the solvents used to formulate the NORVIR product, so that the formulation was very supersaturated with respect to Form II.
  • Form II continued to be produced and precipitate out during the manufacturing process to the point where all attempts to formulate the semisolid capsules were unsuccessful, and this quickly caused a shortage of the product and result in a marketing crisis for Abbott, h addition, while attempting to address the problem, Abbott encountered the further problem that their methods for synthesizing ritonavir, both at the bench level and in Abbott's bulk drug manufacturing process, now could not even synthesize Form I ritonavir, either at the bench scale or in bulk drug manufacturing processes, as all attempts to synthesis Fonn I resulted in production of Form II.
  • Process impurities and degradant that are formed during the manufacture of a particular compound can profoundly impact the crystallization of that compound, e.g., by inhibiting nucleation or crystal growth.
  • Such process impurities or degradants may resemble the compound-of-interest, and be selectively absorbed by a crystal nucleus of the compound, possibly functioning as a potent growth inhibitor. Inhibition of a desired polymorph may result in nucleation and growth of an undesired polymorph, as thought to be the case for ritonavir.
  • excipients are currently available for designing pharmaceutical compositions.
  • a search for an optimum combination of excipients and active agents for even a relatively simple pharmaceutical composition has been unfeasible in the past. Not only does one need to determine which of those excipients would be compatible with the active agent, but one has to determine the optimum values for such parameters as pH and relative concentrations of the components.
  • conventional formulation techniques have generally been a search for an adequate formulation in an adequate time period, rather than a search for an optimum or near-optimum among significant numbers of adequate formulations.
  • new active agents are often "force fitted" into standard formulation recipes that are modified as little as possible to result in a adequate formulation.
  • a molecular descriptor as used herein is an empirical or theoretical datum that may be used in a quantitative structure-activity or structure-property relationship to predict molecular properties in complex environments.
  • molecular descriptors For a discussion of molecular descriptors, see Karelson, Molecular Descriptors in QSAR/QSPR, John Wiley & Sons, Inc. (2000), which is incorporated herein by reference. Many categories of compounds, such as pharmaceutical excipients, have been characterized based on a large number of molecular descriptors. Commercial and noncommercial databases of such characterizations are often available. Typically the molecular descriptors relevant to a desired property or properties are a small fraction of those that are measurable, calculable, or known. Moreover, the relationship between the relevant molecular descriptors and the desired property or properties often cannot be easily determined.
  • the present invention comprises a method for determining a formulation of a pharmaceutical, comprising the steps of: performing high-throughput formulation screening of the pharmaceutical; computing an optimization algorithm to select a plurality of molecular descriptors and a model accepting the molecular descriptors as parameters to optimize the predictive power of the model; determining the formulation of the pharmaceutical.
  • the present invention comprises a method for generating a plurality of solid forms of a pharmaceutical, comprising the steps of: performing high- throughput solid-form screening of the pharmaceutical; computing an optimization algorithm to select a plurality of molecular descriptors and a model accepting the molecular descriptors as parameters to optimize the predictive power of the model; determining the formulation of the pharmaceutical.
  • the foregoing methods may further comprise the steps of: generating values of experimental parameters using the model; performing high-throughput screening using the generated values, comparing the high-throughput experimental results with the results predicted by the model; adjusting the model based on the high-throughput experimental results.
  • the generated values are preferably targeted to find an extremum of an expected property of an experiment, boundaries between solid forms, regions in which desired properties of formulations change rapidly with respect to changes experimental parameters, regions in which desired properties of formulations change slowly with respect to changes experimental parameters or regions of ambiguity or low confidence in classification or regression results.
  • the predictive power is determined with respect to an extremum of an expected property of an experiment, with respect to boundaries between solid forms, with respect to regions in which desired properties of formulations or solid forms change rapidly with respect to changes in experimental parameters, or with respect to one or more regions within class boundaries.
  • an approximately maximally diverse set of values of experimental parameters for high-throughput screening is generated using a diversification algorithm and a metric for measuring diversification.
  • a set of values of experimental parameters for high-throughput screening is generated based on a structure- activity model.
  • the present invention comprises a method for selecting a compound for further testing, comprising the steps of: receiving information of a plurality of compounds; performing high-throughput solid-form screening of at least one of the plurality of compounds to identify at least one solid-form; based on the at least one property of each identified solid-form, selecting at least one of the plurality of compounds for further testing.
  • the present invention comprises a method for selecting a compound for further testing, comprising the steps of: receiving information of a plurality of compounds; performing high-throughput formulation screening on at least one of the plurality of compounds; based on at least one tested property, selecting at least one of the plurality of compounds for further testing.
  • the present invention comprises a method for selecting a solid form of a compound for further testing, comprising the steps of: receiving information of a compound; performing high-throughput solid-form screening to identify at least two solid forms of the compound; based on the results of the high-throughput solid-form screening, selecting a solid form of the compound for further testing.
  • the present invention comprises a method for selecting a formulation of a compound for further testing, comprising the steps of: receiving information of a compound; performing high-throughput formulation screening of the compound; based on the results of the high-throughput formulation screening, selecting a formulation of the compound for further testing.
  • the present invention comprises a method for determining whether to ftirther test at least one compound, comprising the steps of: receiving information of the at least one compound; performing high-throughput formulation screening of the at least one compound; based on at least one tested property, determining whether to further test the at least one compound.
  • the invention comprises a method for determining whether to further test at least one compound, comprising the steps of: receiving information of the at least one compound; performing high-throughput solid-form screening of the at least one compound; based on at least one tested property, determining whether to further test the at least one compound.
  • the foregoing methods may preferably further comprise the step of: based on the results of the high-throughput screening, generating a model to estimate at least one property of the compound.
  • a model to estimate at least one property of the compound.
  • the methods of the present invention further comprise the steps of: based on the results of the high-throughput screening, generating a classifier to assign each solid form to a class.
  • the classes may correspond to a variety of solid forms of a compound.
  • the methods further comprise the step of applying at least one unsupervised learning or clustering algorithm to at least a subset of the results of the high-throughput screening.
  • a variety of unsupervised or clustering algorithms may be used as described below.
  • the methods may be used to prioritize testing. They may also be used to select a solid form based on high-throughput formulation testing. These and other embodiments are described below.
  • Fig. 1 is a schematic illustration of one example preferred embodiment.
  • Fig. 2 is an illustration of a display of a high-dimensional visualization in which the experimental results are represented as points of varying size, in a representation of a projection of a multidimensional space.
  • Fig. 3 is an illustration of an identification of certain groups of experimental results as exhibiting measured results of interest.
  • Fig. 4 is an illustration of additional data points corresponding to distinct experiments to characterize a formulation at higher resolution near results of interest.
  • Fig. 5 schematically illustrates a preferred method to assess a first collection of experimental results in a search for novel or known solid forms.
  • Fig. 6 schematically illustrates an architecture of a preferred example embodiment.
  • Fig. 7 depicts an example multivariate display.
  • Fig. 8 schematically depicts a preferred method to plan and assess experiments.
  • Fig. 9 schematically depicts a preferred method to plan and asses experiments.
  • Fig. 10 schematically depicts stages of clustering spectra.
  • Fig. 11 depicts an example filtered and unfiltered Raman spectrum.
  • Fig. 12 depicts a multivariate display of a dendrogram-sorted Tanimoto matrix.
  • Fig. 13 schematically depicts a simplified set of stages of pharmaceutical compound development and a corresponding qualitative indication of the reduction in the number of compounds at each stage in the process.
  • Fig. 14 schematically depicts a simplified overview of the pharmaceutical research and development process.
  • Fig. 15 schematically depicts a simplified overview of the pharmaceutical research and development process.
  • the present invention provides a system and associated methods for chemical knowledge acquisition through data acquisition, retrieval, and mining technologies, methods for applying the system and associated methods to assess whether a compound has properties suitable for commercial use, and for directing research and development expenditures towards compounds more likely to prove suitable for commercial uses, and away from compounds having properties that make commercial uses more difficult or impossible.
  • Substances, such as pharmaceutical compounds can assume many different crystal forms and sizes. Particular emphasis has been put on these crystal characteristics in the pharmaceutical industry — especially polymorphic form, crystal size, crystal habit, and crystal-size distribution. — since crystal structure and size can affect manufacturing, formulation, and pharmacokinetics, including bioavailability.
  • composition refers to whether the solid-form is a single compound or a mixture of compounds.
  • solid-forms can be present in their neutral form, e.g. , the free base of a compound having a basic nitrogen or as a salt, e.g. , the hydrochloride salt of a basic nitrogen-containing compound.
  • Composition also refers to crystals containing adduct molecules. During crystallization or precipitation an adduct molecule (e.g., a solvent or water) can be incorporated into the matrix, adsorbed on the surface, or trapped within the particle or crystal.
  • Examples include hydrates (water molecule incorporated in the matrix) and solvates (solvent trapped within a matrix). Whether a crystal forms as a hydrate or solvate can have a profound effect on the properties, such as the bioavailability or ease of processing or manufacture of a pharmaceutical. For example, hydrates or solvates may dissolve more or less readily, have different mechanical properties or strength, or have different physical and/or chemical stability than the corresponding non- hydrated or - solvated compounds.
  • a crystal habit refers to the external shape that a crystal assumes upon crystallization, which depend on, among others, the composition of the crystallizing medium. Those shapes maybe cubic, tetragonal, orthorhombic, monoclinic, triclinic, rhomboidal, or hexagonal. Such information is important because the crystal habit has a large influence on the crystalDs surface-to-volume ratio. Although a single crystal polymorph may have different crystal habits, each having the same internal structure and thus the same single crystal- and powder-diffraction patterns, different crystal habits can exhibit different pharmaceutical properties (Haleblian 1975, J Pharm. Sci., 64:1269). Thus discovering conditions or excipients that affect crystal habit are needed.
  • Polymorphism refers to the phenomenon in which a compound crystallizes into more than one distinct crystalline species (i.e., having a different internal structure) or shift from one crystalline species to another.
  • the distinct species which are known as polymorphs, can exhibit different optical properties, melting points, solubilities, chemical reactivities, physical stability, dissolution rates, and different bioavailabilities.
  • different polymorphs of the same pharmaceutical can have different pharmacokinetics, for example, different polymorphs may give rise to different levels of absorpotion of the compound.
  • only one polymorphic form of a given pharmaceutical may have solubility, bioavailability or other properties suitable for disease treatment.
  • novel or beneficial polymorphs is extremely important, especially in the pharmaceutical area.
  • Amorphous solids cannot be characterized according to habit or polymorphic form.
  • An amorphous solid is in a high-energy structural state relative to its crystalline form which can give rise to instability problems. It may crystallize during storage or shipping or an amorphous solid may be more sensitive to oxidation (Pikal et al.,1997, J. Pharm. Sci. 66:1312).
  • a common amorphous solid is glass in which the atoms and molecules exist in a nonuniform array.
  • Amorphous solids are usually the result of rapid solidification and can be conveniently identified (but not characterized) by x-ray powder diffraction, since these solids give very diffuse lines or no crystal diffraction pattern.
  • Crystals are normally obtained by dissolving a compound in a suitable solvent and then adjusting the conditions to induce crystal growth.
  • the crystallization process commonly involves dissolving the compound to saturation and then lowering the temperature. Upon cooling, the solution becomes supersaturated which often leads to the appearance of the crystals.
  • crystal formation is induced by mechanically disturbing the solution, such as by scratching the inner surface of the solution container, or by seeding the solution with dust or crystals of the same compound.
  • the pH, rate of cooling, type of solvent, solute-solvent ratio, additives such as surfactants, and inhibitors not only affect the purity of the crystals that form, but they may affect the crystal habit or polymorph that predominates.
  • the term "array” means a plurality of samples, preferably, at least 24 samples each sample comprising a compound-of-interest and at least one component, wherein: (a) an amount of the compound-of-interest in each sample is less than about 100 milligrams; and (b) at least one of the samples comprises a solid-form of the compound-of- interest.
  • An array can comprise 2 or more samples, for example, 24, 36, 48, 96, or more samples, preferably 1000 or more samples, more preferably, 10,000 or more samples.
  • An array can comprise one or more groups of samples also known as sub-arrays.
  • a group can be a 96-vessel plate of sample vessels (such as sample tubes) or a 96-well plate of sample wells in an array consisting of 100 or more plates.
  • Each sample or selected samples or each sample group of selected sample groups in the array can be subjected to the same or different processing parameters; each sample or sample group can have different components or concentrations of components; or both to induce, inhibit, prevent, or reverse formation of solid-forms of the compound-of-interest.
  • Arrays can be prepared by preparing a plurality of samples, each sample comprising a compound-of-interest and one or more components, then processing the samples to induce, inhibit, prevent, or reverse formation of solid-forms of the compound-of-interest.
  • sample means a mixture of a compound-of-interest and one or more additional components to be subjected to various processing parameters and then screened to detect the presence or absence of solid-forms, preferably, to detect desired solid-forms with new or enhanced properties.
  • the sample comprises one or more components, preferably, 2 or more components, or 3 or more components.
  • Each additional component adds one or more additional degrees of freedom to the experiment, greatly increasing the number of possible experiments, and in some cases enhancing the ability of the informatics system to perform its functions.
  • a sample will comprise one compound-of-interest but can comprise multiple compounds-of- interest.
  • a sample comprises less than about 1 g of the compound-of-interest, preferably, less than about 100 mg, more preferably, less than about 25 mg, even more preferably, less than aboutl mg, still more preferably less than about 100 microgra s, and optimally less than about 100 nanograms of the compound-of-interest.
  • the sample has a total volume of less than about 100 to about 250 ul.
  • pharmaceutical means any substance that has a therapeutic, disease preventive, diagnostic, or prophylactic effect when administered to an animal or a human.
  • pharmaceutical includes prescription pharmaceuticals and over the counter pharmaceuticals. Pharmaceuticals suitable for use in the invention include all those known or to be developed.
  • a pharmaceutical can be a large molecule (i.e., molecules having a molecular weight of greater than about 1000 g/mol), such as oligonucleotides, polynucleotides, oligonucleotide conjugates, polynucleotide conjugates, proteins, peptides, peptidomimetics, or polysaccharides or small molecules (i.e., molecules having a molecular weight of less than about 1000 g/mol), such as hormones, steroids, nucleotides, nucleosides, or aminoacids.
  • oligonucleotides oligonucleotides, polynucleotides, oligonucleotide conjugates, polynucleotide conjugates, proteins, peptides, peptidomimetics, or polysaccharides or small molecules (i.e., molecules having a molecular weight of less than about 1000 g/mol), such as hormones, steroids, nucleotides, nu
  • suitable small molecule pharmaceuticals include, but are not limited to, cardiovascular pharmaceuticals, such as amlodipine, losartan, irbesartan, diltiazem, clopidogrel, digoxin, abciximab, furosemide, amiodarone, beraprost, tocopheryl; anti- infective components, such as amoxicillin, clavulanate, azithromycin, itraconazole, acyclovir, fluconazole, terbinafine, erythro ycin, and acetyl sulfisoxazole; psychotherapeutic components, such as sertaline, vanlafaxine, bupropion, olanzapine, buspirone, alprazolam, methylphenidate, fluvoxamine, and ergoloid; gastrointestinal products, such as lansoprazole, ranitidine, famotidine, ondansetron, granisetron, sulfas,
  • Suitable veterinary pharmaceuticals include, but are not limited to, vaccines, antibiotics, growth enhancing components, and dewormers. Other examples of suitable veterinary pharmaceuticals are listed in The Merck Veterinary Manual, 8th ed.,
  • dietary supplement means a non-caloric or insignificant- caloric substance administered to an animal or a human to provide a nutritional benefit or a non-caloric or insignificant-caloric substance administered in a food to impart the food with an aesthetic, textural, stabilizing, or nutritional benefit.
  • Dietary supplements include, but are not limited to, fat binders, such as caducean; fish oils; plant extracts, such as garlic and pepper extracts; vitamins and minerals; food additives, such as preservatives, acidulents, anticaking components, antifoaming components, antioxidants, bulking components, coloring components, curing components, dietary fibers, emulsifiers, enzymes, firming components, humectants, leavening components, lubricants, non-nutritive sweeteners, food- grade solvents, thickeners; fat substitutes, and flavor enhancers; and dietary aids, such as appetite suppressants.
  • suitable dietary supplements are listed in (1994) The Encyclopedia of
  • alternative medicine means a substance, preferably a natural substance, such as a herb or an herb extract or concentrate, administered to a subject or a patient for the treatment of disease or for general health or well being, wherein the substance does not require approval by the FDA.
  • suitable alternative medicines include, but are not limited to, ginkgo biloba, ginseng root, valerian root, oak bark, kava kava, echinacea, harpagophyti radix, others are listed in The Complete German Commission E Monographs: Therapeutic Guide to Herbal Medicine, Mark Blumenthal et al. eds., h tegrative Medicine Communications 1998, incorporated by reference herein.
  • nutraceutical means a food or food product having both caloric value and pharmaceutical or therapeutic properties.
  • nutraceuticals include garlic, pepper, brans and fibers, and health drinks. Examples of suitable Nutraceuticals are listed in M.C. Linder, ed. Nutritional Biochemistry and Metabolism with Clinical Applications, Elsevier, New York, 1985; Pszczola et al., 1998 Food technology 52:30-37 and Shukla et al, 1992 Cereal Foods World 37:665-666.
  • the term "sensory-material” means any chemical or substance, known or to be developed, that is used to provide an olfactory or taste effect in a human or an animal, preferably, a fragrance material, a flavor material, or a spice.
  • a sensory-material also includes any chemical or substance used to mask an odor or taste.
  • fragrances materials include, but are not limited to, musk materials, such as civetone, ambrettolide, ethylene brassylate, musk xylene, Tonalide®, and Glaxolide®; amber materials, such as ambrox, ambreinolide, and ambrinol; sandalwood materials, such as ⁇ -santalol, ⁇ -santalol, Sandalore®, and Bacdanol®; patchouli and woody materials, such as patchouli oil, patchouli alcohol, Timberol® and Polywood®; materials with floral odors, such as Givescone®, damascone, irones, linalool, Lilial®, Lilestralis®, and dihydrojasmonate.
  • musk materials such as civetone, ambrettolide, ethylene brassylate, musk xylene, Tonalide®, and Glaxolide®
  • amber materials such as ambrox, ambrein
  • fragrance materials for use in the invention are listed in Perfumes: Art, Science, Technology, P.M. Muller ed. Elsevier, New York, 1991, incorporated herein by reference.
  • suitable flavor materials include, but are not limited to, benzaldehyde, anethole, dimethyl sulf ⁇ de, vanillin, methyl anthranilate, nootkatone, and cinnamyl acetate.
  • suitable spices include but are not limited to allspice, tarrogon, clove, pepper, sage, thyme, and coriander.
  • suitable flavor materials and spices are listed in Flavor and Fragrance Materials- 1989, Allured Publishing Corp.
  • agrochemical means any substance known or to be developed that is used on the farm, yard, or in the house or living area to benefit gardens, crops, ornamental plants, shrubs, or vegetables or kill insects, plants, or fungi.
  • suitable agrochemicals for use in the invention include pesticides, herbicides, fungicides, insect repellants, fertilizers, and growth enhancers.
  • Pesticides include chemicals, compounds, and substances administered to kill vermin such as bugs, mice, and rats and to repel garden pests such as deer and woodchucks.
  • suitable pesticides that can be used according to the invention include, but are not limited to, abamectin (acaricide), bifenthrin (acaricide), cyphenothrin (insecticide), imidacloprid (insecticide), and prallethrin (insectide).
  • Other examples of suitable pesticides for use in the invention are listed in Crop Protection Chemicals Reference, 6th ed., Chemical and Pharmaceutical Press, John Wiley & Sons Inc., New York, 1990; (1996) The Encyclopedia of Chemical Technology, 18 Kirk-Othomer (4 th ed. at 311-341); and Hayes et al., Handbook of Pesticide Toxicology, Academic Press, Inc., San Diego, CA, 1990, all of which are incorporated by reference herein.
  • Herbicides include selective and non-selective chemicals, compounds, and substances administered to kill plants or inhibit plant growth.
  • suitable herbicides include, but are not limited to, photosystem I inhibitors, such as actifluorfen; photosystem II inhibitors, such as atrazine; bleaching herbicides, such as fluridone and difunon; chlorophyll biosynthesis inhibitors, such as DTP, clethodim, sethoxydim, methyl haloxyfop, tralkoxydim, and alacholor; inducers of damage to antioxidative system, such as paraquat; amino-acid and nucleotide biosynthesis inhibitors, such as phaseolotoxin and imazapyr; cell division inhibitors, such as pronamide; and plant growth regulator synthesis and function inhibitors, such as dicamba, chloramben, dichlofop, and ancymidol.
  • herbicides are listed in Herbicide Handbook, 6th ed., Weed Science Society of America, Champaign, II 1989; (1995) The Encyclopedia of Chemical Technology, 13 Kirk-Othomer (4th ed. at 73-136); and Duke, Handbook of Biologically Active Phytochemicals and Their Activities, CRC Press, Boca Raton, FL, 1992, all of which are incorporated herein by reference.
  • Fungicides include chemicals, compounds, and substances administered to plants and crops that selectively or non-selectively kill fungi.
  • a fungicide can be systemic or non-systemic.
  • suitable non-systemic fungicides include, but are not limited to, thiocarbamate and thiurame derivatives, such as ferbam, ziram, thiram, and nabam; imides, such as captan, folpet, captafol, and dichlofluanid; aromatic hydrocarbons, such as quintozene, dinocap, and chloroneb; dicarboximides, such as vinclozolin, chlozolinate, and iprodione.
  • Example of systemic fungicides include, but are not limited to, mitochondiral respiration inhibitors, such as carboxin, oxycarboxin, flutolanil, fenfuram, mepronil, and methfuroxam; microtubulin polymerization inhibitors, such as thiabendazole, fuberidazole, carbendazim, and benomyl; inhibitors of sterol biosynthesis, such as triforine, fenarimol, nuarimol, imazalil, triadimefon, propiconazole, flusilazole, dodemorph, tridemorph, and fenpropidin; and RNA biosynthesis inhibitors, such as ethirimol and dimethirimol; phopholipic biosynthesis inhibitors, such as ediphenphos and iprobenphos.
  • mitochondiral respiration inhibitors such as carboxin, oxycarboxin, flutolanil, fenfuram, mepronil, and me
  • a "consumer formulation” means a formulation for consumer use, not intended to be absorbed or ingested into the body of a human or animal, comprising an active component. Preferably, it is the active component that is investigated as the compound-of-interest in the arrays and methods of the invention.
  • Consumer formulations include, but are not limited to, cosmetics, such as lotions, facial makeup; antiperspirants and deodorants, shaving products, and nail care products; hair products, such as and shampoos, colorants, conditioners; hand and body soaps; paints; lubricants; adhesives; and detergents and cleaners.
  • an "industrial formulation” means a formulation for industrial use, not intended to be absorbed or ingested into the body of a human or animal, comprising an active component.
  • it is the active component of industrial formulation that is investigated as the compound-of-interest in the arrays and methods of the invention.
  • Industrial formulations include, but are not limited to, polymers; rubbers; plastics; industrial chemicals, such as solvents, bleaching agents, inks, dyes, fire retardants, antifreezes and formulations for deicing roads, cars, trucks, jets, and airplanes; industrial lubricants; industrial adhesives; industrial enzymes; construction materials, such as cements.
  • active and inactive components used in consumer and industrial formulations and set up arrays according to the invention.
  • Such active components and inactive components are well known in the literature and the following references are provided merely by way of example.
  • Active components and inactive components for use in cosmetic formulations are listed in (1993J Tlie Encyclopedia of Chemical Technology, 1 Kirk-Othomer (4 th ed. at 572-619); M.G. de Navarre, The Chemistry and Manufacture of Cosmetics, D. Van Nosfrand Company, Inc., New York, 1941; CTFA International Cosmetic Ingredient Dictionary and Handbook, 8th Ed., CTFA, Washington, D.C., 2000; and A. Nowak, Cosmetic Preparations, Micelle Press, London, 1991.
  • Active components and inactive components for use in hair care products are listed in (1994) The Encyclopedia of Chemical Technology, 12 Kirk-Othomer (4 th ed. at 881-890) and Shampoos and Hair Preparations in ECT 1st ed., Vol. 12, pp. 221-243, by F. E. Wall, both of which are incorporated by reference herein. Active components and inactive components for use in hand and body soaps are listed in (1997) The Encyclopedia of Chemical Technology, 22 Kirk-Othomer (4 th ed. at 297-396), incorporated by reference herein. Active components and inactive components for use in paints are listed in (1996) The Encyclopedia of Chemical
  • Active components and inactive components for use with industrial chemicals are listed in Ash et al., Handbook of Industrial Chemical Additives, VCH Publishers, New York 1991, incorporated herein by reference.
  • Active components and inactive components for use in bleaching components are listed in (1992) The Encyclopedia of Chemical Technology, 4 Kirk-Othomer (4 th ed. at 271-311), incorporated herein by reference.
  • Active components and inactive components for use inks are listed in (1995) The Encyclopedia of Chemical Technology, 14 Kirk-Othomer (4 l h ed. at 482-503), incorporated herein by reference.
  • Active components and inactive components for use in dyes are listed in (1993) The Encyclopedia of Chemical Technology, 8 Kirk-Othomer (4 th ed. at 533-860), incorporated herein by reference. Active components and inactive components for use in fire retardants are listed in (1993) The Encyclopedia of Chemical Technology, 10 Kirk-Othomer (4 th ed. at 930- 1022), incorporated herein by reference.
  • Active components and inactive components for use in antifreezes and deicers are listed in (1992) The Encyclopedia of Chemical Technology, 3 Kirk-Othomer (4 th ed. at 347-367), incorporated herein by reference. Active components and inactive components for use in cement are listed in (1993) The Encyclopedia of Chemical Technology, 5 Kirk-Othomer (4 th ed. at 564), incorporated herein by reference.
  • the term "component” means any substance that is combined, mixed, or processed with the compound-of-interest to form a sample or impurities, for example, trace impurities left behind after synthesis or manufacture of the compound-of- interest.
  • component includes solvents in the sample.
  • the term component also encompasses the compound-of-interest itself.
  • the compound-of-interests to be screened can be any useful compound including, but not limited to, pharmaceuticals, dietary supplements, nutraceuticals, agrochemicals, or alternative medicines.
  • the invention is particularly well-suited for screening solid-forms of a single low-molecular- weight organic molecules. Thus, the invention encompasses arrays of diverse solid-forms of a single low- molecular-weight molecule.
  • a single substance can exist in one or more physical states having different properties thereby classified herein as different components.
  • the amorphous and crystalline forms of an identical compound are classified as different components.
  • Components can be large molecules (i.e., molecules having a molecular weight of greater than about 1000 g/mol), such as large-molecule pharmaceuticals, oligonucleotides, polynucleotides, oligonucleotide conjugates, polynucleotide conjugates, proteins, peptides, peptidomimetics, or polysaccharides or small molecules (i.e., molecules having a molecular weight of less than about 1000 g/mol) such as small-molecule pharmaceuticals, hormones, nucleotides, nucleosides, steroids, or aminoacids.
  • Components can also be chiral or optically-active substances or compounds, such as optically-active solvents, optically-active reagents, or optically-active catalysts.
  • components promote or inhibit or otherwise effect precipitation, formation, crystallization, or nucleation of solid-forms, preferably, solid-forms of the compound-of-interest.
  • a component can be a substance whose intended effect in an array sample is to induce, inhibit, prevent, modify, or reverse formation of solid-forms of the compound-of-interest.
  • components include, but are not limited to, excipients; solvents; salts; acids; bases; gases; small molecules, such as hormones, steroids, nucleotides, nucleosides, and aminoacids; large molecules, such as oligonucleotides, polynucleotides, oligonucleotide and polynucleotide conjugates, proteins, peptides, peptidomimetics, and polysaccharides; pharmaceuticals; dietary supplements; alternative medicines; nutraceuticals; sensory compounds; agrochemicals; the active component of a consumer formulation; and the active component of an industrial formulation; crystallization additives, such as additives that promote and/or control nucleation, additives that affect crystal habit, and additives that affect polymorphic form; additives that affect particle or crystal size; additives that structurally stabilize crystalline or amorphous solid-forms; additives that dissolve solid- forms; additives that inhibit crystallization or solid formation; optically-active solvents; optically-active solvents
  • Components include acidic substances and basic substances. Such substances can react to form a salt with the compound-of-interest or other components present in a sample. When a salt of the compound-of-interest is desired, salt forming components will generally be used in stoichiometric quantities. Components that are basic in nature are capable of forming a wide variety of salts with various inorganic and organic acids.
  • suitable acids are those that form the following salts with basic compounds: chloride, bromide, iodide, acetate, salicylate, benzenesulfonate, benzoate, bicarbonate, bitartrate, calcium edetate, camsylate, carbonate, citrate, edetate, edisylate, estolate, esylate, fumarate, gluceptate, gluconate, glutamate, glycoUylarsanilate, hexylresorcinate, hydrabamine, hydroxynaphthoate, isethionate, lactate, lactobionate, malate, maleate, mandelate, mesylate, methylsulfate, muscate, napsylate, nitrate, panthothenate, phosphate/diphosphate, polygalacturonate, salicylate, stearate, succinate, sulfate, tannate, tartrate, t
  • excipient refers to substances used to formulate actives into pharmaceutical formulations. Preferably, an excipient does not lower or interfere with the primary therapeutic effect of the active, more preferably, an excipient is therapeutically inert.
  • excipient encompasses carriers, solvents, diluents, vehicles, stabilizers, and binders. Excipients can also be those substances present in a pharmaceutical formulation as an indirect result of the manufacturing process. Preferably, excipients are approved for or considered to be safe for human and animal administration, i.e., GRAS substances (generally regarded as safe).
  • GRAS substances are listed by the Food and Drug administration in the Code of Federal Regulations (CFR) at 21 CFR 182 and 21 CFR 184, incorporated herein by reference.
  • suitable excipients include, but are not limited to, acidulents, such as lactic acid, hydrochloric acid, and tartaric acid; solubilizing components, such as non-ionic, cationic, and anionic surfactants; absorbents, such as bentonite, cellulose, and kaolin; alkalizing components, such as diethanolamine, potassium citrate, and sodium bicarbonate; anticaking components, such as calcium phosphate tribasic, magnesium trisilicate, and talc; antimicrobial components, such as benzoic acid, sorbic acid, benzyl alcohol, benzethonium chloride, bronopol, alkyl parabens, cetrimide, phenol, phenylmercuric acetate, thimerosol, and phenoxyethanol; antioxidants, such as ascorbic acid, alpha
  • the arrays of the invention will contain a solvent as one on the components. Solvents may influence and direct the formation of solid-forms through polarity, viscosity, boiling point, volatility, charge distribution, and molecular shape. The solvent identity and concentration is one way to control saturation.
  • the term "experimental parameters" means the physical or chemical conditions under which a sample is subjected and the time during which the sample is subjected to such conditions.
  • Experimental parameters include, but are not limited to, the temperature, time, pH, amount or the concentration of a component, component identity, solvent removal rate, and solvent composition.
  • Sub-arrays or even individual samples within an array can be subjected to processing parameters that are different from the processing parameters to which other sub-arrays or samples, within the same array, are subjected. Processing parameters will differ between sub-arrays or samples when they are intentionally varied to induce a measurable change in the sample's properties. Thus, according to the invention, minor variations, such as those introduced by slight adjustment errors, are not considered intentionally varied.
  • an “interaction” means that the components as a mixture display a property (e.g., the ability to solubilize a specific pharmaceutical) of a different magnitude or value than the same property displayed by each component in isolation. Interactions between components will affect the properties of samples. Merely for example, a particular combination and ratio of excipients can interact such that the combination has a high solubilizing power for a particular pharmaceutical. Once such an interaction is detected, it can be exploited to develop enhanced formulations for the pharmaceutical.
  • a property e.g., the ability to solubilize a specific pharmaceutical
  • the term "property” means a structural, physical, pharmacological, or chemical characteristic of a sample, preferably, a structural, physical, pharmacological, or chemical characteristics of a compound-of-interest.
  • the properties of a sample, as well as the interactions or the manifestations or outcomes of those interactions arising from or involving the original sample, can be analyzed using methods or techniques known in the art.
  • Some examples of these methods or techniques are Raman and infrared spectroscopy, ultraviolet spectroscopy, x-ray diffraction, scanning electron microscopy, transmission electron microscopy, near field scanning optical microscopy, far field scanning optical microscopy, atomic force microscopy, micro-thermal analysis, differential analyis, nuclear magnetic resonance spectroscopy, gas chromatography, and high-pressure or high- performance liquid chromatography.
  • Preferred properties are those that relate to the efficacy, safety, stability, or utility of the compound-of-interest or a formulation thereof.
  • properties include physical properties, such as stability, solubility, dissolution, permeability, and partitioning; mechanical properties, such as compressibility, compactability, and flow characteristics; the formulation's sensory properties, such as color, taste, and smell; and properties that affect the utility, such as absorption, bioavailability, toxicity, metabolic profile, and potency.
  • Other properties include those which affect the compound-of- interest's behavior and ease of processing in a crystallizer or a formulating machine.
  • Such processing properties are closely related to the solid-form's mechanical properties and its physical state, especially degree of agglomeration.
  • Concerning pharmaceuticals, dietary supplements, alternative medicines, and nutraceuticals, optimizing physical and utility properties of their solid-forms can result in a lowered required dose for the same therapeutic effect.
  • Structural properties include, but are not limited to, whether the compound-of- interest can be crystallize, whether it is solid, and if solid, is it crystalline or amorphous, and if crystalline, the polymorphic form and a description of the crystal habit. Structural properties also include the composition, such as whether the solid-form is a hydrate, solvate, or a salt. Examples of structural property are surface-to-volume ratio and the degree of agglomeration of the particles. Surface-to-volume ratio decreases with the degree of agglomeration. It is well known that a high surface-to- volume ratio improves the solubility rate. Small-size particles have high surface-to-volume ratio.
  • the surface-to- volume ratio is also influenced by the crystal habit, for example, the surface-to-volume ratio increases from spherical shape to needle shape to dendritic shape. Porosity also affects the surface-to-volume ratio, for example, solid-forms having channels or pores (e.g., inclusions, such as hydrates and solvates) have a high surface-to-volume ratio.
  • Still another structural property is particle size and particle-size distribution. For example, depending on concentrations, the presence of inhibitors or impurities, and other conditions, particles can form from solution in different sizes and size distributions.
  • Particulate matter produced by precipitation or crystallization, has a distribution of sizes that varies in a definite way throughout the size range.
  • Particle- and crystal-size distribution is generally expressed as a population distribution relating to the number of particles at each size.
  • particle and crystal size distribution have very important clinical aspects, such as bioavailability.
  • compounds or compositions that promote small crystal size can be of clinical importance.
  • Physical properties include, but are not limited to, physical stability, melting point, solubility, strength, hardness, compressibility, and compactability.
  • Physical stability refers to a compound's or composition's ability to maintain its physical form, for example maintaining particle size; maintaining crystal or amorphous form; maintaining complexed form, such as hydrates and solvates; resistance to absorption of ambient moisture; and maintaining of mechanical properties, such as compressibility and flow characteristics.
  • Methods for measuring physical stability include spectroscopy, sieving or testing, microscopy, sedimentation, stream scanning, and light scattering. Polymorphic changes, for example, are usually detected by differential scanning calorimetry or quantitative infrared analysis.
  • Solubility refers to the equilibrium solubility or steady state and is measured as weight component/volume solvent.
  • an active component such as a pharmaceutical substance has an aqueous solubility of less than about 1 milligram/milliliter in the physiological pH range of 1-7, a potential bioavailability problem exists.
  • Descriptive terms used to describe solubility given in parts of solvent for 1 part of solute are: very soluble ( ⁇ 1 part); freely soluble (from 1 to 10 parts); soluble (from 10 to 30 parts); sparingly soluble (from 30 to 100 parts); slightly soluble (from 100 to 1,000 parts); very slightly soluble (from 1,000 to 10,000 parts); and insoluble (> 10,000 parts).
  • the solubility can be tested by mixing the sample with a test solvent and agitating the sample at a constant temperature until equilibrium is achieved. Equilibrium usually occurs upon agitating the samples for 6 to 24 hours. If the component is acidic or basic, its solubility can be influenced by pH and one of skill in the art will take such factors into consideration when testing the solubility properties of a sample. Once equilibrium has occurred, the sample can be tested to determine the amount of component dissolved using standard technology, such as mass spectroscopy, HPLC, UV spectroscopy, fluorescence spectroscopy, gas chromatography, optical density, or by colorimetery. For a discussion of the theory and methods of measuring solubility see Streng et al., 1984 J. Pharm. Sci.
  • Dissolution refers to the rate atwhich a solid enters into solution.
  • factors affect dissolution such as solubility, particle size, crystalline state, and the presence of diluents, disintegrants, or other excipients.
  • Chemical properties include, but are not limited to chemical stability, such as susceptibility to oxidation and reactivity with other compounds, such as acids, bases, or chelating agents. Chemical stability refers to resistance to chemical reactions induced, for example, by heat, ultraviolet radiation, moisture, chemical reactions between components, or oxygen.
  • Well known methods for measuring chemical stability include mass spectroscopy, UN- VIS spectroscopy, HPLC, gas chromatography, and liquid chromatography-mass spectroscopy (LC-MS).
  • solid-form means a form of a solid substance, element, or chemical compound that is defined and differentiated from other solid-forms according to its physical state and properties.
  • the basic requirements for array and sample preparation and screening thereof are: (1) manually or electronically designing the experiment; (2) a distribution mechanism to add components and the compound-of-interest to separate sites, for example, on an array plate having sample wells or sample vessels.
  • the experiment design is performed electroncially using computer software, and the distribution mechanism is automated and controlled by computer software, which can optionally be linked to the experimental design software, and can vary at least one addition variable, e.g., the identity of the component(s) and/or the component concentration, more preferably, two or more variables.
  • Such material handling technologies and robotics are well known to those skilled in the art.
  • individual components can be placed at the appropriate sample site manually. This pick and place technique is also known to those skilled in the art.
  • the testing mechanism is automated and driven by a computer.
  • the system further comprises a processing mechanism to process the samples after component addition.
  • the system can have a processing station the process the samples after preparation.
  • automated experimentation apparatus and cognates thereof means a high-throughput apparatus for performing large numbers of experiments having at least one experimental step performed by computer-controlled apparatus. Human operators may direct the apparatus, or manually perform some portions of the process (e.g. moving groups of plates from one automated station to another, or performing an experimental procedure on results identified using a computer).
  • Fully automated experimentation apparatus and cognates thereof means a high-throughput apparatus for performing large numbers of experiments in which all experimental steps are performed by computer-controlled apparatus.
  • High throughput experimentation apparatus and cognates thereof means an apparatus for performing at least two simultaneous experiments.
  • high-throughput solid-form screening means: performing a method for screening a plurality of solid-forms of a compound-of-interest, the method comprising the steps of (a) preparing at least 24 samples, each sample comprising the compound-of-interest and one or more components, wherein an amount of the compound- of-interest in each sample is less than about 1 gram; (b) processing at least 24 of the samples to generate an array wherein at least two of the processed samples comprise a solid-form of the compound-of-interest; and (c) analyzing the processed samples to detect at least one solid-form.
  • one or more experiments are performed to characterize at least one detected solid form.
  • high-throughput formulation screening means: performing a method to (1) measure or detect an interaction between components; or (2) test or optimize one or more properties of a formulation of an active-component; the method comprising the steps of: (a) preparing an array of samples, each sample comprising a component-in-common and at least one additional component, wherein each sample differs from a plurality of other samples with respect to at least one of:
  • model means a computational entity that accepts as inputs data representing values of experimental parameters and/or results and produces as output data representing an estimate of one or more properties expected to result from an experiment corresponding to the input.
  • compound or “compound-of-interest” include, but are not limited to, pharmaceuticals, dietary supplements, alternative medicines, nutraceuticals, sensory compounds, agrochemicals, the active component of a consumer formulation, and the active component of an industrial formulation, h one preferred embodiment, the compound or compound-of-interest is a pharmaceutical.
  • array systems that can be adapted for use in the invention disclosed herein. Such systems may require modification, which is well within ordinary skill in the art.
  • Examples of companies having array systems include Gene Logic of Gaithersburg, MD (see U.S. patent No. 5,843,767 to Beattie), Luminex Corp., Austin, TX, Beckman Instruments, Fullerton, CA, MicroFab Technologies, Piano, TX, Nanogen, San Diego, CA, and Hyseq, Sunnyvale, CA. These devices test samples based on a variety of different systems. All include thousands of microscopic channels that direct components into test wells, where reactions can occur. These systems are connected to computers for analysis of the data using appropriate software and data sets.
  • the Beckman Instruments system can deliver nanoliter samples of 96 or 384-arrays, and is particularly well suited for hybridization analysis of nucleotide molecule sequences.
  • the MicroFab Technologies system delivers sample using inkjet printers to aliquot discrete samples into wells. These and other systems can be adapted as required for use herein.
  • the combinations of the compound-of-interest and various components at various concentrations and combinations can be generated using standard formulating software (e.g., Matlab software, commercially available from Mathworks, Natick, Massachusetts).
  • the combinations thus generated can be downloaded into a spread sheet, such as Microsoft EXCEL or stored in a relational database.
  • a work list can be generated for instructing the automated distribution mechanism to prepare an array of samples according to the various combinations generated by the formulating software.
  • the work list can be generated using standard programming methods according to the automated distribution mechanism that is being used.
  • the use of so-called work lists simply allows a file to be used as the process command rather than discrete programmed steps.
  • the work list combines the formulation output of the formulating program with the appropriate commands in a file format directly readable by the automatic distribution mechanism.
  • the automated distribution mechanism delivers at least one compound-of- interest, such as a pharmaceutical, as well as various additional components, such as solvents and additives, to each sample well.
  • the automated distribution mechanism can deliver multiple amounts of each component.
  • Automated liquid and solid distribution systems are well known and commercially available, such as the Tecan Genesis, from Tecan-US, RTP, North Carolina.
  • the robotic arm can collect and dispense the solutions, solvents, additives, or compound-of-interest form the stock plate to a sample well or sample vessel.
  • the process is repeated until array is completed, for example, generating an array that moves from wells at left to right and from top to bottom in increasing polarity or non-polarity of solvent. Alternatively, it is often appropriate to randomize the positions of the samples in the array rather than placing them in order.
  • the samples are then mixed. For example, the robotic arm moves up and down in each well plate for a set number of times to ensure proper mixing.
  • Liquid handling devices manufactured by vendors such as Tecan, Hamilton and Advanced Chemtech are all capable of being used in the invention.
  • a prerequisite for all liquid handling devices is the ability to dispense to a sealed or sealable reaction vessel and have chemical compatibility for a wide range of solvent properties.
  • the liquid handling device specifically manufactured for organic syntheses are the most desirable for application to crystallization due to the chemical compatibility issues.
  • Robbins Scientific manufactures the Flexchem reaction block which consists of a Teflon reaction block with removable gasketed top and bottom plates. This reaction block is in the standard footprint of a 96-well microtiter plate and provides for individually sealed reaction chambers for each well.
  • the gasketing material is typically Viton, neoprene/Viton, or Teflon coated Viton, and acts as a septum to seal each well.
  • the Flexchem reaction vessel is designed to be reusable in that the reaction block can be cleaned and reused with new gasket material.
  • An array can be prepared, processed, and screened as follows.
  • the first step comprises selecting the component sources, preferably, at one or more concentrations.
  • at least one component source can deliver a compound-of-interest and one can deliver a solvent.
  • adding the compound-of-interest and components to a plurality of sample sites, such as sample wells or sample vessels on a sample plate to give an array of unprocessed samples.
  • the array can then be processed according to the purpose and objective of the experiment, and one of skill in the art will readily ascertain the appropriate processing conditions.
  • the automated distribution mechanism as described above is used to distribute or add components.
  • solid formation can be induced by introducing a nucleation or precipitation event.
  • this involves subjecting a supersaturated solution to some form of energy, such as ultrasound or mechanical stimulation or by inducing supersaturation by adding additional components.
  • some form of energy such as ultrasound or mechanical stimulation
  • inducing supersaturation by adding additional components.
  • actively inducing solid formation is not required, and solid formation occurs spontaneously with the passage of time and/or changes in temperature.
  • the array can be processed according to the design and objective of the experiment.
  • Processing includes mixing; agitating; heating; cooling; adjusting the pressure; adding additional components, such as crystallization aids, nucleation promoters, nucleation inhibitors, acids, or bases, etc.; stirring; milling; filtering; centrifuging, emulsifying, subjecting one or more of the samples to mechanical stimulation; ultrasound; or laser energy; or subjection the samples to temperature gradient or simply allowing the samples to stand for a period of time at a specified temperature.
  • additional components such as crystallization aids, nucleation promoters, nucleation inhibitors, acids, or bases, etc.
  • stirring milling
  • filtering centrifuging, emulsifying, subjecting one or more of the samples to mechanical stimulation; ultrasound; or laser energy; or subjection the samples to temperature gradient or simply allowing the samples to stand for a period of time at a specified temperature.
  • processing will comprise dissolving either the compound-of-interest or one or more components.
  • Solubility is commonly controlled by the composition (identity of components and/or the compound-of-interest) or by the temperature. The latter is most common in industrial crystallizers where a solution of a substance is cooled from a temperature at which it is in solution to one at which the solubility is exceeded.
  • the array can be processed by heating to a temperature (Tl), preferably to. a temperature at which the all the solids are completely in solution. The samples are then cooled, to a lower temperature (T2). The presence of solids can then determined. Implementation of this approach in arrays can be done on an individual sample site basis or for the entire array (i.e., all the samples in parallel).
  • each sample site could be warmed by local heating to a point at which the components and the compound-of-interest are dissolved. This step is followed by cooling through local thermal conduction or convection.
  • a temperature sensor in each sample site can be used to record the temperature when the first crystal or precipitate is detected, i one embodiment, all the sample sites are processed individually with respect to temperature and small heaters, cooling coils, and temperature sensors for each sample site are provided and controlled. This approach is useful if each sample site has the same composition and the experiment is designed to sample a large number of temperature profiles to find those profiles that produce desired solid-forms.
  • the composition of each sample site is controlled and the entire array is heated and cooled as a unit. The advantage of the latter approach is that much simpler heating, cooling, and controlling systems can be utilized.
  • thermal profiles are investigated by simultaneous experiments on identical array stages. Thus, a high-throughput matrix of experiments in both composition and thermal profiles can be obtained by parallel operation.
  • Temperature can be controlled in either a static or dynamic manner.
  • Static temperature means that a set incubation temperature is used throughout the experiment.
  • a temperature gradient can be used. For example, the temperature can be lowered at a certain rate throughout the experiment.
  • temperature can be controlled in a way as to have both static and dynamic components. For example, a constant temperature (e.g., 60°C) is maintained during the mixing of crystallization reagents. After mixing of reagents is complete, controlled temperature decline is initiated (e.g., 60°C to about 25°C over 35 minutes).
  • Stand-alone devices employing Peltier-effect cooling and joule-heating are commercially available for use with microtiter plate footprints.
  • a standard thermocycler used for PCR such as those manufactured by MJ Research or PE Biosystems, can also be used to accomplish the temperature control.
  • the use of these devices necessitates the use of conical vials of conical bottom micro-well plates. If greater throughput or increased user autonomy is required, then full-scale systems such as the advanced
  • Chemtech Benchmark Omega 96TM or Venture 596TM would be the platforms of choice. Both of these platforms utilize 96-well reaction blocks made from TeflonTM. These reaction blocks can be rapidly and precisely controlled from -70 to 150°C with complete isolation between individual wells. Also, both systems operate under inert atmospheres of nitrogen or argon and utilize all chemically inert liquid handling elements.
  • the Omega 496 system has simultaneous independent dual coaxial probes for liquid handling, while the Venture 596 system has 2 independent 8-channel probe heads with independent z-control. Moreover, the Venture 596 system can process up to 10,000 reactions simultaneously. Both systems offer complete autonomy of operation. Array samples can be incubated for various lengths of time (e.g., 5 minutes, 60 minutes, 48 hours, etc.).
  • phase changes can be time dependent, it can be advantageous to monitors arrays experiments as a function of time.
  • time control is very important, for example, the first solid-form to crystallize may not be the most stable, but rather a metastable form which can then convert to a form stable over a period of time. This process is called "ageing”.
  • Ageing also can be associated with changes in crystal size and/or habit. This type of ageing phenomena is called Ostwald ripening.
  • the pH of the sample medium can determine the physical state and properties of the solid phase that is generated.
  • the pH can be controlled by the addition of inorganic and organic acids and bases.
  • the pH of samples can be monitored with standard pH meters modified according to the volume of the sample.
  • the system is used in conjunction with one or more high-throughput automated experimentation apparatus, such as Transform Pharmaceutical's FASTTM formulation system or CRYSTALMAXTM crystal discovery system.
  • FAST and CRYSTALMAX systems are described in U.S. Patent Applications 09/628,667 and 09/756,092, respectively, (the FASTTM and CRYSTALMAXTM applications) which are incorporated herein by reference. Words used herein are intended to be consistent with the FASTTM and CRYSTALMAXTM applications.
  • the system is used to plan and assess experiments performed with the CRYSTALMAXTM and FASTTM systems.
  • This embodiment includes a process informatics subsystem for controlling and acquiring data from the CRYSTALMAX and FAST systems, and a computational informatics subsystem for performing data mining, simulation, molecular modeling, high- dimensional multivariate visualizations of data, data clustering, categorizations, and other data processing.
  • These subsystems operate on a shared database system used to store experimental results and analyses, as well as data derived from sources other than the process informatics subsystem, such as external databases and literature.
  • a combination of experimental parameters which may be varied by an automated experimentation apparatus such as FAST or CRYSTALMAX is selected 801.
  • a plurality of distinct combinations of values of the experimental parameters is then determined, each combination corresponding to a distinct experiment 802.
  • the automated experimentation apparatus is caused to conduct a set of experiments, each experiment corresponding to a distinct combination of the plurality of distinct combinations 803.
  • the process informatics subsystem is also used to determine a collection of experimental results of the first set of experiments, the collection comprising a plurality of individual result sets, where each individual result set corresponds to a distinct experiment 804.
  • the experimental results determined in the methods of the present invention may be obtained using one or more of the following techniques: spectroscopy; sieving or testing; microscopy; optical imaging; sedimentation; stream scanning; light scattering; differential scanning calorimetry; infrared spectroscopy; quantitative infrared analysis; x-ray diffraction or x-ray powder diffraction; or Raman spectroscopy, including dispersive Raman spectroscopy and Fourier transform Raman or FT-Raman spectroscopy.
  • Raman microscopes are available from a number of commercial sources, including, for example, the ALMEGATM Dispersive Raman and the FT-Raman 960 (both available from Thermo Nicolet Corporation, Madison, WI, USA).
  • the first collection of experimental results is processed through the computational informatics subsystem to determine a second combination of parameters variable by the automated experimentation apparatus 801, and a second plurality of distinct combinations of values of the experimental parameters 802, each combination of the second plurality corresponding to a distinct experiment.
  • This process preferably may be iterated indefinitely to yield a third, fourth, fifth, or arbitrary number of subsequent pluralities of distinct combinations of experimental parameters, each combination corresponding to a distinct experiment.
  • each combination preferably corresponds to a distinct experiment, in some circumstances multiples of each experiment are preferably performed to provide reliable data, particularly in stochastic processes such as crystallization.
  • one or more multivariate visualizations 805, generated models 806 and 807, and/or unsupervised learning or clustering methods 808 are preferably employed.
  • Generated models preferably comprise one or more regression model 806 and/or one or more classification model 807.
  • a classification model takes one or more inputs and provides at least one class assignment as an output.
  • a regression model takes one or more inputs and provides at least one output representing a variable that has a continuous range (e.g. at least one real or complex interval).
  • a multivariate visualization of the results of a clustering calculation may be used to determine a classifier, as described more fully below.
  • a classification model comprising a qualitative solubility assay may, for example, be used in conjunction with the FAST automated experimentation apparatus to assign a soluble/not soluble label to each individual experimental result set.
  • a regression model comprising a quantitative solubility assay may, for example, be used with FAST to assign an estimated solubility, expressed for example in mg/ml.
  • a classification model may, for example, be used to assign a polymorph label to each individual experimental result set producing a solid form.
  • a regression model may be used with CRYSTALMAX to, for example, provide an estimated nucleation time.
  • the input may comprise experimental parameters and/or results.
  • Regression models may include (but are not limited to) linear regression, stepwise linear regression, additive models (AM), projection pursuit regression (PPR), recursive partitioning regression (RPR), alternating conditional expectations (ACE), additivity and variance stabilization (AVAS), locally weighted regression (LOESS), neural networks, Multivariate Adaptive Regression Splines (MARS), principal components regression, partial least squares regression, and support vector regression.
  • AM additive models
  • PPR projection pursuit regression
  • RPR recursive partitioning regression
  • ACE alternating conditional expectations
  • AVAS additivity and variance stabilization
  • LOESS locally weighted regression
  • neural networks including Multivariate Adaptive Regression Splines (MARS), principal components regression, partial least squares regression, and support vector regression.
  • MERS Multivariate Adaptive Regression Splines
  • Classification models may include (but are not limited to) decision trees (generated by algorithm like C4.5, C5.0, or CART), support vector machines, neural networks, k- nearest neighbor classifiers, Bayesian classifiers (with probability density functions preferably determined using Gaussian Mixture Models or Parzen windowing), self- organizing maps. These and other classification models are described in Duda, Richard O., et al. Pattern Classification, second edition. John Wiley & Sons, Inc. 2001.
  • One or more models may preferably be generated based on the results of unsupervised learning and/or clustering applied to one or more collections of experimental result sets.
  • a collection of individual experimental result sets is received, a similarity measure is calculated between a plurality of pairs of individual experimental result sets, and based on the similarity measure, a plurality of clusters of experimental result sets is determined, and one or more properties is determined for at least one solid form from each of at least two of the clusters.
  • a three- dimensional visualization is preferably used to display the clusters.
  • each experimental result set in each cluster corresponds to a single solid form, preferably a single crystal polymorph.
  • solid form labels may be determined for each experimental result set for each cluster. Based on these labels and the experimental result sets and experimental parameters, a classifier model and/or a regression model may generated.
  • Unsupervised learning and clustering methods may include hierarchical clustering, including agglomerative and stepwise-optimal hierarchical clustering, k-means clustering, Gaussian mixture model clustering, or self-organizing-map (SOM) -based clustering, clustering using the Chameleon, DBScan, CURE, or Rock clustering algorithms, unsupervised Bayesian learning, Principal Component Analysis, Nonlinear Component Analysis, Independent Component Analysis, and multidimensional scaling. See Kohonen, T., "Self-organizing Maps", Springer Series in Information Sciences, Vol.
  • the experimental result sets comprise Raman spectra
  • the similarity measure comprises the Tanimoto distance between bit- vectors representing peaks in Raman spectra
  • the clustering method comprises hierarchical k- means clustering.
  • the results of the preferred hierarchical clustering of Raman spectra described above are preferably displayed using a three-dimensional representation (two spatial coordinates plus color or shading) as shown in Fig. 12.
  • additional combinations of experimental parameters are determined to meet one or more experimental objectives.
  • the experimental objectives preferably include determining boundaries between solid forms, determining regions in which desired properties of formulations change rapidly with respect to changes experimental parameters (not necessarily with respect to time), extrema (e.g. maxima or minima) of experimental results or parameters, regions within a class boundary, or regions of ambiguity or low confidence in classification or regression results.
  • data representing a collection of experimental results is processed as a collection of points in a space, such as a topological space, a metric space, or a vector space comprising dimensions corresponding to the dimensions of the experimental parameters 105.
  • regions of the space are determined based on one or more experimental objectives such as determining boundaries between solid forms 106.
  • the second (or subsequent) plurality of distinct combinations of values of the experimental parameters is preferably selected 107 to more fully define such boundaries or regions, and to include combinations of parameters as far as possible from such boundaries or regions.
  • the second (or subsequent) plurality of distinct combinations of values of the experimental parameters my be selected 107 to more fully define regions in which desired properties of formulations change rapidly with respect to changes experimental parameters, extrema (e.g. maxima or minima) of experimental results or parameters, regions within a class boundary, and regions of ambiguity or low confidence in classification or regression results.
  • the first collection of experimental results may preferably be processed by the computational informatics subsystem to display a multivariate visualization in which the experimental results are represented as points of varying size, color, shape or other indicia in a multidimensional space representing the space or a projection thereof, such as that shown in Fig. 2.
  • a visualization By viewing such a visualization, an operator of the computational informatics subsystem may visually identify boundaries or regions of rapid change.
  • Fig. 7 depicts a multivariate visualization of one thousand experimental formulations, each with three excipients and one measured property of the formulation.
  • Four axes 701, 702, 703 and 704 are depicted. Distinct formulations appear along the length of the axes, with each formulation appearing at the same place along all four axes.
  • the width of each line on each axis is proportional to the normalized magnitude of the value represented by the axis for the corresponding experiment. For example, concentrations of excipients may be shown by the widths along axes 702, 703, and 704 and solubility may be shown by the widths along 701.
  • combinations of values of the second plurality may be selected along a line or curve fitted to the data using a regression model, or selected based on a predicted classification.
  • Other examples of selection include random or uniform selection within a range of values for results exhibiting desired properties, or selection within a range determined by use of one or more classification algorithms, such as a range classified as likely to correspond to a single solid form, or a range classified as likely to include a boundary between sets of experimental conditions within which two distinct solid forms are produced.
  • Selection of additional values may also include a change of experimental parameters such as selection of different reagents or excipients likely to interact with observed species or solid forms.
  • the automated experimentation apparatus is activated to conduct a second set of experiments, each experiment of the second set corresponding to a distinct combination of values of the second plurality 108.
  • the process informatics subsystem is also used to determine a second collection of experimental results of the second set of experiments, the second collection comprising a plurality of individual results, each individual result corresponding to a distinct experiment 109.
  • the computational informatics subsystem is then used to select a multicomponent chemical composition of matter or solid form based on the first collection of experimental results and the second collection of experimental results. Alternatively, additional iterations of experimentation may be performed prior to selecting the multicomponent chemical composition or solid form.
  • data representing the second or subsequent collection of experimental results is processed as a collection of points in a space such as topological space, metric space, or vector space comprising dimensions corresponding to the dimensions of the experimental parameters 110.
  • a set of experimental parameter values and a resulting multicomponent chemical composition of matter or solid form is preferably selected having optimum or near-optimum properties that do not change significantly within a region of the space corresponding to an expected range of conditions of manufacture, storage, and administration or use 111.
  • an experimental search for a formulation having an optimized solubility is performed.
  • This example is schematically illustrated in Figs. 2 - 4.
  • a combination of experimental parameters which may be varied by an automated experimentation apparatus is selected.
  • the selected experimental parameters are concentrations of three selected excipients, schematically illustrated as a three-dimensional metric space in Fig. 2 comprising axes 201, 202, 203 of plot 204.
  • a first plurality of distinct combinations of values of the experimental parameters is then determined, each combination corresponding to a distinct experiment. The combinations of values correspond to the coordinates of each of the data points 204 shown in Fig. 2.
  • each experiment comprises a sample formulation.
  • Each sample formulation comprises one or more target active agents at fixed concentrations and a combination of excipients having concentrations corresponding to one of the data points 204 of Fig. 2.
  • the process informatics subsystem is also used to determine a first collection of experimental results of the first set of experiments, the first collection comprising a plurality of individual result sets, where each individual result set corresponds to a distinct experiment.
  • Each individual result set in this example embodiment comprises a measurement of the amount of an active component dissolved using standard technology such as mass spectroscopy, HPLC, UV spectroscopy, fluorescence spectroscopy, gas chromatography, optical density or colorimetry.
  • the measured experimental results are stored in a shared database, and thereby made available to the computational informatics subsystem.
  • the computational informatics subsystem may then be used to visualize the experimental data in a high-dimensional multivariate display, hi the display illustrated in Fig. 2, the size of plotted data points are used to depict the measured solubility of the active portion of the formulations corresponding to the data points 204, wherein larger sizes indicate greater solubility.
  • a second plurality of distinct combinations of values of the experimental parameters is determined, based on the measured experimental results. For example, as shown in Fig. 3, certain experimental results or groups of experimental results 305, 306, 307 are identified as exhibiting measured results of interest. As shown in Fig. 4, additional data points 406, 407, 408 corresponding to distinct experiments may be selected to more accurately characterize the formulation near the results of interest.
  • a portion of the experimental results 305 of interest are solubility maxima or near-maxima in the sample.
  • Another portion of the results of interest are groups of results 306 for which the rate of solubility change with respect to one or more experimental parameters is high relative to other groups of the sample.
  • a third set of results of interest in this example are results 307 for which the rate of change of solubility with respect to one or more experimental parameters is low relative to other groups of the sample.
  • each experiment comprises a sample formulation of the same one or more active agents and excipients as the first set of experiments.
  • concentration or identity of the one or more target active agents, or the identities of one or more excipients could be changed (or the numbers of excipients increased) for the second set of experiments.
  • the process informatics subsystem is also used to determine a second collection of experimental results of the second set of experiments. In this example embodiment, the same measurement of solubility used for the first set of experiments is performed for the second set. Alternatively, a different measurement could be used for the second set.
  • measured experimental results are stored in the shared database, and thereby made available to the computational informatics subsystem.
  • the computational informatics subsystem may then be used to visualize the experimental data in a multivariate display.
  • the size of plotted data points are used to depict the measured solubility of the active portion of the formulations corresponding to the data points of the second set of experiments 406. Additional iterations of selecting additional data points and automated experimentation may be performed.
  • an optimum formulation is selected.
  • an optimum formulation is one having a high relative solubility, but comprising a combination of concentrations of excipients away from areas in which solubility changes relatively rapidly with concentration of one or more excipients.
  • Fig. 5 An example prefe ⁇ ed method to assess a collection of experimental results in a search for novel or known solid forms is schematically illustrated in Fig. 5.
  • the method comprises the steps of: determining low-energy crystal polymporphs via simulation model 501; characterizing the low-energy crystal polymorphs according to expected experimental results by standard techniques such as by calculated X-ray powder or single-crystal diffraction results 502; conducting a first collection of crystallization experiments 503; measuring a collection of actual experimental results such as actual X-ray powder diffraction for the crystals produced by the first collection of crystallization experiments 504; comparing the expected experimental results with the actual experimental results 505; determining if any lowest-energy structures were not included in the solid forms produced by a first collection of experiments 506.
  • low-energy polymorphs are determined by using multivariate optimization such as hydrogen-bond-biased simulated annealing to locate a plurality of lowest-energy structures with the model.
  • Lattice energy is determined by summing all the pairwise atom-atom interactions between a central molecule and all the su ⁇ ounding molecules. The calculation of lattice energy is discussed in Myerson, Molecular Modeling Applications in Crystallization, pp. 117-125, Cambridge University Press (1999), which is incorporated herein in its entirety by reference. The lattice energy is a useful parameter because its calculated value can be compared with the experimental enthalpy of sublimation. This allows one to verify the description of the intermolecular interactions by the force field in question.
  • An advantage of the calculated value of the crystal lattice energy is that it can be separated into specific interactions along certain directions and into the constituent atom- atom pair-wise contributions. This provides the link between molecular and crystal structures.
  • the calculation of lattice energies thus provides a profile of the important intermolecular interactions that co ⁇ espond to particular classes of compounds. It also provides an understanding of the nature of the intermolecular interactions that lead to a particular crystal packing a ⁇ angement.
  • the potentials used include those that incorporate attractive or repulsive components, coulombic interaction, or hydrogen- bonding interaction.
  • V vdw -A/r + B/r
  • a and B are the atom-atom parameters and r is the interatomic distance.
  • the parameters A and B can be obtained by fitting the chosen potential to observable properties such as crystal structure, heats of sublimation, and hardness measurements, h accordance with the present invention, the results of a first principles calculation can also be used in the curve fitting step as an alternative to using actual experimental data to determine the parameters A and B.
  • An example prefe ⁇ ed multivariate optimization method used to search for a low energy crystal structure is the hydrogen-bond-biased simulated annealing monte carlo
  • SAMC systemic Chem. Soc. 1999, 121 2115- 2122, the entirety of which is incorporated herein by reference.
  • QUANTA molecular modeling program
  • CHARMm a program that minimizes its energy using a program such as CHARMm, also available from Molecular Simulations Inc.
  • CHARMm an academic version of the program, refe ⁇ ed to as CHARMM, is also available from Harvard University.
  • the molecular frame of reference is preferably positioned at the molecule's center of mass.
  • a trial crystal structure with a given space group is built using a program such as CHARMM.
  • the limits used are: (a) a "loose" window for the lengths of the axes of the unit cell (for example, 30% greater than the largest molecular dimension as an upper limit and 3% less than the smallest dimension of the molecule as the lower limit); and (b) a range of angles co ⁇ esponding to the allowable degree of molecular rotation.
  • the above limits are chosen to ensure that any van der Waals interaction or contact present in the initially found crystal structure is not energetically unfavorable.
  • CE crystal energy
  • CE ⁇ ij[A i j/ri 12 - B ij / ⁇ j 6 + qjq j /4 ⁇ ei j ]
  • unit cell dimensions are used as variables to be searched in the presence of the crystalline environment, and structures are chosen based on whether or not their hydrogen-bonding energies exceed a given value.
  • the SAMC method may be summarized as follows: (1) building, parameterizing, and minimizing the energy of a molecule that will be used for the crystal construction.
  • step (1) creating a reference crystal structure based on the molecule created in step (1) by randomly varying the unit cell parameters appropriate for the given crystal space group and the preselected molecular rotational constraint. (3) calculating the crystal energy of the reference crystal and setting the value obtained as CE 0 .
  • step (2) (4) generating another crystal as in step (2) based on the given molecular constraints.
  • the selected structures are characterized, according to expected experimental results, for solid forms co ⁇ esponding to the structures.
  • the lowest energy structures are characterized by calculating X-ray powder diffraction results for each structure.
  • the process informatics subsystem After characterizing the lowest energy structures according to expected experimental results, the process informatics subsystem preferably compares the expected experimental results with a set of actual experimental results from a first set of experiments. Based on the comparison, the process informatics subsystem assesses which, if any, of the lowest energy structures was produced by each experiment.
  • the process informatics subsystem compares the expected experimental results with the actual experimental results of the first set of experiments by comparing calculated X-ray powder diffraction results for the lowest energy structures with experimentally measured X-ray powder diffraction results for the first set of experiments.
  • the comparison is preferably performed by calculating a similarity measure of the expected experimental results and the actual experimentally measured results.
  • the similarity measure is calculated as
  • SI d «F «d
  • each experiment is classified as to which predicted lowest energy form was produced. This may be accomplished by classifying each experiment as the predicted low-energy structure having a calculated X-ray powder diffraction pattern most similar to the measured X-ray powder diffraction pattern according to the similarity measure applied. Using the prefe ⁇ ed similarity measure, each experiment is classified as the predicted low-energy structure for which SI measure is the least. Preferably, a threshold is also applied, so that measured patterns for which the least SI is above the threshold are classified as "unknown.”
  • Fig. 5 One prefe ⁇ ed way of planning additional experiments to find missing expected solid forms is schematically illustrated in Fig. 5: generating a predictive model, such as a regression model, of the experimental parameters and results from the first set of experiments 507, and interpolating or extrapolating those results to determine sets of experimental parameters likely to produce predicted low-energy structures not produced in the first set of experiments 508.
  • a predictive model such as a regression model
  • MARS Multivariate Adaptive Regression Splines
  • regression methods such as linear regression, stepwise linear regression, additive models (AM), projection pursuit regression (PPR), recursive partitioning regression (RPR), alternating conditional expectations (ACE), additivity and variance stabilization (AVAS), locally weighted regression (LOESS), neural networks, principal components regression, partial least squares regression, and support vector regression may also be used.
  • AM additive models
  • PPR projection pursuit regression
  • RPR recursive partitioning regression
  • ACE alternating conditional expectations
  • AVAS additivity and variance stabilization
  • LOESS locally weighted regression
  • neural networks principal components regression, partial least squares regression, and support vector regression may also be used.
  • the model is used to determine a second set of distinct combinations of experimental parameters that, according to the model, should produce predicted solid forms that were not produced in the first set of experiments. This may be accomplished by setting the response variable to a value co ⁇ esponding to a missing predicted solid form and solving the predictive model for one or more sets of values of experimental parameters giving that result.
  • the solution may be found using algebraic or numerical methods readily apparent to those of ordinary skill in the art of using such predictive models.
  • the automated experimentation apparatus is activated to conduct a second set of experiments, each experiment of the second set co ⁇ esponding to a distinct combination of experimental parameters determined using the predictive model.
  • the second set of experimental results are preferably again compared against predicted experimental results as described above to classify the results according to predicted solid forms and to determine if all predicted low-energy structures have been produced.
  • an optimum or near-optimum solid form is selected 509.
  • data representing the collection of experimental results is processed as a collection of points in a space, such as a topological space, metric space, or vector space comprising dimensions co ⁇ esponding to the dimensions of the experimental parameters 510.
  • regions of the space in which the selected solid form is produced, and the boundaries between such regions and regions in which other forms or no solid forms are produced may be determined. Additional sets of experiments may be performed to define such regions with greater resolution 511.
  • set of experimental parameters is thereby determined as far as possible from such boundaries 512. Such a set of parameters is advantageous for manufacture because small variations in manufacturing conditions are less likely to produce a solid form other than the selected form.
  • FIG. 9 Another, more prefe ⁇ ed, example method to assess collection of experimental results in a search for novel or known solid forms is schematically illustrated in Fig. 9.
  • the method comprises the steps of: calculating a plurality of clusters of experiments resulting in a solid form based on a measure of similarity of characteristics of the experimental results and/or parameters 905; further characterizing at least one sample solid form from each cluster 907; based on the characterization, assigning a solid-form label to each experiment of each cluster 908.
  • the method also comprises additional optional steps of: displaying clusters in a multivariate display 906, generating a classifier to assign a solid form label to an input comprising experimental parameters and/or results 909, generating a regression model 910 to estimate one or more expected property based on an input comprising experimental parameters and/or results, selecting a combination of experimental parameters variable by an automated experimentation apparatus 901 , generating a plurality of sets of values of the experimental parameters, providing one or more of the sets to a classifier and/or regression model as input, and based on the output of the classifier and/or regression model, selecting combinations a plurality of sets of values of experimental parameters co ⁇ esponding to experiments to be performed 902, providing selected sets of values of experimental parameters to an automated experimentation apparatus 903, and determining Raman spectra for experiments that produce solid forms 904.
  • the method further optionally also comprises the step of: providing one or more individual experimental result sets as input to a classifier and/or regression model.
  • the foregoing steps may be iterated an arbitrary number of times, with variations in the steps performed in each iteration.
  • a prefe ⁇ ed embodiment for implementing this method comprises the CRYSTALMAX automated experimentation apparatus configured to determine Raman spectra of solid forms, as described more fully in application 60/318,138, which is incorporated herein by reference.
  • the computational informatics subsystem receives from the process informatics subsystem a plurality of Raman spectra, each spectrum co ⁇ esponding to a distinct experiment.
  • the computational informatics subsystem then preferably processes the spectra in six stages as schematically illustrated in the flow chart 270 in FIG. 10: preprocessing 271, peak finding 275, similarity matrix calculation 281, spectral clustering 283, and visualization 285.
  • This process preferably also includes a binary spectra generation stage 279 between peak finding 275 and similarity matrix calculation 281.
  • Each of these stages will be described in detail in the following sections. The following discussion relates to Raman spectra, but the same steps can easily be modified and applied to other types of spectra, or other forms of data.
  • the purpose of the preprocessing step is to eliminate artifacts of the Raman spectra that are not caused by Raman scattering and to make the Raman scattering peaks as sharp as possible.
  • Raman spectra often contain large fluorescence peaks spread over a broad spectral range and much smaller, na ⁇ ower peaks caused by measurement, glass background, and instrument noise.
  • filtering techniques can be used in order to eliminate these deleterious features: Fourier filtering, wavelet filtering, matched filtering, etc.
  • the prefe ⁇ ed embodiment uses a matched filter approach where the filter kernel is a zero-mean, symmetric product of sinusoids matched approximately to an average Raman peak width.
  • the specific form of the matched filter is given by the following equation:
  • the matched filter equation includes a normalization term as follows:
  • the normalization factor ensures that the magnitude of a filtered signal is about the same as the magnitude of the original, and that all peaks point in the right direction.
  • filtered points having a value less than zero are automatically set to equal zero.
  • the bandwidth of the main kernel peak is set to be equal to or slightly smaller than the bandwidth of an average Raman peak.
  • matched filters of this type When matched filters of this type are viewed in the Fourier domain, they may be seen to perform as bandpass filters, almost completely attenuating low- and high-frequency spectral components.
  • this filter detects peaks that are very close to each other. A raw, unfiltered spectrum will often display two close peaks as a main peak with a "shoulder" on one of its sides. After a matched filtering step, though, the shoulder will often be distinguished as a separate peak. This separation is useful for the peak picking procedure described below.
  • FIG. 11 shows a Raman intensity of a fluorescent sample as a function of Raman shift and the co ⁇ esponding filtered spectra after the fluorescence has been removed.
  • Peak Finding The process of finding peaks in a spectrum is an important aspect of many spectral processing techniques, and there are many commercially available programs for performing this task. Many variations of peak finding algorithms can be found in the literature. An example of a simple algorithm is to find the zero-crossings of the first derivative of a smoothed or unsmoothed spectrum, and then to select the concave down zero-crossings that meets certain height and separation criteria.
  • the peak finding function available in the software provided with the Almega dispersive Raman spectrometer (Thermo Nicolet, OMNIC software) was used. This function allows the threshold and sensitivity values to be set by the user. The threshold sets the lowest peak height that will be counted as a peak, and the sensitivity controls how far apart each peak must be to count as a separate peak.
  • binary spectral representations are preferably created for all of the spectra.
  • These binary spectra representations comprise vectors of ones and zeros. Each zero represents the absence of a peak feature and each one represents the presence of a peak feature.
  • a peak feature is simply a peak that occurs within a certain spectral range, preferably a few wave numbers.
  • the vectors for all of the spectra are preferably the same length and co ⁇ esponding elements of these vectors co ⁇ espond to the same peak feature.
  • the peaks are clustered into ranges of peak features.
  • the process used to perform this peak clustering is a modified form of a 1- dimensional iterative k-means clustering algorithm.
  • the process begins with the picked peaks from a single spectrum. These peak positions are used to define the centers of peak feature ranges.
  • the peak feature bins cover a range of wave numbers that can be specified by a user (the default is 5 wave numbers).
  • the rest of the spectra are then iteratively added to the peak feature representation. At each step any peak that fits into a pre-existing peak feature range is added to that range. For any peak that does not fit into a range, a new range is created. Centers are not permitted to move so that peak feature ranges overlap.
  • Spectrum 1 for example, has a peak in each of the ranges co ⁇ esponding to wave numbers 270, 350, 430 and 510, but does not have a peak in the bin associated with wave number 390.
  • a similarity measure between pairs of spectra is calculated.
  • the similarity measure is calculated between each distinct pair of spectra.
  • This similarity measurement is used to determine one or more clusters of similar spectra.
  • Example similarity measurements include metric distances such as Hamming, Lp, or Euclidean distance, or non-metric similarity indices such a the Tversky similarity index (or its derivatives such as the Tanimoto or Dice coefficients) or functions thereof.
  • N mn number of peak values in a first spectrum within 5 cm-1 of a peak value in a second spectrum
  • N m number of peak values in the first spectrum
  • N n number of peak values in the second spectrum. Similarity can then be calculated using various methods, e.g. ,
  • FIGS. 12A and 12B were generated using the foregoing method.
  • a number of l's in a first spectrum that are zeros in a second spectrum
  • b number of l's in a second spectrum that are zeros in the first spectrum
  • c number of l's in the first spectrum that are ones in the second spectrum
  • the Tanimoto coefficient is equal to the Tversky index with ⁇ and ⁇ equal to 1.
  • the Dice coefficient is equal to the Tversky index with ⁇ and ⁇ equal to 0.5.
  • 1 - Tanimoto coefficient is used as the (dis)similarity measure. Additional metrics, including metrics based on other metrics, may be used in alternative embodiments of the invention.
  • the selected similarity measure is preferably calculated for each distinct pair of spectra. This calculation may be represented as a symmetric similarity matrix with each element (i,j) of the matrix representing the distance or similarity between spectra i and j.
  • Spectral Clustering may be represented as a symmetric similarity matrix with each element (i,j) of the matrix representing the distance or similarity between spectra i and j.
  • clustering algorithm Using the similarity measure calculated between spectra, a clustering algorithm, is applied to determine one or more clusters of similar spectra.
  • clustering algorithms may be used.
  • Hierarchical clustering including agglomerative and stepwise-optimal hierarchical clustering, k-means clustering, Gaussian mixture model clustering, or self-organizing-map (SOM) -based clustering, clustering using the Chameleon, DBScan, CURE, or Rock clustering algorithms are some of the clustering methods that may be used. See Kohonen, T., “Self-organizing Maps", Springer Series in Information Sciences, Vol.
  • hierarchical clustering is used as a first-pass method of spectral data processing. Using the information from the hierarchical clustering run, a step of k-means clustering is then performed with user-defined cluster numbers and initial centroid positions.
  • the number of clusters can be automatically selected in order to minimize some metric, such as the sum-of-squared e ⁇ or or the trace or determinant of the within cluster scatter matrix. See Duda, R., Hart, P., and Stork, D., "Pattern
  • Hierarchical clustering produces a dendrogram-sorted list of spectra, so that similar spectra are very close to each other.
  • This dendrogram-sorted list is used to rearrange both axes of the original similarity matrix and then present the "sorted similarity" matrix in a coded manner wherein similarity indicia are used for each similarity region, including without limitation different symbols (such as cross-hatching), shades of color, or different colors.
  • the "sorted similarity" matrix is presented in a color- coded manner, with regions of high similarity in warm colors and regions of low similarity in cool colors.
  • An example Raman clustering application is written in Visual Basic (VB).
  • This VB program allows a user to select a group of spectra and set processing parameters. Preprocessing is performed within the VB application and then the filtered spectra are sent to OMNIC for peak finding through the Macros/Pro DDE communication layer provided by OMNIC. Once peaks are found, binary spectrum and distance matrix generation is performed in the main VB application. Then, the distance matrix is sent to MATLAB through a socket communication layer, h MATLAB, clusters are generated and visualizations are created. These visualizations are made available to the main VB application through a web server present on the same machine as the MATLAB instance. The resulting visualization allows for the easy identification of groups of samples that all have similar physical structure.
  • clusters After clusters have been calculated, it is desirable to co ⁇ elate clusters with co ⁇ esponding solid forms. This is preferably accomplished by selecting one sample, or preferably, a plurality of samples from each cluster and characterizing the selected sample or samples with additional experimental techniques, such as powder X-Ray diffraction and or differential calorimetry. In a prefe ⁇ ed embodiment, the clustering and techniques result in clusters of experimental results all of which produced the same solid form. Based on the additional experimental characterization, solid-form labels reflecting the solid form produced by the experiments of the cluster are associated with the experimental result sets by the computational informatics subsystem.
  • additional experimental techniques such as powder X-Ray diffraction and or differential calorimetry.
  • regression models are preferably used in combination with the experimental result sets and the co ⁇ esponding values of experimental parameters to generate one or more regression models and/or classifiers for use in planning and assessing further experiments, or estimating properties for conditions that have not been experimentally verified.
  • regression models may be used to estimate properties over a continuous range reflecting an infinite number of different conditions.
  • One prefe ⁇ ed approach to generating the first set of experiments in what may be a succession of iterative experiments is to systematically create a diverse set of experiments in a property/descriptor space of potential interest.
  • Experimental parameters that may be varied by the automated experimentation apparatus must be selected, and values for those parameters determined, in order to conduct a set of experiments. Parameters may be selected by scientists acting on knowledge of the chemistry of the compound-of-interest, or the computational informatics system may guide the selection or suggest parameters by querying the database for similar compounds of interest and analyzing which descriptors were significant in prior experiments and/or simulations. The descriptors may then be mapped onto parameters that may be varied by the automated experimentation apparatus.
  • Stepwise algorithms are straightforward, but can lead to suboptimal results.
  • a regression or classification is performed using each possible independent variable. The variable that performs the best is added to the model. The regression or classification is then performed again with the first variable and all possible second variables. The best second variable is then added to the model. Additional variables are added in similar fashion. This process is preferably continued a set number of times or until some measure of predictive ability reaches a minimum.
  • Genetic algorithms randomly create a population of models (with say 2-10 independent variables). Then regression or classification is performed with each of the models, and each model is scored for predictive power. Genetic operations (e.g. mutation - adding, deleting, or changing variables; or crosssover - taking variables from one model and mixing them with variables from another model) are then performed on the models based on their score. The process is iterated either a set number of times or until some condition is met (such as 10-100 iterations without improvement). The best model from any population is then selected. Simulated annealing starts with a randomly generated model (the "current solution"). The model is then perturbed (e.g. add, delete, or change a variable).
  • a random number is generated uniformly between 0 and 1 and this random number is less than exp(-(R 2 curre nt-R 2 new)/T) -
  • the parameter T is a temperature parameter that is lowered over the course of iterations making it increasingly harder to accept a "worse” solution as the cu ⁇ ent solution This process is continues until a stopping condition is met (such as T reaching a predetermined value). See Zheng, W. and Tropsha, A, Novel Variable Selection Quantitative Structure- Property Relationship ApproachBased on the k-Nearest-Neighbor Principle, J. Chem. Inf. Comput. Sci. 2000, 40, 185-194.
  • Another set of choices that a user or informatics system preferably makes are the manner in which the selected descriptors will be mathematically combined in order to create generate values for experimental parameters co ⁇ esponding to experiments to be performed.
  • the arithmetic mean, the geometric mean, the standard deviation, or the geometric standard deviation may be computed by weight, by volume, or by mole fraction to combine descriptors mathematically to determine values for experimental parameters.
  • the Hansen solubility parameter for a mixture is calculated by taking the arithmetic mean by volume of the Hansen solubility parameters for each of the mixture components.
  • the user or the informatics system preferably also chooses an algorithm for performing diversification and a metric by which diversification is to be measured.
  • a tournament scheme is used in which for every mixture added to an experiment, 20-100 possible mixtures are randomly generated (random in their components), their mixture descriptors are calculated, and the one that is furthest from any other point already in the experiment is added to the experiment. This method seeks to maximize the minimum distance between any two experiments. Other algorithms may be used.
  • the user also preferably selects the maximum number of components in an experiment.
  • the methods and systems of the present invention may be used to great advantage in a system and method for pharmaceutical product development "pipeline" management.
  • Pharmaceutical companies typically have a large number of compounds and new therapeutic uses of compounds at a variety of stages in the development, testing, and marketing process, or "pipeline.” Many of these stages, particular pharmaceutical testing stages, are dramatically expensive, and the number of compounds that proceed from one stage to the next is often reduced by an order of magnitude.
  • pharmaceutical testing means any investigations required or used for approval of a New Drug Application by the U.S. Food and Drug Administration.
  • Fig. 13 schematically represents a simplified set of development, testing and marketing stages, and a co ⁇ esponding qualitative indication of the manifold reduction in the number of compounds at each stage in the process.
  • the various stages in the product development process also provide data that change the ultimate form, formulation, manufacturing and distribution of the very small percentage of compounds in the development process that ultimately are marketed as pharmaceuticals. Many of these changes must be reflected in the Food and Drug Administration New Drug Application process, and may form a part of the labeling for the product. Because some results from very expensive portions of the product development process, such as human safety or effectiveness testing, may not be usable if the product must be reformulated or produced in a different solid form, it is desirable to determine the full range of solid forms of a candidate compound that may be produced, and to assess the properties of the selected solid form and formulation before large amounts of resources are expended on a solid form or formulation that differs from the final product.
  • an important goal of market-driven pharmaceutical concerns is to maximize the profitability of the entire pipeline.
  • candidate compounds that pose expensive difficulties due to unfavorable solid forms or formulation difficulties may be lowered in priority until those difficulties are overcome, and those candidates with acceptable formulations and solid forms may be increased in priority.
  • the process of drug research and development is extremely complex and successfully taking a pharmaceutical product through the complex pathway from discovery of the API through subsequent safety and efficacy testing requires scientific expertise in many different areas.
  • the stages of drug research and development include, but are not limited to: discovery of an API; synthesis, and chemical and physical characterization of the API; pharmacology; pharmacokinetics; formulation development, synthesis, and chemical and physical characterization of formulations; animal safety testing; chemistry, manufacturing, and control testing; and clinical studies and human studies, including without limitation Phase I, Phase II, Phase III, and Phase IV, and various "sub- phases" of the same.
  • Figure 14 There are many ways to describe or group the pharmaceutical research and development process, and one example is shown in Figure 14. It is important to recognize that while Figure 14 is a linear representation, drug research and development is often a circular, or iterative process, as the results of one step may indicate that additional work in a previously carried out needs to be performed. Moreover, many steps occur throughout the research and development process, and beyond product launch. For example, chemistry, manufacturing and controls (CMC) testing (which includes, for example, analytical data, quality analysis, and stability testing in accordance with cGMP and other global regulatory standards) is essential at all phases of the research and development process, as well as the production and post-approval processes.
  • CMC manufacturing and controls
  • formulation development and scale-up are sometimes designated as "preclinical” because of the nature of the work being performed, or they can be designated as “clinical” because they generally occur in association with and throughout the human testing phase of the process.
  • the goals of the preclinical or nonclinical safety evaluation include: characterization of toxic effects with respect to target organs, dose dependence, relationship to exposure, and potential reversibility. This information is important for estimating an initial safe starting dose for the human trials and the identification of parameters for clinical monitoring for potential adverse effects.
  • Preclinical research or testing includes, but is not limited to API synthesis, chemical and physical characterization of API, pharmacology, toxicology, metabolism, bioanalysis, pharmaceutical analysis, and biosafety testing.
  • Preclinical research can also encompass studies that relate to the transition from preclinical to clinical, including Phase I studies, which typically provide a preliminary evaluation of a compounds safety, tolerance, and pharmacokinetics.
  • Phase I studies typically provide a preliminary evaluation of a compounds safety, tolerance, and pharmacokinetics.
  • preclinical research is carried out throughout all phases of the drug research and development process, and thorough preclinical research or testing maximizes the likelihood that a drug will be successful in the clinical phases of the process.
  • Chemical and physical characterization of a compound (or API) during the preclinical phase can include, but is not limited to: identification (e.g., by spectroscopic analysis); determining chromatographic purity; hygroscopocity determination; solubility studies; pKa determination; partitioning studies; characterization studies; short-term or accelerated stability; early formulation and excipient compatibility studies; developing chromatographic analysis of the API; residual solvents identification and quantitation; and reference standard certification.
  • Preclinical pharmacology and toxicology studies can include, but are not limited to, studies on or of: the pharmacological actions of the compound relating to its proposed therapeutic indication(s); defining the pharmacological properties of the compound; possible adverse effects of the compound; the toxicological effects of the compound relating to the compound's intended clinical uses, including without limitation assessing acute, sub-acute, and chronic toxicity (including single and repeated dose toxicity studies), and carcinogenicity; toxicities related to the compound's particular mode of administration or conditions of use; local tolerance; the effects of the compound on reproduction and on developing fetuses, or reproduction toxicity; genotoxicity; and the absorption, distribution, metabolism, and excretion of the compound in animals.
  • Metabolism studies relate to determining how a compound is absorbed, distributed, metabolized, and eliminated (ADME) from the body.
  • ADME whole-body autoradiography
  • WBA whole-body autoradiography
  • AME human metabolism
  • C14 or another radiolabeled form of the compound can be studied in human clinical trials using C14 or another radiolabeled form of the compound.
  • Safety pharmacology studies in general are studies that investigate the potential undesirable pharmacodynamic effects of a compound on physiological functions when a subject is exposed to the compound in the proposed therapeutic dosage range and above.
  • the goals of safety pharmacology studies include (1) identifying undesirable pharmacodynamic properties that may have relevance to human safety, (2) evaluating adverse pharmacodynamic and/or pathophysiological effects of a compound observed in toxicology and/or clinical studies, and (3) investigating the mechanism or mode of action of observed and or suspected adverse pharmacodynamic effects.
  • Some safety pharmacology determinations or endpoints may be incorporated into the design of a various toxicology, pharmacokinetic, and clinical studies, while in other cases specific safety pharmacology studies are needed.
  • the specific safety pharmacology studies that should be conducted and their design will vary based on the individual properties and intended uses of a compound or pharmaceuticals, but in general, factors to be considered when determining what safety pharmacology studies are needed include, but are not limited to: adverse effects related to the therapeutic class of the API, as the mechanism of action of the API may suggest certain adverse effects; adverse effects associated with members of the chemical or therapeutic class, but independent of the primary pharmacodynamic effects; ligand binding or enzyme assay data suggesting a potential for adverse effects; results from previous safety pharmacology studies, from secondary pharmacodynamic studies, from toxicology studies, or from human use that suggest further investigation to establish and characterize the relevance of the findings to potential adverse effects in humans.
  • a hierarchy of organ systems can be used, wherein the hierarchy is established based on an organ's importance with respect to life-supporting functions.
  • Vital organs or systems whose functions are acutely critical for life (e.g., the respiratory, cardiovascular, and central nervous systems), are the most important to assess in safety pharmacology studies.
  • Other organ systems e.g., the gastrointestinal or renal systems
  • safety pharmacology evaluation of these other systems may be of particular importance when considering factors such as likely clinical trial or patient populations (e.g., gastrointestinal tract in Crohn's disease, or immune system in immunocompromised patients.).
  • Preclinical or nonclinical studies such as pharmacokinetics and pharmacodynamics provide very important information for transitioning a compound from the preclinical stage to Phase I of the clinical stage.
  • the areas or parameters that encompassed by pharmacokinetic studies include, without limitation, studies directed to: bioavailability/bioequivalence; absolute bioavailability; food effect; age or gender effects; nutritional effects; dose tolerance escalation; dose proportionality; controlled release; drug/drug interaction; and radiolabeled AME.
  • Parameters that can be evaluated or measured in pharmacodynamic studies include, but are not limited to: gastric acid secretion; central nervous system (CNS) effects; cardiovascular effects; platelet aggregation; blood coagulation; bronchial challenge; wheal and flare response; endocrinology; and immunology
  • Human clinical trials are conducted to demonstrate the efficacy and safety of a compound, beginning with a relatively low exposure in a small number of subjects, followed by clinical trials in which exposure usually increases by dose, duration, and/or size of the exposed patient population.
  • Clinical trials are extended based on the demonstration of adequate safety in the previous clinical trial(s) as well as additional preclinical or nonclinical safety information that is available as the clinical trials proceed. Serious adverse clinical or nonclinical findings may influence whether clinical trials are continued and/or suggest the need for additional nonclinical studies and a reevaluation of previous clinical adverse events to resolve the issue.
  • Phase I clinical studies are the first human exposure studies, and generally consist of single dose studies, followed by dose escalation and short-term repeated dose studies to evaluate pharmacokinetic parameters and tolerance. Phase I clinical trials are often conducted in healthy volunteers (or "normals") but may also include patients suffering from the indication to be treated. Phase II clinical studies (also refe ⁇ ed to as therapeutic exploratory studies) comprise exploratory efficacy and safety studies in patients Phase III clinical studies (also refe ⁇ ed to as therapeutic confirmatory studies) comprise confirmatory clinical trials for efficacy and safety in patient populations.
  • Phase IV studies also refe ⁇ ed to as therapeutic use studies or periapproval studies. Phase IV studies generally go beyond the prior demonstration of the drug's safety, effectiveness, and dose definition. While Phase IV studies are not necessary for product approval, they often are very important for optimizing the product's or drug's use, and such studies include, but are not limited to: additional drug- drug interaction studies; dose-response, or safety studies; and studies designed to support an extended claim under the approved indication (e.g., mortality/morbidity studies).
  • Phase IV is also used to refer to studies focused on other aspects, including without limitation: expanding scientific understanding of the product; competitive comparisons to support marketing claims; safety confirmation, such as evaluating rare or infrequent adverse events; evaluating the drug for specialized markets such as pediatric use; and expansion of the product labeling and optimization of dose.
  • CMC Testing is continuous element of the pharmaceutical research and development process, and studies or tests that are refe ⁇ ed to as CMC testing are often also grouped into other steps of the R&D process.
  • CMC studies or tests include without limitation:
  • Discovery/preclinical to Phase I studies including but not limited to, identification (e.g., by spectroscopic analysis), determining chromatographic purity, hygroscopocity determination, solubility studies, pKa determination, partitioning studies, characterization studies, short-term or accelerated stability, excipient compatibility studies, developing chromatographic analysis of the API, residual solvents identification and quantitation, and reference standard certification;
  • Phase I, II, and III studies including but not limited to, stability testing of product used in Phase I studies, validation of analytical methods for API and Phase I product, release of Phase I, II and III clinical test materials and product for long-term toxicology studies, long term stability, refinements and validations of analytical methods resulting from product and process improvements, microbial testing, stress testing, extractability/leaching studies on product container & closures, cleaning validation support, and analysis of manufacturing process validation samples; and
  • Post- Approval studies including but not limited to, quality control (QC) release of excipients, QC release of API, QC release of formulated product, scale-up and post- approval changes, and post-approval stability studies.
  • QC quality control
  • the architecture of a prefe ⁇ ed example embodiment is schematically illustrated in Fig. 6.
  • the computational informatics subsystem comprises two database systems, an Online Transaction Processing (OLTP) database system 601 and an Online Analytical Processing database system (OLAP) 602.
  • the OLTP system 601 comprises an Oracle 8i object-oriented relational database management system with partitioning option running under Solaris 2.8 on a Sun Microsystems Sunfire 4810 with four 750 megahertz UltraSparc III CPUs and 4 gigabytes of RAM
  • the OLAP system 602 comprises an Oracle 9i Data Warehouse database utilizing a snowflake schema running under Solaris 2.8 on a Sun Microsystems Sunfire 6800 with sixteen 900 megahertz UltraSparc III CPUs and 24 gigabytes of RAM.
  • Windows systems 603 preferably comprise a variety of personal workstation hardware ranging from typical desktop PCs to high-performance workstations with visualization hardware.
  • the OLTP 601 and OLAP 602 are preferably interconnected with gigabit Ethernet.
  • Windows systems 603 are typically connected to the computational informatics subsystem by a variety of heterogeneous networks, including the Internet.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Library & Information Science (AREA)
  • Medicinal Preparation (AREA)
  • Analysing Materials By The Use Of Radiation (AREA)

Abstract

L'invention concerne un procédé et un système permettant de planifier et d'évaluer les résultats du criblage haut rendement de formes solides et du criblage haut rendement de préparations. L'invention concerne également des procédés et des systèmes permettant d'utiliser le criblage haut rendement de formes solides et le criblage haut rendement de préparations afin de choisir des composés et des préparations de manière à les analyser ultérieurement, ou encore, pour hiérarchiser l'analyse.
PCT/US2002/014601 2001-05-11 2002-05-10 Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants Ceased WO2002093297A2 (fr)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU2002303683A AU2002303683A1 (en) 2001-05-11 2002-05-10 Methods for high-throughput screening and computer modelling of pharmaceutical compounds
IL15878702A IL158787A0 (en) 2001-05-11 2002-05-10 Medthod and system for planning, performing, and assessing high-throughput screening of multicomponent chemical compositions and solid forms of compounds
CA002447047A CA2447047A1 (fr) 2001-05-11 2002-05-10 Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants
EP02731725A EP1395808A4 (fr) 2001-05-11 2002-05-10 Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US29032001P 2001-05-11 2001-05-11
US60/290,320 2001-05-11

Publications (2)

Publication Number Publication Date
WO2002093297A2 true WO2002093297A2 (fr) 2002-11-21
WO2002093297A3 WO2002093297A3 (fr) 2003-09-25

Family

ID=23115464

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/014601 Ceased WO2002093297A2 (fr) 2001-05-11 2002-05-10 Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants

Country Status (5)

Country Link
EP (1) EP1395808A4 (fr)
AU (1) AU2002303683A1 (fr)
CA (1) CA2447047A1 (fr)
IL (1) IL158787A0 (fr)
WO (1) WO2002093297A2 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7061605B2 (en) 2000-01-07 2006-06-13 Transform Pharmaceuticals, Inc. Apparatus and method for high-throughput preparation and spectroscopic classification and characterization of compositions
CN106897331A (zh) * 2016-06-07 2017-06-27 阿里巴巴集团控股有限公司 用户关键位置数据获取方法及装置
WO2017199006A1 (fr) * 2016-05-17 2017-11-23 Tap Biosystems (Phc) Limited Développement automatisé de bioprocessus
EP2676215A4 (fr) * 2011-02-14 2018-01-24 Carnegie Mellon University Apprentissage de la prédiction des effets de composés sur des cibles
CN109142574A (zh) * 2018-08-29 2019-01-04 广东药科大学 基于svr研究葛根芩连汤改善胰岛素抵抗的物质基础的方法
CN112154174A (zh) * 2018-05-22 2020-12-29 埃克森美孚化学专利公司 形成膜的方法及其相关的计算装置
CN113963756A (zh) * 2021-05-18 2022-01-21 杭州剂泰医药科技有限责任公司 一种药物制剂处方开发的平台及应用
WO2024169983A1 (fr) * 2023-02-17 2024-08-22 Basf Se Optimisation de recettes concernant le comportement de combustion de mousses de polyuréthane

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357966B (zh) * 2017-06-21 2020-07-03 山东科技大学 一种回采巷道围岩稳定性预测的评估方法
US12446600B2 (en) 2021-01-26 2025-10-21 Citrine Informatics, Inc. Two-stage sampling for accelerated deformulation generation

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5705333A (en) * 1994-08-05 1998-01-06 The Regents Of The University Of California Peptide-based nucleic acid mimics(PENAMS)
AU9378098A (en) * 1997-09-05 1999-03-22 Molecular Simulations, Inc. Modeling interactions with atomic parameters including anisotropic dipole polarizability
US20020061599A1 (en) * 1999-12-30 2002-05-23 Elling Christian E. Method of identifying ligands of biological target molecules
US6907350B2 (en) * 2000-03-13 2005-06-14 Chugai Seiyaku Kabushiki Kaisha Method, system and apparatus for handling information on chemical substances
CA2441931A1 (fr) * 2001-03-23 2002-10-03 Douglas A. Levinson Procede et systeme servant a planifier, executer et evaluer un criblage a haut rendement de compositions chimiques a constituants multiples et de formes solides de composes

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7061605B2 (en) 2000-01-07 2006-06-13 Transform Pharmaceuticals, Inc. Apparatus and method for high-throughput preparation and spectroscopic classification and characterization of compositions
EP2676215A4 (fr) * 2011-02-14 2018-01-24 Carnegie Mellon University Apprentissage de la prédiction des effets de composés sur des cibles
US11525836B2 (en) 2016-05-17 2022-12-13 The Automation Partnership (Cambridge) Limited Automated bioprocess development
WO2017199006A1 (fr) * 2016-05-17 2017-11-23 Tap Biosystems (Phc) Limited Développement automatisé de bioprocessus
CN109154588A (zh) * 2016-05-17 2019-01-04 自动化合作关系(剑桥)有限公司 自动化生物过程开发
CN106897331B (zh) * 2016-06-07 2020-09-11 阿里巴巴集团控股有限公司 用户关键位置数据获取方法及装置
CN106897331A (zh) * 2016-06-07 2017-06-27 阿里巴巴集团控股有限公司 用户关键位置数据获取方法及装置
CN112154174A (zh) * 2018-05-22 2020-12-29 埃克森美孚化学专利公司 形成膜的方法及其相关的计算装置
CN112154174B (zh) * 2018-05-22 2024-02-06 埃克森美孚化学专利公司 形成膜的方法及其相关的计算装置
CN109142574A (zh) * 2018-08-29 2019-01-04 广东药科大学 基于svr研究葛根芩连汤改善胰岛素抵抗的物质基础的方法
CN109142574B (zh) * 2018-08-29 2021-05-18 广东药科大学 基于svr研究葛根芩连汤改善胰岛素抵抗的物质基础的方法
CN113963756A (zh) * 2021-05-18 2022-01-21 杭州剂泰医药科技有限责任公司 一种药物制剂处方开发的平台及应用
WO2024169983A1 (fr) * 2023-02-17 2024-08-22 Basf Se Optimisation de recettes concernant le comportement de combustion de mousses de polyuréthane

Also Published As

Publication number Publication date
IL158787A0 (en) 2004-05-12
EP1395808A4 (fr) 2007-01-03
WO2002093297A3 (fr) 2003-09-25
EP1395808A2 (fr) 2004-03-10
CA2447047A1 (fr) 2002-11-21
AU2002303683A1 (en) 2002-11-25

Similar Documents

Publication Publication Date Title
US20050089923A9 (en) Method and system for planning, performing, and assessing high-throughput screening of multicomponent chemical compositions and solid forms of compounds
US20020177167A1 (en) Method and system for planning, performing, and assessing high-throughput screening of multicomponent chemical compositions and solid forms of compounds
US20030162226A1 (en) High-throughput formation, identification, and analysis of diverse solid-forms
US11080570B2 (en) Systems and methods for applying a convolutional network to spatial data
Merlot et al. Chemical substructures in drug discovery
WO2017062382A1 (fr) Systèmes et procédés d'application de réseau convolutionnel à des données spatiales
WO2002093297A2 (fr) Procede et systeme de planification, d'execution et d'evaluation du criblage haut rendement de compositions chimiques a composants multiples et de formes solides de composants
WO2002077772A2 (fr) Procede et systeme servant a planifier, executer et evaluer un criblage a haut rendement de compositions chimiques a constituants multiples et de formes solides de composes
Solomon Molecular modelling and drug design
Durojaye et al. Csc01 shows promise as a potential inhibitor of the oncogenic G13D mutant of KRAS: an in silico approach
Rani et al. Artificial Intelligence and Machine Learning‐Based New Drug Discovery Process with Molecular Modelling
Kolpak et al. Enhanced SAR maps: expanding the data rendering capabilities of a popular medicinal chemistry tool
Tradigo et al. Algorithms for structure comparison and analysis: docking
Vahedi et al. QSAR Study of PARP Inhibitors by GA-MLR, GA-SVM and GA-ANN Approaches
Ejiofor Computational phytochemistry, databases, and tools
Zhou Chemoinformatics and library design
CA2379160A1 (fr) Reseaux d'echantillons et test a haut rendement de ces derniers pour detecter des interactions
Kameswaran et al. Different Tools for Modern Drug Discovery Research
Kulkarni Artificial intelligence: Drug discovery and development prospective in medicinal chemistry
Anshar et al. Molecular docking and molecular dynamics simulations of Lannea coromandelica bioactive compounds as antinociceptive activity
Waszkowycz De Novo Drug Design
Rollinger et al. Computational approaches for the discovery of natural lead structures
Dubey et al. Computer aided drug design: A review
Dey et al. 2D AND 3D QSAR MODELING OF 2-AMINOPYRIMIDINE DERIVATIVES OR FLT3 KINASE INHIBITOR FOR THE TREATMENT OF ACUTE MYELOID LEUKAEMIA
Bhatt et al. MULTIDISCIPLINARY TECHNOVATION

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 158787

Country of ref document: IL

WWE Wipo information: entry into national phase

Ref document number: 2447047

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2002731725

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2002731725

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP