AU2002314715A1 - Method for correlating gene expression profiles with protein expression profiles - Google Patents
Method for correlating gene expression profiles with protein expression profilesInfo
- Publication number
- AU2002314715A1 AU2002314715A1 AU2002314715A AU2002314715A AU2002314715A1 AU 2002314715 A1 AU2002314715 A1 AU 2002314715A1 AU 2002314715 A AU2002314715 A AU 2002314715A AU 2002314715 A AU2002314715 A AU 2002314715A AU 2002314715 A1 AU2002314715 A1 AU 2002314715A1
- Authority
- AU
- Australia
- Prior art keywords
- sample
- adsorbent
- mrna
- analyte
- identifying
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 108090000623 proteins and genes Proteins 0.000 title claims description 178
- 238000000034 method Methods 0.000 title claims description 144
- 102000004169 proteins and genes Human genes 0.000 title claims description 131
- 230000014509 gene expression Effects 0.000 title claims description 67
- 239000003463 adsorbent Substances 0.000 claims description 353
- 239000000523 sample Substances 0.000 claims description 168
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 135
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 128
- 229920001184 polypeptide Polymers 0.000 claims description 123
- 238000009739 binding Methods 0.000 claims description 103
- 230000027455 binding Effects 0.000 claims description 102
- 210000004027 cell Anatomy 0.000 claims description 102
- 108020004999 messenger RNA Proteins 0.000 claims description 65
- 239000000126 substance Substances 0.000 claims description 59
- 238000004949 mass spectrometry Methods 0.000 claims description 33
- 230000000717 retained effect Effects 0.000 claims description 30
- 239000012472 biological sample Substances 0.000 claims description 28
- 239000012634 fragment Substances 0.000 claims description 27
- 230000014759 maintenance of location Effects 0.000 claims description 23
- 239000003446 ligand Substances 0.000 claims description 21
- 239000012071 phase Substances 0.000 claims description 19
- 229910052751 metal Inorganic materials 0.000 claims description 16
- 239000002184 metal Substances 0.000 claims description 16
- 239000007790 solid phase Substances 0.000 claims description 16
- 239000003795 chemical substances by application Substances 0.000 claims description 13
- 239000013522 chelant Substances 0.000 claims description 10
- 238000002966 oligonucleotide array Methods 0.000 claims description 10
- 230000001575 pathological effect Effects 0.000 claims description 10
- 230000002797 proteolythic effect Effects 0.000 claims description 10
- 239000013592 cell lysate Substances 0.000 claims description 9
- 230000013595 glycosylation Effects 0.000 claims description 9
- 238000006206 glycosylation reaction Methods 0.000 claims description 9
- 230000026731 phosphorylation Effects 0.000 claims description 9
- 238000006366 phosphorylation reaction Methods 0.000 claims description 9
- 230000015556 catabolic process Effects 0.000 claims description 7
- 206010028980 Neoplasm Diseases 0.000 claims description 6
- 201000011510 cancer Diseases 0.000 claims description 6
- 238000006731 degradation reaction Methods 0.000 claims description 6
- 238000003499 nucleic acid array Methods 0.000 claims description 6
- 238000001616 ion spectroscopy Methods 0.000 claims description 5
- 238000001502 gel electrophoresis Methods 0.000 claims description 4
- 231100000167 toxic agent Toxicity 0.000 claims description 4
- 239000003440 toxic substance Substances 0.000 claims description 4
- 210000005260 human cell Anatomy 0.000 claims description 3
- 230000005855 radiation Effects 0.000 claims description 3
- 125000003275 alpha amino acid group Chemical group 0.000 claims 3
- 239000012491 analyte Substances 0.000 description 207
- 235000018102 proteins Nutrition 0.000 description 107
- 230000003993 interaction Effects 0.000 description 90
- 239000000758 substrate Substances 0.000 description 86
- 150000007523 nucleic acids Chemical class 0.000 description 39
- 230000002209 hydrophobic effect Effects 0.000 description 38
- 102000039446 nucleic acids Human genes 0.000 description 36
- 108020004707 nucleic acids Proteins 0.000 description 36
- 238000010828 elution Methods 0.000 description 33
- 150000002500 ions Chemical class 0.000 description 32
- 230000006870 function Effects 0.000 description 31
- 238000005406 washing Methods 0.000 description 31
- 125000003729 nucleotide group Chemical group 0.000 description 27
- 150000001413 amino acids Chemical group 0.000 description 26
- 238000001514 detection method Methods 0.000 description 26
- 239000002773 nucleotide Substances 0.000 description 26
- 238000000672 surface-enhanced laser desorption--ionisation Methods 0.000 description 21
- 108091033319 polynucleotide Proteins 0.000 description 20
- 102000040430 polynucleotide Human genes 0.000 description 20
- 239000002157 polynucleotide Substances 0.000 description 20
- 239000000203 mixture Substances 0.000 description 18
- 238000009396 hybridization Methods 0.000 description 16
- -1 phosphorotriesters Chemical class 0.000 description 16
- 239000000243 solution Substances 0.000 description 16
- 238000003491 array Methods 0.000 description 15
- 239000011159 matrix material Substances 0.000 description 15
- 229910021645 metal ion Inorganic materials 0.000 description 15
- 108020004414 DNA Proteins 0.000 description 13
- 238000004458 analytical method Methods 0.000 description 13
- 238000004587 chromatography analysis Methods 0.000 description 13
- 238000000816 matrix-assisted laser desorption--ionisation Methods 0.000 description 13
- 125000000539 amino acid group Chemical group 0.000 description 12
- 125000000129 anionic group Chemical group 0.000 description 12
- 229920001222 biopolymer Polymers 0.000 description 11
- 239000007788 liquid Substances 0.000 description 11
- 239000000463 material Substances 0.000 description 11
- 238000006116 polymerization reaction Methods 0.000 description 11
- 108091034117 Oligonucleotide Proteins 0.000 description 10
- 235000001014 amino acid Nutrition 0.000 description 10
- 239000002585 base Substances 0.000 description 10
- 125000002091 cationic group Chemical group 0.000 description 10
- 239000002299 complementary DNA Substances 0.000 description 10
- 125000000524 functional group Chemical group 0.000 description 10
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 9
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 9
- 239000003153 chemical reaction reagent Substances 0.000 description 9
- 150000001875 compounds Chemical class 0.000 description 9
- 230000000694 effects Effects 0.000 description 9
- 239000007789 gas Substances 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 239000012465 retentate Substances 0.000 description 9
- 238000000926 separation method Methods 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 8
- 238000001212 derivatisation Methods 0.000 description 8
- 239000003599 detergent Substances 0.000 description 8
- 239000000975 dye Substances 0.000 description 8
- 238000011223 gene expression profiling Methods 0.000 description 8
- 229920000642 polymer Polymers 0.000 description 8
- 239000000047 product Substances 0.000 description 8
- 108020003175 receptors Proteins 0.000 description 8
- 102000005962 receptors Human genes 0.000 description 8
- 239000007787 solid Substances 0.000 description 8
- 210000001519 tissue Anatomy 0.000 description 8
- 238000013518 transcription Methods 0.000 description 8
- 230000035897 transcription Effects 0.000 description 8
- 102000004190 Enzymes Human genes 0.000 description 7
- 108090000790 Enzymes Proteins 0.000 description 7
- 102000003886 Glycoproteins Human genes 0.000 description 7
- 108090000288 Glycoproteins Proteins 0.000 description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 description 7
- 239000011324 bead Substances 0.000 description 7
- 229940088598 enzyme Drugs 0.000 description 7
- 238000005194 fractionation Methods 0.000 description 7
- 239000000499 gel Substances 0.000 description 7
- 229920002521 macromolecule Polymers 0.000 description 7
- 241000894007 species Species 0.000 description 7
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 7
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 108091005461 Nucleic proteins Proteins 0.000 description 6
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 6
- 241000700605 Viruses Species 0.000 description 6
- 150000001412 amines Chemical group 0.000 description 6
- 238000003556 assay Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 6
- 229940079593 drug Drugs 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000001747 exhibiting effect Effects 0.000 description 6
- 238000010195 expression analysis Methods 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 150000003839 salts Chemical class 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 6
- 239000002904 solvent Substances 0.000 description 6
- 239000013598 vector Substances 0.000 description 6
- 244000187656 Eucalyptus cornuta Species 0.000 description 5
- 108060003951 Immunoglobulin Proteins 0.000 description 5
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 5
- 108091005804 Peptidases Proteins 0.000 description 5
- 102000035195 Peptidases Human genes 0.000 description 5
- 239000002253 acid Substances 0.000 description 5
- 239000000427 antigen Substances 0.000 description 5
- 108091007433 antigens Proteins 0.000 description 5
- 102000036639 antigens Human genes 0.000 description 5
- 150000001720 carbohydrates Chemical class 0.000 description 5
- 239000002738 chelating agent Substances 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000009826 distribution Methods 0.000 description 5
- 239000003814 drug Substances 0.000 description 5
- 230000009881 electrostatic interaction Effects 0.000 description 5
- 230000007613 environmental effect Effects 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 102000018358 immunoglobulin Human genes 0.000 description 5
- 125000005647 linker group Chemical group 0.000 description 5
- 230000002441 reversible effect Effects 0.000 description 5
- 238000003196 serial analysis of gene expression Methods 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- 229920002307 Dextran Polymers 0.000 description 4
- 108091060211 Expressed sequence tag Proteins 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 241000233866 Fungi Species 0.000 description 4
- 239000004365 Protease Substances 0.000 description 4
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 4
- 229910052770 Uranium Inorganic materials 0.000 description 4
- 239000011358 absorbing material Substances 0.000 description 4
- 238000003287 bathing Methods 0.000 description 4
- 230000001588 bifunctional effect Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 235000014633 carbohydrates Nutrition 0.000 description 4
- 238000003776 cleavage reaction Methods 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 4
- 238000003795 desorption Methods 0.000 description 4
- 238000007598 dipping method Methods 0.000 description 4
- 239000011521 glass Substances 0.000 description 4
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 4
- 239000001257 hydrogen Substances 0.000 description 4
- 229910052739 hydrogen Inorganic materials 0.000 description 4
- 230000001965 increasing effect Effects 0.000 description 4
- 239000003112 inhibitor Substances 0.000 description 4
- 238000001819 mass spectrum Methods 0.000 description 4
- 239000003607 modifier Substances 0.000 description 4
- 238000002823 phage display Methods 0.000 description 4
- 239000002243 precursor Substances 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000007017 scission Effects 0.000 description 4
- 229910052814 silicon oxide Inorganic materials 0.000 description 4
- 238000002791 soaking Methods 0.000 description 4
- 238000004611 spectroscopical analysis Methods 0.000 description 4
- 238000005507 spraying Methods 0.000 description 4
- 239000004094 surface-active agent Substances 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 3
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 3
- 101710163270 Nuclease Proteins 0.000 description 3
- 102000004142 Trypsin Human genes 0.000 description 3
- 108090000631 Trypsin Proteins 0.000 description 3
- 238000005411 Van der Waals force Methods 0.000 description 3
- 230000002378 acidificating effect Effects 0.000 description 3
- 150000007513 acids Chemical class 0.000 description 3
- 150000001338 aliphatic hydrocarbons Chemical class 0.000 description 3
- 230000004075 alteration Effects 0.000 description 3
- 229910052782 aluminium Inorganic materials 0.000 description 3
- XAGFODPZIPBFFR-UHFFFAOYSA-N aluminium Chemical compound [Al] XAGFODPZIPBFFR-UHFFFAOYSA-N 0.000 description 3
- 230000003321 amplification Effects 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- 125000003118 aryl group Chemical group 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000003139 buffering effect Effects 0.000 description 3
- 239000004202 carbamide Substances 0.000 description 3
- 229920002678 cellulose Polymers 0.000 description 3
- 239000001913 cellulose Substances 0.000 description 3
- 230000003196 chaotropic effect Effects 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 3
- 125000002228 disulfide group Chemical group 0.000 description 3
- 238000007876 drug discovery Methods 0.000 description 3
- 239000013604 expression vector Substances 0.000 description 3
- 239000000835 fiber Substances 0.000 description 3
- 150000004676 glycans Chemical class 0.000 description 3
- 229940072221 immunoglobulins Drugs 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 239000000178 monomer Substances 0.000 description 3
- 230000007935 neutral effect Effects 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 210000000056 organ Anatomy 0.000 description 3
- 239000006174 pH buffer Substances 0.000 description 3
- 229920001282 polysaccharide Polymers 0.000 description 3
- 239000005017 polysaccharide Substances 0.000 description 3
- 230000004481 post-translational protein modification Effects 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 230000002285 radioactive effect Effects 0.000 description 3
- 125000003396 thiol group Chemical group [H]S* 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 230000014616 translation Effects 0.000 description 3
- 239000012588 trypsin Substances 0.000 description 3
- 238000000539 two dimensional gel electrophoresis Methods 0.000 description 3
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- NWUYHJFMYQTDRP-UHFFFAOYSA-N 1,2-bis(ethenyl)benzene;1-ethenyl-2-ethylbenzene;styrene Chemical compound C=CC1=CC=CC=C1.CCC1=CC=CC=C1C=C.C=CC1=CC=CC=C1C=C NWUYHJFMYQTDRP-UHFFFAOYSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 229920000936 Agarose Polymers 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 208000023275 Autoimmune disease Diseases 0.000 description 2
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 102000019034 Chemokines Human genes 0.000 description 2
- 108010012236 Chemokines Proteins 0.000 description 2
- 108091033380 Coding strand Proteins 0.000 description 2
- 108020004635 Complementary DNA Proteins 0.000 description 2
- 102000004127 Cytokines Human genes 0.000 description 2
- 108090000695 Cytokines Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- VTLYFUHAOXGGBS-UHFFFAOYSA-N Fe3+ Chemical compound [Fe+3] VTLYFUHAOXGGBS-UHFFFAOYSA-N 0.000 description 2
- 230000005526 G1 to G0 transition Effects 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 2
- 108700005091 Immunoglobulin Genes Proteins 0.000 description 2
- 108091092195 Intron Proteins 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- 108090001090 Lectins Proteins 0.000 description 2
- 102000004856 Lectins Human genes 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 108010085220 Multiprotein Complexes Proteins 0.000 description 2
- 102000007474 Multiprotein Complexes Human genes 0.000 description 2
- 238000000636 Northern blotting Methods 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- 108010033276 Peptide Fragments Proteins 0.000 description 2
- 102000007079 Peptide Fragments Human genes 0.000 description 2
- 108091093037 Peptide nucleic acid Proteins 0.000 description 2
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 2
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 2
- 206010036790 Productive cough Diseases 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- FKNQFGJONOIPTF-UHFFFAOYSA-N Sodium cation Chemical compound [Na+] FKNQFGJONOIPTF-UHFFFAOYSA-N 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 239000002250 absorbent Substances 0.000 description 2
- 230000002745 absorbent Effects 0.000 description 2
- 239000000370 acceptor Substances 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 210000004102 animal cell Anatomy 0.000 description 2
- 150000001450 anions Chemical class 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 229940098773 bovine serum albumin Drugs 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 238000001818 capillary gel electrophoresis Methods 0.000 description 2
- 230000022131 cell cycle Effects 0.000 description 2
- 238000011098 chromatofocusing Methods 0.000 description 2
- 229920001577 copolymer Polymers 0.000 description 2
- 229920006037 cross link polymer Polymers 0.000 description 2
- 239000013078 crystal Substances 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000004925 denaturation Methods 0.000 description 2
- 230000036425 denaturation Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 239000000539 dimer Substances 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 229940083124 ganglion-blocking antiadrenergic secondary and tertiary amines Drugs 0.000 description 2
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 2
- 239000010931 gold Substances 0.000 description 2
- 238000004128 high performance liquid chromatography Methods 0.000 description 2
- 229940088597 hormone Drugs 0.000 description 2
- 239000005556 hormone Substances 0.000 description 2
- 238000003018 immunoassay Methods 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 239000003456 ion exchange resin Substances 0.000 description 2
- 229920003303 ion-exchange polymer Polymers 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 239000002523 lectin Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 239000004530 micro-emulsion Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 229910052759 nickel Inorganic materials 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 229920001542 oligosaccharide Polymers 0.000 description 2
- 150000002482 oligosaccharides Chemical class 0.000 description 2
- 239000003960 organic solvent Substances 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 239000010452 phosphate Substances 0.000 description 2
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 2
- 229920001223 polyethylene glycol Polymers 0.000 description 2
- 235000019419 proteases Nutrition 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 239000011347 resin Substances 0.000 description 2
- 229920005989 resin Polymers 0.000 description 2
- 239000012266 salt solution Substances 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 229910001415 sodium ion Inorganic materials 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 210000003802 sputum Anatomy 0.000 description 2
- 208000024794 sputum Diseases 0.000 description 2
- 150000003431 steroids Chemical class 0.000 description 2
- 235000000346 sugar Nutrition 0.000 description 2
- 239000000725 suspension Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 229910001428 transition metal ion Inorganic materials 0.000 description 2
- 241001515965 unidentified phage Species 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- 235000012431 wafers Nutrition 0.000 description 2
- WYTZZXDRDKSJID-UHFFFAOYSA-N (3-aminopropyl)triethoxysilane Chemical compound CCO[Si](OCC)(OCC)CCCN WYTZZXDRDKSJID-UHFFFAOYSA-N 0.000 description 1
- WXTMDXOMEHJXQO-UHFFFAOYSA-N 2,5-dihydroxybenzoic acid Chemical compound OC(=O)C1=CC(O)=CC=C1O WXTMDXOMEHJXQO-UHFFFAOYSA-N 0.000 description 1
- UMCMPZBLKLEWAF-BCTGSCMUSA-N 3-[(3-cholamidopropyl)dimethylammonio]propane-1-sulfonate Chemical compound C([C@H]1C[C@H]2O)[C@H](O)CC[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H]([C@@H](CCC(=O)NCCC[N+](C)(C)CCCS([O-])(=O)=O)C)[C@@]2(C)[C@@H](O)C1 UMCMPZBLKLEWAF-BCTGSCMUSA-N 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 206010001935 American trypanosomiasis Diseases 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 201000001320 Atherosclerosis Diseases 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 102100031680 Beta-catenin-interacting protein 1 Human genes 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 229910001369 Brass Inorganic materials 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 241000700199 Cavia porcellus Species 0.000 description 1
- 108010001857 Cell Surface Receptors Proteins 0.000 description 1
- 208000024699 Chagas disease Diseases 0.000 description 1
- 241000282552 Chlorocebus aethiops Species 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 108091062157 Cis-regulatory element Proteins 0.000 description 1
- 101710094648 Coat protein Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 241000938605 Crocodylia Species 0.000 description 1
- JPVYNHNXODAKFH-UHFFFAOYSA-N Cu2+ Chemical compound [Cu+2] JPVYNHNXODAKFH-UHFFFAOYSA-N 0.000 description 1
- 102000018832 Cytochromes Human genes 0.000 description 1
- 108010052832 Cytochromes Proteins 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 108091006089 DNA- and RNA-binding proteins Proteins 0.000 description 1
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 206010014561 Emphysema Diseases 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 108090000371 Esterases Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 102000012673 Follicle Stimulating Hormone Human genes 0.000 description 1
- 108010079345 Follicle Stimulating Hormone Proteins 0.000 description 1
- 206010017533 Fungal infection Diseases 0.000 description 1
- 102000003688 G-Protein-Coupled Receptors Human genes 0.000 description 1
- 108090000045 G-Protein-Coupled Receptors Proteins 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Chemical group OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 1
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 1
- 102100027619 Histidine-rich glycoprotein Human genes 0.000 description 1
- 101000947120 Homo sapiens Beta-casein Proteins 0.000 description 1
- 101000993469 Homo sapiens Beta-catenin-interacting protein 1 Proteins 0.000 description 1
- 102000001706 Immunoglobulin Fab Fragments Human genes 0.000 description 1
- 108010054477 Immunoglobulin Fab Fragments Proteins 0.000 description 1
- 108010067060 Immunoglobulin Variable Region Proteins 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 102000004877 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 102000004310 Ion Channels Human genes 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 1
- 229930064664 L-arginine Natural products 0.000 description 1
- 235000014852 L-arginine Nutrition 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical group OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- 108010063045 Lactoferrin Proteins 0.000 description 1
- 102000010445 Lactoferrin Human genes 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- 208000019693 Lung disease Diseases 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101710125418 Major capsid protein Proteins 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 241000582786 Monoplex Species 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 208000031888 Mycoses Diseases 0.000 description 1
- 102100030856 Myoglobin Human genes 0.000 description 1
- 108010062374 Myoglobin Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- NQTADLQHYWFPDB-UHFFFAOYSA-N N-Hydroxysuccinimide Chemical compound ON1C(=O)CCC1=O NQTADLQHYWFPDB-UHFFFAOYSA-N 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 101710141454 Nucleoprotein Proteins 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- BZQFBWGGLXLEPQ-UHFFFAOYSA-N O-phosphoryl-L-serine Natural products OC(=O)C(N)COP(O)(O)=O BZQFBWGGLXLEPQ-UHFFFAOYSA-N 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 241000282579 Pan Species 0.000 description 1
- 102000057297 Pepsin A Human genes 0.000 description 1
- 108090000284 Pepsin A Proteins 0.000 description 1
- 241000009328 Perro Species 0.000 description 1
- 102000009097 Phosphorylases Human genes 0.000 description 1
- 108010073135 Phosphorylases Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 241000276498 Pollachius virens Species 0.000 description 1
- 239000004696 Poly ether ether ketone Substances 0.000 description 1
- 239000004952 Polyamide Substances 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 101710083689 Probable capsid protein Proteins 0.000 description 1
- 108010029485 Protein Isoforms Proteins 0.000 description 1
- 102000001708 Protein Isoforms Human genes 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 108010026552 Proteome Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 102000012479 Serine Proteases Human genes 0.000 description 1
- 108010022999 Serine Proteases Proteins 0.000 description 1
- 241001147693 Staphylococcus sp. Species 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 241000194022 Streptococcus sp. Species 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 1
- 229920006362 Teflon® Polymers 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- GWEVSGVZZGPLCZ-UHFFFAOYSA-N Titan oxide Chemical compound O=[Ti]=O GWEVSGVZZGPLCZ-UHFFFAOYSA-N 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 241000223109 Trypanosoma cruzi Species 0.000 description 1
- 101710162629 Trypsin inhibitor Proteins 0.000 description 1
- 229940122618 Trypsin inhibitor Drugs 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 101710100170 Unknown protein Proteins 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 101710185494 Zinc finger protein Proteins 0.000 description 1
- 102100023597 Zinc finger protein 816 Human genes 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 239000003929 acidic solution Substances 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 208000005652 acute fatty liver of pregnancy Diseases 0.000 description 1
- 150000001298 alcohols Chemical class 0.000 description 1
- 150000001299 aldehydes Chemical class 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 238000005349 anion exchange Methods 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000000340 anti-metabolite Effects 0.000 description 1
- 230000002927 anti-mitotic effect Effects 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 229940100197 antimetabolite Drugs 0.000 description 1
- 239000002256 antimetabolite Substances 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 1
- 125000002029 aromatic hydrocarbon group Chemical group 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- JXLHNMVSKXFWAO-UHFFFAOYSA-N azane;7-fluoro-2,1,3-benzoxadiazole-4-sulfonic acid Chemical group N.OS(=O)(=O)C1=CC=C(F)C2=NON=C12 JXLHNMVSKXFWAO-UHFFFAOYSA-N 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 239000003637 basic solution Substances 0.000 description 1
- JUPQTSLXMOCDHR-UHFFFAOYSA-N benzene-1,4-diol;bis(4-fluorophenyl)methanone Chemical compound OC1=CC=C(O)C=C1.C1=CC(F)=CC=C1C(=O)C1=CC=C(F)C=C1 JUPQTSLXMOCDHR-UHFFFAOYSA-N 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 238000012742 biochemical analysis Methods 0.000 description 1
- 238000010364 biochemical engineering Methods 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 239000013060 biological fluid Substances 0.000 description 1
- 239000012620 biological material Substances 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 239000010951 brass Substances 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 150000001718 carbodiimides Chemical class 0.000 description 1
- 150000001735 carboxylic acids Chemical class 0.000 description 1
- 231100000357 carcinogen Toxicity 0.000 description 1
- 239000003183 carcinogenic agent Substances 0.000 description 1
- 235000021466 carotenoid Nutrition 0.000 description 1
- 150000001747 carotenoids Chemical class 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000024245 cell differentiation Effects 0.000 description 1
- 230000011748 cell maturation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000000919 ceramic Substances 0.000 description 1
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 1
- 208000026106 cerebrovascular disease Diseases 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000009614 chemical analysis method Methods 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 150000001851 cinnamic acid derivatives Chemical class 0.000 description 1
- 229910017052 cobalt Inorganic materials 0.000 description 1
- 239000010941 cobalt Substances 0.000 description 1
- GUTLYIVDDKVIGB-UHFFFAOYSA-N cobalt atom Chemical compound [Co] GUTLYIVDDKVIGB-UHFFFAOYSA-N 0.000 description 1
- 239000005515 coenzyme Substances 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 230000009137 competitive binding Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000009918 complex formation Effects 0.000 description 1
- 239000002322 conducting polymer Substances 0.000 description 1
- 229920001940 conductive polymer Polymers 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 150000004696 coordination complex Chemical class 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 238000002425 crystallisation Methods 0.000 description 1
- 230000008025 crystallization Effects 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- APQPRKLAWCIJEK-UHFFFAOYSA-N cystamine Chemical compound NCCSSCCN APQPRKLAWCIJEK-UHFFFAOYSA-N 0.000 description 1
- 238000013016 damping Methods 0.000 description 1
- 238000000326 densiometry Methods 0.000 description 1
- 230000000368 destabilizing effect Effects 0.000 description 1
- 229950006137 dexfosfoserine Drugs 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 230000009274 differential gene expression Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000004141 dimensional analysis Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-N dithiophosphoric acid Chemical class OP(O)(S)=S NAGJZTKCGNOGPW-UHFFFAOYSA-N 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000000132 electrospray ionisation Methods 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 239000002532 enzyme inhibitor Substances 0.000 description 1
- DNJIEGIFACGWOD-UHFFFAOYSA-N ethyl mercaptane Natural products CCS DNJIEGIFACGWOD-UHFFFAOYSA-N 0.000 description 1
- 238000013401 experimental design Methods 0.000 description 1
- 238000010265 fast atom bombardment Methods 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 229910001447 ferric ion Inorganic materials 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 229940028334 follicle stimulating hormone Drugs 0.000 description 1
- 125000002485 formyl group Chemical group [H]C(*)=O 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Chemical group 0.000 description 1
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 108010044853 histidine-rich proteins Proteins 0.000 description 1
- 230000002962 histologic effect Effects 0.000 description 1
- 108091008147 housekeeping proteins Proteins 0.000 description 1
- 230000036571 hydration Effects 0.000 description 1
- 238000006703 hydration reaction Methods 0.000 description 1
- 229920001477 hydrophilic polymer Polymers 0.000 description 1
- 230000005660 hydrophilic surface Effects 0.000 description 1
- 125000001165 hydrophobic group Chemical group 0.000 description 1
- 229910052588 hydroxylapatite Inorganic materials 0.000 description 1
- 230000000984 immunochemical effect Effects 0.000 description 1
- 238000012309 immunohistochemistry technique Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 230000004941 influx Effects 0.000 description 1
- 229910052809 inorganic oxide Inorganic materials 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 238000005342 ion exchange Methods 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 238000005040 ion trap Methods 0.000 description 1
- 239000002555 ionophore Substances 0.000 description 1
- 230000000236 ionophoric effect Effects 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 238000001155 isoelectric focusing Methods 0.000 description 1
- 230000000366 juvenile effect Effects 0.000 description 1
- 150000002576 ketones Chemical class 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 210000003292 kidney cell Anatomy 0.000 description 1
- 208000017169 kidney disease Diseases 0.000 description 1
- CSSYQJWUGATIHM-IKGCZBKSSA-N l-phenylalanyl-l-lysyl-l-cysteinyl-l-arginyl-l-arginyl-l-tryptophyl-l-glutaminyl-l-tryptophyl-l-arginyl-l-methionyl-l-lysyl-l-lysyl-l-leucylglycyl-l-alanyl-l-prolyl-l-seryl-l-isoleucyl-l-threonyl-l-cysteinyl-l-valyl-l-arginyl-l-arginyl-l-alanyl-l-phenylal Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CSSYQJWUGATIHM-IKGCZBKSSA-N 0.000 description 1
- 235000021242 lactoferrin Nutrition 0.000 description 1
- 229940078795 lactoferrin Drugs 0.000 description 1
- 238000001698 laser desorption ionisation Methods 0.000 description 1
- 238000004989 laser desorption mass spectroscopy Methods 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 102000019758 lipid binding proteins Human genes 0.000 description 1
- 108091016323 lipid binding proteins Proteins 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 238000004811 liquid chromatography Methods 0.000 description 1
- 239000011344 liquid material Substances 0.000 description 1
- 239000007791 liquid phase Substances 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 201000004792 malaria Diseases 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 102000006240 membrane receptors Human genes 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- YACKEPLHDIMKIO-UHFFFAOYSA-N methylphosphonic acid Chemical class CP(O)(O)=O YACKEPLHDIMKIO-UHFFFAOYSA-N 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 201000006417 multiple sclerosis Diseases 0.000 description 1
- 239000003471 mutagenic agent Substances 0.000 description 1
- 231100000707 mutagenic chemical Toxicity 0.000 description 1
- 208000010125 myocardial infarction Diseases 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000002414 normal-phase solid-phase extraction Methods 0.000 description 1
- 238000007826 nucleic acid assay Methods 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 150000003833 nucleoside derivatives Chemical class 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 239000007764 o/w emulsion Substances 0.000 description 1
- 238000011275 oncology therapy Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 229920000620 organic polymer Polymers 0.000 description 1
- 230000002611 ovarian Effects 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- XYJRXVWERLGGKC-UHFFFAOYSA-D pentacalcium;hydroxide;triphosphate Chemical compound [OH-].[Ca+2].[Ca+2].[Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O XYJRXVWERLGGKC-UHFFFAOYSA-D 0.000 description 1
- 229940111202 pepsin Drugs 0.000 description 1
- 208000022177 perioral myoclonia with absences Diseases 0.000 description 1
- 230000002974 pharmacogenomic effect Effects 0.000 description 1
- 238000005191 phase separation Methods 0.000 description 1
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 150000003904 phospholipids Chemical class 0.000 description 1
- 150000008298 phosphoramidates Chemical class 0.000 description 1
- 150000003014 phosphoric acid esters Chemical class 0.000 description 1
- USRGIUJOYOXOQJ-GBXIJSLDSA-N phosphothreonine Chemical group OP(=O)(O)O[C@H](C)[C@H](N)C(O)=O USRGIUJOYOXOQJ-GBXIJSLDSA-N 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- DCWXELXMIBXGTH-UHFFFAOYSA-N phosphotyrosine Chemical compound OC(=O)C(N)CC1=CC=C(OP(O)(O)=O)C=C1 DCWXELXMIBXGTH-UHFFFAOYSA-N 0.000 description 1
- 239000005648 plant growth regulator Substances 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 229920002492 poly(sulfone) Polymers 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 229920001281 polyalkylene Polymers 0.000 description 1
- 229920002647 polyamide Polymers 0.000 description 1
- 229920000515 polycarbonate Polymers 0.000 description 1
- 239000004417 polycarbonate Substances 0.000 description 1
- 229920000570 polyether Polymers 0.000 description 1
- 229920002530 polyetherether ketone Polymers 0.000 description 1
- 229920000193 polymethacrylate Polymers 0.000 description 1
- 210000002729 polyribosome Anatomy 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 229920000136 polysorbate Polymers 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 229920001290 polyvinyl ester Polymers 0.000 description 1
- 229920001289 polyvinyl ether Polymers 0.000 description 1
- 229920001291 polyvinyl halide Polymers 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- BDERNNFJNOPAEC-UHFFFAOYSA-N propan-1-ol Chemical compound CCCO BDERNNFJNOPAEC-UHFFFAOYSA-N 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 235000019833 protease Nutrition 0.000 description 1
- 238000002331 protein detection Methods 0.000 description 1
- 108060006633 protein kinase Proteins 0.000 description 1
- 230000009145 protein modification Effects 0.000 description 1
- 208000028172 protozoa infectious disease Diseases 0.000 description 1
- 125000000714 pyrimidinyl group Chemical group 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 230000022983 regulation of cell cycle Effects 0.000 description 1
- 238000007634 remodeling Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 238000004366 reverse phase liquid chromatography Methods 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 238000004007 reversed phase HPLC Methods 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 201000004409 schistosomiasis Diseases 0.000 description 1
- 238000003345 scintillation counting Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 238000003998 size exclusion chromatography high performance liquid chromatography Methods 0.000 description 1
- 238000003373 small molecule array Methods 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- VGTPCRGMBIAPIM-UHFFFAOYSA-M sodium thiocyanate Chemical compound [Na+].[S-]C#N VGTPCRGMBIAPIM-UHFFFAOYSA-M 0.000 description 1
- 239000011343 solid material Substances 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 230000003381 solubilizing effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 239000007921 spray Substances 0.000 description 1
- 229910001220 stainless steel Inorganic materials 0.000 description 1
- 239000010935 stainless steel Substances 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 210000001768 subcellular fraction Anatomy 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 150000003464 sulfur compounds Chemical class 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 229920001059 synthetic polymer Polymers 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 229910001258 titanium gold Inorganic materials 0.000 description 1
- OGIDPMRJRNCKJF-UHFFFAOYSA-N titanium oxide Inorganic materials [Ti]=O OGIDPMRJRNCKJF-UHFFFAOYSA-N 0.000 description 1
- 231100000041 toxicology testing Toxicity 0.000 description 1
- 231100000033 toxigenic Toxicity 0.000 description 1
- 230000001551 toxigenic effect Effects 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000011830 transgenic mouse model Methods 0.000 description 1
- 150000003626 triacylglycerols Chemical class 0.000 description 1
- 239000002753 trypsin inhibitor Substances 0.000 description 1
- 125000000430 tryptophan group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 1
- 201000008827 tuberculosis Diseases 0.000 description 1
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 210000003932 urinary bladder Anatomy 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 239000011782 vitamin Substances 0.000 description 1
- 235000013343 vitamin Nutrition 0.000 description 1
- 229940088594 vitamin Drugs 0.000 description 1
- 229930003231 vitamin Natural products 0.000 description 1
- 239000007762 w/o emulsion Substances 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Description
METHOD FOR CORRELATING GENE EXPRESSION PROFILES WITH PROTEIN EXPRESSION PROFILES
CROSS-REFERENCES TO RELATED APPLICATIONS The present application claims priority to USSN 60/269,772, filed
February 16, 2001, herein incorporated by reference in its entirety.
STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT Not applicable.
BACKGROUND OF THE INVENTION Genomics is the study of the collective set of genes (the genome) of a species, as well as study of the function and activity of those genes, in different cells and in the same cell, temporally, developmentally, and under varying environmental conditions. Differential gene function and activity plays a significant role in the development of a cell for a specialized activity in the body and the transformation of a cell from healthy into pathologic.
The expression of genetic information in a cell is carried out through the transcription of an intermediate molecule, mRNA. The cell translates expressed mRNAs into polypeptides, or proteins. Proteins carry out the majority of functions encoded by the genes. The study of the collective set of proteins (the proteome) of a species, and the activity and function of those proteins in a cell is the subject of a new field of biology called "proteomics." Because the character of a cell depends on the genes expressed by the cell, gene expression profiling has become an important method in genomics. Gene expression profiling seeks to determine which genes are expressed in a cell and the level of their expression. Thus, the gene expression profile of a cell provides a "fingerprint" that is characteristic of the cell, indicating both the identity of the cell and its activity. Comparing the gene expression profiles of different cells is a process called "differential gene expression." This method can provide information about the genes that are responsible for the different phenotypes of cells. Genes that are differentially expressed in healthy and pathologic cells can function as diagnostic markers and are candidate
targets for therapeutic intervention. Thus, obtaining accurate profiles of gene expression in different cell types is an important goal.
There are numerous methods presently used to generate gene expression profiles of a cell. These methods include traditional methods such as northern blots, RT- PCR, nuclease protection, differential display, cDNA fingerprinting, and subtractive hybridization, as well a newer techniques such as the generation of expressed sequence tag, or "EST" libraries and arrays, cDNA arrays, mRNA arrays, oligonucleotide arrays, and serial analysis of gene expression, or "SAGE" (see generally Lockhar & Winzeler, Nature 405:827-836 (2000); see also Velculescu et al., Science 270:484-487 (1995)). In one example, nucleic acid arrays such as oligonucleotide arrays are used for expression profiling. These arrays are collections of specifically chosen oligonucleotides that are bound to a solid support at predetermined and addressable locations. In certain embodiments, these arrays comprise an oligonucleotide that specifically identifies each of the known genes in a genome. Messenger RNAs or cDNAs derived from a cell are applied to the array. Each mRNA or cDNA hybridizes with an oligonucleotide that corresponds to the particular gene from which it was transcribed. Because the identity and location of each immobilized oligonucleotide is predetermined, each hybridization event indicates that a particular gene has been expressed by the cell. One commercialized version of an oligonucleotide array is the GeneChip™ from Affymetrix. In yet another example of commercialized array methodology, beads coated with an array, or cells, are each attached to an optical sensor molecule. To provide an address, the beads are then drawn into wells at the end of fibers in a fiber optic bundle (see, e.g., Bead Array™ (Illumina)). In yet another example, arrays can be made from EST libraries. EST libraries are generated by reverse-transcribing the set of expressed mRNA in a cell. Frequently, the entire mRNA is not reverse transcribed, but a sufficient portion of it is to uniquely identify the gene from which the mRNA was expressed. The ESTs are sequenced and identified in a genomic database.
Despite the power of existing gene expression technologies, it is acknowledged that levels of mRNA transcription do not always correlate directly to levels of protein expression, for a number of reasons: (1) different mRNAs may be translated into polypeptides with different efficiencies; (2) an mRNA may be differentially spliced to produce different proteins in different cells; (3) expressed polypeptides may be degraded at different rates; and (4) polypeptides can be subject to post-translational
modifications so that the same polypeptide can assume a different form or function in the same cell and in different cells. Thus, there is a need to correlate mRNA expression with protein expression (see, e.g., Hancock et al., Anal. Chem. News & Features, November 1, 1999, page 742A-748A; Nelson et al., Electrophoresis 21:1823-1831 (2000)). At the same time, current methods of protein expression profiling, such as mass spectrometry, 2D gel electrophoresis, and chromatography, may suffer from limitations in sensitivity and resolution (see, e.g., Pandey & Mann, Nature 405:837-846 (2000)). The present invention therefore address this issue by combining gene expression profiling and protein profiling to more quickly and accurately identify proteins of interest in a particular cell type. Gene expression profiling is used to select a candidate transcript or transcripts that are expressed in a cell. The transcripts are typically sequenced and used to deduce the amino acid sequence of the encoded protein. The amino acid sequence is then used to predict and identify physio-chemical characteristics of the protein encoded by transcript, e.g., molecular weight, isoelectric point, hydrophobicity, hydrophilicity, glycosylation, phosphorylation, epitope sequence, ligand binding sequence, charge at specified pH, or metal chelate binding. The physio-chemical characteristics are then employed to improve the sensitivity and resolution of protein profiling, thereby providing improved information about the proteins encoded by mRNA expressed in a particular cell type. This invention provides methods for making such a correlation and provides other advantages, as well.
SUMMARY OF THE INVENTION The present invention therefore provides methods for correlating gene expression with protein expression. The methods involve performing gene expression profiling on a sample, selecting one or more expressed genes for further study, determining a physio-chemical property characteristic of the proteins encoded by these genes, and determining whether the proteins are expressed in the sample using the physio- chemical property as an identifier in a protein expression profile of the sample. In certain embodiments, the selected gene is differentially expressed in two cells or samples of interest, for example, a healthy cell and a pathologic cell, or two cells at different stages of a cell cycle, maturation, or differentiation pathway, or under different environmental conditions. In a preferred embodiment, the proteins are fractionated using mass spectrometry. In another preferred embodiment, the proteins are fractionated using SELDI (surface enhanced laser desorption ionization).
The methods of the invention are therefore useful in the identification of target proteins for drug discovery, and for the identification of diagnostic markers, for disease states such as cancer, e.g., prostate, breast, lung, bladder, ovarian, colon, brain and kidney; cancer metastasis; diabetes, both juvenile and late-onset; autoimmune disease such as rheumatoid arthritis and multiple sclerosis; heart disease, e.g., myocardial infarction, atherosclerosis, and cardiomyopathy; cerebrovascular disease, e.g., stroke; renal disease; lung disease, e.g., emphysema; viral infection, e.g., HIV, HCV, CMV, HPN, HBN; bacterial infection, e.g., M. tuberculosis, toxigenic E. coli, Streptococcus sp., Staphylococcus sp.; fungal infection; protozoal infection; e.g., malaria, schistosomiasis, Chagas disease. The methods of the present invention are also useful for investigating the expression products of different alleles, for, e.g., pharmacogenetic applications. The methods of the present invention are also useful for toxicology studies, and for investigating the effects of exposure of a cell to varying environmental conditions, such as radiation, e.g., UN radiation, heat, and cold.
BRIEF DESCRIPTION OF THE DRAWINGS Not applicable.
DETAILED DESCRIPTION OF THE INVENTION Introduction
The present invention provides methods that combine RNA and protein expression profiling, to identify genes and the proteins expressed in cells under different conditions, e.g., at different times in the cell cycle, under varying environmental conditions (such as ion influx or efflux; exposure to a toxin; drug; ligand; e.g., a hormone, a cytokine, or a chemokine; or a pathogen such as a virus, bacteria, protozoa, or fungus), under varying pathological conditions, such as cancer, at different times during maturation and differentiation, at different times during development of the organism, during responses such as inflammation, in different tissue types or organs, in different pathological conditions such as cancer or autoimmune disease, between individuals with different phenotypic traits, e.g., responders vs. non-responders to a particular pharmaceutical drug, etc. The methods of the present invention, e.g., allow one of skill in the art to identify a list of candidate genes expressed in a cell or biological sample, and then to further identify a subset of proteins of interest encoded by the genes of interest using the methods of the invention. The methods of the invention are also useful for
combining information related to mRNA expression to expression and function of the protein encoded by the mRNA.
The invention therefore provides a method of correlating gene and protein expression in a cell, comprising the steps of obtaining a biological sample; generating a gene expression profile of the sample, thereby identifying one or more mRNAs expressed in the sample; predicting and identifying one or more physio-chemical properties of the polypeptides encoded by the RNAs; and identifying one or more polypeptides encoded by the mRNAs, the polypeptides comprising the physio-chemical property in the sample, by fractionating the polypeptides in the sample, thereby correlating gene and protein expression the in sample.
In one embodiment, the step of generating the gene expression profile comprises identifying expressed mRNA with an EST array, an mRNA array, or an oligonucleotide array.
In another embodiment, the step of identifying the polypeptide comprises fractionating polypeptides in the sample using 2-D electrophoresis, chromatography, mass spectrometry, or SELDI.
In another embodiment, the physiochemical characteristic is selected from the group consisting of amino acid sequence, molecular weight, iso-electric point, hydrophobicity, hydrophilicity, charge (e.g., isoelectric point), glycosylation, phosphorylation, epitope sequence or antibody binding, ligand binding, dye binding, and metal chelate binding. In another embodiment, the step of identifying a physiochemical characteristic comprises predicting the masses of proteolytic fragments generated by the encoded polypeptide upon degradation of the encoded polypeptide by a selected proteolytic agent, and the step of identifying a polypeptide comprises subjecting polypeptides in the sample to degradation by the agent and identifying actual proteolytic fragments in the sample having masses that correspond to the masses of the predicted fragments.
In another embodiment, the sample comprises a human cell. In another embodiment, the sample comprises a cell lysate from a normal or healthy cell. In another embodiment, the sample comprises a cell lysate from a pathological cell. In another embodiment, the sample comprises a cell lysate from a cell that has been contacted with a toxic compound. In another embodiment, the biological sample comprises a cell lysate from a cell of a subject who respond to a drug treatment or a subject who does not respond to a drug treatment.
In one embodiment, the sample is tissue from a human. In another embodiment, the mRNA is differentially expressed in two biological samples. In another embodiment, the two biological samples are a normal or healthy cell and a pathological cell, e.g., a cancer cell. In another embodiment, the two biological samples are derived from a healthy cell and a cell exposed to a toxic compound.
In another embodiment, the sample comprises a biopsy; cultured cells, e.g., transformed cells, cells from a cell line, an explant, or a primary culture; blood, serum, sputum, stool, or urine.
In a preferred aspect of the invention, the method comprises the steps of: obtaining a biological sample; generating a gene expression profile of the cell using an nucleic acid array, thereby identifying one or more mRNAs expressed in the cell; identifying one or more physio-chemical properties of a polypeptide encoded by the mRNA; and identifying a polypeptide comprising the physio-chemical property by fractionating the polypeptides in the sample with mass spectrometry; thereby correlating gene and protein expression in the cell.
In a preferred aspect of the invention, the method comprises the steps of: obtaining a biological sample comprising a cell; generating a gene expression profile of the cell using an oligonucleotide array, thereby identifying one or more mRNAs expressed in the cell; identifying one or more physio-chemical properties of a polypeptide encoded by the mRNA; and identifying a polypeptide comprising the physio-chemical property by fractionating the polypeptides in the sample with SELDI, wherein SELDI comprises fractionating by affinity retention on solid phase-bound adsorbent followed by fractionating retained proteins from the solid phase by gas phase ion spectrometry; thereby correlating gene and protein expression in the cell. In one embodiment, the method comprises using more than one technique to identify either mRNA or proteins expressed in the sample.
In one embodiment, the genomics arrays compare expression of housekeeping genes with other tissue specific genes. In one embodiment, the genomics arrays compare differential levels of gene expression. In one embodiment, the genomics arrays compare similar levels of gene expression.
Definitions
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this
invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: .Singleton et al, Dictionary of Microbiology and Molecular Biology (2nd ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used herein, the following terms have the meanings ascribed to them unless specified otherwise.
"Biological sample" refers to a sample derived from a virus, cell, tissue, organ or organism (either eukaryotic or prokaryotic) including, without limitation, cell, tissue or organ lysates or homogenates, or body fluid samples, such as blood, urine, sputum, or cerebrospinal fluid. Such samples include, but are not limited to, tissue isolated from humans, or explants, primary, and transformed cell cultures derived therefrom. Biological samples may also include sections of tissues such as frozen sections taken for histologic purposes. A biological sample can be obtained from a eukaryotic organism such as fungi, plants, insects, protozoa, birds, fish, reptiles, and preferably a mammal such as rat, mice, cow, dog, guinea pig, or rabbit, and most preferably a primate such as chimpanzees or humans.
"Biopolymer" refers to a polymer of biological origin, e.g., polypeptides, polynucleotides, polysaccharides or polyglycerides (e.g., di- or tri-glycerides). "Polypeptide" refers to a polymer composed of amino acid residues, related naturally occurring structural variants, and synthetic non-naturally occurring analogs thereof linked via peptide bonds, related naturally occurring structural variants, and synthetic non-naturally occurring analogs thereof. Synthetic polypeptides can be synthesized, for example, using an automated polypeptide synthesizer. The term "protein" typically refers to large polypeptides. The term "peptide" typically refers to short polypeptides.
"Polynucleotide" or "nucleic acid" refers to a polymer composed of nucleotide units. Polynucleotides include naturally occurring nucleic acids, such as deoxyribonucleic acid ("DNA") and ribonucleic acid ("RNA") as well as nucleic acid analogs. Nucleic acid analogs include those which include non-naturally occurring bases, nucleotides that engage in linkages with other nucleotides other than the naturally occurring phosphodiester bond or which include bases attached through linkages other than phosphodiester bonds. Thus, nucleotide analogs include, for example and without limitation, phosphorothioates, phosphorodithioates, phosphorotriesters,
phosphoramidates, boranophosphates, methylphosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs), and the like. Such polynucleotides can be synthesized, for example, using an automated DNA synthesizer. The term "nucleic acid" typically refers to large polynucleotides. The term "oligonucleotide" typically refers to short polynucleotides, generally no greater than about 50 nucleotides. It will be understood that when a nucleotide sequence is represented by a DNA sequence (i.e., A, T, G, C), this also includes an RNA sequence (i.e., A, U, G, C) in which "U" replaces "T."
"Detectable moiety" or a "label" refers to a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include 32P, 35S, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin-streptavadin, dioxigenin, haptens and proteins for which antisera or monoclonal antibodies are available, or nucleic acid molecules with a sequence complementary to a target. The detectable moiety often generates a measurable signal, such as a radioactive, chromogenic, or fluorescent signal, that can be used to quantitate the amount of bound detectable moiety in a sample. The detectable moiety can be incorporated in or attached to a primer or probe either covalently, or through ionic, van der Waals or hydrogen bonds, e.g., incoφoration of radioactive nucleotides, or biotinylated nucleotides that are recognized by streptavadin. The detectable moiety may be directly or indirectly detectable. Indirect detection can involve the binding of a second directly or indirectly detectable moiety to the detectable moiety. For example, the detectable moiety can be the ligand of a binding partner, such as biotin, which is a binding partner for streptavadin, or a nucleotide sequence, which is the binding partner for a complementary sequence, to which it can specifically hybridize. The binding partner may itself be directly detectable, for example, an antibody may be itself labeled with a fluorescent molecule. The binding partner also may be indirectly detectable, for example, a nucleic acid having a complementary nucleotide sequence can be a part of a branched DNA molecule that is in turn detectable through hybridization with other labeled nucleic acid molecules (see, e.g., Fahrlander & Klausner, Biotechnology 6:1165 (1988)). Quantitation of the signal is achieved by, e.g., scintillation counting, densitometry, or flow cytometry.
The terms "isolated," "purified," or "biologically pure" refer to material that is substantially or essentially free from components that normally accompany it as found in its native state. Purity and homogeneity are typically determined using
analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein or nucleic acid that is the predominant species present in a preparation is substantially purified. In particular, an isolated nucleic acid is separated from open reading frames that flank the gene and encode proteins other than protein encoded by the gene. The term "purified" denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and most preferably at least 99% pure.
"Purify" or "purification" means removing at least one contaminant from the composition to be purified. Purification does not require that the purified compound be 100% pure.
The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.
"Recombinant polynucleotide" refers to a polynucleotide having sequences that are not naturally joined together. An amplified or assembled recombinant polynucleotide may be included in a suitable vector, and the vector can be used to transform a suitable host cell. A host cell that comprises the recombinant polynucleotide is referred to as a "recombinant host cell." The gene is then expressed in the recombinant host cell to produce, e.g., a "recombinant polypeptide." A recombinant polynucleotide may serve a non-coding function (e.g., promoter, origin of replication, ribosome-binding site, etc.) as well. Appropriate unicellular hosts include any of those routinely used in expressing eukaryotic or mammalian polynucleotides, including, for example, prokaryotes, such as E. coli; and eukaryotes, including for example, fungi, such as yeast; and mammalian cells, including insect cells (e.g., Sf9) and animal cells such as CHO, Rl.l, B-W, L-M, African Green Monkey Kidney cells (e.g. COS 1, COS 7, BSC 1, BSC 40 and BMT 10) and cultured human cells.
The term "heterologous" when used with reference to portions of a nucleic acid indicates that the nucleic acid comprises two or more subsequences that are not found in the same relationship to each other in nature. For instance, the nucleic acid is
typically recombinantly produced, having two or more sequences from unrelated genes arranged to make a new functional nucleic acid, e.g., a promoter from one source and a coding region from another source. Similarly, a heterologous protein indicates that the protein comprises two or more subsequences that are not found in the same relationship to each other in nature (e.g., a fusion protein).
The phrase "selectively (or specifically) hybridizes to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent hybridization conditions when that sequence is present in a complex mixture (e.g., total cellular or library DNA or RNA). The phrase "stringent hybridization conditions" refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acid, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" (1993). Generally, stringent conditions are selected to be about 5-10°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For high stringency hybridization, a positive signal is at least two times background, preferably 10 times background hybridization. Exemplary high stringency or stringent hybridization conditions include: 50% formamide, 5x SSC and 1% SDS incubated at 42° C or 5x SSC and 1% SDS incubated at 65° C, with a wash in 0.2x SSC and 0.1% SDS at 65° C. For PCR, a temperature of about 36°C is typical for low stringency amplification, although annealing temperatures may vary between about 32°C and 48°C depending on primer length. For
high stringency PCR amplification, a temperature of about 62°C is typical, although high stringency annealing temperatures can range from about 50°C to about 65°C, depending on the primer length and specificity. Typical cycle conditions for both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 72°C for 1 - 2 min.
"Plurality" means at least two.
A "ligand" is a compound that specifically binds to a target molecule.
A "receptor" is compound that specifically binds to a ligand. "Antibody" refers to a polypeptide comprising a framework region from an immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively. This term also encompasses, e.g., polyclonal, monoclonal, single-chain, humanized, chimeric antibodies, and fragments thereof.
An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N- terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively. Antibodies exist, e.g., as intact immunoglobulins or as a number of well- characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'2) a dimer of Fab which itself is a light chain joined to VH-CHI by a disulfide bond. The F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)'2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 1993)). While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such
fragments may be synthesized de novo either chemically or by using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al., Nature 348:552-554 (1990)).
For preparation of monoclonal or polyclonal antibodies, any technique known in the art can be used (see, e.g., Kohler & Milstein, Nature 256:495-497 (1975); Kozbor et al., Immunology Today 4: 72 (1983); Cole et al., pp. 77-96 in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. (1985)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such as other mammals, may be used to express humanized antibodies. Alternatively, phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty et al., Nature 348:552-554 (1990); Marks et al., Biotechnology 10:779-783 (1992)).
A ligand or a receptor (e.g., an antibody) "specifically binds to" or "is specifically immunoreactive with" a compound analyte when the ligand or receptor functions in a binding reaction which is determinative of the presence of the analyte in a sample of heterogeneous compounds. Thus, under designated assay (e.g., immunoassay) conditions, the ligand or receptor binds preferentially to a particular analyte and does not bind in a significant amount to other compounds present in the sample. For example, a polynucleotide specifically binds under hybridization conditions to an analyte polynucleotide comprising a complementary sequence; an antibody specifically binds under immunoassay conditions to an antigen analyte bearing an epitope against which the antibody was raised; and an adsorbent specifically binds to an analyte under proper elution conditions.
"Agent" refers to a chemical compound, a mixture of chemical compounds, a sample of undetermined composition, a combinatorial small molecule array, a biological macromolecule, a bacteriophage peptide display library, a bacteriophage antibody (e.g., scFv) display library, a polysome peptide display library, or an extract made from biological materials such as bacteria, plants, fungi, or animal cells or tissues. Suitable techniques involve selection of libraries of recombinant antibodies in phage or similar vectors (see, e.g., Huse et al., Science 246:1275-1281 (1989); and Ward et al., Nature 341 :544-546 (1989)). The protocol described by Huse is rendered more
efficient in combination with phage display technology (see, e.g., WO 91/17271 and WO 92/01047.
"Expression control sequence" refers to a nucleotide sequence in a polynucleotide that regulates the expression (transcription and/or translation) of a nucleotide sequence operatively linked to it. "Operatively linked" refers to a functional relationship between two parts in which the activity of one part (e.g., the ability to regulate transcription) results in an action on the other part (e.g., transcription of the sequence). Expression control sequences can include, for example and without limitation, sequences of promoters (e.g., inducible, repressible or constitutive), enhancers, transcription terminators, a start codon (i.e., ATG), splicing signals for introns, and stop codons.
"Expression vector" refers to a vector comprising a recombinant polynucleotide comprising expression control sequences operatively linked to a nucleotide sequence to be expressed. An expression vector comprises sufficient cis- acting elements for expression; other elements for expression can be supplied by the host cell or in vitro expression system. Expression vectors include all those known in the art, such as cosmids, plasmids (e.g., naked or contained in liposomes) and viruses that incorporate the recombinant polynucleotide.
"Encoding" refers to the inherent property of specific sequences of nucleotides in a polynucleotide, such as a gene, a cDNA, or an mRNA, to serve as templates for synthesis of other polymers and macromolecules in biological processes having either a defined sequence of nucleotides (i.e., rRNA, tRNA and mRNA) or a defined sequence of amino acids and the biological properties resulting therefrom. Thus, a gene encodes a protein if transcription and translation of mRNA produced by that gene produces the protein in a cell or other biological system. Both the coding strand, the nucleotide sequence of which is identical to the mRNA sequence and is usually provided in sequence listings, and non-coding strand, used as the template for transcription, of a gene or cDNA can be referred to as encoding the protein or other product of that gene or cDNA. Unless otherwise specified, a "nucleotide sequence encoding an amino acid sequence" includes all nucleotide sequences that are degenerate versions of each other and that encode the same amino acid sequence. Nucleotide sequences that encode proteins and RNA may include introns.
"Energy absorbing molecule" refers to refers to a molecule that absorbs energy from an energy source in a desorption spectrometer thereby enabling desorption of
analyte from a probe surface. Energy absorbing molecules used in MALDI are frequently referred to as "matrix." Cinnamic acid derivatives (such as alpha-4-cyano-4-hydroxy- cinammic acid), cinnapinic acid and dihydroxybenzoic acid are frequently used as energy absorbing molecules in laser desorption of bioorganic molecules. "Probe" refers to a device that is removably insertable into a gas phase ion spectrometer (e.g., a mass spectrometer) that contains a substrate having a surface adapted for the presentation of an analyte for detection. The probes may be modified as a result of the analysis and may be disposable.
"Gas phase ion spectrometer" refers to an apparatus that measures a parameter which can be translated into mass-to-charge ratios of ions formed when a sample is volatilized and ionized. Generally ions created by laser desorption ionization bear a single charge, and mass-to-charge ratios are often simply referred to as mass. Gas phase ion spectrometers include, for example, mass spectrometers, ion mobility spectrometers, and total ion current measuring devices. "Mass spectrometer" refers to a gas phase ion spectrometer that includes an inlet system, an ionization source, an ion optic assembly, a mass analyzer, and a detector. Examples of mass spectrometers are time-of-flight, magnetic sector, quadrapole filter, ion trap, ion cyclotron resonance and hybrids of these.
"Laser desorption mass spectrometer" refers to a mass spectrometer which uses laser as means to desorb, volatilize, and ionize an analyte.
"Mass spectrometry" refers to the analysis of sample by a mass spectrometer.
A "quadrupole time-of-flight mass spectrometer" refers to a mass spectrometer that contains a collisional damping interface that cools the ions formed by the energy source before the ions enter a quadrupole Q. The quadrupole time-of-flight mass spectrophotometer can also contain a collision cell.
"Analyte" refers to a component of a sample which is desirably retained and detected. The term can refer to a single component or a set of components in the sample. "Adsorbent" refers to any material capable of adsorbing an analyte. The term "adsorbent" is used herein to refer both to a single material ("monoplex adsorbent") (e.g., a compound or functional group) to which the analyte is exposed, and to a plurality of different materials ("multiplex adsorbent") to which a sample is exposed. The adsorbent materials in a multiplex adsorbent are referred to as "adsorbent species." For
example, an addressable location on a substrate can comprise a multiplex adsorbent characterized by many different adsorbent species (e.g., anion exchange materials, metal chelators, or antibodies), having different binding characteristics.
"Adsorb" refers to the detectable binding between an absorbent and an analyte either before or after washing with an eluant (selectivity threshold modifier).
"Substrate" refers to a solid phase to which an adsorbent is attached or deposited.
"Binding characteristic" refers to a chemical and physical feature that dictates the attraction of an adsorbent for an analyte. Two adsorbents have different binding characteristics if, under the same elution conditions, the adsorbents bind the same analyte with different degrees of affinity. Binding characteristics include, for example, degree of salt-promoted interaction, degree of hydrophobic interaction, degree of hydrophilic interaction, degree of electrostatic interaction, and others described herein.
"Binding conditions" refer to the binding characteristics to which an analyte is exposed.
"Eluant" refers to an agent, typically a solution, that is used to mediate adsoφtion of an analyte to an adsorbent. Eluants also are referred to as "selectivity threshold modifiers."
"Elution characteristic" refers to a feature that dictates the ability of a particular eluant (selectivity threshold modifier) to mediate adsoφtion between an analyte and an absorbent. Two eluants have different elution characteristics if, when put in contact with an analyte and adsorbent, the degree of affinity of the analyte for the adsorbent differs. Elution characteristics include, for example, pH, ionic strength, modification of water structure, detergent strength, modification of hydrophobic interactions, and others described herein.
"Elution conditions" refer to the elution characteristics to which an analyte is exposed.
"Selectivity characteristic" refers to a feature of the combination of an adsorbent having particular binding characteristics and an eluant having particular elution characteristics that dictate the specificity with which the analyte is retained to the adsorbent after washing with the eluant.
"Selectivity conditions" refer to the selectivity characteristics to which an analyte is exposed.
"Basis for attraction" refers to the chemical and/or physio-chemical properties which cause one molecule to be attracted to another.
"Strength of attraction" refers to the intensity of the attraction of one molecule for another (also known as affinity). "Resolve," "resolution," or "resolution of analyte" refers to the detection of at least one analyte in a sample. Resolution includes the detection of a plurality of analytes in a sample by separation and subsequent differential detection. Resolution does not require the complete separation of an analyte from all other analytes in a mixture. Rather, any separation that allows the distinction between at least two analytes suffices. "High information resolution" refers to resolution of an analyte in a manner that permits not only detection of the analyte, but also at least one physiochemical property of the analyte to be evaluated, e.g., molecular mass.
"Desoφtion spectrometry" refers to a method of detecting an analyte in which the analyte is exposed to energy which desorbs the analyte from a stationary phase into a gas phase, and the desorbed analyte or a distinguishable portion of it is directly detected by a detector, without an intermediate step of capturing the analyte on a second stationary phase.
"Detect" refers to identifying the presence, absence or amount of the object to be detected. "Retention" refers to an adsoφtion of an analyte by an adsorbent after washing with an eluant.
"Retention data" refers to data indicating the detection (optionally including detecting mass) of an analyte retained under a particular selectivity condition.
"Retention map" refers to a value set specifying retention data for an analyte retained under a plurality of selectivity conditions.
"Recognition profile" refers to a value set specifying relative retention of an analyte under a plurality of selectivity conditions.
"Complex" refers to analytes formed by the union of 2 or more analytes.
"Fragment" refers to the products of the chemical, enzymatic, or physical breakdown of an analyte. Fragments may be in a neutral or ionic state.
"Differential expression" refers to a detectable difference in the qualitative or quantitative presence of an analyte.
"Gene expression profile" refers to the identification of at least one mRNA expressed in a biological sample.
"Physio-chemical property" refers to a physical or chemical property of a molecule that is characteristic the molecule. Physio-chemical properties of proteins include, without limitation, amino acid sequence, molecular weight, iso-electric point, hydrophobicity, hydrophilicity, glycosylation, phosphorylation, epitope sequence, ligand binding sequence, charge at a specified pH (isoelectric point), dye binding, and metal chelate binding. A physiochemical property is used, e.g., as an identifier or means of fractionation or isolation in a protein profile. For example, an amino acid sequence feature such as a hexa-histidine sequence, ligand binding motif or sequence, domain, protease cleavage site, metal chelate binding site, or epitope, can be used to fractionate, isolate or identify a polypeptide comprising such a sequence. In another example, phosphorylated polypeptide can be fractionated, isolated or identified via interaction with a corresponding kinase or phosphorylase, or by a colorimetric enzyme reaction, or by an antibody that binds to the phosphorylated portion of the polypeptide. Similarly, a glycosylated polypeptide can be fractionated, isolated, or identified via an interaction with a binding partner, or an antibody that binds to the glycosylated portion of the polypeptide, or by an antibody that recognizes the carbohydrate, or by a lectin, or enzymatically. In another example, buffers and solutions of varying pH, or anionic or cationic resins, can be used to fractionate, isolate or identify polypeptides according to their charge at a given pH, or their pi or isoelectric point. In another example, buffers, solutions, and resins of varying hydrophilicity can be used to fractionate, isolate, or identify polypeptides based on their hydrophobicity or hydrophilicity. In another example, mass or molecular weight, or the mass or molecular weight of proteolytic fragments of the polypeptide can be used to isolate, identify, or fractionate the polypeptide. "Nucleic acid array" refers to an array of addressable locations (i.e., a location characterized by a distinctive, interrogatable address), each addressable location comprising a characteristic nucleic acid attached thereto. A nucleic acid can be any nucleic acid as defined herein, e.g., a naturally occurring or synthetic nucleic acid, e.g., an oligonucleotide or polynucleotide. In an oligonucleotide array, the nucleic acid is an oligonucleotide (e.g., corresponding to an exon, EST, or a portion of a gene, transcript, or cDNA); in an EST array the nucleic acid is an EST or portion thereof; in an mRNA array the nucleic acid is an mRNA or portion thereof, or a corresponding cDNA. An oligonucleotide can be from 4, 6, 8, 10, or 12 nucleotides or longer in length, often 10, 30, 40, or 50 nucleotides in length, up to about 100 nucleotides in length.
Gene Expression Profiling
A first step in the methods of the invention is performing gene expression profiling of a sample of interest. Gene expression profiling refers to examining expression of one or more RNAs in a cell, preferably mRNA. Often at least or up to 10, 100, 100, 10,000 or more different mRNAs are examined in a single experiment. In one embodiment, differential profiling (comparison with another cell, e.g., that has a different phenotype, or is at a different temporal or developmental stage, or has been exposed to different environmental conditions, e.g., physical or chemical conditions, etc.) provides useful information about the cell of interest, e.g., genes that are preferentially or selectively expressed in a given cell type. Often, a gene of interest is highly expressed in one cell but not another. In other embodiments, the gene of interest has a similar expression pattern in different cells. In other embodiments, the gene of interest has low expression in one cell as compared to another. Methods for examining gene expression, often but not always hybridization based, include, e.g., northern blots; dot blots; primer extension; nuclease protection; subtractive hybridization and isolation of non-duplexed molecules using, e.g., hydroxyapatite; solution hybridization; filter hybridization; amplification techniques such as RT-PCR and other PCR-related techniques such as differential display, LCR, AFLP, RAP, etc. (see, e.g., U.S. Patents 4,683,195 and 4,683,202; PCR Protocols: A Guide to Methods and Applications (Innis et al., eds, 1990); Liang & Pardee, Science 257:967-971 (1992); Hubank & Schatz, Nuc. Acids Res. 22:5640-5648 (1994); Perucho et al, Methods Enzymol. 254:275-290 (1995)), fmgeφrinting, e.g., with restriction endonucleases (Ivanova et al, Nuc. Acids. Res. 23:2954-2958 (1995); Kato, Nuc. Acids Res. 23:3685- 3690 (1995); and Shimkets et al, Nature Biotechnology 17:798-803, see also US Patent No. 5,871,697)); and the use of structure specific endonucleases (see, e.g., De Francesco, The Scientist 12:16 (1998)). mRNA expression can also be analyzed using mass spectrometry techniques (e.g., MALDI or SELDI), liquid chromatography, and capillary gel electrophoresis, as described below. For a general description of these techniques, see also Sambrook et al,
Molecular Cloning, A Laboratory Manual (2nd ed. 1989), see, e.g., pages 7.37-7.39, 7.53-7.54, 7.58-7.66, and 7.71-7.79; Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al, eds., 1994).
Techniques have been developed that expedite expression analysis and sequencing of large numbers of nucleic acids samples. For example, nucleic acid arrays have been developed for high density and high throughput expression analysis (see, e.g., Granjeuad et al, BioEssays 21:781-790 (1999); Lockhart & Winzeler, Nature 405:827- 836 (2000)). Nucleic acid arrays refer to large numbers (e.g., hundreds, thousands, tens of thousands, or more) of nucleic acid probes bound to solid substrates, such as nylon, glass, or silicon wafers (see, e.g., Fodor et al., Science 251:767-773 (1991); Brown & Botstein, Nature Genet. 21:33-37 (1999); Eberwine, Biotechniques 20:584-591 (1996)). A single array can contain, e.g., probes corresponding to an entire genome, or to all genes expressed by the genome. The probes on the array can be DNA oligonucleotide arrays (e.g., GeneChip™, see, e.g., Lipshutz et al, Nat. Genet. 21:20-24 (1999)), mRNA arrays, cDNA arrays, EST arrays, or optically encoded arrays on fiber optic bundles (e.g., BeadArray™). The samples applied to the arrays for expression analysis can be, e.g., PCR products, cDNA, mRNA, etc. Additional techniques for rapid gene sequencing and analysis of gene expression include, e.g., SAGE (serial analysis of gene expression). For SAGE, a short sequence tag (typically about 10-14 bp) contains sufficient information to uniquely identify a transcript. These sequence tags can be linked together to form long serial molecules that can be cloned and sequenced. Quantitation of the number of times a particular tag is observed proves the expression level of the corresponding transcript (see, e.g., Nelculescu et al, Science 270:484-487 (1995); Velculescu et al, Cell 88 (1997); and de Waard et al, Gene 226:1-8 (1999)).
Physio-chemical Properties As described herein , each of these techniques can be used, alone or in combination, to identify a candidate gene or set of candidate genes of interest that are expressed in a cell. Transcripts of interest are identified and isolated using techniques known to those of skill in the art. The transcript so identified is sequenced and, using the encoded amino acid sequence information, is analyzed for physiochemical characteristics, such as molecular weight, iso-electric point, hydrophobicity, hydrophilicity, glycosylation, phosphorylation, epitope sequence, protease fragmentation, ligand binding sequence, charge at a specified pH, and metal chelate binding. Often, bioinformatics and sequence databases can be used to identify a function of the protein encoded by the
transcript. Genes of interest include, e.g., ion channels, receptors, e.g., G protein coupled receptors, cytokines, chemokines, signal transduction proteins, housekeeping proteins, cell cycle regulation proteins, transcription factors, zinc finger proteins, chromatin remodeling proteins, etc. The physio-chemical properties so identified are tools for correlating the level of expression of a transcript with the level of expression of the protein encoded by the transcript. Using the protein analysis tools described below, one or more of the physio-chemical characteristics of the protein can be used fractionate the proteins of interest, while reducing background and increasing sensitivity of protein detection. In this manner, a candidate transcript or transcripts of interest can be further correlated with the level of expression of the encoded protein in a cell. This information can be used to select a subset of transcripts and proteins for use in, e.g., diagnostic and therapeutic applications.
Protein Fractionation Analysis of Samples
Polypeptides in the sample are then fractionated based on at least one physio-chemical property of the polypeptide encoded by the identified expressed mRNA. For example, the identity of the polypeptide will indicate several predicted physiochemical characteristics of the polypeptide. Amino acid sequence will provide a predicted molecular mass of the protein. The amino acid sequence also can be used to predict the isoelectric point of the polypeptide, whether the polypeptide is hydrophilic or hydrophobic and whether the polypeptide has metal chelate binding ability. Amino acid sequence also can indicate whether the polypeptide includes glycosylation or phosphorylation sites. Post-translational modifications of the polypeptide will be reflected in changes to molecular weight. Amino acid sequence also can identify epitopes which, in turn, may be targets for antibody binding. An exact measurement of the physiochemical property is not necessary; it is sufficient to obtain some information so that upon fractionation into a plurality of aliquots based on that characteristic, the polypeptide is expected to be preferentially fractionated among the aliquots. The polypeptides in the sample are then fractionated based on a physiochemical characteristic of the polypeptide. A most useful method of separation is molecular weight, as there are many useful methods to separate proteins based on this characteristic including, for example, SDS gel electrophoresis and gas phase ion spectrometry, e.g., mass spectrometry. Another useful physiochemical characteristic is
isoelectric point. Isoelectric focusing, affinity chromatography and solid phase extraction on an ion exchange resin will fractionate proteins in a sample based on this property.
Methods of fractionating proteins are used to examine the level of expression of a selected protein in a cell. As described above, the use of one or more elected physiochemical characteristics can enhance the sensitivity of fractionation and reduce background. The techniques described herein can be used to examine one or more proteins expressed in a cell, up to tens, hundreds, thousands, or tens of thousands of proteins. Any one technique or a combination of techniques can be used to fractionate the proteins, based on one or more physio-chemical property. Methods of fractionation include, e.g., two dimensional gels; capillary gel electrophoresis; mass spectrometry, e.g., MALDI, SELDI; ICAT (isotope coded affinity tag, see, e.g., Mann, Nature Biotechnology 17:954-955 (1999); Gygi et al, Nature Biotechnology 17:994-999 (1999)); chromatography, e.g., gel-filtration, ion-exchange, affinity, immunoaffinity, and metal chelate chromatography, HPLC, e.g., reversed phase, ion-exchange, and size exclusion HPLC; western blotting; immunohistochemistry techniques such as ELISA and in situ screening with antibodies, etc (see, e.g., Blackstock & Weir, Trends in Biotech. 17:121- 127 (1999); Dutt & Lee, Biochemical Engineering, pages 176-179 (April 2000); Page et al, Drug Discovery Today 4:55-62 (1999); Wang & Hewick, Drug Discovery Today 4:129-133 (1999); Regnier et al, Trends in Biotech. 17:101-106 (1999); and Pandey & Mann, Nature 405 :837-846 (2000)). The proteins of interest are identified and isolated using techniques known to those of skill in the art.
For a general description of these techniques, see also Sambrook et al., Molecular Cloning, A Laboratory Manual (2nd ed. 1989); and Current Protocols in Molecular Biology (Ausubel et al, eds., 1994). In one embodiment, two-dimensional electrophoresis can be used to fractionate the proteins of the invention. This technique fractionates proteins based on the physio-chemical characteristics of pi and molecular weight. 2d gel electrophoresis and the techniques describedTierein can be used alone, or in combination with other techniques such as mass spectrometry, e.g., MALDI and SELDI, described herein below. In another embodiment, described below, MALDI is a mass spectrometry technique that fractionates proteins based on mass, and is often combined with size and or affinity chromatography techniques to increase resolution.
In another embodiment, described below, SELDI is a mass spectrometry technique that couples affinity fractionation with mass spectrometry. An affinity matrix
or probe based on, e.g., pi (ion exchange resin and wash), antibody binding, glycosylation, phosphorylation, histidine residues, etc. is used in SELDI, in combination with mass spectrometry, to identify proteins with high resolution, accuracy, and sensitivity. When using this technique, an affinity matrix that enriches for the candidate polypeptides can be determined, based on the physio-chemical characteristics of the protein encoded by the transcript.
Mass Spectrometry Analysis of Samples
Introduction The polypeptides of the invention or fragments thereof can be analyzed using mass spectrometry methods. This method fractionates the polypeptides based on mass. In certain embodiments gas phase ion spectrophotometer is used. In other embodiments, laser-desoφtion ionization mass spectrometry is used to analyze the sample on the substrate-bound adsorbent. Modem laser desoφtion/ionization mass spectrometry ("LDI-MS") can be practiced in two main variations: matrix assisted laser desoφtion/ionization ("MALDI") mass spectrometry and surface-enhanced laser desoφtion/ionization ("SELDI"). Mass spectrometers utilizing laser desoφtion/ionization mass spectrometry can be further coupled to a quadrupole time-of-flight mass spectrometer. In MALDI, the analyte, which may contain biological molecules, is mixed with a solution containing a matrix, and a drop of the liquid is placed on the surface of a substrate. The matrix solution then co- crystallizes with the biological molecules. The substrate is inserted into the mass spectrometer. Laser energy is directed to the substrate surface where it desorbs and ionizes the biological molecules without significantly fragmenting them. However, MALDI has limitations as an analytical tool. It does not provide means for fractionating the sample, and the matrix material can interfere with detection, especially for low molecular weight analytes. See, e.g., U.S. Patent 5,118,937 (Hillenkamp et al), and U.S. Patent 5,045,694 (Beavis & Chait).
In SELDI, the substrate surface is modified so that it is an active participant in the desoφtion process. In one variant, the surface is derivatized with affinity reagents that selectively bind the analyte. In another variant, the surface is derivatized with energy absorbing molecules that are not desorbed when struck with the laser. In another variant, the surface is derivatized with molecules that bind the analyte and that contain a photolytic bond that is broken upon application of the laser. In each of
these methods, the derivatizing agent generally is localized to a specific location on the substrate surface where the sample is applied. See, e.g., U.S. Patent 5,719,060 and 5, 6020208 (Hutchens & Yip) and WO 98/59360, WO 98/59361, and WO 98/59362 (Hutchens & Yip). The two methods can be combined by, for example, using a SELDI affinity surface to capture an analyte and adding matrix-containing liquid to the captured analyte to provide the energy absorbing material.
In certain embodiments, the laser desoφtion/ionization mass spectrophotometer is further coupled to a quadrupole time-of-flight mass spectrometer QqTOF MS (see, e.g., Kratchinsky et al, WO 99/38185). Methods such as MALDI- QqTOFMS (Kratchinsky et al, WO 99/38185; Shevchenko et al (2000) Anal. Chem. 72: 2132-2141), ESI-QqTOF MS (Figeys et al. (1998) Rapid Comm 'ns. Mass Spec. 12-1435- 144) and chip capillary electrophoresis (chip-CE)-QqTOF MS(Li et al (2000) Anal. Chem. 72: 599-609) have been described previously.
In one embodiment, a mass spectrometer is used to fractionate protein samples of the invention. In a typical mass spectrometer, a substrate containing a polypeptide analyte is introduced into an inlet system of the mass spectrometer. The analyte is then desorbed by a desoφtion source such as a laser, fast atom bombardment, high energy plasma, electrospray ionization, thermospray ionization, liquid secondary ion MS, field desoφtion, etc. The generated desorbed, volatilized species consist of preformed ions or neutrals which are ionized as a direct consequence of the desoφtion event. Generated ions are collected by an ion optic assembly, and then a mass analyzer disperses and analyzes the passing ions. The ions exiting the mass analyzer are detected by a detector.
The detector then translates information of the detected ions into mass-to- charge ratios. Detection of the presence of a marker or other substances will typically involve detection of signal intensity. This, in turn, can reflect the quantity and character of a polypeptide bound to the substrate. The mass spectrometers and their techniques are well known to those of skill in the art. Any person skilled in the art understands, any of the components of a mass spectrometer (e.g., desoφtion source, mass analyzer, detect, etc.) can be combined with other suitable components described herein or those known in the art. For additional information regarding mass spectrometers, see, e.g., Principles of Instrumental Analysis, 3rd ed., Skoog, Saunders College Publishing, Philadelphia, 1985; and Kirk-Othmer Encyclopedia of Chemical Technology, 4 ed. Vol. 15 (John Wiley & Sons, New York 1995), pp. 1071-1094.
In one embodiment, a laser desoφtion time-of-flight mass spectrometer is used with the substrate of the present invention. In laser desoφtion mass spectrometry, a substrate with a bound marker is introduced into an inlet system. The marker is desorbed and ionized into the gas phase by laser from the ionization source. The ions generated are collected by an ion optic assembly, and then in a time-of-flight mass analyzer, ions are accelerated through a short high voltage field and let drift into a high vacuum chamber. At the far end of the high vacuum chamber, the accelerated ions strike a sensitive detector surface at a different time. Since the time-of-flight is a function of the mass of the ions, the elapsed time between ion formation and ion detector impact can be used to identify the presence or absence of molecules of specific mass to charge ratio.
Retentate chromatography is a method for the multidimensional resolution of analytes in a sample. The method involves (1) selectively adsorbing analytes from a sample to a substrate under a plurality of different adsorbent/eluant combinations ("selectivity conditions") and (2) detecting the retention of adsorbed analytes by desoφtion spectrometry. Each selectivity condition provides a first dimension of separation, separating adsorbed analytes from those that are not adsorbed. Desoφtion mass spectrometry provides a second dimension of separation, separating adsorbed analytes from each other according to mass. Because retentate chromatography involves using a plurality of different selectivity conditions, many dimensions of separation are achieved. The relative adsoφtion of one or more analytes under the two selectivity conditions also can be determined. This multidimensional separation provides both resolution of the analytes and their characterization.
Further, the analytes thus separated remain docked in a retentate map that is amenable to further manipulation to examine, for example, analyte structure and or function. Also, the docked analytes can, themselves, be used as adsorbents to dock other analytes exposed to the substrate. In sum, the present invention provides a rapid, multidimensional and high information resolution of analytes.
The method can take several forms. In one embodiment, the analyte is adsorbed to two different adsorbents at two physically different locations and each adsorbent is washed with the same eluant (selectivity threshold modifier). In another embodiment, the analyte is adsorbed to the same adsorbent at two physically different locations and washed with two different eluants. In another embodiment, the analyte is adsorbed to two different adsorbents in physically different locations and washed with two different eluants. In another embodiment, the analyte is adsorbed to an adsorbent and
washed with a first eluant, and retention is detected; then, the adsorbed analyte is washed with a second, different eluant, and subsequent retention is detected.
Methods Of Performing Retentate Chromatography Retentate chromatography is a particularly useful method for fractionating polypeptides in a sample. According to this method, the polypeptides are fractionated on a solid phase adsorbent which binds polypeptides based on particular physio-chemical properties. Unbound polypeptides are washed away. Then the retained polypeptides are further fractionated by mass spectrometry, thereby providing fractionation based on at least two physio-chemical properties.
Exposing The Analyte to Selectivity Conditions Substrate preparation: In performing retentate chromatography an analyte that is retained by an adsorbent is presented to an energy source on a substrate. A sample containing the analyte may be contacted to the adsorbent before or after the adsorbent is affixed to the substrate that will serve to present the analyte to the desoφtion means. For contacting puφoses, the adsorbent may be in liquid form or solid form (i.e., on a substrate or solid phase). Specifically, the adsorbent may be in the form of a solution, suspension, dispersion, water-in-oil emulsion, oil-in-water emulsion, or microemulsion. When the adsorbent is provided in the form of a suspension, dispersion, emulsion or microemulsion, a suitable surfactant may also be present. In this embodiment, the sample may be contacted with the adsorbent by admixing a liquid sample with the liquid adsorbent. Alternatively, the sample may be provided on a solid support and contacting will be accomplished by bathing, soaking, or dipping the sample-containing solid support in the liquid adsorbent. In addition, the sample may be contacted by spraying or washing over the solid support with the liquid adsorbent. In this embodiment, different adsorbents may be provided in different containers.
In one embodiment, the adsorbent is provided on a substrate. The substrate can be any material which is capable of binding or holding the adsorbent. Typically, the substrate is comprised of glass; ceramic; electrically conducting polymers (e.g. carbonized PEEK); TEFLON® coated materials; organic polymers; native biopolymers; metals (e.g., nickel, brass, steel or aluminum); films; porous and non-porous beads of cross-linked polymers (e.g., agarose, cellulose or dextran); other insoluble polymers; or combinations thereof.
In one embodiment, the substrate takes the form of a probe or a sample presenting means that is inserted into a desoφtion detector. For example, referring to Fig. 1, the substrate can take the form of a strip. The adsorbent can be attached to the substrate in the form of a linear array of spots, each of which can be exposed to the analyte. Several strips can be joined together so that the plurality of adsorbents form an array 30 having discrete spots in defined rows. The substrate also can be in the form of a plate having an array of horizontal and vertical rows of adsorbents which form a regular geometric pattern such as a square, rectangle or circle.
Probes can be produced as follows. The substrate can be any solid material, for example, stainless steel, aluminum or a silicon wafer. A metal substrate can then be coated with a material that allows derivitization of the surface. For example a metal surface can be coated with silicon oxide, titanium oxide or gold.
The surface is then derivatized with a bifunctional linker. The linker includes at one end a functional group that can covalently bind with a functional group on the surface. Thus the functional group can be an inorganic oxide or a sulfhydryl group for gold. The other end of the linker generally has an amino functionality. Useful bifunctional linkers include aminopropyl triethoxysilane or aminoethyl disulfide.
Once bound to the surface, the linkers are further derivatized with groups that function as the adsorbent. Generally the adsorbent is added to addressable locations on the probe. In one type of probe spots of about 3 mm in diameter are arrange in an orthogonal array. The adsorbents can, themselves, be part of bifunctional molecules containing a group reactive with the available amino group and the functional group that acts as the adsorbent. Functional groups include, for example, normal phase (silicon oxide), reverse phase (C18 aliphatic hydrocarbon), quaternary amine and sulphonate. Also, the surface can be further derivatized with other bifunctional molecules such as carbodiimide and N-hydroxysuccinimide, creating a pre-activated blank. These blanks can be functionalized with bioorganic adsorbents (e.g., nucleic acids, antibodies and other protein ligands). Biopolymers can bind the functional groups on the blanks through amine residues or sulfhydryl residues. In one embodiment, the adsorbents are bound to cross-linked polymers (e.g., films) that are themselves bound to the surface of the probe through the available functional groups. Such polymers include, for example, cellulose, dextran, carboxymethyl dextran, polyacrylamide and mixtures of these. Probes with attached adsorbents are ready for use.
In another embodiment, the adsorbent is attached to a first substrate to provide a solid phase, such as a polymeric or glass bead, which is subsequently positioned on a second substrate which functions as the means for presenting the sample to the desorbing energy of the desoφtion detector. For example, the second substrate can be in the form a plate having a series of wells at predetermined addressable locations. The wells can function as containers for a first substrate derivatized with the adsorbent, e.g., polymeric beads derivatized with the adsorbent. One advantage of this embodiment is that the analyte can be adsorbed to the first substrate in one physical context, and transferred to the sample presenting substrate for analysis by desoφtion spectrometry. Typically, the substrate is adapted for use with the detectors employed in the methods of the present invention for detecting the analyte bound to and retained by the adsorbent. In one embodiment, the substrate is removably insertable into a desoφtion detector where an energy source can strike the spot and desorb the analyte. The substrate can be suitable for mounting in a horizontally and/or vertically translatable carriage that horizontally and or vertically moves the substrate to successively position each predetermined addressable location of adsorbent in a path for interrogation by the energy source and detection of the analyte bound thereto. The substrate can be in the form of a conventional mass spectrometry probe
The strips, plates, or probes of substrate can be produced using conventional techniques. Thereafter, the adsorbent can be directly or indirectly coupled, fitted, or deposited on the substrate prior to contacting with the sample containing the analyte. The adsorbent may be directly or indirectly coupled to the substrate by any suitable means of attachment or immobilization. For example, the adsorbent can be directly coupled to the substrate by derivatizing the substrate with the adsorbent to directly bind the adsorbent to the substrate through covalent or non-covalent bonding.
Attachment of the adsorbent to the substrate can be accomplished through a variety of mechanisms. The substrate can be derivatized with a fully prepared adsorbent molecule by attaching the previously prepared adsorbent molecule to the substrate. Alternatively, the adsorbent can be formed on the substrate by attaching a precursor molecule to the substrate and subsequently adding additional precursor molecules to the growing chain bound to the substrate by the first precursor molecule. This mechanism of building the adsorbent on the substrate is particularly useful when the adsorbent is a polymer, particularly a biopolymer such as a DNA or RNA molecule. A biopolymer adsorbent can be provided by successively adding bases to a first base
attached to the substrate using methods known in the art of oligonucleotide chip technology. See, e.g., U.S. Patent No. 5,445,934 (Fodor et al).
As can be seen from Fig. 2, as few as two and as many as 10, 100, 1000, 10,000 or more adsorbents can be coupled to a single substrate. The size of the adsorbent site may be varied, depending on experimental design and puφose. However, it need not be larger than the diameter of the impinging energy source (e.g., laser spot diameter). The spots can continue the same or different adsorbents. In some cases, it is advantageous to provide the same adsorbent at multiple locations on the substrate to permit evaluation against a plurality of different eluants or so that the bound analyte can be preserved for future use or reference, perhaps in secondary processing. By providing a substrate with a plurality of different adsorbents, it is possible to utilize the plurality of binding characteristics provided by the combination of different adsorbents with respect to a single sample and thereby bind and detect a wider variety of different analytes. The use of a plurality of different adsorbents on a substrate for evaluation of a single sample is essentially equivalent to concurrently conducting multiple chromatographic experiments, each with a different chromatography column, but the present method has the advantage of requiring only a single system.
When the substrate includes a plurality of adsorbents, it is particularly useful to provide the adsorbents in predetermined addressable locations. By providing the adsorbents in predetermined addressable locations, it is possible to wash an adsorbent at a first predetermined addressable location with a first eluant and to wash an adsorbent at a second predetermined addressable location with a second eluant. In this manner, the binding characteristics of a single adsorbent for the analyte can be evaluated in the presence of multiple eluants which each selectively modify the binding characteristics of the adsorbent in a different way. The addressable locations can be arranged in any pattern, but preferably in regular patters, such as lines, orthogonal, arrays, or regular curves, such as circles. Similarly, when the substrate includes a plurality of different adsorbents, it is possible to evaluate a single eluant with respect to each different adsorbent in order to evaluate the binding characteristics of a given adsorbent in the presence of the eluant. It is also possible to evaluate the binding characteristics of different adsorbents in the presence of different eluants.
Incremental or Gradient Adsorbent Surfaces: A series of adsorbents having different binding characteristics can be provided by synthesizing a plurality of different polymeric adsorbents on the substrate. The different polymeric adsorbents can
be provided by attaching a precursor molecule to the substrate, initializing the polymerization reaction, and terminating the polymerization reaction at varied degrees of completion for each adsorbent. Also, the terminal functional groups in the polymers can be reacted so as to chemically derivatize them to varying degrees with different affinity reagent (e.g., -NH3, or COO-). By terminating the polymerization or derivatization reaction, adsorbents of varying degrees of polymerization or derivatization are produced. The varying degrees of polymerization or derivatization provide different binding characteristics for each different polymeric adsorbent. This embodiment is particularly useful for providing a plurality of different biopolymer adsorbents on a substrate. If desired, the polymerization reactions can be carried out in a reaction vessel, rather than on the substrate itself. For example, polymeric adsorbents of varying binding characteristics can be provided by extracting an aliquot of product from the reaction vessel as the polymerization/derivatization reaction is proceeding. The aliquots, having been extracted at various points during the polymerization/derivatization reaction will exhibit varied degrees of polymerization/derivatization to yield a plurality of different adsorbents. The different aliquots of product can then be utilized as adsorbents having different binding characteristics. Alternatively, a plurality of different adsorbents can be provided by sequentially repeating the steps of terminating the reaction, withdrawing an aliquot of product, and re-starting the polymerization/derivatization reaction. The products extracted at each termination point will exhibit varying degrees of polymerization/derivatization and as a result will provide a plurality of adsorbents having different binding characteristics.
In one embodiment, a substrate is provided in the form of a strip or a plate that is coated with adsorbent in which one or more binding characteristic varies in a one- or two-dimensional gradient. For example, a strip is provided having an adsorbent that is weakly hydrophobic at one end and strongly hydrophobic at the other end. Or, a plate is provided that is weakly hydrophobic and anionic in one comer, and strongly hydrophobic and anionic in the diagonally opposite comer. Such adsoφtion gradients are useful in the qualitative analysis of an analyte. Adsoφtion gradients can be made by a controlled spray application or by flowing material across a surface in a time- wise manner to allow incremental completion of a reaction over the dimension of the gradient. This process can be repeated, at right angles, to provide orthogonal gradients of similar or different adsorbents with different binding characteristics.
The sample containing the analyte may be contacted to the adsorbent either before or after the adsorbent is positioned on the substrate using any suitable method which will enable binding between the analyte and the adsorbent. The adsorbent can simply be admixed or combined with the sample. The sample can be contacted to the adsorbent by bathing or soaking the substrate in the sample, or dipping the substrate in the sample, or spraying the sample onto the substrate, by washing the sample over the substrate, or by generating the sample or analyte in contact with the adsorbent. In addition, the sample can be contacted to the adsorbent by solubilizing the sample in or admixing the sample with an eluant and contacting the solution of eluant and sample to the adsorbent using any of the foregoing techniques (i.e., bathing, soaking, dipping, spraying, or washing over).
Contacting the analyte to the adsorbent: Exposing the sample to an eluant prior to binding the analyte to the adsorbent has the effect of modifying the selectivity of the adsorbent while simultaneously contacting the sample to the adsorbent. Those components of the sample which will bind to the adsorbent and thereby be retained will include only those components which will bind the adsorbent in the presence of the particular eluant which has been combined with the sample, rather than all components which will bind to the adsorbent in the absence of elution characteristics which modify the selectivity of the adsorbent. The sample should be contacted to the adsorbent for a period of time sufficient to allow the analyte to bind to the adsorbent. Typically, the sample is contacted with the analyte for a period of between about 30 seconds and about 12 hours. Preferably, the sample is contacted to the analyte for a period of between about 30 seconds and about 15 minutes. The temperature at which the sample is contacted to the adsorbent is a function of the particular sample and adsorbents selected. Typically, the sample is contacted to the adsorbent under ambient temperature and pressure conditions, however, for some samples, modified temperature (typically 4°C through 37°C) and pressure conditions can be desirable and will be readily determinable by those skilled in the art. Another advantage of the present invention over conventional detection techniques is that the present invention enables the numerous different experiments to be conducted on a very small amount of sample. Generally, a volume of sample containing from a few atommoles to 100 picomoles of analyte in about 1 μl to 500 μl is sufficient for binding to the adsorbent. Analyte may be preserved for future experiments after binding
to the adsorbent because any adsorbent locations which are not subjected to the steps of desorbing and detecting all of the retained analyte will retain the analyte thereon. Therefore, in the case where only a very small fraction of sample is available for analysis, the present invention provides the advantage of enabling a multitude of experiments with different adsorbents and/or eluants to be carried out at different times without wasting sample.
Washing the Adsorbent with Eluants: After the sample is contacted with the analyte, resulting in the binding of the analyte to the adsorbent, the adsorbent is washed with eluant. Typically, to provide a multi-dimensional analysis, each adsorbent location is washed with at least a first and a second different eluants. Washing with the eluants modifies the analyte population retained on a specified adsorbent. The combination of the binding characteristics of the adsorbent and the elution characteristics of the eluant provide the selectivity conditions which control the analytes retained by the adsorbent after washing. Thus, the washing step selectively removes sample components from the adsorbent.
The washing step can be carried out using a variety of techniques. For example, as seen above, the sample can be solubilized in or admixed with the first eluant prior to contacting the sample to the adsorbent. Exposing the sample to the first eluant prior to or simultaneously with contacting the sample to the adsorbent has, to a first approximation, the same net effect as binding the analyte to the adsorbent and subsequently washing the adsorbent with the first eluant. After the combined solution is contacted to the adsorbent, the adsorbent can be washed with the second or subsequent eluants.
Washing an adsorbent having the analyte bound thereto can be accomplished by bathing, soaking, or dipping the substrate having the adsorbent and analyte bound thereon in an eluant; or by rinsing, spraying, or washing over the substrate with the eluant. The introduction of eluant to small diameter spots of affinity reagent is best achieved by a microfluidics process.
When the analyte is bound to adsorbent at only one location and a plurality of different eluants are employed in the washing step, information regarding the selectivity of the adsorbent in the presence of each eluant individually may be obtained. The analyte bound to adsorbent at one location may be determined after each washing with eluant by following a repeated pattern of washing with a first eluant, desorbing and detecting retained analyte, followed by washing with a second eluant, and desorbing and
detecting retained analyte. The steps of washing followed by desorbing and detecting can be sequentially repeated for a plurality of different eluants using the same adsorbent. In this manner the adsorbent with retained analyte at a single location may be reexamined with a plurality of different eluants to provide a collection of information regarding the analytes retained after each individual washing.
The foregoing method is also useful when adsorbents are provided at a plurality of predetermined addressable locations, whether the adsorbents are all the same or different. However, when the analyte is bound to either the same or different adsorbents at a plurality of locations, the washing step may alternatively be carried out using a more systematic and efficient approach involving parallel processing. Namely, the step of washing can be carried out by washing an adsorbent at a first location with eluant, then washing a second adsorbent with eluant, then desorbing and detecting the analyte retained by the first adsorbent and thereafter desorbing and detecting analyte retained by the second adsorbent. In other words, all of the adsorbents are washed with eluant and thereafter analyte retained by each is desorbed and detected for each location of adsorbent. If desired, after detection at each adsorbent location, a second stage of washings for each adsorbent location may be conducted followed by a second stage of desoφtion and detection. The steps of washing all adsorbent locations, followed by desoφtion and detection at each adsorbent location can be repeated for a plurality of different eluants. In this manner, and entire array may be utilized to efficiently determine the character of analytes in a sample. The method is useful whether all adsorbent locations are washed with the same eluant in the first washing stage or whether the plurality of adsorbents are washed with a plurality of different eluants in the first washing stage.
Detection Analytes retained by the adsorbent after washing are adsorbed to the substrate. Analytes retained on the substrate are detected by desoφtion spectrometry: desorbing the analyte from the adsorbent and directly detecting the desorbed analytes. Methods For Desoφtion: Desorbing the analyte from the adsorbent involves exposing the analyte to an appropriate energy source. Usually this means striking the analyte with radiant energy or energetic particles. For example, the energy can be light energy in the form of laser energy (e.g., UN laser) or energy from a flash
lamp. Alternatively, the energy can be a stream of fast atoms. Heat may also be used to induce/aid desoφtion.
Methods of desorbing and/or ionizing analytes for direct analysis are well known in the art. One such method is called matrix-assisted laser desoφtion/ionization, or MALDI. In MALDI, the analyte solution is mixed with a matrix solution and the mixture is allowed to crystallize after being deposited on an inert probe surface, trapping the analyte within the crystals may enable desoφtion. The matrix is selected to absorb the laser energy and apparently impart it to the analyte, resulting in desoφtion and ionization. Generally, the matrix absorbs in the UN range. MALDI for large proteins is described in, e.g., U.S. patent 5,118,937 (Hillenkamp et al.) and U.S. patent 5,045,694 (Beavis and Chait).
Surface-enhanced laser desoφtion/ionization, or SELDI, represents a significant advance over MALDI in terms of specificity, selectivity and sensitivity. SELDI is described in United States patent 5,719,060 (Hutchens and Yip). SELDI is a solid phase method for desoφtion in which the analyte is presented to the energy stream on a surface that enhances analyte capture and/or desoφtion. In contrast, MALDI is a liquid phase method in which the analyte is mixed with a liquid material that crystallizes around the analyte.
One version of SELDI, called SEAC (Surface-Enhanced Affinity Capture), involves presenting the analyte to the desorbing energy in association with an affinity capture device (i.e., an adsorbent). It was found that when an analyte is so adsorbed, it can be presented to the desorbing energy source with a greater opportunity to achieve desoφtion of the target analyte. An energy absorbing material can be added to the probe to aid desoφtion. Then the probe is presented to the energy source for desorbing the analyte
Another version of SELDI, called SEND (Surface-Enhanced Neat Desoφtion), involves the use of a layer of energy absorbing material onto which the analyte is placed. A substrate surface comprises a layer of energy absorbing molecules chemically bond to the surface and/or essentially free of crystals. Analyte is then applied alone (i.e., neat) to the surface of the layer, without being substantially mixed with it.
The energy absorbing molecules, as do matrix, absorb the desorbing energy and cause the analyte to be desorbed. This improvement is substantial because analytes can now be presented to the energy source in a simpler and more homogeneous manner because the performance of solution mixtures and random crystallization is eliminated. This provides
more uniform and predictable results that enable automation of the process. The energy absorbing material can be classical matrix material or can be matrix material whose pH has been neutralized or brought into the basic range. The energy absorbing molecules can be bound to the probe through covalent or noncovalent means. Another version of SELDI, called SEP AR (Surface-Enhanced Photolabile
Attachment and Release), involves the use of photolabile attachment molecules. A photolabile attachment molecule is a divalent molecule having one site covalently bound to a solid phase, such a flat probe surface or another solid phase, such as a bead, that can be made part of the probe, and a second site that can be covalently bound with the affinity reagent or analyte. The photolabile attachment molecule, when bound to both the surface and the analyte, also contains a photolabile bond that can release the affinity reagent or analyte upon exposure to light. The photolabile bond can be within the attachment molecule or at the site of attachment to either the analyte (or affinity reagent) or the probe surface. Method For Direct Detection Of Analytes: The desorbed analyte can be detected by any of several means. When the analyte is ionized in the process of desoφtion, such as in laser desoφtion/ionization mass spectrometry, the detector can be an ion detector. Mass spectrometers generally include means for determining the time-of- flight of desorbed ions. This information is converted to mass. However, one need not determine the mass of desorbed ions to resolve and detect them: the fact that ionized analytes strike the detector at different times provides detection and resolution of them.
Alternatively, the analyte can be detectably labeled with, e.g., a fluorescent moiety or with a radioactive moiety. In these cases, the detector can be a fluorescence or radioactivity detector. A plurality of detection means can be implemented in series to fully interrogate the analyte components and function associated with retentate at each location in the array.
Desoφtion Detectors: Desoφtion detectors comprise means for desorbing the analyte from the adsorbent and means for directly detecting the desorbed analyte. That is, the desoφtion detector detects desorbed analyte without an intermediate step of capturing the analyte in another solid phase and subjecting it to subsequent analysis. Detection of an analyte normally will involve detection of signal strength. This, in turn, reflects the quantity of analyte adsorbed to the adsorbent.
Beyond these two elements, the desoφtion detector also can have other elements. One such element is means to accelerate the desorbed analyte toward the detector. Another element is means for determining the time-of-flight of analyte from desoφtion to detection by the detector. A preferred desoφtion detector is a laser desoφtion/ionization mass spectrometer, which is well known in the art. The mass spectrometer includes a port into which the substrate that carries the adsorbed analytes, e.g., a probe, is inserted. Desoφtion is accomplished by striking the analyte with energy, such as laser energy. The device can include means for translating the surface so that any spot on the array is brought into line with the laser beam. Striking the analyte with the laser results in desoφtion of the intact analyte into the flight tube and its ionization. The flight tube generally defines a vacuum space. Electrified plates in a portion of the vacuum tube create an electrical potential which accelerate the ionized analyte toward the detector. A clock measures the time of flight and the system electronics determines velocity of the analyte and converts this to mass. As any person skilled in the art understands, any of these elements can be combined with other elements described herein in the assembly of desoφtion detectors that employ various means of desoφtion, acceleration, detection, measurement of time, etc.
Selectivity Conditions
One advantage of the invention is the ability to expose the analytes to a variety of different binding and elution conditions, thereby providing both increased resolution of analytes and information about them in the form of a recognition profile. As in conventional chromatographic methods, the ability of the adsorbent to retain the analyte is directly related to the attraction or affinity of the analyte for the adsorbent as compared to the attraction or affinity of the analyte for the eluant or the eluant for the adsorbent. Some components of the sample may have no affinity for the adsorbent and therefore will not bind to the adsorbent when the sample is contacted to the adsorbent. Due to their inability to bind to the adsorbent, these components will be immediately separated from the analyte to be resolved. However, depending upon the nature of the sample and the particular adsorbent utilized, a number of different components can initially bind to the adsorbent.
Adsorbents Adsorbents are the materials that bind analytes. A plurality of adsorbents can be employed in retentate chromatography. Different adsorbents can exhibit grossly different binding characteristics, somewhat different binding characteristics, or subtly different binding characteristics. Adsorbents which exhibit grossly different binding characteristics typically differ in their bases of attraction or mode of interaction. The basis of attraction is generally a function of chemical or biological molecular recognition. Bases for attraction between an adsorbent and an analyte include, for example, (1) a salt- promoted interaction, e.g., hydrophobic interactions, thiophilic interactions, and immobilized dye interactions; (2) hydrogen bonding and/or van der Waals forces interactions and charge transfer interactions, such as in the case of a hydrophilic interactions; (3) electrostatic interactions, such as an ionic charge interaction, particularly positive or negative ionic charge interactions; (4) the ability of the analyte to form coordinate covalent bonds (i.e., coordination complex formation) with a metal ion on the adsorbent; (5) enzyme-active site binding; (6) reversible covalent interactions, for example, disulfide exchange interactions; (7) glycoprotein interactions; (8) biospecific interactions; or (9) combinations of two or more of the foregoing modes of interaction. That is, the adsorbent can exhibit two or more bases of attraction, and thus be known as a "mixed functionality" adsorbent. Salt-promoted Interaction Adsorbents: Adsorbents which are useful for observing salt-promoted interactions include hydrophobic interaction adsorbents. Examples of hydrophobic interaction adsorbents include matrices having aliphatic hydrocarbons, specifically C1-C18 aliphatic hydrocarbons; and matrices having aromatic hydrocarbon functional groups such as phenyl groups. Hydrophobic interaction adsorbents bind analytes which include uncharged solvent exposed amino acid residues, and specifically amino acid residues which are commonly referred to as nonpolar, aromatic and hydrophobic amino acid residues, such as phenylalanine and tryptophan. Specific examples of analytes which will bind to a hydrophobic interaction adsorbent include lysozyme and DNA. Without wishing to be bound by a particular theory, it is believed that DNA binds to hydrophobic interaction adsorbents by the aromatic nucleotides in DNA, specifically, the purine and pyrimidine groups.
Another adsorbent useful for observing salt-promoted interactions includes thiophilic interaction adsorbents, such as for example T-GEL® which is one type of thiophilic adsorbent commercially available from Pierce, Rockford, Illinois. Thiophilic
interaction adsorbents bind, for example, immunoglobulins such as IgG. The mechanism of interaction between IgG and T-GEL® is not completely known, but solvent exposed tφ residues are suspected to play a role.
A third adsorbent which involves salt-promoted ionic interactions and also hydrophobic interactions includes immobilized dye interaction adsorbents. Immobilized dye interaction adsorbents include matrices of immobilized dyes such as for example CIB ACHRONTM blue available from Pharmacia Biotech, Piscataway, New Jersey. Immobilized dye interaction adsorbents bind proteins and DNA generally. One specific example of a protein which binds to an immobilized dye interaction adsorbent is bovine serum albumin (BSA).
Hydrophilic Interaction Adsorbents: Adsorbents which are useful for observing hydrogen bonding and/or van der Waals forces on the basis of hydrophilic interactions include surfaces comprising normal phase adsorbents such as silicon-oxide (i.e., glass). The normal phase or silicon-oxide surface, acts as a functional group. In addition, adsorbents comprising surfaces modified with hydrophilic polymers such as polyethylene glycol, dextran, agarose, or cellulose can also function as hydrophilic interaction adsorbents. Most proteins will bind hydrophilic interaction adsorbents because of a group or combination of amino acid residues (i.e., hydrophilic amino acid residues) that bind through hydrophilic interactions involving hydrogen bonding or van der Waals forces. Examples of proteins which will bind hydrophilic interaction adsorbents include myoglobin, insulin and cytochrome C.
In general, proteins with a high proportion of polar or charged amino acids will be retained on a hydrophilic surface. Alternatively, glycoproteins with surface exposed hydrophilic sugar moieties, also have high affinity for hydrophilic adsorbents. Electrostatic Interaction Adsorbents: Adsorbents which are useful for observing electrostatic or ionic charge interactions include anionic adsorbents such as, for example, matrices of sulfate anions (i.e., SO3-) and matrices of carboxylate anions (i.e., COO-) or phosphate anions (OPO3-). Matrices having sulfate anions are permanent negatively charged. However, matrices having carboxylate anions have a negative charge only at a pH above their pKa. At a pH below the pKa, the matrices exhibit a substantially neutral charge. Suitable anionic adsorbents also include anionic adsorbents which are matrices having a combination of sulfate and carboxylate anions and phosphate anions. The combination provides an intensity of negative charge that can be continuously varied as a function of pH. These adsorbents attract and bind proteins and macromolecules
having positive charges, such as for example ribonuclease and lactoferrin. Without wishing to be bound by a particular theory, it is believed that the electrostatic interaction between an adsorbent and positively charged amino acid residues including lysine residues, arginine residues, and histidyl residues are responsible for the binding interaction.
Other adsorbents which are useful for observing electrostatic or ionic charge interactions include cationic adsorbents. Specific examples of cationic adsorbents include matrices of secondary, tertiary or quaternary amines. Quaternary amines are permanently positively charged. However, secondary and tertiary amines have charges that are pH dependent. At a pH below the pKa, secondary and tertiary amines are positively charged, and at a pH above their pKa, they are negatively charged. Suitable cationic adsorbents also include cationic adsorbents which are matrices having combinations of different secondary, tertiary, and quaternary amines. The combination provides an intensity of positive charge that can be continuously varied as a function of pH. Cationic interaction adsorbents bind anionic sites on molecules including proteins having solvent exposed amino acid residues, such as aspartic acid and glutamic acid residues.
In the case of ionic interaction adsorbents (both anionic and cationic) it is often desirable to use a mixed mode ionic adsorbent containing both anions and cations. Such adsorbents provide a continuous buffering capacity as a function of pH. The continuous buffering capacity enables the exposure of a combination of analytes to ' eluants having differing buffering components especially in the pH range of from 2 to 11. This results in the generation of local pH environments on the adsorbent which are defined by immobilized titratable proton exchange groups. Such systems are equivalent to the solid phase separation technique known as chromatofocusing. Follicle stimulating hormone isoforms, which differ mainly in the charged carbohydrate components are separated on a chromatofocusing adsorbent.
Still other adsorbents which are useful for observing electrostatic interactions include dipole-dipole interaction adsorbents in which the interactions are electrostatic but no formal charge or titratable protein donor or acceptor is involved.
Coordinate Covalent Interaction Adsorbents: Adsorbents which are useful for observing the ability to form coordinate covalent bonds with metal ions include matrices bearing, for example, divalent and trivalent metal ions. Matrices of immobilized metal ion chelators provide immobilized synthetic organic molecules that have one or
more electron donor groups which form the basis of coordinate covalent interactions with transition metal ions. The primary electron donor groups functioning as immobilized metal ion chelators include oxygen, nitrogen, and sulfur. The metal ions are bound to the immobilized metal ion chelators resulting in a metal ion complex having some number of remaining sites for interaction with electron donor groups on the analyte. Suitable metal ions include in general transition metal ions such as copper, nickel, cobalt, zinc, iron, and other metal ions such as aluminum and calcium. Without wishing to be bound by any particular theory, metals ions are believed to interact selectively with specific amino acid residues in peptides, proteins, or nucleic acids. Typically, the amino acid residues involved in such interactions include histidine residues, tyrosine residues, tryptophan residues, cysteine residues, and amino acid residues having oxygen groups such as aspartic acid and glutamic acid. For example, immobilized ferric ions interact with phosphoserine, phosphotyrosine, and phosphothreonine residues on proteins. Depending on the immobilized metal ion, only those proteins with sufficient local densities of the foregoing amino acid residues will be retained by the adsorbent. Some interactions between metal ions and proteins can be so strong that the protein cannot be severed from the complex by conventional means. Human β casein, which is highly phosphorylated, binds very strongly to immobilized Fe(III). Recombinant proteins which are expressed with a 6-Histidine tag, binds very strongly to immobilized Cu(II) and Ni(II). Enzyme- Active Site Interaction Adsorbents: Adsorbents which are useful for observing enzyme-active site binding interactions include proteases (such as trypsin), phosphatases, kinases, and nucleases. The interaction is a sequence-specific interaction of the enzyme binding site on the analyte (typically a biopolymer) with the catalytic binding site on the enzyme. Enzyme binding sites of this type include, for example, active sites of trypsin interacting with proteins and peptides having lysine-lysine or lysine-arginine pairs in their sequence. More specifically, soybean trypsin inhibitor interacts with and binds to an adsorbent of immobilized trypsin. Alternatively, serine proteases are selectively retained on immobilized L-arginine adsorbent.
Reversible Covalent Interaction Adsorbents: Adsorbents which are useful for observing reversible covalent interactions include disulfide exchange interaction adsorbents. Disulfide exchange interaction adsorbents include adsorbents comprising immobilized sulfhydryl groups, e.g., mercaptoethanol or immobilized dithiothrietol. The interaction is based upon the formation of covalent disulfide bonds between the adsorbent and solvent exposed cysteine residues on the analyte. Such adsorbents bind proteins or
peptides having cysteine residues and nucleic acids including bases modified to contain reduced sulfur compounds.
Glycoprotein Interaction Adsorbents: Adsorbents which are useful for observing glycoprotein interactions include glycoprotein interaction adsorbents such as adsorbents having immobilize lectins (i.e., proteins bearing oligosaccharides) therein, an example of which is CONCONANALINTM, which is commercially available from Pharmacia Biotech of Piscataway, New Jersey. Such adsorbents function on the basis of the interaction involving molecular recognition of carbohydrate moieties on macromolecules. Examples of analytes which interact with and bind to glycoprotein interaction adsorbents include glycoproteins, particularly histidine-rich glycoproteins, whole cells and isolated subcellular fractions.
Biospecific Interaction Adsorbent: Adsorbents which are useful for observing biospecific interactions are generically termed "biospecific affinity adsorbents." Adsoφtion is considered biospecific if it is selective and the affinity (equilibrium dissociation constant, Kd) is at least 10-3 M to (e.g., 10-5 M, 10-7 M, 10-9 M). Examples of biospecific affinity adsorbents include any adsorbent which specifically interacts with and binds a particular biomolecule. Biospecific affinity adsorbents include for example, immobilized antibodies which bind to antigens; immobilized DNA wliich binds to DNA binding proteins, DNA, and RNA; immobilized substrates or inhibitors which bind to proteins and enzymes; immobilized drags which bind to drug binding proteins; immobilized ligands which bind to receptors; immobilized receptors which bind to ligands; immobilized RNA which binds to DNA and RNA binding proteins; immobilized avidin or streptavidin which bind biotin and biotinylated molecules; immobilized phospholipid membranes and vesicles which bind lipid-binding proteins. Enzymes are useful adsorbents that can modify an analyte adsorbent thereto. Cells are useful as adsorbents. Their surfaces present complex binding characteristics. Adsoφtion to cells is useful for identifying, e.g., ligands or signal molecules that bind to surface receptors. Viruses or phage also are useful as adsorbents. Viruses frequently have ligands for cell surface receptors (e.g., gpl20 for CD4). Also, in the form a phage display library, phage coat proteins act as agents for testing binding to targets. Biospecific interaction adsorbents rely on known specific interactions such as those described above. Other examples of biospecific interactions for which adsorbents can be utilized will be readily apparent to those skilled in the art and are contemplated by the present invention.
In one embodiment, the biospecific adsorbent can further comprise an auxiliary, or "helper", molecule that does not directly participate in binding the target analyte.
Degrees of Binding Specificity: By exposure to adsorbents having different modes of interaction, the components of a sample can be grossly divided based upon their interaction with the different adsorbents. Thus, the attraction of the analyte for adsorbents having different modes of interaction provides a first separation parameter. For example, by exposing a sample containing the analyte to a first adsorbent with a basis of attraction involving hydrophobicity and a second adsorbent with a basis of attraction involving ionic charge, it is possible to separate from the sample those analytes which bind to a hydrophobic adsorbent and to separate those analytes which bind to an adsorbent having the particular ionic charge.
Adsorbents having different bases of attraction provide resolution of the analyte with a low degree of specificity because the adsorbent will bind not only the analyte, but any other component in the sample which also exhibits an attraction for the adsorbent by the same basis of attraction. For example, a hydrophobic adsorbent will bind not only a hydrophobic analyte, but also any other hydrophobic components in the sample; a negatively charged adsorbent will bind not only a positively charged analyte, but also any other positively charged component in the sample; and so on. The resolution of analytes based upon the basis of attraction of the analyte for the adsorbent can be further refined by exploiting binding characteristics of relatively intermediate specificity or altered strength of attraction. Resolution of the analyte on the basis of binding characteristics of intermediate specificity can be accomplished, for example, by utilizing mixed functionality adsorbents. Once the resolution of the analyte is accomplished with relatively low specificity, the binding characteristic found to attract the analyte of interest can be exploited in combination with a variety of other binding and elution characteristics to remove still more undesired components and thereby resolve the analyte.
For example, if the analyte binds to hydrophobic adsorbents, the analyte can be further resolved from other hydrophobic sample components by providing a mixed functionality adsorbent which exhibits as one basis of attraction a hydrophobic interaction and also exhibits a second, different basis of attraction. The mixed functionality adsorbent may exhibit hydrophobic interactions and negatively charged ionic interactions so as to bind hydrophobic analytes which are positively charged. Alternatively, the
mixed functionality adsorbent can exhibit hydrophobic interactions and the ability to form coordinate covalent bonds with metal ions so as to bind hydrophobic analytes having the ability to form coordination complexes with metal ions on the adsorbent. Still further examples of adsorbents exhibiting binding characteristics of intermediate specificity will be readily apparent to those skilled in the art based upon the disclosure and examples set forth above.
The resolution of analytes on the basis of binding characteristics of intermediate specificity can be further refined by exploiting binding characteristics of relatively high specificity. Binding characteristics of relatively high specificity can be exploited by utilizing a variety of adsorbents exhibiting the same basis of attraction but a different strength of attraction. In other words, although the basis of attraction is the same, further resolution of the analyte from other sample components can be achieved by utilizing adsorbents having different degrees of affinity for the analyte.
For example, an analyte that binds an adsorbent based upon the analyte's acidic nature may be further resolved from other acidic sample components by utilizing adsorbents having affinity for analytes in specific acidic pH ranges. Thus the analyte may be resolved using one adsorbent attracted to sample components of pH 1-2, another adsorbent attracted to sample components of pH of 3-4, and a third adsorbent attracted to sample components of pH of 5-6. In this manner, an analyte having a specific affinity for an adsorbent which binds analyte of pH of 5-6 will be resolved from sample components of pH of 1-4. Adsorbents of increasing specificity can be utilized by decreasing the interval of attraction, i.e., the difference between the binding characteristics of adsorbents exhibiting the same basis of attraction.
A primary analyte adsorbed to a primary adsorbent can, itself, have adsorbent properties. In this case, the primary analyte adsorbed to a substrate can become a secondary adsorbent for isolating secondary analytes. In rum, the retained secondary analyte can function as a tertiary adsorbent to isolate a tertiary analyte from a sample. This process can continue through several iterations.
Eluants
The eluants, or wash solutions, selectively modify the threshold of absoφtion between the analyte and the adsorbent. The ability of an eluant to desorb and elute a bound analyte is a function of its elution characteristics. Different eluants can
exhibit grossly different elution characteristics, somewhat different elution characteristics, or subtly different elution characteristics.
The temperature at which the eluant is contacted to the adsorbent is a function of the particular sample and adsorbents selected. Typically, the eluant is contacted to the adsorbent at a temperature of between 0°C and 100°C, preferably between 4°C and 37°C. However, for some eluants, modified temperatures can be desirable and will be readily determinable by those skilled in the art.
As in the case of adsorbents, eluants which exhibit grossly different elution characteristics generally differ in their basis of attraction. For example, various bases of attraction between the eluant and the analyte include charge or pH, ionic strength, water stracture, concentrations of specific competitive binding reagents, surface tension, dielectric constant and combinations of two or more of the above. pH-Based Eluants: Eluants which modify the selectivity of the adsorbent based upon pH (i.e., charge) include known pH buffers, acidic solutions, and basic solutions. By washing an analyte bound to a given adsorbent with a particular pH buffer, the charge can be modified and therefore the strength of the bond between the adsorbent and the analyte in the presence of the particular pH buffer can be challenged. Those analytes which are less competitive than others for the adsorbent at the pH of the eluant will be desorbed from the adsorbent and eluted, leaving bound only those analytes which bind more strongly to the adsorbent at the pH of the eluant.
Ionic Strength-Based Eluants: Eluants which modify the selectivity of the adsorbent with respect to ionic strength include salt solutions of various types and concentrations. The amount of salt solubilized in the eluant solution affects the ionic strength of the eluant and modifies the adsorbent binding ability correspondingly. Eluants containing a low concentration of salt provide a slight modification of the adsorbent binding ability with respect to ionic strength. Eluants containing a high concentration of salt provide a greater modification of the adsorbent binding ability with respect to ionic strength.
Water Structure-Based Eluants: Eluants which modify the selectivity of the adsorbent by alteration of water stracture or concentration include urea and chaotropic salt solutions. Typically, urea solutions include, e.g., solutions ranging in concentration from 0.1 to 8 M. Chaotropic salts which can be used to provide eluants include sodium thiocyanate. Water structure-based eluants modify the ability of the adsorbent to bind the analyte due to alterations in hydration or bound water structure. Eluants of this type
include for example, glycerol, ethylene glycol and organic solvents. Chaotropic anions increase the water solubility of nonpolar moieties thereby decreasing hydrophobic interactions between the analyte and the adsorbent.
Detergent-Based Eluants: Eluants which modify the selectivity of the adsorbent with respect to surface tension and analyte stracture include detergents and surfactants. Suitable detergents for use as eluants include ionic and nonionic detergents such as CHAPS, TWEEN and NP-40. Detergent-based eluants modify the ability of the adsorbent to bind the analyte as the hydrophobic interactions are modified when the hydrophobic and hydrophilic groups of the detergent are introduced. Hydrophobic interactions between the analyte and the adsorbent, and within the analyte are modified and charge groups are introduced, e.g., protein denaturation with ionic detergents such as SDS.
Hydrophobicity-Based Eluants: Eluants which modify the selectivity of the adsorbent with respect to dielectric constant are those eluants which modify the selectivity of the adsorbent with respect to hydrophobic interaction. Examples of suitable eluants which function in this capacity include urea (0.1 -8M) organic solvents such as propanol, acetonitrile, ethylene glycol and glycerol, and detergents such as those mentioned above. Use of acetonitrile as eluant is typical in reverse phase chromatography. Inclusion of ethylene glycol in the eluant is effective in eluting immunoglobulins from salt-promoted interactions with thiophilic adsorbents.
Combinations of Eluants: Suitable eluants can be selected from any of the foregoing categories or can be combinations of two or more of the foregoing eluants. Eluants which comprise two or more of the foregoing eluants are capable of modifying the selectivity of the adsorbent for the analyte on the basis of multiple elution characteristics.
Variability of Two Parameters The ability to provide different binding characteristics by selecting different adsorbents and the ability to provide different elution characteristics by washing with different eluants permits variance of two distinct parameters each of which is capable of individually effecting the selectivity with which analytes are bound to the adsorbent. The fact that these two parameters can be varied widely assures a broad range of binding attraction and elution conditions so that the methods of the present invention can be useful for binding and thus detecting many different types of analytes.
The selection of adsorbents and eluants for use in analyzing a particular sample will depend on the nature of the sample, and the particular analyte or class of analytes to be characterized, even if the nature of the analytes are not known. Typically, it is advantageous to provide a system exhibiting a wide variety of binding characteristics and a wide variety of elution characteristics, particularly when the composition of the sample to be analyzed is unknown. By providing a system exhibiting broad ranges of selectivity characteristics, the likelihood that the analyte of interest will be retained by one or more of the adsorbents is significantly increased.
One skilled in the art of chemical or biochemical analysis is capable of determining the selectivity conditions useful for retaining a particular analyte by providing a system exhibiting a broad range of binding and elution characteristics and observing binding and elution characteristics which provide the best resolution of the analyte. Because the present invention provides for systems including broad ranges of selectivity conditions, the determination by one skilled in the art of the optimum binding and elution characteristics for a given analyte can be easily accomplished without the need for undue experimentation.
Analytes
The present invention permits the resolution of analytes based upon a variety of biological, chemical, or physio-chemical properties of the analyte by exploiting the properties of the analyte through the use of appropriate selectivity conditions. Among the many properties of analytes which can be exploited through the use of appropriate selectivity conditions are the hydrophobic index (or measure of hydrophobic residues in the analyte), the isoelectric point (i.e., the pH at which the analyte has no charge), the hydrophobic moment (or measure of amphipathicity of an analyte or the extent of asymmetry in the distribution of polar and nonpolar residues), the lateral dipole moment (or measure of asymmetry in the distribution of charge in the analyte), a molecular structure factor (accounting for the variation in surface contour of the analyte molecule such as the distribution of bulky side chains along the backbone of the molecule), secondary stracture components (e.g., helix, parallel and antiparallel sheets), disulfide bands, solvent-exposed electron donor groups (e.g., His), aromaticity (or measure of pi-pi interaction among aromatic residues in the analyte) and the linear distance between charged atoms.
These are representative examples of the types of properties which can be exploited for the resolution of a given analyte from a sample by the selection of appropriate selectivity characteristics in the methods of the present invention. Other suitable properties of analytes which can form the basis for resolution of a particular analyte from the sample will be readily known and/or determinable by those skilled in the art and are contemplated by the instant invention.
The inventive method is not limited with respect to the types of samples which can be analyzed. Samples can be in the solid, liquid, or gaseous state, although typically the sample will be in a liquid state. Solid or gaseous samples are preferably solubilized in a suitable solvent to provide a liquid sample according to techniques well within the skill of those in the art. The sample can be a biological composition, non- biological organic composition, or inorganic composition. The technique of the present invention is particularly useful for resolving analytes in a biological sample, particularly biological fluids and extracts; and for resolving analytes in non-biological organic compositions, particularly compositions of small organic and inorganic molecules. The analytes may be molecules, multimeric molecular complexes, macromolecular assemblies, cells, subcellular organelles, viruses, molecular fragments, ions, or atoms. The analyte can be a single component of the sample or a class of structurally, chemically, biologically, or functionally related components having one or more characteristics (e.g., molecular weight, isoelectric point, ionic charge, hydrophobic/hydrophilic interaction, etc.) in common.
Specific examples of analytes which may be resolved using the retentate chromatography methods of the present invention include biological macromolecules such as peptides, proteins, enzymes, polynucleotides, oligonucleotides, nucleic acids, carbohydrates, oligosaccharides, polysaccharides; fragments of biological macromolecules set forth above, such as nucleic acid fragments, peptide fragments, and protein fragments; complexes of biological macromolecules set forth above, such as nucleic acid complexes, protein-DNA complexes, receptor-ligand complexes, enzyme- substrate, enzyme inhibitors, peptide complexes, protein complexes, carbohydrate complexes, and polysaccharide complexes; small biological molecules such as amino acids, nucleotides, nucleosides, sugars, steroids, lipids, metal ions, drugs, hormones, amides, amines, carboxylic acids, vitamins and coenzymes, alcohols, aldehydes, ketones, fatty acids, poφhyrins, carotenoids, plant growth regulators, phosphate esters and nucleoside diphospho-sugars, synthetic small molecules such as pharmaceutically or
therapeutically effective agents, monomers, peptide analogs, steroid analogs, inhibitors, mutagens, carcinogens, antimitotic drags, antibiotics, ionophores, antimetabolites, amino acid analogs, antibacterial agents, transport inhibitors, surface-active agents (surfactants), mitochondrial and chloroplast function inhibitors, electron donors, carriers and acceptors, synthetic substrates for proteases, substrates for phosphatases, substrates for esterases and lipases and protein modification reagents; and synthetic polymers, oligomers, and copolymers such as polyalkylenes, polyamides, poly(meth)acrylates, polysulfones, polystyrenes, polyethers, polyvinyl ethers, polyvinyl esters, polycarbonates, polyvinyl halides, polysiloxanes, POMA, PEG, and copolymers of any two or more of the above.
Identifying the Polypeptide Encoded by the mRNA
Once the polypeptides are fractionated, a next step is identifying a polypeptide from among the fractionated polypeptides that corresponds to the polypeptide encoded by the selected mRNA. The polypeptides in the sample have been fractionated based on a known physio-chemical property of the encoded polypeptide. This information is useful in discovering the encoded polypeptide from among the fractionated polypeptides. For example, one may know that an encoded polypeptide has a negative charge at pH 7 and a mass of about 18 kD. Using a protein biochip comprising an anionic adsorbent spot, one could capture proteins having a negative charge at pH 7. Then, using a mass spectrometer, the captured proteins are fractionated based on molecular weight, providing a spectrum. Examining the spectrum at around 18 kD will provide one or more candidate proteins having the selected physiochemical properties. The candidates can now be examined further by a variety of methods described herein to determine their identity and correlated them with the expressed polypeptide. Similarly, two-dimensional gel electrophoresis separates proteins based on pi and molecular weight. Knowing the predicted mass and pi of the expressed protein leads the investigator to a particular region of the gel expected to comprise the protein. The proteins in that spot are then examined to correlated them to the expressed protein using, e.g., tandem mass spectrometric analysis coupled with interrogation of a protein database.
Identification of proteins fractionated by mass spectrometry The data of a mass spectrum can be used to identify the proteins present in a sample by executing an algorithm with a programmable digital computer that compares
the MS data to records in a database. Each molecule provides characteristic mass- spectrometric (MS) data (also referred to as a mass spectral "signature" or "fingeφrint") when analyzed by MS methods. This data can be analyzed by comparing it to databases containing, ter alia, actual or theoretical MS data or biopolymer sequence information. Additionally, a molecule may be cleaved into fragments for MS analysis. Information obtained from the MS analysis of fragments is also compared to a database to identify polypeptides in the analyte (Yates, J. Mass Spec. 33: 1-19 (1988); Yates et al, U.S. Patent No. 5,538,897; Yates et al, U.S. Patent No. 6,017,693).
Further methods for identifying proteins detected by SELDI are described, e.g., in U.S. Patent 6,225,047; International Patent Application PCT/USOO/28163, and USSN 60/277,677, filed March 20, 2001.
Data generated by desoφtion and detection of polypeptides can be analyzed using any suitable means. In one embodiment, data is analyzed with the use of a programmable digital computer. The computer program generally contains a readable medium that stores codes. Certain code can be devoted to memory that includes the location of each feature on a substrate, the identity of the adsorbent at that feature and the elution conditions used to wash the adsorbent. Using this information, the program can then identify the set of features on the substrate defining certain selectivity characteristics (e.g., types of adsorbent and eluants used). The computer also contains code that receives as input, data on the strength of the signal at various molecular masses received from a particular addressable location on the substrate. This data can indicate the number of polypeptides detected, optionally including the strength of the signal and the determined molecular mass for each polypeptide detected.
Data analysis can include the steps of determining signal strength (e.g., height of peaks) of a polypeptide detected and removing "outliers" (data deviating from a predetermined statistical distribution). The observed peaks can be normalized, a process whereby the height of each peak relative to some reference is calculated. For example, a reference can be background noise generated by instrument and chemicals (e.g., energy absorbing molecule) which is set as zero in the scale. Then the signal strength detected for each polypeptide or other substances can be displayed in the form of relative intensities in the scale desired (e.g., 100). Alternatively, a standard may be admitted with the sample so that a peak from the standard can be used as a reference to calculate relative intensities of the signals observed for each polypeptide or other polypeptides detected.
In certain embodiments, MS data and information obtained from that data are compared to a database consisting of data and information relating to biopolymers. For example, the database may consist of sequences of nucleotides or amino acids. The database may consist of nucleotide or amino acid sequences of expressed sequence tags (ESTs). Alternatively, the database may consist of sequences of genes at the nucleotide or amino acid level. The database can include, without limitation, a collection of nucleotide sequences, amino acid sequences, or translations of nucleotide sequences included in the genome of any species.
A database of information relating to biopolymers, e.g., sequences of nucleotides or amino acids, is typically analyzed via a computer program or a search algorithm which is optionally performed by a computer. Information from sequence databases is searched for best matches with data and information obtained from the methods of the present invention (see e.g., Yates (1998) J. Mass Spec. 33: 1-19; Yates et al, U.S. Patent No. 5,538,897; Yates et al, U.S. Patent No. 6,017,693). Any appropriate algorithm or computer program useful for searching a database can be used. Search algorithms and databases are constantly updated, and such updated versions will be used in accordance with the present invention. Examples of programs or databases can be found on the World Wide Web (WWW) at http ://base- peak.wiley.com/, htφ://mac-mann6.embl-heidelberg.de/MassSpec/Software.html, htφJ/www.mann.embl-heidelberg.de/Services/PeptideSearch/PeptideSearcli^fro.html, ftp://ftp.ebi.ac.uk/pub/databases/, and http://donatello.ucsf.edu. U.S. Patent Nos. 5,632,041; 5,964,860; 5,706,498; and 5,701,256 also describe algorithms or methods for sequence comparison.
In one embodiment, the database of protein, peptide, or nucleotide sequences is a combination of databases. Examples of databases include, but are not limited to, ProteinProspector at the UCSF web site (prospector.ucsf.edu), the Genpept database, the GenBank database (described in Burks et al. (1990) Methods in Enzymology 183: 3-22, EMBL data library (described in Kahn et al. (1990) Methods in Enzymology 183:23-31, the Protein Sequence Database (described in Barker et al. (1990) Methods in Enzymology 183: 31 -49, SWISS-PROT (described in Bairoch et al. (1993) Nucleic Acids Res., 21 : 3093-3096, and PIR-International (described in (1993) Protein Seg. Data Anal. 5:67-192).
In a further embodiment, novel databases are generated for comparison to mass spectrometrically determined MS data, e.g., mass or mass spectra of cleaved protein
and peptide fragments. For example, a theoretical database of all the possible amino acid sequence combinations of the peptide masses being characterized is generated (Parekh et al, WO 98/53323). Then, the database is compared with the actual masses determined using mass spectrometry to determine the amino acid sequence of the peptides in the sample.
In some embodiments, the mass of a polypeptide derived from a mass spectrum is used to query a database for those masses of proteins or predicted proteins from nucleic acid sequences that provide the closest fit. In this manner, an unknown protein can be rapidly identified without an amino acid sequence. In other embodiments of the invention, the masses provided from chimeric polypeptide fragments thereof can be compared to the predicted mass spectra of a database of proteins or predicted proteins from a nucleic acid sequences that provide the closest fit. An algorithm or computer program generates a theoretical cleavage of sequences in a database with the same cleavage agent used to cleave the biopolymer analyzed by MS methods. Sequences or simulated cleavage fragments from the sequence database that fall within a desired range of similar sequence homologies to sequences generated from the MS data of parent or fragment molecules are designated "matches" or "hits." In this manner, the identity of the test domain or fragments thereof can be rapidly determined. The investigator can customize or vary the range of acceptable sequence homology comparison values according to each particular analysis.
Detection of polypeptides using SELDI
Detection of analytes adsorbed to an adsorbent under particular elution conditions provides information about analytes in a sample and their chemical character. Adsoφtion depends, in part, upon the binding characteristics of the adsorbent: Analytes that bind to an adsorbent possess the characteristic that makes binding possible. For example, molecules that are cationic at a particular pH will bind to an anionic adsorbent under elution conditions that include that pH. Strongly cationic molecules will only be eluted from the adsorbent under very strong elution conditions. Molecules with hydrophobic regions will bind to hydrophobic adsorbents, while molecules with hydrophilic regions will bind to hydrophilic adsorbents. Again, the strength of the interaction will depend, in part, upon extent to which an analyte contains hydrophobic or hydrophilic regions. Thus, the determination that certain analytes in a sample bind to an adsorbent under certain elution conditions not only resolves analytes in a mixture by
separating them from each other and from analytes that do not possess the appropriate chemical character for binding, but also identifies a class of analytes or individual analytes having the particular chemical character. Collecting information about analyte retention on one or more particular adsorbents under a variety of elution conditions provides not only detailed resolution of analytes in a mixture, but also chemical information about the analytes, themselves that can lead to their identity. This data is referred to as "retention data."
Data generated in retention assays is most easily analyzed with the use of a programmable digital computer. The computer program generally contains a readable medium that stores codes. Certain code is devoted to memory that includes the location of each feature on a substrate array, the identity of the adsorbent at that feature and the elution conditions used to wash the adsorbent. Using this information, the program can then identify the set of features on the array defining certain selectivity characteristics. The computer also contains code that receives as input, data on the strength of the signal at various molecular masses received from a particular addressable location on the probe. This data can indicate the number of analytes detected, optionally including for each analyte detected the strength of the signal and the determined molecular mass.
The computer also contains code that processes the data. This invention contemplates a variety of methods for processing the data. In one embodiment, this involves creating an analyte recognition profile. For example, data on the retention of a particular analyte identified by molecular mass can be sorted according to a particular binding characteristic, for example, binding to anionic adsorbents or hydrophobic adsorbents. This collected data provides a profile of the chemical properties of the particular analyte. Retention characteristics reflect analyte function which, in turn, reflects stracture. For example, retention to coordinate covalent metal chelators can reflect the presence of histidine residues in a polypeptide analyte. Using data of the level of retention to a plurality of cationic and anionic adsorbents under elution at a variety of pH levels reveals information from which one can derive the isoelectric point of a protein. This, in turn, reflects the probable number of ionic amino acids in the protein. Accordingly, the computer can include code that transforms the binding information into structural information. Furthermore, secondary processing of the analyte (e.g., post- translational modifications) results in an altered recognition profile reflected by differences in binding or mass.
In another embodiment, retention assays are performed under the same set of selectivity thresholds on two different cell types, and the retention data from the two assays is compared. Differences in the retention maps (e.g., presence or strength of signal at any feature) indicate analytes that are differentially expressed by the two cells. This can include, for example, generating a difference map indicating the difference in signal strength between two retention assays, thereby indicating which analytes are increasingly or decreasingly retained by the adsorbent in the two assays.
The computer program also can include code that receives instructions from a programmer as input. The progressive and logical pathway for selective desoφtion of analytes from specified, predetermined locations in the array can be anticipated and programmed in advance.
The computer can transform the data into another format for presentation. Data analysis can include the steps of determining, e.g., signal strength as a function of feature position from the data collected, removing "outliers" (data deviating from a predetermined statistical distribution), and calculating the relative binding affinity of the analytes from the remaining data.
The resulting data can be displayed in a variety of formats. In one format, the strength of a signal is displayed on a graph as a function of molecular mass. In another format, referred to as "gel format," the strength of a signal is displayed along a linear axis intensity of darkness, resulting in an appearance similar to bands on a gel. In another format, signals reaching a certain threshold are presented as vertical lines or bars on a horizontal axis representing molecular mass. Accordingly, each bar represents an analyte detected. Data also can be presented in graphs of signal strength for an analyte grouped according to binding characteristic and/or elution characteristic.
It is understood that the examples and embodiments described herein are for illustrative puφoses only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incoφorated by reference in their entirety for all pmposes.
Claims (30)
- WHAT IS CLAIMED IS:A method of correlating gene and protein expression in a biological sample, the method comprising the steps of: a) obtaining the biological sample ; b) generating a gene expression profile of the sample, thereby identifying an mRNA expressed in the sample; c) identifying a physio-chemical property of a polypeptide encoded by the mRNA; d) fractionating polypeptides in the sample on the basis of the physio- chemical property and; (e) identifying the polypeptide encoded by the mRNA from among the fractionated proteins, wherein the identified polypeptide comprises the physio-chemical property; thereby correlating gene and protein expression in the sample.
- 2. The method of claim 1, wherein the biological sample comprises a cell lysate from a healthy cell.
- 3. The method of claim 1 , wherein the biological sample comprises a cell lysate from a pathological cell.
- 4. The method of claim 1 , wherein the biological sample comprises a cell lysate from a cell contacted by a toxic compound.
- 5. The method of claim 1, wherein the biological sample comprises a cell lysate from a cell of a subject who respond to a drag treatment or a subject who does not respond to a drag treatment.
- 6. The method of claim 1, wherein the biological sample comprises a cell lysate from a cell exposed to heat, cold, or radiation.
- 7. The method of claim 1, wherein the biological sample comprises a human cell.
- 8. The method of claim 1 , wherein the step of generating the gene expression profile comprises identifying expressed mRNA with an EST array.
- 9. The method of claim 1 , wherein the step of generating the gene expression profile comprises identifying expressed mRNA with an oligonucleotide array.
- 10. The method of claim 1 , wherein the step of generating the gene expression profile comprises identifying expressed mRNA with an mRNA array.
- 11. The method of claim 1 , wherein the mRNA is differentially expressed in two biological samples.
- 12. The method of claim 11 , wherein the two biological samples are derived from a normal cell and a pathologic cell.
- 13. The method of claim 12, wherein the pathologic cell is a cancer cell.
- 14. The method of claim 11 , wherein the two biological samples are derived from a healthy cell and a cell exposed to a toxic compound.
- 15. The method of claim 1 , wherein the step of identifying the physio- chemical property of the polypeptide encoded by the mRNA further comprises identifying a plurality of physio-chemical properties.
- 16. The method of claim 1 , wherein the step of identifying a physio- chemical property comprises predicting the masses of proteolytic fragments generated by the polypeptide encoded by the mRNA upon degradation of the polypeptide by a selected proteolytic agent, and the step of identifying the polypeptide encoded by the mRNA comprises subjecting polypeptides in the sample to degradation by the agent and identifying actual proteolytic fragments in the sample having masses that correspond to the masses of the predicted fragments.
- 17. The method of claim 1, wherein the physio-chemical property is selected from the group consisting of: amino acid sequence, molecular weight, iso- electric point, hydrophobicity, hydrophilicity, glycosylation, phosphorylation, epitope sequence, ligand binding sequence, charge at a specified pH, and metal chelate binding.
- 18. The method of claim 1 , wherein the step of fractionating the polypeptides in the sample comprises 2D-gel electrophoresis.
- 19. The method of claim 1 , wherein the step of fractionating the polypeptides in the sample comprises mass spectrometry.
- 20. The method of claim 1 , wherein the step of fractionating the polypeptides in the sample comprises surface enhanced laser desoφtion ionization, wherein the surface enhanced laser desoφtion ionization comprises fractionating by affinity retention on solid phase-bound adsorbent followed by fractionating retained polypeptides from the solid phase by gas phase ion spectrometry.
- 21. The method of claim 20, wherein the adsorbent is selected to have affinity for polypeptides possessing at least one physio-chemical property selected from the group consisting of: amino acid sequence, molecular weight, iso-electric point, hydrophobicity, hydrophilicity, glycosylation, phosphorylation, epitope sequence, ligand binding sequence, charge at a specified pH, and metal chelate binding.
- 22. The method of claim 1 , wherein the step of identifying the polypeptide comprises selecting a polypeptide from among the fractionated polypeptides, which selected polypeptide comprises the physio-chemical property, identifying the selected polypeptide and correlating the identity of the selected polypeptide with the polypeptide encoded by the mRNA.
- 23. A method of correlating gene and protein expression in a biological sample, the method comprising the steps of: a) obtaining a biological sample; b) generating a gene expression profile of the sample using a nucleic acid array, thereby identifying an mRNA expressed in the sample; c) identifying a physio-chemical property of a polypeptide encoded by the mRNA; d) fractionating polypeptides in the sample on the basis of the physio- chemical property, using mass spectrometry and; (e) identifying the polypeptide encoded by the mRNA from among the fractionated proteins, wherein the identified polypeptide comprises the physio-chemical property; thereby correlating gene and protein expression in the cell.
- 24. The method of claim 23, wherein the step of generating the gene expression profile comprises identifying expressed mRNA with an EST array.
- 25. The method of claim 23, wherein the step of generating the gene expression profile comprises identifying expressed mRNA with an oligonucleotide array.
- 26. The method of claim 23, wherein the step of generating the gene expression profile comprises identifying expressed mRNA with an mRNA aπay.
- 27. The method of claim 23, wherein the step of identifying the polypeptide encoded by the mRNA comprises fractionating polypeptides in the sample by surface enhanced laser desoφtion ionization, wherein the surface enhanced laser desoφtion ionization comprises fractionating by affinity retention on solid phase-bound adsorbent followed by fractionating retained polypeptides from the solid phase by gas phase ion spectrometry.
- 28. A method of correlating gene and protein expression in a biological sample, the method comprising the steps of: a) obtaining a biological sample; b) generating a gene expression profile of the sample using an oligonucleotide array, thereby identifying an mRNA expressed in the sample; c) identifying a physio-chemical property of a polypeptide encoded by the mRNA; d) fractionating polypeptides in the sample on the basis of the physio- chemical property with surface enhanced laser desoφtion ionization, wherein the surface enhanced laser desoφtion ionization comprises fractionating by affinity retention on solid phase-bound adsorbent followed by fractionating retained polypeptides from the solid phase by gas phase ion spectrometry; and e) identifying the polypeptide encoded by the mRNA from among the fractionated proteins, wherein the identified polypeptide comprises the physio-chemical property; thereby correlating gene and protein expression in the cell.
- 29. The method of claim 28, wherein the adsorbent is selected to have affinity for polypeptides possessing at least one physio-chemical property selected from the group consisting of: amino acid sequence, molecular weight, iso-electric point, hydrophobicity, hydrophilicity, glycosylation, phosphorylation, epitope sequence, ligand binding sequence, charge at a specified pH, and metal chelate binding.
- 30. The method of claim 28, wherein the step of identifying the physio- chemical property comprises predicting the masses of proteolytic fragments generated by the polypeptide encoded by the mRNA upon degradation of the polypeptide by a selected proteolytic agent, and the step of identifying the polypeptide encoded by the mRNA comprises subjecting polypeptides in the sample to degradation by the agent and identifying actual proteolytic fragments in the sample having masses that correspond to the masses of the predicted fragments.
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US26977201P | 2001-02-16 | 2001-02-16 | |
| US60/269,772 | 2001-02-16 | ||
| PCT/US2002/004467 WO2002079491A2 (en) | 2001-02-16 | 2002-02-15 | Method for correlating gene expression profiles with protein expression profiles |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| AU2002314715A1 true AU2002314715A1 (en) | 2003-04-03 |
| AU2002314715B2 AU2002314715B2 (en) | 2006-07-27 |
Family
ID=23028598
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU2002314715A Expired - Fee Related AU2002314715B2 (en) | 2001-02-16 | 2002-02-15 | Method for correlating gene expression profiles with protein expression profiles |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US7299134B2 (en) |
| JP (1) | JP2005507235A (en) |
| KR (1) | KR20040054609A (en) |
| CN (1) | CN1636068A (en) |
| AU (1) | AU2002314715B2 (en) |
| CA (1) | CA2438391A1 (en) |
| WO (1) | WO2002079491A2 (en) |
Families Citing this family (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| NZ516848A (en) * | 1997-06-20 | 2004-03-26 | Ciphergen Biosystems Inc | Retentate chromatography apparatus with applications in biology and medicine |
| JP2003536179A (en) * | 2000-06-19 | 2003-12-02 | コレロジック システムズ,インコーポレイティド | Heuristic classification method |
| BR0112667A (en) | 2000-07-18 | 2006-05-09 | Correlogic Systems Inc | process of distinguishing between biological states based on hidden patterns of biological data |
| US7333896B2 (en) * | 2002-07-29 | 2008-02-19 | Correlogic Systems, Inc. | Quality assurance/quality control for high throughput bioassay process |
| AU2004261222A1 (en) * | 2003-08-01 | 2005-02-10 | Correlogic Systems, Inc. | Multiple high-resolution serum proteomic features for ovarian cancer detection |
| WO2005055812A2 (en) * | 2003-12-05 | 2005-06-23 | Ciphergen Biosystems, Inc. | Serum biomarkers for chagas disease |
| JP4774534B2 (en) * | 2003-12-11 | 2011-09-14 | アングーク ファーマシューティカル カンパニー,リミティド | A diagnostic method for biological status through the use of a centralized adaptive model and remotely manipulated sample processing |
| US7749716B2 (en) * | 2004-05-18 | 2010-07-06 | Vermilllion, Inc. | Methods of detecting a fragment of neurosecretory protein VGF for diagnosing alzheimer's disease |
| US20060063161A1 (en) * | 2004-09-22 | 2006-03-23 | Ciphergen Biosystems, Inc. | Detecting RNAi using SELDI mass spectrometry |
| US20070003996A1 (en) * | 2005-02-09 | 2007-01-04 | Hitt Ben A | Identification of bacteria and spores |
| US20080312514A1 (en) * | 2005-05-12 | 2008-12-18 | Mansfield Brian C | Serum Patterns Predictive of Breast Cancer |
| US20080201095A1 (en) * | 2007-02-12 | 2008-08-21 | Yip Ping F | Method for Calibrating an Analytical Instrument |
| JP2010532484A (en) | 2007-06-29 | 2010-10-07 | コレロジック システムズ,インコーポレイテッド | Predictive markers for ovarian cancer |
| AU2009232353B2 (en) * | 2008-04-05 | 2015-04-30 | Single Cell Technology, Inc. | Method of screening single cells for the production of biologically active agents |
| US8975019B2 (en) * | 2009-10-19 | 2015-03-10 | University Of Massachusetts | Deducing exon connectivity by RNA-templated DNA ligation/sequencing |
| AU2015334841B2 (en) | 2014-10-24 | 2022-02-03 | Koninklijke Philips N.V. | Medical prognosis and prediction of treatment response using multiple cellular signaling pathway activities |
| JP6415712B2 (en) | 2014-10-24 | 2018-10-31 | コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. | Assessment of TGF-β cell signaling pathway activity using mathematical modeling of target gene expression |
| ES2838923T3 (en) | 2014-10-24 | 2021-07-02 | Koninklijke Philips Nv | Medical prognosis and prediction of response to treatment using multiple activities of the cell signaling pathway |
| ES2861400T3 (en) | 2015-08-14 | 2021-10-06 | Koninklijke Philips Nv | Evaluation of the activity of the NFkB cell signaling pathway using mathematical models of target gene expression |
| EP3461916A1 (en) | 2017-10-02 | 2019-04-03 | Koninklijke Philips N.V. | Assessment of jak-stat3 cellular signaling pathway activity using mathematical modelling of target gene expression |
| EP3461915A1 (en) | 2017-10-02 | 2019-04-03 | Koninklijke Philips N.V. | Assessment of jak-stat1/2 cellular signaling pathway activity using mathematical modelling of target gene expression |
| EP3502279A1 (en) | 2017-12-20 | 2019-06-26 | Koninklijke Philips N.V. | Assessment of mapk-ap 1 cellular signaling pathway activity using mathematical modelling of target gene expression |
| KR102871187B1 (en) * | 2017-12-29 | 2025-10-14 | 노틸러스 서브시디어리, 인크. | Decoding approaches for protein identification |
| CN110609078B (en) * | 2019-09-20 | 2022-03-11 | 南京谱利健生物技术有限公司 | A method for detecting the correlation between protein phosphorylation and acetylglucosamine |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6593084B2 (en) | 1998-10-13 | 2003-07-15 | Robert E. Bird | Carcinogen assay |
| AU2025400A (en) | 1998-11-17 | 2000-06-05 | Proteo Tools | Separation, screening, and identification of biological targets |
| AU2001233276A1 (en) | 2000-02-03 | 2001-08-14 | Immunomatrix Inc. | Method and apparatus for signal transduction pathway profiling |
-
2002
- 2002-02-15 KR KR10-2003-7010801A patent/KR20040054609A/en not_active Withdrawn
- 2002-02-15 CA CA002438391A patent/CA2438391A1/en not_active Abandoned
- 2002-02-15 AU AU2002314715A patent/AU2002314715B2/en not_active Expired - Fee Related
- 2002-02-15 JP JP2002578492A patent/JP2005507235A/en active Pending
- 2002-02-15 CN CNA028064798A patent/CN1636068A/en active Pending
- 2002-02-15 WO PCT/US2002/004467 patent/WO2002079491A2/en not_active Ceased
- 2002-02-15 US US10/076,967 patent/US7299134B2/en not_active Expired - Fee Related
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2002314715B2 (en) | Method for correlating gene expression profiles with protein expression profiles | |
| AU2002314715A1 (en) | Method for correlating gene expression profiles with protein expression profiles | |
| EP1448760B1 (en) | Methods for monitoring polypeptide production and purification using surface enhanced laser desorption/ionization mass spectrometry | |
| James | Protein identification in the post-genome era: the rapid rise of proteomics | |
| JP4129540B2 (en) | Retained chromatography and protein chip arrays applied to biology and medicine | |
| US20060084059A1 (en) | Serum biomarkers in hepatocellular carcinoma | |
| US20040146937A1 (en) | Protein interaction difference mapping | |
| AU2001249275B2 (en) | Prostate cancer markers | |
| US20020060290A1 (en) | Method for analysis of analytes by mass spectrometry | |
| AU2003272924B2 (en) | Plate for mass spectrometry, process for preparing the same and use thereof | |
| US20030119063A1 (en) | High accuracy protein identification | |
| US20040096820A1 (en) | Comparative proteomics of progressor and nonprogressor populations | |
| WO2002074927A2 (en) | High accuracy protein identification | |
| Gupta et al. | Mass spectrometry: An essential tool for genome and proteome analysis | |
| WO2003102542A2 (en) | Comparative proteomics of progressor and nonprogressor populations |