US20020042118A1 - Phenol-induced proteins of Thauera aromatica - Google Patents
Phenol-induced proteins of Thauera aromatica Download PDFInfo
- Publication number
- US20020042118A1 US20020042118A1 US09/870,162 US87016201A US2002042118A1 US 20020042118 A1 US20020042118 A1 US 20020042118A1 US 87016201 A US87016201 A US 87016201A US 2002042118 A1 US2002042118 A1 US 2002042118A1
- Authority
- US
- United States
- Prior art keywords
- ala
- leu
- gly
- val
- glu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 199
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 title claims abstract description 152
- 241000608961 Thauera aromatica Species 0.000 title claims abstract description 60
- 102000004169 proteins and genes Human genes 0.000 title abstract description 95
- 150000007523 nucleic acids Chemical group 0.000 claims description 71
- 239000002773 nucleotide Substances 0.000 claims description 60
- 125000003729 nucleotide group Chemical group 0.000 claims description 60
- 108020004414 DNA Proteins 0.000 claims description 58
- 239000012634 fragment Substances 0.000 claims description 40
- 230000014509 gene expression Effects 0.000 claims description 38
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 36
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 24
- 241000588724 Escherichia coli Species 0.000 claims description 22
- 102000039446 nucleic acids Human genes 0.000 claims description 22
- 108020004707 nucleic acids Proteins 0.000 claims description 22
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 22
- 230000000694 effects Effects 0.000 claims description 20
- 229920001184 polypeptide Polymers 0.000 claims description 17
- 244000005700 microbiome Species 0.000 claims description 16
- 230000001105 regulatory effect Effects 0.000 claims description 14
- 230000000295 complement effect Effects 0.000 claims description 11
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 claims description 10
- 238000009396 hybridization Methods 0.000 claims description 9
- 239000000758 substrate Substances 0.000 claims description 7
- 238000011144 upstream manufacturing Methods 0.000 claims description 6
- -1 LEU2 Proteins 0.000 claims description 5
- 239000013604 expression vector Substances 0.000 claims description 4
- 241000589519 Comamonas Species 0.000 claims description 3
- 241000186216 Corynebacterium Species 0.000 claims description 3
- 241000235648 Pichia Species 0.000 claims description 3
- 241000589516 Pseudomonas Species 0.000 claims description 3
- 241000316848 Rhodococcus <scale insect> Species 0.000 claims description 3
- FQVLRGLGWNWPSS-BXBUPLCLSA-N (4r,7s,10s,13s,16r)-16-acetamido-13-(1h-imidazol-5-ylmethyl)-10-methyl-6,9,12,15-tetraoxo-7-propan-2-yl-1,2-dithia-5,8,11,14-tetrazacycloheptadecane-4-carboxamide Chemical compound N1C(=O)[C@@H](NC(C)=O)CSSC[C@@H](C(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@@H]1CC1=CN=CN1 FQVLRGLGWNWPSS-BXBUPLCLSA-N 0.000 claims description 2
- 101710163881 5,6-dihydroxyindole-2-carboxylic acid oxidase Proteins 0.000 claims description 2
- 102100034035 Alcohol dehydrogenase 1A Human genes 0.000 claims description 2
- 102100036826 Aldehyde oxidase Human genes 0.000 claims description 2
- 241000228257 Aspergillus sp. Species 0.000 claims description 2
- 241000099686 Azotobacter sp. Species 0.000 claims description 2
- 241000186312 Brevibacterium sp. Species 0.000 claims description 2
- 241000222120 Candida <Saccharomycetales> Species 0.000 claims description 2
- 241000873310 Citrobacter sp. Species 0.000 claims description 2
- 241000193464 Clostridium sp. Species 0.000 claims description 2
- 241001478312 Comamonas sp. Species 0.000 claims description 2
- 241000186249 Corynebacterium sp. Species 0.000 claims description 2
- 241001030162 Debaryomyces sp. Species 0.000 claims description 2
- 241001560459 Dunaliella sp. Species 0.000 claims description 2
- 241000147019 Enterobacter sp. Species 0.000 claims description 2
- 241000488157 Escherichia sp. Species 0.000 claims description 2
- 101150094690 GAL1 gene Proteins 0.000 claims description 2
- 101150038242 GAL10 gene Proteins 0.000 claims description 2
- 102100028501 Galanin peptides Human genes 0.000 claims description 2
- 102100024637 Galectin-10 Human genes 0.000 claims description 2
- 101000892220 Geobacillus thermodenitrificans (strain NG80-2) Long-chain-alcohol dehydrogenase 1 Proteins 0.000 claims description 2
- 102100031181 Glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 claims description 2
- 101150009006 HIS3 gene Proteins 0.000 claims description 2
- 101100246753 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) pyrF gene Proteins 0.000 claims description 2
- 101000780443 Homo sapiens Alcohol dehydrogenase 1A Proteins 0.000 claims description 2
- 101000928314 Homo sapiens Aldehyde oxidase Proteins 0.000 claims description 2
- 101100121078 Homo sapiens GAL gene Proteins 0.000 claims description 2
- 101001046426 Homo sapiens cGMP-dependent protein kinase 1 Proteins 0.000 claims description 2
- 241000588754 Klebsiella sp. Species 0.000 claims description 2
- 241000170280 Kluyveromyces sp. Species 0.000 claims description 2
- 241000186610 Lactobacillus sp. Species 0.000 claims description 2
- 241001558145 Mucor sp. Species 0.000 claims description 2
- 101150012394 PHO5 gene Proteins 0.000 claims description 2
- 241000235061 Pichia sp. Species 0.000 claims description 2
- 241000589774 Pseudomonas sp. Species 0.000 claims description 2
- 241000589187 Rhizobium sp. Species 0.000 claims description 2
- 241000187562 Rhodococcus sp. Species 0.000 claims description 2
- 101100394989 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) hisI gene Proteins 0.000 claims description 2
- 101100434411 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ADH1 gene Proteins 0.000 claims description 2
- 241000235088 Saccharomyces sp. Species 0.000 claims description 2
- 241000607142 Salmonella Species 0.000 claims description 2
- 101001000154 Schistosoma mansoni Phosphoglycerate kinase Proteins 0.000 claims description 2
- 241000187180 Streptomyces sp. Species 0.000 claims description 2
- 101150050575 URA3 gene Proteins 0.000 claims description 2
- 241000235017 Zygosaccharomyces Species 0.000 claims description 2
- 101150102866 adc1 gene Proteins 0.000 claims description 2
- 102100022422 cGMP-dependent protein kinase 1 Human genes 0.000 claims description 2
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 claims description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 claims 4
- 125000003275 alpha amino acid group Chemical group 0.000 claims 3
- 108010076504 Protein Sorting Signals Proteins 0.000 claims 2
- 239000011780 sodium chloride Substances 0.000 claims 2
- 241000235058 Komagataella pastoris Species 0.000 claims 1
- 102000009097 Phosphorylases Human genes 0.000 claims 1
- 108010073135 Phosphorylases Proteins 0.000 claims 1
- 238000012423 maintenance Methods 0.000 claims 1
- 102000004190 Enzymes Human genes 0.000 abstract description 29
- 108090000790 Enzymes Proteins 0.000 abstract description 29
- FJKROLUGYXJWQN-UHFFFAOYSA-M 4-hydroxybenzoate Chemical compound OC1=CC=C(C([O-])=O)C=C1 FJKROLUGYXJWQN-UHFFFAOYSA-M 0.000 abstract description 28
- CMPQUABWPXYYSH-UHFFFAOYSA-N phenyl phosphate Chemical compound OP(O)(=O)OC1=CC=CC=C1 CMPQUABWPXYYSH-UHFFFAOYSA-N 0.000 abstract description 27
- 238000006243 chemical reaction Methods 0.000 abstract description 24
- 238000006473 carboxylation reaction Methods 0.000 abstract description 18
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 abstract description 17
- 230000021523 carboxylation Effects 0.000 abstract description 14
- 235000018102 proteins Nutrition 0.000 description 88
- 150000001413 amino acids Chemical group 0.000 description 65
- 210000004027 cell Anatomy 0.000 description 49
- 239000013615 primer Substances 0.000 description 27
- 238000003752 polymerase chain reaction Methods 0.000 description 26
- 239000000047 product Substances 0.000 description 25
- 238000000034 method Methods 0.000 description 24
- 235000001014 amino acid Nutrition 0.000 description 22
- 229940024606 amino acid Drugs 0.000 description 22
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 21
- 108091026890 Coding region Proteins 0.000 description 16
- 239000000523 sample Substances 0.000 description 15
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 14
- UIIMBOGNXHQVGW-UHFFFAOYSA-M Sodium bicarbonate Chemical compound [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 14
- 230000004060 metabolic process Effects 0.000 description 14
- 101100038645 Streptomyces griseus rppA gene Proteins 0.000 description 13
- 108020004999 messenger RNA Proteins 0.000 description 13
- 101710100170 Unknown protein Proteins 0.000 description 12
- 108010050848 glycylleucine Proteins 0.000 description 12
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 12
- 238000004519 manufacturing process Methods 0.000 description 12
- 238000013519 translation Methods 0.000 description 12
- 238000004587 chromatography analysis Methods 0.000 description 11
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 11
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 10
- 108091034117 Oligonucleotide Proteins 0.000 description 10
- 108700026244 Open Reading Frames Proteins 0.000 description 10
- 239000002299 complementary DNA Substances 0.000 description 10
- 108010061238 threonyl-glycine Proteins 0.000 description 10
- 101100148606 Caenorhabditis elegans pst-1 gene Proteins 0.000 description 9
- 108020001019 DNA Primers Proteins 0.000 description 9
- 239000003155 DNA primer Substances 0.000 description 9
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 9
- 108010005233 alanylglutamic acid Proteins 0.000 description 9
- 230000004075 alteration Effects 0.000 description 9
- 238000003556 assay Methods 0.000 description 9
- 230000009466 transformation Effects 0.000 description 9
- 239000013598 vector Substances 0.000 description 9
- 108020004705 Codon Proteins 0.000 description 8
- 101100111747 Eupenicillium brefeldianum Bref-PKS gene Proteins 0.000 description 8
- 101100226895 Phomopsis amygdali PaP450-3 gene Proteins 0.000 description 8
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 8
- 108010087924 alanylproline Proteins 0.000 description 8
- 101150083238 bsc7 gene Proteins 0.000 description 8
- 108010049041 glutamylalanine Proteins 0.000 description 8
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 8
- 239000000543 intermediate Substances 0.000 description 8
- 238000012216 screening Methods 0.000 description 8
- 108010062110 water dikinase pyruvate Proteins 0.000 description 8
- 101100165658 Alternaria brassicicola bsc5 gene Proteins 0.000 description 7
- 101100165660 Alternaria brassicicola bsc6 gene Proteins 0.000 description 7
- 101100165663 Alternaria brassicicola bsc8 gene Proteins 0.000 description 7
- 101100499295 Bacillus subtilis (strain 168) disA gene Proteins 0.000 description 7
- 101100032924 Bacillus subtilis (strain 168) radA gene Proteins 0.000 description 7
- 101100492392 Didymella fabae pksAC gene Proteins 0.000 description 7
- 108010079364 N-glycylalanine Proteins 0.000 description 7
- 229910002651 NO3 Inorganic materials 0.000 description 7
- NHNBFGGVMKEFGY-UHFFFAOYSA-N Nitrate Chemical compound [O-][N+]([O-])=O NHNBFGGVMKEFGY-UHFFFAOYSA-N 0.000 description 7
- 101150007210 ORF6 gene Proteins 0.000 description 7
- 101100226894 Phomopsis amygdali PaGT gene Proteins 0.000 description 7
- 101100226896 Phomopsis amygdali PaMT gene Proteins 0.000 description 7
- 101100226893 Phomopsis amygdali PaP450-2 gene Proteins 0.000 description 7
- 101000870438 Streptococcus gordonii UDP-N-acetylglucosamine-peptide N-acetylglucosaminyltransferase stabilizing protein GtfB Proteins 0.000 description 7
- 101000645119 Vibrio campbellii (strain ATCC BAA-1116 / BB120) Nucleotide-binding protein VIBHAR_03667 Proteins 0.000 description 7
- 108010060035 arginylproline Proteins 0.000 description 7
- 239000012528 membrane Substances 0.000 description 7
- 239000013612 plasmid Substances 0.000 description 7
- 229910000030 sodium bicarbonate Inorganic materials 0.000 description 7
- 241000894006 Bacteria Species 0.000 description 6
- 229920002271 DEAE-Sepharose Polymers 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 6
- 238000000246 agarose gel electrophoresis Methods 0.000 description 6
- 229960000723 ampicillin Drugs 0.000 description 6
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 6
- 230000003321 amplification Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 6
- YCIMNLLNPGFGHC-UHFFFAOYSA-N catechol Chemical compound OC1=CC=CC=C1O YCIMNLLNPGFGHC-UHFFFAOYSA-N 0.000 description 6
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 6
- 229960005542 ethidium bromide Drugs 0.000 description 6
- 108010015792 glycyllysine Proteins 0.000 description 6
- 238000004128 high performance liquid chromatography Methods 0.000 description 6
- 108010078274 isoleucylvaline Proteins 0.000 description 6
- 108010034529 leucyl-lysine Proteins 0.000 description 6
- 108010057821 leucylproline Proteins 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 108091033319 polynucleotide Proteins 0.000 description 6
- 102000040430 polynucleotide Human genes 0.000 description 6
- 239000002157 polynucleotide Substances 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- 238000012163 sequencing technique Methods 0.000 description 6
- 239000006228 supernatant Substances 0.000 description 6
- 238000012546 transfer Methods 0.000 description 6
- 101100437895 Alternaria brassicicola bsc3 gene Proteins 0.000 description 5
- 108020005544 Antisense RNA Proteins 0.000 description 5
- 101100512049 Arabidopsis thaliana LWD2 gene Proteins 0.000 description 5
- 108700010070 Codon Usage Proteins 0.000 description 5
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 5
- 101150073872 ORF3 gene Proteins 0.000 description 5
- 101100226897 Phomopsis amygdali PaAT-1 gene Proteins 0.000 description 5
- 101100226891 Phomopsis amygdali PaP450-1 gene Proteins 0.000 description 5
- 101100226885 Phomopsis amygdali PaP450-4 gene Proteins 0.000 description 5
- 101100220583 Rhizobium meliloti (strain 1021) cheD gene Proteins 0.000 description 5
- 101100350254 Streptomyces antibioticus oleV gene Proteins 0.000 description 5
- 102000004142 Trypsin Human genes 0.000 description 5
- 108090000631 Trypsin Proteins 0.000 description 5
- 230000000692 anti-sense effect Effects 0.000 description 5
- 108010013835 arginine glutamate Proteins 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- 239000003184 complementary RNA Substances 0.000 description 5
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 5
- 230000012010 growth Effects 0.000 description 5
- 238000011534 incubation Methods 0.000 description 5
- 230000000865 phosphorylative effect Effects 0.000 description 5
- 108010029020 prolylglycine Proteins 0.000 description 5
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 239000012588 trypsin Substances 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 4
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 4
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 4
- 108020004635 Complementary DNA Proteins 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- ZJSMFRTVYSLKQU-DJFWLOJKSA-N His-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ZJSMFRTVYSLKQU-DJFWLOJKSA-N 0.000 description 4
- PJLLMGWWINYQPB-PEFMBERDSA-N Ile-Asn-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PJLLMGWWINYQPB-PEFMBERDSA-N 0.000 description 4
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 4
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 4
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 4
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 4
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 4
- ZEDVFJPQNNBMST-CYDGBPFRSA-N Met-Arg-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZEDVFJPQNNBMST-CYDGBPFRSA-N 0.000 description 4
- 241001302042 Methanothermobacter thermautotrophicus Species 0.000 description 4
- 239000002033 PVDF binder Substances 0.000 description 4
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 4
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 4
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 4
- 239000007983 Tris buffer Substances 0.000 description 4
- XKVXSCHXGJOQND-ZOBUZTSGSA-N Val-Asp-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N XKVXSCHXGJOQND-ZOBUZTSGSA-N 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 108010087823 glycyltyrosine Proteins 0.000 description 4
- 125000001165 hydrophobic group Chemical group 0.000 description 4
- 108010003700 lysyl aspartic acid Proteins 0.000 description 4
- BDAGIHXWWSANSR-UHFFFAOYSA-N methanoic acid Natural products OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- VLTRZXGMWDSKGL-UHFFFAOYSA-N perchloric acid Chemical compound OCl(=O)(=O)=O VLTRZXGMWDSKGL-UHFFFAOYSA-N 0.000 description 4
- DTBNBXWJWCWCIK-UHFFFAOYSA-K phosphonatoenolpyruvate Chemical compound [O-]C(=O)C(=C)OP([O-])([O-])=O DTBNBXWJWCWCIK-UHFFFAOYSA-K 0.000 description 4
- 229920002981 polyvinylidene fluoride Polymers 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- VWDWKYIASSYTQR-UHFFFAOYSA-N sodium nitrate Chemical compound [Na+].[O-][N+]([O-])=O VWDWKYIASSYTQR-UHFFFAOYSA-N 0.000 description 4
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- SBGXWWCLHIOABR-UHFFFAOYSA-N Ala Ala Gly Ala Chemical compound CC(N)C(=O)NC(C)C(=O)NCC(=O)NC(C)C(O)=O SBGXWWCLHIOABR-UHFFFAOYSA-N 0.000 description 3
- WQVFQXXBNHHPLX-ZKWXMUAHSA-N Ala-Ala-His Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O WQVFQXXBNHHPLX-ZKWXMUAHSA-N 0.000 description 3
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 3
- FBHOPGDGELNWRH-DRZSPHRISA-N Ala-Glu-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FBHOPGDGELNWRH-DRZSPHRISA-N 0.000 description 3
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 3
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 3
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 3
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 3
- LFWOQHSQNCKXRU-UFYCRDLUSA-N Arg-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 LFWOQHSQNCKXRU-UFYCRDLUSA-N 0.000 description 3
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 3
- HDHZCEDPLTVHFZ-GUBZILKMSA-N Asn-Leu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O HDHZCEDPLTVHFZ-GUBZILKMSA-N 0.000 description 3
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 3
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 3
- 101100101481 Escherichia coli (strain K12) ubiD gene Proteins 0.000 description 3
- MLZRSFQRBDNJON-GUBZILKMSA-N Gln-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MLZRSFQRBDNJON-GUBZILKMSA-N 0.000 description 3
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 3
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 3
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 3
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 3
- ZVXMEWXHFBYJPI-LSJOCFKGSA-N Gly-Val-Ile Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZVXMEWXHFBYJPI-LSJOCFKGSA-N 0.000 description 3
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 3
- SHVFUCSSACPBTF-VGDYDELISA-N Ile-Ser-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SHVFUCSSACPBTF-VGDYDELISA-N 0.000 description 3
- 108091092195 Intron Proteins 0.000 description 3
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 3
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 3
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 3
- 229910021380 Manganese Chloride Inorganic materials 0.000 description 3
- GLFNIEUTAYBVOC-UHFFFAOYSA-L Manganese chloride Chemical compound Cl[Mn]Cl GLFNIEUTAYBVOC-UHFFFAOYSA-L 0.000 description 3
- OSOLWRWQADPDIQ-DCAQKATOSA-N Met-Asp-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OSOLWRWQADPDIQ-DCAQKATOSA-N 0.000 description 3
- DJDFBVNNDAUPRW-GUBZILKMSA-N Met-Glu-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O DJDFBVNNDAUPRW-GUBZILKMSA-N 0.000 description 3
- ZRACLHJYVRBJFC-ULQDDVLXSA-N Met-Lys-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZRACLHJYVRBJFC-ULQDDVLXSA-N 0.000 description 3
- VYXIKLFLGRTANT-HRCADAONSA-N Met-Tyr-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N VYXIKLFLGRTANT-HRCADAONSA-N 0.000 description 3
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 3
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 3
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 3
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 3
- 108091000080 Phosphotransferase Proteins 0.000 description 3
- 238000012300 Sequence Analysis Methods 0.000 description 3
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 3
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 3
- NIEWSKWFURSECR-FOHZUACHSA-N Thr-Gly-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NIEWSKWFURSECR-FOHZUACHSA-N 0.000 description 3
- YJCVECXVYHZOBK-KNZXXDILSA-N Thr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H]([C@@H](C)O)N YJCVECXVYHZOBK-KNZXXDILSA-N 0.000 description 3
- VBMOVTMNHWPZJR-SUSMZKCASA-N Thr-Thr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VBMOVTMNHWPZJR-SUSMZKCASA-N 0.000 description 3
- ZMANZCXQSJIPKH-UHFFFAOYSA-N Triethylamine Chemical compound CCN(CC)CC ZMANZCXQSJIPKH-UHFFFAOYSA-N 0.000 description 3
- PKUJMYZNJMRHEZ-XIRDDKMYSA-N Trp-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PKUJMYZNJMRHEZ-XIRDDKMYSA-N 0.000 description 3
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 3
- BZDNMSNFLZXRLY-UHFFFAOYSA-N [2-(diphenylphosphanylmethyl)phenyl]methyl-diphenylphosphane Chemical compound C=1C=CC=C(CP(C=2C=CC=CC=2)C=2C=CC=CC=2)C=1CP(C=1C=CC=CC=1)C1=CC=CC=C1 BZDNMSNFLZXRLY-UHFFFAOYSA-N 0.000 description 3
- 239000002253 acid Substances 0.000 description 3
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 3
- 108010044940 alanylglutamine Proteins 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 3
- 108010068380 arginylarginine Proteins 0.000 description 3
- 108010038633 aspartylglutamate Proteins 0.000 description 3
- 238000002869 basic local alignment search tool Methods 0.000 description 3
- WPYMKLBDIGXBTP-UHFFFAOYSA-N benzoic acid Chemical compound OC(=O)C1=CC=CC=C1 WPYMKLBDIGXBTP-UHFFFAOYSA-N 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 239000000872 buffer Substances 0.000 description 3
- 239000004202 carbamide Substances 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 108091008053 gene clusters Proteins 0.000 description 3
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 3
- 108010089804 glycyl-threonine Proteins 0.000 description 3
- 108010077515 glycylproline Proteins 0.000 description 3
- 108010084389 glycyltryptophan Proteins 0.000 description 3
- 108010037850 glycylvaline Proteins 0.000 description 3
- 108010040030 histidinoalanine Proteins 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 101150066555 lacZ gene Proteins 0.000 description 3
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 3
- 239000011565 manganese chloride Substances 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 230000000813 microbial effect Effects 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 150000002989 phenols Chemical class 0.000 description 3
- 108010051242 phenylalanylserine Proteins 0.000 description 3
- 229930029653 phosphoenolpyruvate Natural products 0.000 description 3
- 102000020233 phosphotransferase Human genes 0.000 description 3
- 230000008488 polyadenylation Effects 0.000 description 3
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 3
- 108010070643 prolylglutamic acid Proteins 0.000 description 3
- 108010053725 prolylvaline Proteins 0.000 description 3
- CDBYLPFSWZWCQE-UHFFFAOYSA-L sodium carbonate Substances [Na+].[Na+].[O-]C([O-])=O CDBYLPFSWZWCQE-UHFFFAOYSA-L 0.000 description 3
- 229910000029 sodium carbonate Inorganic materials 0.000 description 3
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 3
- 238000010186 staining Methods 0.000 description 3
- 238000006467 substitution reaction Methods 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 108010084932 tryptophyl-proline Proteins 0.000 description 3
- YQUVCSBJEUQKSH-UHFFFAOYSA-N 3,4-dihydroxybenzoic acid Chemical compound OC(=O)C1=CC=C(O)C(O)=C1 YQUVCSBJEUQKSH-UHFFFAOYSA-N 0.000 description 2
- UMCMPZBLKLEWAF-BCTGSCMUSA-N 3-[(3-cholamidopropyl)dimethylammonio]propane-1-sulfonate Chemical compound C([C@H]1C[C@H]2O)[C@H](O)CC[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H]([C@@H](CCC(=O)NCCC[N+](C)(C)CCCS([O-])(=O)=O)C)[C@@]2(C)[C@@H](O)C1 UMCMPZBLKLEWAF-BCTGSCMUSA-N 0.000 description 2
- 108010075832 3-octaprenyl-4-hydroxybenzoate carboxy-lyase Proteins 0.000 description 2
- OSWFIVFLDKOXQC-UHFFFAOYSA-N 4-(3-methoxyphenyl)aniline Chemical compound COC1=CC=CC(C=2C=CC(N)=CC=2)=C1 OSWFIVFLDKOXQC-UHFFFAOYSA-N 0.000 description 2
- 229940086681 4-aminobenzoate Drugs 0.000 description 2
- ALYNCZNDIQEVRV-UHFFFAOYSA-N 4-aminobenzoic acid Chemical compound NC1=CC=C(C(O)=O)C=C1 ALYNCZNDIQEVRV-UHFFFAOYSA-N 0.000 description 2
- 125000005274 4-hydroxybenzoic acid group Chemical group 0.000 description 2
- OPIFSICVWOWJMJ-AEOCFKNESA-N 5-bromo-4-chloro-3-indolyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CNC2=CC=C(Br)C(Cl)=C12 OPIFSICVWOWJMJ-AEOCFKNESA-N 0.000 description 2
- 102100026449 AKT-interacting protein Human genes 0.000 description 2
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 2
- LGQPPBQRUBVTIF-JBDRJPRFSA-N Ala-Ala-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LGQPPBQRUBVTIF-JBDRJPRFSA-N 0.000 description 2
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 2
- GRPHQEMIFDPKOE-HGNGGELXSA-N Ala-His-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O GRPHQEMIFDPKOE-HGNGGELXSA-N 0.000 description 2
- QQACQIHVWCVBBR-GVARAGBVSA-N Ala-Ile-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QQACQIHVWCVBBR-GVARAGBVSA-N 0.000 description 2
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 2
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 2
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 2
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 2
- PAYRUJLWNCNPSJ-UHFFFAOYSA-N Aniline Chemical compound NC1=CC=CC=C1 PAYRUJLWNCNPSJ-UHFFFAOYSA-N 0.000 description 2
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 2
- XVLLUZMFSAYKJV-GUBZILKMSA-N Arg-Asp-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XVLLUZMFSAYKJV-GUBZILKMSA-N 0.000 description 2
- MFAMTAVAFBPXDC-LPEHRKFASA-N Arg-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O MFAMTAVAFBPXDC-LPEHRKFASA-N 0.000 description 2
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 2
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 2
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 2
- IGFJVXOATGZTHD-UHFFFAOYSA-N Arg-Phe-His Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccccc1)C(=O)NC(Cc2c[nH]cn2)C(=O)O IGFJVXOATGZTHD-UHFFFAOYSA-N 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 2
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 2
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 2
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 2
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 2
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 2
- UZFHNLYQWMGUHU-DCAQKATOSA-N Asp-Lys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UZFHNLYQWMGUHU-DCAQKATOSA-N 0.000 description 2
- YFGUZQQCSDZRBN-DCAQKATOSA-N Asp-Pro-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YFGUZQQCSDZRBN-DCAQKATOSA-N 0.000 description 2
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 2
- 101100462570 Bacillus subtilis (strain 168) bsdB gene Proteins 0.000 description 2
- 101100488070 Bacillus subtilis (strain 168) bsdC gene Proteins 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 108090000489 Carboxy-Lyases Proteins 0.000 description 2
- 102000004031 Carboxy-Lyases Human genes 0.000 description 2
- 101100497194 Comamonas testosteroni cpnA gene Proteins 0.000 description 2
- GRNOCLDFUNCIDW-ACZMJKKPSA-N Cys-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N GRNOCLDFUNCIDW-ACZMJKKPSA-N 0.000 description 2
- UXIYYUMGFNSGBK-XPUUQOCRSA-N Cys-Gly-Val Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O UXIYYUMGFNSGBK-XPUUQOCRSA-N 0.000 description 2
- YQEHNIKPAOPBNH-DCAQKATOSA-N Cys-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N YQEHNIKPAOPBNH-DCAQKATOSA-N 0.000 description 2
- 239000003298 DNA probe Substances 0.000 description 2
- 101000665898 Escherichia coli (strain K12) Replication initiation protein Proteins 0.000 description 2
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 2
- YGNPTRVNRUKVLA-DCAQKATOSA-N Gln-Met-Met Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N YGNPTRVNRUKVLA-DCAQKATOSA-N 0.000 description 2
- QFXNFFZTMFHPST-DZKIICNBSA-N Gln-Phe-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCC(=O)N)N QFXNFFZTMFHPST-DZKIICNBSA-N 0.000 description 2
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 2
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 2
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 2
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 2
- NNQDRRUXFJYCCJ-NHCYSSNCSA-N Glu-Pro-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O NNQDRRUXFJYCCJ-NHCYSSNCSA-N 0.000 description 2
- BDISFWMLMNBTGP-NUMRIWBASA-N Glu-Thr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O BDISFWMLMNBTGP-NUMRIWBASA-N 0.000 description 2
- NTHIHAUEXVTXQG-KKUMJFAQSA-N Glu-Tyr-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O NTHIHAUEXVTXQG-KKUMJFAQSA-N 0.000 description 2
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 2
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 2
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 2
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 2
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 2
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 2
- XLFHCWHXKSFVIB-BQBZGAKWSA-N Gly-Gln-Gln Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLFHCWHXKSFVIB-BQBZGAKWSA-N 0.000 description 2
- LXXANCRPFBSSKS-IUCAKERBSA-N Gly-Gln-Leu Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LXXANCRPFBSSKS-IUCAKERBSA-N 0.000 description 2
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 2
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 2
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 2
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 2
- CUVBTVWFVIIDOC-YEPSODPASA-N Gly-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)CN CUVBTVWFVIIDOC-YEPSODPASA-N 0.000 description 2
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 2
- IDNNYVGVSZMQTK-IHRRRGAJSA-N His-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N IDNNYVGVSZMQTK-IHRRRGAJSA-N 0.000 description 2
- UCDWNBFOZCZSNV-AVGNSLFASA-N His-Arg-Met Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O UCDWNBFOZCZSNV-AVGNSLFASA-N 0.000 description 2
- 101000718065 Homo sapiens AKT-interacting protein Proteins 0.000 description 2
- HERITAGIPLEJMT-GVARAGBVSA-N Ile-Ala-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HERITAGIPLEJMT-GVARAGBVSA-N 0.000 description 2
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 2
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 2
- HOLOYAZCIHDQNS-YVNDNENWSA-N Ile-Gln-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HOLOYAZCIHDQNS-YVNDNENWSA-N 0.000 description 2
- WZDCVAWMBUNDDY-KBIXCLLPSA-N Ile-Glu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)O)N WZDCVAWMBUNDDY-KBIXCLLPSA-N 0.000 description 2
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 2
- VOBYAKCXGQQFLR-LSJOCFKGSA-N Ile-Gly-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VOBYAKCXGQQFLR-LSJOCFKGSA-N 0.000 description 2
- AXNGDPAKKCEKGY-QPHKQPEJSA-N Ile-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N AXNGDPAKKCEKGY-QPHKQPEJSA-N 0.000 description 2
- PMMMQRVUMVURGJ-XUXIUFHCSA-N Ile-Leu-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O PMMMQRVUMVURGJ-XUXIUFHCSA-N 0.000 description 2
- YKZAMJXNJUWFIK-JBDRJPRFSA-N Ile-Ser-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(=O)O)N YKZAMJXNJUWFIK-JBDRJPRFSA-N 0.000 description 2
- RQJUKVXWAKJDBW-SVSWQMSJSA-N Ile-Ser-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N RQJUKVXWAKJDBW-SVSWQMSJSA-N 0.000 description 2
- KXUKTDGKLAOCQK-LSJOCFKGSA-N Ile-Val-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O KXUKTDGKLAOCQK-LSJOCFKGSA-N 0.000 description 2
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 2
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 2
- 102000016600 Inosine-5'-monophosphate dehydrogenases Human genes 0.000 description 2
- 108050006182 Inosine-5'-monophosphate dehydrogenases Proteins 0.000 description 2
- 108010065920 Insulin Lispro Proteins 0.000 description 2
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 2
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 2
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- 101100399603 Lactiplantibacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) lpdC gene Proteins 0.000 description 2
- 241000880493 Leptailurus serval Species 0.000 description 2
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 2
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 2
- DLFAACQHIRSQGG-CIUDSAMLSA-N Leu-Asp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O DLFAACQHIRSQGG-CIUDSAMLSA-N 0.000 description 2
- VPKIQULSKFVCSM-SRVKXCTJSA-N Leu-Gln-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPKIQULSKFVCSM-SRVKXCTJSA-N 0.000 description 2
- KVMULWOHPPMHHE-DCAQKATOSA-N Leu-Glu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KVMULWOHPPMHHE-DCAQKATOSA-N 0.000 description 2
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 2
- VZBIUJURDLFFOE-IHRRRGAJSA-N Leu-His-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VZBIUJURDLFFOE-IHRRRGAJSA-N 0.000 description 2
- CPONGMJGVIAWEH-DCAQKATOSA-N Leu-Met-Ala Chemical compound CSCC[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O CPONGMJGVIAWEH-DCAQKATOSA-N 0.000 description 2
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 2
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 2
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 2
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 2
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 2
- 239000006391 Luria-Bertani Medium Substances 0.000 description 2
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 2
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 2
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 2
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 2
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 2
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- CSNNHWWHGAXBCP-UHFFFAOYSA-L Magnesium sulfate Chemical compound [Mg+2].[O-][S+2]([O-])([O-])[O-] CSNNHWWHGAXBCP-UHFFFAOYSA-L 0.000 description 2
- 108010052285 Membrane Proteins Proteins 0.000 description 2
- 102000018697 Membrane Proteins Human genes 0.000 description 2
- CTVJSFRHUOSCQQ-DCAQKATOSA-N Met-Arg-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O CTVJSFRHUOSCQQ-DCAQKATOSA-N 0.000 description 2
- JACAKCWAOHKQBV-UWVGGRQHSA-N Met-Gly-Lys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN JACAKCWAOHKQBV-UWVGGRQHSA-N 0.000 description 2
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 2
- LXCSZPUQKMTXNW-BQBZGAKWSA-N Met-Ser-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O LXCSZPUQKMTXNW-BQBZGAKWSA-N 0.000 description 2
- VOAKKHOIAFKOQZ-JYJNAYRXSA-N Met-Tyr-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=C(O)C=C1 VOAKKHOIAFKOQZ-JYJNAYRXSA-N 0.000 description 2
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 2
- UFWIBTONFRDIAS-UHFFFAOYSA-N Naphthalene Chemical compound C1=CC=CC2=CC=CC=C21 UFWIBTONFRDIAS-UHFFFAOYSA-N 0.000 description 2
- 102100023206 Neuromodulin Human genes 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 2
- 108090000854 Oxidoreductases Proteins 0.000 description 2
- 102000004316 Oxidoreductases Human genes 0.000 description 2
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 2
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 2
- IWNOFCGBMSFTBC-CIUDSAMLSA-N Pro-Ala-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IWNOFCGBMSFTBC-CIUDSAMLSA-N 0.000 description 2
- CQZNGNCAIXMAIQ-UBHSHLNASA-N Pro-Ala-Phe Chemical compound C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O CQZNGNCAIXMAIQ-UBHSHLNASA-N 0.000 description 2
- CYQQWUPHIZVCNY-GUBZILKMSA-N Pro-Arg-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CYQQWUPHIZVCNY-GUBZILKMSA-N 0.000 description 2
- ILMLVTGTUJPQFP-FXQIFTODSA-N Pro-Asp-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ILMLVTGTUJPQFP-FXQIFTODSA-N 0.000 description 2
- CLJLVCYFABNTHP-DCAQKATOSA-N Pro-Leu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O CLJLVCYFABNTHP-DCAQKATOSA-N 0.000 description 2
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 2
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 2
- WWXNZNWZNZPDIF-SRVKXCTJSA-N Pro-Val-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 WWXNZNWZNZPDIF-SRVKXCTJSA-N 0.000 description 2
- IMNVAOPEMFDAQD-NHCYSSNCSA-N Pro-Val-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IMNVAOPEMFDAQD-NHCYSSNCSA-N 0.000 description 2
- FHJQROWZEJFZPO-SRVKXCTJSA-N Pro-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FHJQROWZEJFZPO-SRVKXCTJSA-N 0.000 description 2
- 101710126986 Probable phosphoenolpyruvate synthase Proteins 0.000 description 2
- 101710132548 Protein F1 Proteins 0.000 description 2
- 108091034057 RNA (poly(A)) Proteins 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 2
- KYKKKSWGEPFUMR-NAKRPEOUSA-N Ser-Arg-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KYKKKSWGEPFUMR-NAKRPEOUSA-N 0.000 description 2
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 2
- IFPBAGJBHSNYPR-ZKWXMUAHSA-N Ser-Ile-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O IFPBAGJBHSNYPR-ZKWXMUAHSA-N 0.000 description 2
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 2
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 2
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 2
- 108010006785 Taq Polymerase Proteins 0.000 description 2
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 2
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 2
- VUXIQSUQQYNLJP-XAVMHZPKSA-N Thr-Ser-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N)O VUXIQSUQQYNLJP-XAVMHZPKSA-N 0.000 description 2
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 2
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 2
- OSXNCKRGMSHWSQ-ACRUOGEOSA-N Tyr-His-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OSXNCKRGMSHWSQ-ACRUOGEOSA-N 0.000 description 2
- ZZDYJFVIKVSUFA-WLTAIBSBSA-N Tyr-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O ZZDYJFVIKVSUFA-WLTAIBSBSA-N 0.000 description 2
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 2
- 101001028862 Vaccinia virus (strain L-IVP) Protein F14 Proteins 0.000 description 2
- 101000875664 Vaccinia virus (strain L-IVP) Protein F15 Proteins 0.000 description 2
- 101000972564 Vaccinia virus (strain Western Reserve) Protein L2 Proteins 0.000 description 2
- 101000972553 Vaccinia virus (strain Western Reserve) Protein L3 Proteins 0.000 description 2
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 2
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 2
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 2
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 2
- BEGDZYNDCNEGJZ-XVKPBYJWSA-N Val-Gly-Gln Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O BEGDZYNDCNEGJZ-XVKPBYJWSA-N 0.000 description 2
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 2
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 2
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 2
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 2
- NHXZRXLFOBFMDM-AVGNSLFASA-N Val-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C NHXZRXLFOBFMDM-AVGNSLFASA-N 0.000 description 2
- DLRZGNXCXUGIDG-KKHAAJSZSA-N Val-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O DLRZGNXCXUGIDG-KKHAAJSZSA-N 0.000 description 2
- DVLWZWNAQUBZBC-ZNSHCXBVSA-N Val-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N)O DVLWZWNAQUBZBC-ZNSHCXBVSA-N 0.000 description 2
- DFQZDQPLWBSFEJ-LSJOCFKGSA-N Val-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DFQZDQPLWBSFEJ-LSJOCFKGSA-N 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 2
- 235000004279 alanine Nutrition 0.000 description 2
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 108010008355 arginyl-glutamine Proteins 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 108010093581 aspartyl-proline Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000002759 chromosomal effect Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- RMRCNWBMXRMIRW-BYFNXCQMSA-M cyanocobalamin Chemical compound N#C[Co+]N([C@]1([H])[C@H](CC(N)=O)[C@]\2(CCC(=O)NC[C@H](C)OP(O)(=O)OC3[C@H]([C@H](O[C@@H]3CO)N3C4=CC(C)=C(C)C=C4N=C3)O)C)C/2=C(C)\C([C@H](C/2(C)C)CCC(N)=O)=N\C\2=C\C([C@H]([C@@]/2(CC(N)=O)C)CCC(N)=O)=N\C\2=C(C)/C2=N[C@]1(C)[C@@](C)(CC(N)=O)[C@@H]2CCC(N)=O RMRCNWBMXRMIRW-BYFNXCQMSA-M 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 2
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 2
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 108010054813 diprotin B Proteins 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 235000019253 formic acid Nutrition 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 108010078144 glutaminyl-glycine Proteins 0.000 description 2
- 108010079547 glutamylmethionine Proteins 0.000 description 2
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 2
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 108010000761 leucylarginine Proteins 0.000 description 2
- 238000005567 liquid scintillation counting Methods 0.000 description 2
- 229910001629 magnesium chloride Inorganic materials 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 108010085203 methionylmethionine Proteins 0.000 description 2
- 239000003471 mutagenic agent Substances 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000007899 nucleic acid hybridization Methods 0.000 description 2
- QWVGKYWNOKOFNN-UHFFFAOYSA-N o-cresol Chemical compound CC1=CC=CC=C1O QWVGKYWNOKOFNN-UHFFFAOYSA-N 0.000 description 2
- 239000002751 oligonucleotide probe Substances 0.000 description 2
- YNPNZTXNASCQKK-UHFFFAOYSA-N phenanthrene Chemical compound C1=CC=C2C3=CC=CC=C3C=CC2=C1 YNPNZTXNASCQKK-UHFFFAOYSA-N 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 239000011736 potassium bicarbonate Substances 0.000 description 2
- 229910000028 potassium bicarbonate Inorganic materials 0.000 description 2
- TYJJADVDDVDEDZ-UHFFFAOYSA-M potassium hydrogencarbonate Chemical compound [K+].OC([O-])=O TYJJADVDDVDEDZ-UHFFFAOYSA-M 0.000 description 2
- 239000002243 precursor Substances 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 2
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 230000009261 transgenic effect Effects 0.000 description 2
- 238000000539 two dimensional gel electrophoresis Methods 0.000 description 2
- 101150085844 ubiD gene Proteins 0.000 description 2
- 101150003433 ubiX gene Proteins 0.000 description 2
- 239000004474 valine Substances 0.000 description 2
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 2
- GJLXVWOMRRWCIB-MERZOTPQSA-N (2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-acetamido-5-(diaminomethylideneamino)pentanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanoyl]amino]-6-aminohexanamide Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(N)=O)C1=CC=C(O)C=C1 GJLXVWOMRRWCIB-MERZOTPQSA-N 0.000 description 1
- JNTMAZFVYNDPLB-PEDHHIEDSA-N (2S,3S)-2-[[[(2S)-1-[(2S,3S)-2-amino-3-methyl-1-oxopentyl]-2-pyrrolidinyl]-oxomethyl]amino]-3-methylpentanoic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JNTMAZFVYNDPLB-PEDHHIEDSA-N 0.000 description 1
- OFHXPCLWHLXQHT-JKQORVJESA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2,6-diaminohexanoyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoyl]amino]butanedioic acid Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN OFHXPCLWHLXQHT-JKQORVJESA-N 0.000 description 1
- MDNRBNZIOBQHHK-KWBADKCTSA-N (2s)-2-[[(2s)-2-[[2-[[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-3-carboxypropanoyl]amino]-3-methylbutanoic acid Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N MDNRBNZIOBQHHK-KWBADKCTSA-N 0.000 description 1
- SADYNMDJGAWAEW-JKQORVJESA-N (2s)-2-[[(2s)-3-carboxy-2-[[(2s)-2-[[(2s)-2,6-diaminohexanoyl]amino]-3-methylbutanoyl]amino]propanoyl]amino]-4-methylpentanoic acid Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN SADYNMDJGAWAEW-JKQORVJESA-N 0.000 description 1
- VWWKKDNCCLAGRM-GVXVVHGQSA-N (2s)-2-[[2-[[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]propanoyl]amino]acetyl]amino]-3-methylbutanoic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VWWKKDNCCLAGRM-GVXVVHGQSA-N 0.000 description 1
- DJXDNYKQOZYOFK-GUBZILKMSA-N (4s)-4-[[2-[[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-5-[[(1s)-1-carboxy-2-hydroxyethyl]amino]-5-oxopentanoic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O DJXDNYKQOZYOFK-GUBZILKMSA-N 0.000 description 1
- GHOKWGTUZJEAQD-ZETCQYMHSA-N (D)-(+)-Pantothenic acid Chemical compound OCC(C)(C)[C@@H](O)C(=O)NCCC(O)=O GHOKWGTUZJEAQD-ZETCQYMHSA-N 0.000 description 1
- YREROAPXUOXCGI-UHFFFAOYSA-N 2,5-dihydroxybenzoic acid Chemical compound OC(=O)C1=CC(O)=CC=C1O.OC(=O)C1=CC(O)=CC=C1O YREROAPXUOXCGI-UHFFFAOYSA-N 0.000 description 1
- HOLHYSJJBXSLMV-UHFFFAOYSA-N 2,6-dichlorophenol Chemical compound OC1=C(Cl)C=CC=C1Cl HOLHYSJJBXSLMV-UHFFFAOYSA-N 0.000 description 1
- PPINMSZPTPRQQB-NHCYSSNCSA-N 2-[[(2s)-1-[(2s)-2-[[(2s)-2-amino-3-methylbutanoyl]amino]propanoyl]pyrrolidine-2-carbonyl]amino]acetic acid Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PPINMSZPTPRQQB-NHCYSSNCSA-N 0.000 description 1
- ISPYQTSUDJAMAB-UHFFFAOYSA-N 2-chlorophenol Chemical compound OC1=CC=CC=C1Cl ISPYQTSUDJAMAB-UHFFFAOYSA-N 0.000 description 1
- BFSVOASYOCHEOV-UHFFFAOYSA-N 2-diethylaminoethanol Chemical compound CCN(CC)CCO BFSVOASYOCHEOV-UHFFFAOYSA-N 0.000 description 1
- KNKXKITYRFJDNF-UHFFFAOYSA-N 2-methylphenol Chemical compound CC1=CC=CC=C1O.CC1=CC=CC=C1O KNKXKITYRFJDNF-UHFFFAOYSA-N 0.000 description 1
- YQUVCSBJEUQKSH-UHFFFAOYSA-M 3,4-dihydroxybenzoate Chemical compound OC1=CC=C(C([O-])=O)C=C1O YQUVCSBJEUQKSH-UHFFFAOYSA-M 0.000 description 1
- LTFHNKUKQYVHDX-UHFFFAOYSA-N 4-hydroxy-3-methylbenzoic acid Chemical compound CC1=CC(C(O)=O)=CC=C1O LTFHNKUKQYVHDX-UHFFFAOYSA-N 0.000 description 1
- 108010060957 4-hydroxybenzoate coenzyme A ligase Proteins 0.000 description 1
- QYSWQKNSPMUKBU-UHFFFAOYSA-N 4-hydroxybenzoic acid;2-methylbuta-1,3-diene Chemical class CC(=C)C=C.OC(=O)C1=CC=C(O)C=C1 QYSWQKNSPMUKBU-UHFFFAOYSA-N 0.000 description 1
- 108030000355 4-hydroxybenzoyl-CoA reductases Proteins 0.000 description 1
- 101710182094 8-oxo-dGTP diphosphatase Proteins 0.000 description 1
- 101000977065 Acidithiobacillus ferridurans Uncharacterized 11.6 kDa protein in mobS 3'region Proteins 0.000 description 1
- 101000787133 Acidithiobacillus ferridurans Uncharacterized 12.3 kDa protein in mobL 3'region Proteins 0.000 description 1
- 101000708563 Acidithiobacillus ferrooxidans Uncharacterized 9.0 kDa protein in mobE 3'region Proteins 0.000 description 1
- 241000589158 Agrobacterium Species 0.000 description 1
- DKJPOZOEBONHFS-ZLUOBGJFSA-N Ala-Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O DKJPOZOEBONHFS-ZLUOBGJFSA-N 0.000 description 1
- UWQJHXKARZWDIJ-ZLUOBGJFSA-N Ala-Ala-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(O)=O UWQJHXKARZWDIJ-ZLUOBGJFSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- VBDMWOKJZDCFJM-FXQIFTODSA-N Ala-Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N VBDMWOKJZDCFJM-FXQIFTODSA-N 0.000 description 1
- PIPTUBPKYFRLCP-NHCYSSNCSA-N Ala-Ala-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PIPTUBPKYFRLCP-NHCYSSNCSA-N 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- UGLPMYSCWHTZQU-AUTRQRHGSA-N Ala-Ala-Tyr Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UGLPMYSCWHTZQU-AUTRQRHGSA-N 0.000 description 1
- IMMKUCQIKKXKNP-DCAQKATOSA-N Ala-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCN=C(N)N IMMKUCQIKKXKNP-DCAQKATOSA-N 0.000 description 1
- FSBCNCKIQZZASN-GUBZILKMSA-N Ala-Arg-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O FSBCNCKIQZZASN-GUBZILKMSA-N 0.000 description 1
- YAXNATKKPOWVCP-ZLUOBGJFSA-N Ala-Asn-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O YAXNATKKPOWVCP-ZLUOBGJFSA-N 0.000 description 1
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 1
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 1
- BTYTYHBSJKQBQA-GCJQMDKQSA-N Ala-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N)O BTYTYHBSJKQBQA-GCJQMDKQSA-N 0.000 description 1
- KRHRBKYBJXMYBB-WHFBIAKZSA-N Ala-Cys-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O KRHRBKYBJXMYBB-WHFBIAKZSA-N 0.000 description 1
- NJIFPLAJSVUQOZ-JBDRJPRFSA-N Ala-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](C)N NJIFPLAJSVUQOZ-JBDRJPRFSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- JPGBXANAQYHTLA-DRZSPHRISA-N Ala-Gln-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JPGBXANAQYHTLA-DRZSPHRISA-N 0.000 description 1
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 1
- KXEVYGKATAMXJJ-ACZMJKKPSA-N Ala-Glu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KXEVYGKATAMXJJ-ACZMJKKPSA-N 0.000 description 1
- IXTPACPAXIOCRG-ACZMJKKPSA-N Ala-Glu-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N IXTPACPAXIOCRG-ACZMJKKPSA-N 0.000 description 1
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 1
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 1
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 1
- XYTNPQNAZREREP-XQXXSGGOSA-N Ala-Glu-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XYTNPQNAZREREP-XQXXSGGOSA-N 0.000 description 1
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 1
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 1
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 1
- JDIQCVUDDFENPU-ZKWXMUAHSA-N Ala-His-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CNC=N1 JDIQCVUDDFENPU-ZKWXMUAHSA-N 0.000 description 1
- FDAZDMAFZYTHGS-XVYDVKMFSA-N Ala-His-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FDAZDMAFZYTHGS-XVYDVKMFSA-N 0.000 description 1
- JEPNLGMEZMCFEX-QSFUFRPTSA-N Ala-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N JEPNLGMEZMCFEX-QSFUFRPTSA-N 0.000 description 1
- CBCCCLMNOBLBSC-XVYDVKMFSA-N Ala-His-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O CBCCCLMNOBLBSC-XVYDVKMFSA-N 0.000 description 1
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 1
- NYDBKUNVSALYPX-NAKRPEOUSA-N Ala-Ile-Arg Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NYDBKUNVSALYPX-NAKRPEOUSA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 1
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 1
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 1
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 1
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- DPNZTBKGAUAZQU-DLOVCJGASA-N Ala-Leu-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DPNZTBKGAUAZQU-DLOVCJGASA-N 0.000 description 1
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 1
- UWIQWPWWZUHBAO-ZLIFDBKOSA-N Ala-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)CC(C)C)C(O)=O)=CNC2=C1 UWIQWPWWZUHBAO-ZLIFDBKOSA-N 0.000 description 1
- LDLSENBXQNDTPB-DCAQKATOSA-N Ala-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LDLSENBXQNDTPB-DCAQKATOSA-N 0.000 description 1
- XHNLCGXYBXNRIS-BJDJZHNGSA-N Ala-Lys-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XHNLCGXYBXNRIS-BJDJZHNGSA-N 0.000 description 1
- REAQAWSENITKJL-DDWPSWQVSA-N Ala-Met-Asp-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O REAQAWSENITKJL-DDWPSWQVSA-N 0.000 description 1
- PVQLRJRPUTXFFX-CIUDSAMLSA-N Ala-Met-Gln Chemical compound CSCC[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O PVQLRJRPUTXFFX-CIUDSAMLSA-N 0.000 description 1
- KYDYGANDJHFBCW-DRZSPHRISA-N Ala-Phe-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N KYDYGANDJHFBCW-DRZSPHRISA-N 0.000 description 1
- CNQAFFMNJIQYGX-DRZSPHRISA-N Ala-Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 CNQAFFMNJIQYGX-DRZSPHRISA-N 0.000 description 1
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 1
- CYBJZLQSUJEMAS-LFSVMHDDSA-N Ala-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C)N)O CYBJZLQSUJEMAS-LFSVMHDDSA-N 0.000 description 1
- IHMCQESUJVZTKW-UBHSHLNASA-N Ala-Phe-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 IHMCQESUJVZTKW-UBHSHLNASA-N 0.000 description 1
- VQAVBBCZFQAAED-FXQIFTODSA-N Ala-Pro-Asn Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)N)C(=O)O)N VQAVBBCZFQAAED-FXQIFTODSA-N 0.000 description 1
- DYJJJCHDHLEFDW-FXQIFTODSA-N Ala-Pro-Cys Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N DYJJJCHDHLEFDW-FXQIFTODSA-N 0.000 description 1
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 1
- XWFWAXPOLRTDFZ-FXQIFTODSA-N Ala-Pro-Ser Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O XWFWAXPOLRTDFZ-FXQIFTODSA-N 0.000 description 1
- YHBDGLZYNIARKJ-GUBZILKMSA-N Ala-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N YHBDGLZYNIARKJ-GUBZILKMSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- DYXOFPBJBAHWFY-JBDRJPRFSA-N Ala-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N DYXOFPBJBAHWFY-JBDRJPRFSA-N 0.000 description 1
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 1
- NZGRHTKZFSVPAN-BIIVOSGPSA-N Ala-Ser-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N NZGRHTKZFSVPAN-BIIVOSGPSA-N 0.000 description 1
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 1
- VNFSAYFQLXPHPY-CIQUZCHMSA-N Ala-Thr-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNFSAYFQLXPHPY-CIQUZCHMSA-N 0.000 description 1
- LTTLSZVJTDSACD-OWLDWWDNSA-N Ala-Thr-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LTTLSZVJTDSACD-OWLDWWDNSA-N 0.000 description 1
- PXAFZDXYEIIUTF-LKTVYLICSA-N Ala-Trp-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O PXAFZDXYEIIUTF-LKTVYLICSA-N 0.000 description 1
- MTDDMSUUXNQMKK-BPNCWPANSA-N Ala-Tyr-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N MTDDMSUUXNQMKK-BPNCWPANSA-N 0.000 description 1
- YEBZNKPPOHFZJM-BPNCWPANSA-N Ala-Tyr-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O YEBZNKPPOHFZJM-BPNCWPANSA-N 0.000 description 1
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 1
- CLOMBHBBUKAUBP-LSJOCFKGSA-N Ala-Val-His Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N CLOMBHBBUKAUBP-LSJOCFKGSA-N 0.000 description 1
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 1
- XKHLBBQNPSOGPI-GUBZILKMSA-N Ala-Val-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N XKHLBBQNPSOGPI-GUBZILKMSA-N 0.000 description 1
- NLXLAEXVIDQMFP-UHFFFAOYSA-N Ammonia chloride Chemical compound [NH4+].[Cl-] NLXLAEXVIDQMFP-UHFFFAOYSA-N 0.000 description 1
- 241000205046 Archaeoglobus Species 0.000 description 1
- SGYSTDWPNPKJPP-GUBZILKMSA-N Arg-Ala-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SGYSTDWPNPKJPP-GUBZILKMSA-N 0.000 description 1
- DFCIPNHFKOQAME-FXQIFTODSA-N Arg-Ala-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFCIPNHFKOQAME-FXQIFTODSA-N 0.000 description 1
- MCYJBCKCAPERSE-FXQIFTODSA-N Arg-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N MCYJBCKCAPERSE-FXQIFTODSA-N 0.000 description 1
- HULHGJZIZXCPLD-FXQIFTODSA-N Arg-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N HULHGJZIZXCPLD-FXQIFTODSA-N 0.000 description 1
- KWKQGHSSNHPGOW-BQBZGAKWSA-N Arg-Ala-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)NCC(O)=O KWKQGHSSNHPGOW-BQBZGAKWSA-N 0.000 description 1
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 1
- KJGNDQCYBNBXDA-GUBZILKMSA-N Arg-Arg-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N KJGNDQCYBNBXDA-GUBZILKMSA-N 0.000 description 1
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 1
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- KWTVWJPNHAOREN-IHRRRGAJSA-N Arg-Asn-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KWTVWJPNHAOREN-IHRRRGAJSA-N 0.000 description 1
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 1
- OCOZPTHLDVSFCZ-BPUTZDHNSA-N Arg-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N OCOZPTHLDVSFCZ-BPUTZDHNSA-N 0.000 description 1
- OZNSCVPYWZRQPY-CIUDSAMLSA-N Arg-Asp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OZNSCVPYWZRQPY-CIUDSAMLSA-N 0.000 description 1
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- YSUVMPICYVWRBX-VEVYYDQMSA-N Arg-Asp-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YSUVMPICYVWRBX-VEVYYDQMSA-N 0.000 description 1
- QQJSJIBESHAJPM-IHRRRGAJSA-N Arg-Cys-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QQJSJIBESHAJPM-IHRRRGAJSA-N 0.000 description 1
- VDBKFYYIBLXEIF-GUBZILKMSA-N Arg-Gln-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VDBKFYYIBLXEIF-GUBZILKMSA-N 0.000 description 1
- NKBQZKVMKJJDLX-SRVKXCTJSA-N Arg-Glu-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NKBQZKVMKJJDLX-SRVKXCTJSA-N 0.000 description 1
- SKTGPBFTMNLIHQ-KKUMJFAQSA-N Arg-Glu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SKTGPBFTMNLIHQ-KKUMJFAQSA-N 0.000 description 1
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 1
- NXDXECQFKHXHAM-HJGDQZAQSA-N Arg-Glu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NXDXECQFKHXHAM-HJGDQZAQSA-N 0.000 description 1
- HQIZDMIGUJOSNI-IUCAKERBSA-N Arg-Gly-Arg Chemical compound N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O HQIZDMIGUJOSNI-IUCAKERBSA-N 0.000 description 1
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 1
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 1
- KRQSPVKUISQQFS-FJXKBIBVSA-N Arg-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N KRQSPVKUISQQFS-FJXKBIBVSA-N 0.000 description 1
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 1
- YBIAYFFIVAZXPK-AVGNSLFASA-N Arg-His-Arg Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YBIAYFFIVAZXPK-AVGNSLFASA-N 0.000 description 1
- YKBHOXLMMPZPHQ-GMOBBJLQSA-N Arg-Ile-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O YKBHOXLMMPZPHQ-GMOBBJLQSA-N 0.000 description 1
- FFEUXEAKYRCACT-PEDHHIEDSA-N Arg-Ile-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(O)=O FFEUXEAKYRCACT-PEDHHIEDSA-N 0.000 description 1
- GXXWTNKNFFKTJB-NAKRPEOUSA-N Arg-Ile-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O GXXWTNKNFFKTJB-NAKRPEOUSA-N 0.000 description 1
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 1
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- IIAXFBUTKIDDIP-ULQDDVLXSA-N Arg-Leu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IIAXFBUTKIDDIP-ULQDDVLXSA-N 0.000 description 1
- RTDZQOFEGPWSJD-AVGNSLFASA-N Arg-Leu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O RTDZQOFEGPWSJD-AVGNSLFASA-N 0.000 description 1
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 1
- QBQVKUNBCAFXSV-ULQDDVLXSA-N Arg-Lys-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QBQVKUNBCAFXSV-ULQDDVLXSA-N 0.000 description 1
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 1
- JOADBFCFJGNIKF-GUBZILKMSA-N Arg-Met-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O JOADBFCFJGNIKF-GUBZILKMSA-N 0.000 description 1
- VVJTWSRNMJNDPN-IUCAKERBSA-N Arg-Met-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O VVJTWSRNMJNDPN-IUCAKERBSA-N 0.000 description 1
- NYDIVDKTULRINZ-AVGNSLFASA-N Arg-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NYDIVDKTULRINZ-AVGNSLFASA-N 0.000 description 1
- DPLFNLDACGGBAK-KKUMJFAQSA-N Arg-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N DPLFNLDACGGBAK-KKUMJFAQSA-N 0.000 description 1
- UULLJGQFCDXVTQ-CYDGBPFRSA-N Arg-Pro-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UULLJGQFCDXVTQ-CYDGBPFRSA-N 0.000 description 1
- YCYXHLZRUSJITQ-SRVKXCTJSA-N Arg-Pro-Pro Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 YCYXHLZRUSJITQ-SRVKXCTJSA-N 0.000 description 1
- KXOPYFNQLVUOAQ-FXQIFTODSA-N Arg-Ser-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KXOPYFNQLVUOAQ-FXQIFTODSA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- KMFPQTITXUKJOV-DCAQKATOSA-N Arg-Ser-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O KMFPQTITXUKJOV-DCAQKATOSA-N 0.000 description 1
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 1
- LRPZJPMQGKGHSG-XGEHTFHBSA-N Arg-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)O LRPZJPMQGKGHSG-XGEHTFHBSA-N 0.000 description 1
- ASQKVGRCKOFKIU-KZVJFYERSA-N Arg-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ASQKVGRCKOFKIU-KZVJFYERSA-N 0.000 description 1
- SYFHFLGAROUHNT-VEVYYDQMSA-N Arg-Thr-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SYFHFLGAROUHNT-VEVYYDQMSA-N 0.000 description 1
- UZSQXCMNUPKLCC-FJXKBIBVSA-N Arg-Thr-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UZSQXCMNUPKLCC-FJXKBIBVSA-N 0.000 description 1
- RYQSYXFGFOTJDJ-RHYQMDGZSA-N Arg-Thr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RYQSYXFGFOTJDJ-RHYQMDGZSA-N 0.000 description 1
- JKRPBTQDPJSQIT-RCWTZXSCSA-N Arg-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O JKRPBTQDPJSQIT-RCWTZXSCSA-N 0.000 description 1
- ZPWMEWYQBWSGAO-ZJDVBMNYSA-N Arg-Thr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZPWMEWYQBWSGAO-ZJDVBMNYSA-N 0.000 description 1
- XRNXPIGJPQHCPC-RCWTZXSCSA-N Arg-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)O)C(O)=O XRNXPIGJPQHCPC-RCWTZXSCSA-N 0.000 description 1
- QNYWYYNQSXANBL-WDSOQIARSA-N Arg-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QNYWYYNQSXANBL-WDSOQIARSA-N 0.000 description 1
- POZKLUIXMHIULG-FDARSICLSA-N Arg-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCCN=C(N)N)N POZKLUIXMHIULG-FDARSICLSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- AOJYORNRFWWEIV-IHRRRGAJSA-N Arg-Tyr-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 AOJYORNRFWWEIV-IHRRRGAJSA-N 0.000 description 1
- XMZZGVGKGXRIGJ-JYJNAYRXSA-N Arg-Tyr-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O XMZZGVGKGXRIGJ-JYJNAYRXSA-N 0.000 description 1
- QTAIIXQCOPUNBQ-QXEWZRGKSA-N Arg-Val-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QTAIIXQCOPUNBQ-QXEWZRGKSA-N 0.000 description 1
- FTMRPIVPSDVGCC-GUBZILKMSA-N Arg-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FTMRPIVPSDVGCC-GUBZILKMSA-N 0.000 description 1
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 1
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 1
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- IARGXWMWRFOQPG-GCJQMDKQSA-N Asn-Ala-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IARGXWMWRFOQPG-GCJQMDKQSA-N 0.000 description 1
- CIBWFJFMOBIFTE-CIUDSAMLSA-N Asn-Arg-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N CIBWFJFMOBIFTE-CIUDSAMLSA-N 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- JRVABKHPWDRUJF-UBHSHLNASA-N Asn-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N JRVABKHPWDRUJF-UBHSHLNASA-N 0.000 description 1
- BHQQRVARKXWXPP-ACZMJKKPSA-N Asn-Asp-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N BHQQRVARKXWXPP-ACZMJKKPSA-N 0.000 description 1
- UGXVKHRDGLYFKR-CIUDSAMLSA-N Asn-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(N)=O UGXVKHRDGLYFKR-CIUDSAMLSA-N 0.000 description 1
- IYVSIZAXNLOKFQ-BYULHYEWSA-N Asn-Asp-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IYVSIZAXNLOKFQ-BYULHYEWSA-N 0.000 description 1
- ULRPXVNMIIYDDJ-ACZMJKKPSA-N Asn-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N ULRPXVNMIIYDDJ-ACZMJKKPSA-N 0.000 description 1
- IICZCLFBILYRCU-WHFBIAKZSA-N Asn-Gly-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IICZCLFBILYRCU-WHFBIAKZSA-N 0.000 description 1
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 1
- LTZIRYMWOJHRCH-GUDRVLHUSA-N Asn-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N LTZIRYMWOJHRCH-GUDRVLHUSA-N 0.000 description 1
- SEKBHZJLARBNPB-GHCJXIJMSA-N Asn-Ile-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O SEKBHZJLARBNPB-GHCJXIJMSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- MYVBTYXSWILFCG-BQBZGAKWSA-N Asn-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N MYVBTYXSWILFCG-BQBZGAKWSA-N 0.000 description 1
- RLHANKIRBONJBK-IHRRRGAJSA-N Asn-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)N)N RLHANKIRBONJBK-IHRRRGAJSA-N 0.000 description 1
- MVXJBVVLACEGCG-PCBIJLKTSA-N Asn-Phe-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVXJBVVLACEGCG-PCBIJLKTSA-N 0.000 description 1
- YUUIAUXBNOHFRJ-IHRRRGAJSA-N Asn-Phe-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O YUUIAUXBNOHFRJ-IHRRRGAJSA-N 0.000 description 1
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 1
- HPNDKUOLNRVRAY-BIIVOSGPSA-N Asn-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N)C(=O)O HPNDKUOLNRVRAY-BIIVOSGPSA-N 0.000 description 1
- SNYCNNPOFYBCEK-ZLUOBGJFSA-N Asn-Ser-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O SNYCNNPOFYBCEK-ZLUOBGJFSA-N 0.000 description 1
- DAYDURRBMDCCFL-AAEUAGOBSA-N Asn-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N DAYDURRBMDCCFL-AAEUAGOBSA-N 0.000 description 1
- CBWCQCANJSGUOH-ZKWXMUAHSA-N Asn-Val-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O CBWCQCANJSGUOH-ZKWXMUAHSA-N 0.000 description 1
- WQAOZCVOOYUWKG-LSJOCFKGSA-N Asn-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CC(=O)N)N WQAOZCVOOYUWKG-LSJOCFKGSA-N 0.000 description 1
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 1
- HPNDBHLITCHRSO-WHFBIAKZSA-N Asp-Ala-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)NCC(O)=O HPNDBHLITCHRSO-WHFBIAKZSA-N 0.000 description 1
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 1
- GVPSCJQLUGIKAM-GUBZILKMSA-N Asp-Arg-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GVPSCJQLUGIKAM-GUBZILKMSA-N 0.000 description 1
- HMQDRBKQMLRCCG-GMOBBJLQSA-N Asp-Arg-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HMQDRBKQMLRCCG-GMOBBJLQSA-N 0.000 description 1
- NYLBGYLHBDFRHL-VEVYYDQMSA-N Asp-Arg-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NYLBGYLHBDFRHL-VEVYYDQMSA-N 0.000 description 1
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- HOQGTAIGQSDCHR-SRVKXCTJSA-N Asp-Asn-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HOQGTAIGQSDCHR-SRVKXCTJSA-N 0.000 description 1
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 1
- FRSGNOZCTWDVFZ-ACZMJKKPSA-N Asp-Asp-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O FRSGNOZCTWDVFZ-ACZMJKKPSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- IJHUZMGJRGNXIW-CIUDSAMLSA-N Asp-Glu-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IJHUZMGJRGNXIW-CIUDSAMLSA-N 0.000 description 1
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 1
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 1
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 1
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 1
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 1
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 1
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 1
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 1
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- RQYMKRMRZWJGHC-BQBZGAKWSA-N Asp-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N RQYMKRMRZWJGHC-BQBZGAKWSA-N 0.000 description 1
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 1
- WSXDIZFNQYTUJB-SRVKXCTJSA-N Asp-His-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O WSXDIZFNQYTUJB-SRVKXCTJSA-N 0.000 description 1
- QNFRBNZGVVKBNJ-PEFMBERDSA-N Asp-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QNFRBNZGVVKBNJ-PEFMBERDSA-N 0.000 description 1
- NHSDEZURHWEZPN-SXTJYALSSA-N Asp-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CC(=O)O)N NHSDEZURHWEZPN-SXTJYALSSA-N 0.000 description 1
- YFSLJHLQOALGSY-ZPFDUUQYSA-N Asp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N YFSLJHLQOALGSY-ZPFDUUQYSA-N 0.000 description 1
- SPKCGKRUYKMDHP-GUDRVLHUSA-N Asp-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N SPKCGKRUYKMDHP-GUDRVLHUSA-N 0.000 description 1
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 1
- XLILXFRAKOYEJX-GUBZILKMSA-N Asp-Leu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLILXFRAKOYEJX-GUBZILKMSA-N 0.000 description 1
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 1
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 1
- IVPNEDNYYYFAGI-GARJFASQSA-N Asp-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N IVPNEDNYYYFAGI-GARJFASQSA-N 0.000 description 1
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 1
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 1
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 1
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 1
- FQHBAQLBIXLWAG-DCAQKATOSA-N Asp-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N FQHBAQLBIXLWAG-DCAQKATOSA-N 0.000 description 1
- DONWIPDSZZJHHK-HJGDQZAQSA-N Asp-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N)O DONWIPDSZZJHHK-HJGDQZAQSA-N 0.000 description 1
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 1
- KPSHWSWFPUDEGF-FXQIFTODSA-N Asp-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(O)=O KPSHWSWFPUDEGF-FXQIFTODSA-N 0.000 description 1
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 1
- KESWRFKUZRUTAH-FXQIFTODSA-N Asp-Pro-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O KESWRFKUZRUTAH-FXQIFTODSA-N 0.000 description 1
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 1
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 1
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 1
- HRVQDZOWMLFAOD-BIIVOSGPSA-N Asp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N)C(=O)O HRVQDZOWMLFAOD-BIIVOSGPSA-N 0.000 description 1
- KBJVTFWQWXCYCQ-IUKAMOBKSA-N Asp-Thr-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KBJVTFWQWXCYCQ-IUKAMOBKSA-N 0.000 description 1
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 1
- VHUKCUHLFMRHOD-MELADBBJSA-N Asp-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O VHUKCUHLFMRHOD-MELADBBJSA-N 0.000 description 1
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 1
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 1
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 1
- GGBQDSHTXKQSLP-NHCYSSNCSA-N Asp-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N GGBQDSHTXKQSLP-NHCYSSNCSA-N 0.000 description 1
- 101000666833 Autographa californica nuclear polyhedrosis virus Uncharacterized 20.8 kDa protein in FGF-VUBI intergenic region Proteins 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 101000977023 Azospirillum brasilense Uncharacterized 17.8 kDa protein in nodG 5'region Proteins 0.000 description 1
- 101000977027 Azospirillum brasilense Uncharacterized protein in nodG 5'region Proteins 0.000 description 1
- 101000827603 Bacillus phage SPP1 Uncharacterized 10.2 kDa protein in GP2-GP6 intergenic region Proteins 0.000 description 1
- 101000765604 Bacillus subtilis (strain 168) FlaA locus 22.9 kDa protein Proteins 0.000 description 1
- 101000765606 Bacillus subtilis (strain 168) FlaA locus uncharacterized protein YlxG Proteins 0.000 description 1
- 101000962005 Bacillus thuringiensis Uncharacterized 23.6 kDa protein Proteins 0.000 description 1
- 101000961984 Bacillus thuringiensis Uncharacterized 30.3 kDa protein Proteins 0.000 description 1
- 102100021277 Beta-secretase 2 Human genes 0.000 description 1
- 101710150190 Beta-secretase 2 Proteins 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-M Bicarbonate Chemical compound OC([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-M 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- 101100512078 Caenorhabditis elegans lys-1 gene Proteins 0.000 description 1
- 101100505161 Caenorhabditis elegans mel-32 gene Proteins 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 101000964402 Caldicellulosiruptor saccharolyticus Uncharacterized protein in xynC 3'region Proteins 0.000 description 1
- ACTIUHUUMQJHFO-UHFFFAOYSA-N Coenzym Q10 Natural products COC1=C(OC)C(=O)C(CC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)C)=C(C)C1=O ACTIUHUUMQJHFO-UHFFFAOYSA-N 0.000 description 1
- 101100497181 Comamonas sp. (strain NCIMB 9872) cpnB gene Proteins 0.000 description 1
- 101000861180 Cupriavidus necator (strain ATCC 17699 / DSM 428 / KCTC 22496 / NCIMB 10442 / H16 / Stanier 337) Uncharacterized protein H16_B0147 Proteins 0.000 description 1
- 101000861181 Cupriavidus necator (strain ATCC 17699 / DSM 428 / KCTC 22496 / NCIMB 10442 / H16 / Stanier 337) Uncharacterized protein H16_B0148 Proteins 0.000 description 1
- AEJSNWMRPXAKCW-WHFBIAKZSA-N Cys-Ala-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AEJSNWMRPXAKCW-WHFBIAKZSA-N 0.000 description 1
- SZQCDCKIGWQAQN-FXQIFTODSA-N Cys-Arg-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O SZQCDCKIGWQAQN-FXQIFTODSA-N 0.000 description 1
- SBMGKDLRJLYZCU-BIIVOSGPSA-N Cys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CS)N)C(=O)O SBMGKDLRJLYZCU-BIIVOSGPSA-N 0.000 description 1
- VZKXOWRNJDEGLZ-WHFBIAKZSA-N Cys-Asp-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O VZKXOWRNJDEGLZ-WHFBIAKZSA-N 0.000 description 1
- BDWIZLQVVWQMTB-XKBZYTNZSA-N Cys-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N)O BDWIZLQVVWQMTB-XKBZYTNZSA-N 0.000 description 1
- GCDLPNRHPWBKJJ-WDSKDSINSA-N Cys-Gly-Glu Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GCDLPNRHPWBKJJ-WDSKDSINSA-N 0.000 description 1
- UQHYQYXOLIYNSR-CUJWVEQBSA-N Cys-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CS)N)O UQHYQYXOLIYNSR-CUJWVEQBSA-N 0.000 description 1
- ABLJDBFJPUWQQB-DCAQKATOSA-N Cys-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CS)N ABLJDBFJPUWQQB-DCAQKATOSA-N 0.000 description 1
- HKALUUKHYNEDRS-GUBZILKMSA-N Cys-Leu-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HKALUUKHYNEDRS-GUBZILKMSA-N 0.000 description 1
- XXDATQFUGMAJRV-XIRDDKMYSA-N Cys-Leu-Trp Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XXDATQFUGMAJRV-XIRDDKMYSA-N 0.000 description 1
- SMEYEQDCCBHTEF-FXQIFTODSA-N Cys-Pro-Ala Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O SMEYEQDCCBHTEF-FXQIFTODSA-N 0.000 description 1
- CNAMJJOZGXPDHW-IHRRRGAJSA-N Cys-Pro-Phe Chemical compound N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O CNAMJJOZGXPDHW-IHRRRGAJSA-N 0.000 description 1
- CMYVIUWVYHOLRD-ZLUOBGJFSA-N Cys-Ser-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CMYVIUWVYHOLRD-ZLUOBGJFSA-N 0.000 description 1
- YWEHYKGJWHPGPY-XGEHTFHBSA-N Cys-Thr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CS)N)O YWEHYKGJWHPGPY-XGEHTFHBSA-N 0.000 description 1
- XKDHARKYRGHLKO-QEJZJMRPSA-N Cys-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CS)N XKDHARKYRGHLKO-QEJZJMRPSA-N 0.000 description 1
- 235000000638 D-biotin Nutrition 0.000 description 1
- 239000011665 D-biotin Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 101710088194 Dehydrogenase Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000016680 Dioxygenases Human genes 0.000 description 1
- 108010028143 Dioxygenases Proteins 0.000 description 1
- 101000644901 Drosophila melanogaster Putative 115 kDa protein in type-1 retrotransposable element R1DM Proteins 0.000 description 1
- 101000785191 Drosophila melanogaster Uncharacterized 50 kDa protein in type I retrotransposable element R1DM Proteins 0.000 description 1
- 101000747704 Enterobacteria phage N4 Uncharacterized protein Gp1 Proteins 0.000 description 1
- 101000747702 Enterobacteria phage N4 Uncharacterized protein Gp2 Proteins 0.000 description 1
- 101000861206 Enterococcus faecalis (strain ATCC 700802 / V583) Uncharacterized protein EF_A0048 Proteins 0.000 description 1
- 101100180873 Escherichia coli (strain K12) kdsC gene Proteins 0.000 description 1
- 101000769180 Escherichia coli Uncharacterized 11.1 kDa protein Proteins 0.000 description 1
- 101000758599 Escherichia coli Uncharacterized 14.7 kDa protein Proteins 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- INKFLNZBTSNFON-CIUDSAMLSA-N Gln-Ala-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O INKFLNZBTSNFON-CIUDSAMLSA-N 0.000 description 1
- REJJNXODKSHOKA-ACZMJKKPSA-N Gln-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N REJJNXODKSHOKA-ACZMJKKPSA-N 0.000 description 1
- XXLBHPPXDUWYAG-XQXXSGGOSA-N Gln-Ala-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XXLBHPPXDUWYAG-XQXXSGGOSA-N 0.000 description 1
- KZKBJEUWNMQTLV-XDTLVQLUSA-N Gln-Ala-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZKBJEUWNMQTLV-XDTLVQLUSA-N 0.000 description 1
- KWUSGAIFNHQCBY-DCAQKATOSA-N Gln-Arg-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O KWUSGAIFNHQCBY-DCAQKATOSA-N 0.000 description 1
- DLOHWQXXGMEZDW-CIUDSAMLSA-N Gln-Arg-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O DLOHWQXXGMEZDW-CIUDSAMLSA-N 0.000 description 1
- LJEPDHWNQXPXMM-NHCYSSNCSA-N Gln-Arg-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O LJEPDHWNQXPXMM-NHCYSSNCSA-N 0.000 description 1
- WMOMPXKOKASNBK-PEFMBERDSA-N Gln-Asn-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WMOMPXKOKASNBK-PEFMBERDSA-N 0.000 description 1
- GMGKDVVBSVVKCT-NUMRIWBASA-N Gln-Asn-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GMGKDVVBSVVKCT-NUMRIWBASA-N 0.000 description 1
- SOIAHPSKKUYREP-CIUDSAMLSA-N Gln-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N SOIAHPSKKUYREP-CIUDSAMLSA-N 0.000 description 1
- UICOTGULOUGGLC-NUMRIWBASA-N Gln-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UICOTGULOUGGLC-NUMRIWBASA-N 0.000 description 1
- QFJPFPCSXOXMKI-BPUTZDHNSA-N Gln-Gln-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N QFJPFPCSXOXMKI-BPUTZDHNSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- VSXBYIJUAXPAAL-WDSKDSINSA-N Gln-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O VSXBYIJUAXPAAL-WDSKDSINSA-N 0.000 description 1
- IKFZXRLDMYWNBU-YUMQZZPRSA-N Gln-Gly-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N IKFZXRLDMYWNBU-YUMQZZPRSA-N 0.000 description 1
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 1
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 1
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 1
- RWCBJYUPAUTWJD-NHCYSSNCSA-N Gln-Met-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O RWCBJYUPAUTWJD-NHCYSSNCSA-N 0.000 description 1
- JNVGVECJCOZHCN-DRZSPHRISA-N Gln-Phe-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O JNVGVECJCOZHCN-DRZSPHRISA-N 0.000 description 1
- AQPZYBSRDRZBAG-AVGNSLFASA-N Gln-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N AQPZYBSRDRZBAG-AVGNSLFASA-N 0.000 description 1
- SWDSRANUCKNBLA-AVGNSLFASA-N Gln-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SWDSRANUCKNBLA-AVGNSLFASA-N 0.000 description 1
- DRNMNLKUUKKPIA-HTUGSXCWSA-N Gln-Phe-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)CCC(N)=O)C(O)=O DRNMNLKUUKKPIA-HTUGSXCWSA-N 0.000 description 1
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 1
- SXFPZRRVWSUYII-KBIXCLLPSA-N Gln-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N SXFPZRRVWSUYII-KBIXCLLPSA-N 0.000 description 1
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 1
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 1
- VYOILACOFPPNQH-UMNHJUIQSA-N Gln-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N VYOILACOFPPNQH-UMNHJUIQSA-N 0.000 description 1
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 1
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 1
- UTKICHUQEQBDGC-ACZMJKKPSA-N Glu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UTKICHUQEQBDGC-ACZMJKKPSA-N 0.000 description 1
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 1
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 1
- KBKGRMNVKPSQIF-XDTLVQLUSA-N Glu-Ala-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KBKGRMNVKPSQIF-XDTLVQLUSA-N 0.000 description 1
- AVZHGSCDKIQZPQ-CIUDSAMLSA-N Glu-Arg-Ala Chemical compound C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AVZHGSCDKIQZPQ-CIUDSAMLSA-N 0.000 description 1
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 1
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 1
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 1
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 1
- FLQAKQOBSPFGKG-CIUDSAMLSA-N Glu-Cys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FLQAKQOBSPFGKG-CIUDSAMLSA-N 0.000 description 1
- CLROYXHHUZELFX-FXQIFTODSA-N Glu-Gln-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O CLROYXHHUZELFX-FXQIFTODSA-N 0.000 description 1
- PVBBEKPHARMPHX-DCAQKATOSA-N Glu-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O PVBBEKPHARMPHX-DCAQKATOSA-N 0.000 description 1
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 1
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- QYPKJXSMLMREKF-BPUTZDHNSA-N Glu-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N QYPKJXSMLMREKF-BPUTZDHNSA-N 0.000 description 1
- PXXGVUVQWQGGIG-YUMQZZPRSA-N Glu-Gly-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N PXXGVUVQWQGGIG-YUMQZZPRSA-N 0.000 description 1
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 1
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 1
- HILMIYALTUQTRC-XVKPBYJWSA-N Glu-Gly-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HILMIYALTUQTRC-XVKPBYJWSA-N 0.000 description 1
- XIKYNVKEUINBGL-IUCAKERBSA-N Glu-His-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O XIKYNVKEUINBGL-IUCAKERBSA-N 0.000 description 1
- ZGKXAUIVGIBISK-SZMVWBNQSA-N Glu-His-Trp Chemical compound N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O ZGKXAUIVGIBISK-SZMVWBNQSA-N 0.000 description 1
- YVYVMJNUENBOOL-KBIXCLLPSA-N Glu-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N YVYVMJNUENBOOL-KBIXCLLPSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 1
- QXDXIXFSFHUYAX-MNXVOIDGSA-N Glu-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O QXDXIXFSFHUYAX-MNXVOIDGSA-N 0.000 description 1
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 1
- INGJLBQKTRJLFO-UKJIMTQDSA-N Glu-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O INGJLBQKTRJLFO-UKJIMTQDSA-N 0.000 description 1
- NWOUBJNMZDDGDT-AVGNSLFASA-N Glu-Leu-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NWOUBJNMZDDGDT-AVGNSLFASA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 1
- NPMSEUWUMOSEFM-CIUDSAMLSA-N Glu-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N NPMSEUWUMOSEFM-CIUDSAMLSA-N 0.000 description 1
- LKOAAMXDJGEYMS-ZPFDUUQYSA-N Glu-Met-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKOAAMXDJGEYMS-ZPFDUUQYSA-N 0.000 description 1
- XEKAJTCACGEBOK-KKUMJFAQSA-N Glu-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XEKAJTCACGEBOK-KKUMJFAQSA-N 0.000 description 1
- HQOGXFLBAKJUMH-CIUDSAMLSA-N Glu-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N HQOGXFLBAKJUMH-CIUDSAMLSA-N 0.000 description 1
- QJVZSVUYZFYLFQ-CIUDSAMLSA-N Glu-Pro-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O QJVZSVUYZFYLFQ-CIUDSAMLSA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 1
- WIKMTDVSCUJIPJ-CIUDSAMLSA-N Glu-Ser-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WIKMTDVSCUJIPJ-CIUDSAMLSA-N 0.000 description 1
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 1
- JDAYMLXPUJRSDJ-XIRDDKMYSA-N Glu-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 JDAYMLXPUJRSDJ-XIRDDKMYSA-N 0.000 description 1
- HAGKYCXGTRUUFI-RYUDHWBXSA-N Glu-Tyr-Gly Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)O HAGKYCXGTRUUFI-RYUDHWBXSA-N 0.000 description 1
- UUTGYDAKPISJAO-JYJNAYRXSA-N Glu-Tyr-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 UUTGYDAKPISJAO-JYJNAYRXSA-N 0.000 description 1
- BKMOHWJHXQLFEX-IRIUXVKKSA-N Glu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N)O BKMOHWJHXQLFEX-IRIUXVKKSA-N 0.000 description 1
- UZWUBBRJWFTHTD-LAEOZQHASA-N Glu-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O UZWUBBRJWFTHTD-LAEOZQHASA-N 0.000 description 1
- KCCNSVHJSMMGFS-NRPADANISA-N Glu-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N KCCNSVHJSMMGFS-NRPADANISA-N 0.000 description 1
- ZALGPUWUVHOGAE-GVXVVHGQSA-N Glu-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZALGPUWUVHOGAE-GVXVVHGQSA-N 0.000 description 1
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 1
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 1
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 1
- UPOJUWHGMDJUQZ-IUCAKERBSA-N Gly-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UPOJUWHGMDJUQZ-IUCAKERBSA-N 0.000 description 1
- OGCIHJPYKVSMTE-YUMQZZPRSA-N Gly-Arg-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OGCIHJPYKVSMTE-YUMQZZPRSA-N 0.000 description 1
- VXKCPBPQEKKERH-IUCAKERBSA-N Gly-Arg-Pro Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N1CCC[C@H]1C(O)=O VXKCPBPQEKKERH-IUCAKERBSA-N 0.000 description 1
- XZRZILPOZBVTDB-GJZGRUSLSA-N Gly-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)CN)C(O)=O)=CNC2=C1 XZRZILPOZBVTDB-GJZGRUSLSA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 1
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 1
- FZQLXNIMCPJVJE-YUMQZZPRSA-N Gly-Asp-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FZQLXNIMCPJVJE-YUMQZZPRSA-N 0.000 description 1
- UEGIPZAXNBYCCP-NKWVEPMBSA-N Gly-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)CN)C(=O)O UEGIPZAXNBYCCP-NKWVEPMBSA-N 0.000 description 1
- BPQYBFAXRGMGGY-LAEOZQHASA-N Gly-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN BPQYBFAXRGMGGY-LAEOZQHASA-N 0.000 description 1
- GNPVTZJUUBPZKW-WDSKDSINSA-N Gly-Gln-Ser Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GNPVTZJUUBPZKW-WDSKDSINSA-N 0.000 description 1
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 1
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 1
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 1
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 1
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- YNIMVVJTPWCUJH-KBPBESRZSA-N Gly-His-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YNIMVVJTPWCUJH-KBPBESRZSA-N 0.000 description 1
- KGVHCTWYMPWEGN-FSPLSTOPSA-N Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CN KGVHCTWYMPWEGN-FSPLSTOPSA-N 0.000 description 1
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 1
- VIIBEIQMLJEUJG-LAEOZQHASA-N Gly-Ile-Gln Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O VIIBEIQMLJEUJG-LAEOZQHASA-N 0.000 description 1
- HKSNHPVETYYJBK-LAEOZQHASA-N Gly-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN HKSNHPVETYYJBK-LAEOZQHASA-N 0.000 description 1
- AAHSHTLISQUZJL-QSFUFRPTSA-N Gly-Ile-Ile Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AAHSHTLISQUZJL-QSFUFRPTSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- BHPQOIPBLYJNAW-NGZCFLSTSA-N Gly-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN BHPQOIPBLYJNAW-NGZCFLSTSA-N 0.000 description 1
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 1
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 1
- ULZCYBYDTUMHNF-IUCAKERBSA-N Gly-Leu-Glu Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ULZCYBYDTUMHNF-IUCAKERBSA-N 0.000 description 1
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 1
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 1
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- LOEANKRDMMVOGZ-YUMQZZPRSA-N Gly-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O LOEANKRDMMVOGZ-YUMQZZPRSA-N 0.000 description 1
- PTIIBFKSLCYQBO-NHCYSSNCSA-N Gly-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN PTIIBFKSLCYQBO-NHCYSSNCSA-N 0.000 description 1
- DBJYVKDPGIFXFO-BQBZGAKWSA-N Gly-Met-Ala Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O DBJYVKDPGIFXFO-BQBZGAKWSA-N 0.000 description 1
- ZWRDOVYMQAAISL-UWVGGRQHSA-N Gly-Met-Lys Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCCN ZWRDOVYMQAAISL-UWVGGRQHSA-N 0.000 description 1
- IBYOLNARKHMLBG-WHOFXGATSA-N Gly-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IBYOLNARKHMLBG-WHOFXGATSA-N 0.000 description 1
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 1
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 1
- IXHQLZIWBCQBLQ-STQMWFEESA-N Gly-Pro-Phe Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IXHQLZIWBCQBLQ-STQMWFEESA-N 0.000 description 1
- ISSDODCYBOWWIP-GJZGRUSLSA-N Gly-Pro-Trp Chemical compound [H]NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ISSDODCYBOWWIP-GJZGRUSLSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- FKYQEVBRZSFAMJ-QWRGUYRKSA-N Gly-Ser-Tyr Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FKYQEVBRZSFAMJ-QWRGUYRKSA-N 0.000 description 1
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 1
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 1
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 1
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- WSWWTQYHFCBKBT-DVJZZOLTSA-N Gly-Thr-Trp Chemical compound C[C@@H](O)[C@H](NC(=O)CN)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O WSWWTQYHFCBKBT-DVJZZOLTSA-N 0.000 description 1
- UMRIXLHPZZIOML-OALUTQOASA-N Gly-Trp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)CN UMRIXLHPZZIOML-OALUTQOASA-N 0.000 description 1
- GWNIGUKSRJBIHX-STQMWFEESA-N Gly-Tyr-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)CN)O GWNIGUKSRJBIHX-STQMWFEESA-N 0.000 description 1
- WRFOZIJRODPLIA-QWRGUYRKSA-N Gly-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O WRFOZIJRODPLIA-QWRGUYRKSA-N 0.000 description 1
- LYZYGGWCBLBDMC-QWHCGFSZSA-N Gly-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)CN)C(=O)O LYZYGGWCBLBDMC-QWHCGFSZSA-N 0.000 description 1
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 1
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 1
- SYOJVRNQCXYEOV-XVKPBYJWSA-N Gly-Val-Glu Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SYOJVRNQCXYEOV-XVKPBYJWSA-N 0.000 description 1
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 1
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 1
- KSOBNUBCYHGUKH-UWVGGRQHSA-N Gly-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN KSOBNUBCYHGUKH-UWVGGRQHSA-N 0.000 description 1
- VPZXBVLAVMBEQI-VKHMYHEASA-N Glycyl-alanine Chemical compound OC(=O)[C@H](C)NC(=O)CN VPZXBVLAVMBEQI-VKHMYHEASA-N 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 1
- BIAKMWKJMQLZOJ-ZKWXMUAHSA-N His-Ala-Ala Chemical compound C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)Cc1cnc[nH]1)C(O)=O BIAKMWKJMQLZOJ-ZKWXMUAHSA-N 0.000 description 1
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 1
- AWASVTXPTOLPPP-MBLNEYKQSA-N His-Ala-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AWASVTXPTOLPPP-MBLNEYKQSA-N 0.000 description 1
- GMIWMPUGTFQFHK-KCTSRDHCSA-N His-Ala-Trp Chemical compound C[C@H](NC(=O)[C@@H](N)Cc1cnc[nH]1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O GMIWMPUGTFQFHK-KCTSRDHCSA-N 0.000 description 1
- HXKZJLWGSWQKEA-LSJOCFKGSA-N His-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CN=CN1 HXKZJLWGSWQKEA-LSJOCFKGSA-N 0.000 description 1
- YPLYIXGKCRQZGW-SRVKXCTJSA-N His-Arg-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O YPLYIXGKCRQZGW-SRVKXCTJSA-N 0.000 description 1
- JBJNKUOMNZGQIM-PYJNHQTQSA-N His-Arg-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JBJNKUOMNZGQIM-PYJNHQTQSA-N 0.000 description 1
- UOAVQQRILDGZEN-SRVKXCTJSA-N His-Asp-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UOAVQQRILDGZEN-SRVKXCTJSA-N 0.000 description 1
- AASLOGQZZKZWKH-SRVKXCTJSA-N His-Cys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AASLOGQZZKZWKH-SRVKXCTJSA-N 0.000 description 1
- CMQOGWZUKPHLHL-DCAQKATOSA-N His-Cys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CN=CN1)N CMQOGWZUKPHLHL-DCAQKATOSA-N 0.000 description 1
- BQFGKVYHKCNEMF-DCAQKATOSA-N His-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 BQFGKVYHKCNEMF-DCAQKATOSA-N 0.000 description 1
- AKEDPWJFQULLPE-IUCAKERBSA-N His-Glu-Gly Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O AKEDPWJFQULLPE-IUCAKERBSA-N 0.000 description 1
- FIMNVXRZGUAGBI-AVGNSLFASA-N His-Glu-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FIMNVXRZGUAGBI-AVGNSLFASA-N 0.000 description 1
- JCOSMKPAOYDKRO-AVGNSLFASA-N His-Glu-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N JCOSMKPAOYDKRO-AVGNSLFASA-N 0.000 description 1
- CZXKZMQKXQZDEX-YUMQZZPRSA-N His-Gly-Cys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N CZXKZMQKXQZDEX-YUMQZZPRSA-N 0.000 description 1
- SKYULSWNBYAQMG-IHRRRGAJSA-N His-Leu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SKYULSWNBYAQMG-IHRRRGAJSA-N 0.000 description 1
- AIPUZFXMXAHZKY-QWRGUYRKSA-N His-Leu-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AIPUZFXMXAHZKY-QWRGUYRKSA-N 0.000 description 1
- ZSKJIISDJXJQPV-BZSNNMDCSA-N His-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 ZSKJIISDJXJQPV-BZSNNMDCSA-N 0.000 description 1
- GJMHMDKCJPQJOI-IHRRRGAJSA-N His-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CN=CN1 GJMHMDKCJPQJOI-IHRRRGAJSA-N 0.000 description 1
- WPUAVVXYEJAWIV-KKUMJFAQSA-N His-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N WPUAVVXYEJAWIV-KKUMJFAQSA-N 0.000 description 1
- YAEKRYQASVCDLK-JYJNAYRXSA-N His-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YAEKRYQASVCDLK-JYJNAYRXSA-N 0.000 description 1
- BSVLMPMIXPQNKC-KBPBESRZSA-N His-Phe-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O BSVLMPMIXPQNKC-KBPBESRZSA-N 0.000 description 1
- QCBYAHHNOHBXIH-UWVGGRQHSA-N His-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CN=CN1 QCBYAHHNOHBXIH-UWVGGRQHSA-N 0.000 description 1
- FLXCRBXJRJSDHX-AVGNSLFASA-N His-Pro-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O FLXCRBXJRJSDHX-AVGNSLFASA-N 0.000 description 1
- XHQYFGPIRUHQIB-PBCZWWQYSA-N His-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CN=CN1 XHQYFGPIRUHQIB-PBCZWWQYSA-N 0.000 description 1
- ISQOVWDWRUONJH-YESZJQIVSA-N His-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CN=CN3)N)C(=O)O ISQOVWDWRUONJH-YESZJQIVSA-N 0.000 description 1
- 101000740205 Homo sapiens Sal-like protein 1 Proteins 0.000 description 1
- GRSZFWQUAKGDAV-KQYNXXCUSA-N IMP Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(O)=O)O[C@H]1N1C(NC=NC2=O)=C2N=C1 GRSZFWQUAKGDAV-KQYNXXCUSA-N 0.000 description 1
- JXUGDUWBMKIJDC-NAKRPEOUSA-N Ile-Ala-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JXUGDUWBMKIJDC-NAKRPEOUSA-N 0.000 description 1
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 1
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 1
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 1
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 1
- TZCGZYWNIDZZMR-NAKRPEOUSA-N Ile-Arg-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C)C(=O)O)N TZCGZYWNIDZZMR-NAKRPEOUSA-N 0.000 description 1
- ASCFJMSGKUIRDU-ZPFDUUQYSA-N Ile-Arg-Gln Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O ASCFJMSGKUIRDU-ZPFDUUQYSA-N 0.000 description 1
- ATXGFMOBVKSOMK-PEDHHIEDSA-N Ile-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N ATXGFMOBVKSOMK-PEDHHIEDSA-N 0.000 description 1
- VZIFYHYNQDIPLI-HJWJTTGWSA-N Ile-Arg-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N VZIFYHYNQDIPLI-HJWJTTGWSA-N 0.000 description 1
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 1
- RSDHVTMRXSABSV-GHCJXIJMSA-N Ile-Asn-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N RSDHVTMRXSABSV-GHCJXIJMSA-N 0.000 description 1
- XENGULNPUDGALZ-ZPFDUUQYSA-N Ile-Asn-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N XENGULNPUDGALZ-ZPFDUUQYSA-N 0.000 description 1
- RPZFUIQVAPZLRH-GHCJXIJMSA-N Ile-Asp-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)O)N RPZFUIQVAPZLRH-GHCJXIJMSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 1
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 1
- GYAFMRQGWHXMII-IUKAMOBKSA-N Ile-Asp-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N GYAFMRQGWHXMII-IUKAMOBKSA-N 0.000 description 1
- JRYQSFOFUFXPTB-RWRJDSDZSA-N Ile-Gln-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N JRYQSFOFUFXPTB-RWRJDSDZSA-N 0.000 description 1
- JDAWAWXGAUZPNJ-ZPFDUUQYSA-N Ile-Glu-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JDAWAWXGAUZPNJ-ZPFDUUQYSA-N 0.000 description 1
- KIMHKBDJQQYLHU-PEFMBERDSA-N Ile-Glu-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KIMHKBDJQQYLHU-PEFMBERDSA-N 0.000 description 1
- UBHUJPVCJHPSEU-GRLWGSQLSA-N Ile-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N UBHUJPVCJHPSEU-GRLWGSQLSA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- OEQKGSPBDVKYOC-ZKWXMUAHSA-N Ile-Gly-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N OEQKGSPBDVKYOC-ZKWXMUAHSA-N 0.000 description 1
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 1
- PDTMWFVVNZYWTR-NHCYSSNCSA-N Ile-Gly-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O PDTMWFVVNZYWTR-NHCYSSNCSA-N 0.000 description 1
- UAQSZXGJGLHMNV-XEGUGMAKSA-N Ile-Gly-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N UAQSZXGJGLHMNV-XEGUGMAKSA-N 0.000 description 1
- AMSYMDIIIRJRKZ-HJPIBITLSA-N Ile-His-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AMSYMDIIIRJRKZ-HJPIBITLSA-N 0.000 description 1
- APDIECQNNDGFPD-PYJNHQTQSA-N Ile-His-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N APDIECQNNDGFPD-PYJNHQTQSA-N 0.000 description 1
- WIZPFZKOFZXDQG-HTFCKZLJSA-N Ile-Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O WIZPFZKOFZXDQG-HTFCKZLJSA-N 0.000 description 1
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 1
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 1
- RQQCJTLBSJMVCR-DSYPUSFNSA-N Ile-Leu-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N RQQCJTLBSJMVCR-DSYPUSFNSA-N 0.000 description 1
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- PARSHQDZROHERM-NHCYSSNCSA-N Ile-Lys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N PARSHQDZROHERM-NHCYSSNCSA-N 0.000 description 1
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 1
- ZUPJCJINYQISSN-XUXIUFHCSA-N Ile-Met-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUPJCJINYQISSN-XUXIUFHCSA-N 0.000 description 1
- WYUHAXJAMDTOAU-IAVJCBSLSA-N Ile-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WYUHAXJAMDTOAU-IAVJCBSLSA-N 0.000 description 1
- RENBRDSDKPSRIH-HJWJTTGWSA-N Ile-Phe-Met Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)O RENBRDSDKPSRIH-HJWJTTGWSA-N 0.000 description 1
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 1
- IITVUURPOYGCTD-NAKRPEOUSA-N Ile-Pro-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IITVUURPOYGCTD-NAKRPEOUSA-N 0.000 description 1
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 1
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 1
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 1
- CNMOKANDJMLAIF-CIQUZCHMSA-N Ile-Thr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O CNMOKANDJMLAIF-CIQUZCHMSA-N 0.000 description 1
- QGXQHJQPAPMACW-PPCPHDFISA-N Ile-Thr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QGXQHJQPAPMACW-PPCPHDFISA-N 0.000 description 1
- YBHKCXNNNVDYEB-SPOWBLRKSA-N Ile-Trp-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CO)C(=O)O)N YBHKCXNNNVDYEB-SPOWBLRKSA-N 0.000 description 1
- MITYXXNZSZLHGG-OBAATPRFSA-N Ile-Trp-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N MITYXXNZSZLHGG-OBAATPRFSA-N 0.000 description 1
- IPFKIGNDTUOFAF-CYDGBPFRSA-N Ile-Val-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IPFKIGNDTUOFAF-CYDGBPFRSA-N 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 239000007836 KH2PO4 Substances 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- 101000768930 Lactococcus lactis subsp. cremoris Uncharacterized protein in pepC 5'region Proteins 0.000 description 1
- 101000976301 Leptospira interrogans Uncharacterized 35 kDa protein in sph 3'region Proteins 0.000 description 1
- 101000976302 Leptospira interrogans Uncharacterized protein in sph 3'region Proteins 0.000 description 1
- 101000778886 Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai (strain 56601) Uncharacterized protein LA_2151 Proteins 0.000 description 1
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- WSGXUIQTEZDVHJ-GARJFASQSA-N Leu-Ala-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O WSGXUIQTEZDVHJ-GARJFASQSA-N 0.000 description 1
- HASRFYOMVPJRPU-SRVKXCTJSA-N Leu-Arg-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HASRFYOMVPJRPU-SRVKXCTJSA-N 0.000 description 1
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 1
- WXHFZJFZWNCDNB-KKUMJFAQSA-N Leu-Asn-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXHFZJFZWNCDNB-KKUMJFAQSA-N 0.000 description 1
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 1
- ZDSNOSQHMJBRQN-SRVKXCTJSA-N Leu-Asp-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZDSNOSQHMJBRQN-SRVKXCTJSA-N 0.000 description 1
- MMEDVBWCMGRKKC-GARJFASQSA-N Leu-Asp-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N MMEDVBWCMGRKKC-GARJFASQSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- HUEBCHPSXSQUGN-GARJFASQSA-N Leu-Cys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N HUEBCHPSXSQUGN-GARJFASQSA-N 0.000 description 1
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 1
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 1
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 1
- QJUWBDPGGYVRHY-YUMQZZPRSA-N Leu-Gly-Cys Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N QJUWBDPGGYVRHY-YUMQZZPRSA-N 0.000 description 1
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 1
- YWYQSLOTVIRCFE-SRVKXCTJSA-N Leu-His-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O YWYQSLOTVIRCFE-SRVKXCTJSA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 1
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- OVZLLFONXILPDZ-VOAKCMCISA-N Leu-Lys-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OVZLLFONXILPDZ-VOAKCMCISA-N 0.000 description 1
- WXZOHBVPVKABQN-DCAQKATOSA-N Leu-Met-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WXZOHBVPVKABQN-DCAQKATOSA-N 0.000 description 1
- POMXSEDNUXYPGK-IHRRRGAJSA-N Leu-Met-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N POMXSEDNUXYPGK-IHRRRGAJSA-N 0.000 description 1
- JVTYXRRFZCEPPK-RHYQMDGZSA-N Leu-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(C)C)N)O JVTYXRRFZCEPPK-RHYQMDGZSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 1
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 1
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- XWEVVRRSIOBJOO-SRVKXCTJSA-N Leu-Pro-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O XWEVVRRSIOBJOO-SRVKXCTJSA-N 0.000 description 1
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 1
- YUTNOGOMBNYPFH-XUXIUFHCSA-N Leu-Pro-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YUTNOGOMBNYPFH-XUXIUFHCSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 1
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 1
- FGZVGOAAROXFAB-IXOXFDKPSA-N Leu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N)O FGZVGOAAROXFAB-IXOXFDKPSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 1
- BGGTYDNTOYRTTR-MEYUZBJRSA-N Leu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(C)C)N)O BGGTYDNTOYRTTR-MEYUZBJRSA-N 0.000 description 1
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 1
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 1
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 1
- LMDVGHQPPPLYAR-IHRRRGAJSA-N Leu-Val-His Chemical compound N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O LMDVGHQPPPLYAR-IHRRRGAJSA-N 0.000 description 1
- 101000977779 Lymantria dispar multicapsid nuclear polyhedrosis virus Uncharacterized 33.9 kDa protein in PE 3'region Proteins 0.000 description 1
- 101000977786 Lymantria dispar multicapsid nuclear polyhedrosis virus Uncharacterized 9.7 kDa protein in PE 3'region Proteins 0.000 description 1
- VHFFQUSNFFIZBT-CIUDSAMLSA-N Lys-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N VHFFQUSNFFIZBT-CIUDSAMLSA-N 0.000 description 1
- UWKNTTJNVSYXPC-CIUDSAMLSA-N Lys-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN UWKNTTJNVSYXPC-CIUDSAMLSA-N 0.000 description 1
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 1
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 1
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- ABHIXYDMILIUKV-CIUDSAMLSA-N Lys-Asn-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ABHIXYDMILIUKV-CIUDSAMLSA-N 0.000 description 1
- PXHCFKXNSBJSTQ-KKUMJFAQSA-N Lys-Asn-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)O PXHCFKXNSBJSTQ-KKUMJFAQSA-N 0.000 description 1
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 1
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 1
- ZAWOJFFMBANLGE-CIUDSAMLSA-N Lys-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCCN)N ZAWOJFFMBANLGE-CIUDSAMLSA-N 0.000 description 1
- VSRXPEHZMHSFKU-IUCAKERBSA-N Lys-Gln-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VSRXPEHZMHSFKU-IUCAKERBSA-N 0.000 description 1
- PGBPWPTUOSCNLE-JYJNAYRXSA-N Lys-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N PGBPWPTUOSCNLE-JYJNAYRXSA-N 0.000 description 1
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 1
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 1
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 1
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 1
- PBLLTSKBTAHDNA-KBPBESRZSA-N Lys-Gly-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PBLLTSKBTAHDNA-KBPBESRZSA-N 0.000 description 1
- HAUUXTXKJNVIFY-ONGXEEELSA-N Lys-Gly-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAUUXTXKJNVIFY-ONGXEEELSA-N 0.000 description 1
- ZMMDPRTXLAEMOD-BZSNNMDCSA-N Lys-His-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZMMDPRTXLAEMOD-BZSNNMDCSA-N 0.000 description 1
- OIYWBDBHEGAVST-BZSNNMDCSA-N Lys-His-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OIYWBDBHEGAVST-BZSNNMDCSA-N 0.000 description 1
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 1
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 1
- YWJQHDDBFAXNIR-MXAVVETBSA-N Lys-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N YWJQHDDBFAXNIR-MXAVVETBSA-N 0.000 description 1
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 1
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 1
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 1
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 1
- ZCWWVXAXWUAEPZ-SRVKXCTJSA-N Lys-Met-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZCWWVXAXWUAEPZ-SRVKXCTJSA-N 0.000 description 1
- DAHQKYYIXPBESV-UWVGGRQHSA-N Lys-Met-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O DAHQKYYIXPBESV-UWVGGRQHSA-N 0.000 description 1
- JPYPRVHMKRFTAT-KKUMJFAQSA-N Lys-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N JPYPRVHMKRFTAT-KKUMJFAQSA-N 0.000 description 1
- IPTUBUUIFRZMJK-ACRUOGEOSA-N Lys-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 IPTUBUUIFRZMJK-ACRUOGEOSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- WQDKIVRHTQYJSN-DCAQKATOSA-N Lys-Ser-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N WQDKIVRHTQYJSN-DCAQKATOSA-N 0.000 description 1
- DNWBUCHHMRQWCZ-GUBZILKMSA-N Lys-Ser-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DNWBUCHHMRQWCZ-GUBZILKMSA-N 0.000 description 1
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 1
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 1
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 1
- 108091027974 Mature messenger RNA Proteins 0.000 description 1
- GAELMDJMQDUDLJ-BQBZGAKWSA-N Met-Ala-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O GAELMDJMQDUDLJ-BQBZGAKWSA-N 0.000 description 1
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 1
- ULNXMMYXQKGNPG-LPEHRKFASA-N Met-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N ULNXMMYXQKGNPG-LPEHRKFASA-N 0.000 description 1
- XMMWDTUFTZMQFD-GMOBBJLQSA-N Met-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCSC XMMWDTUFTZMQFD-GMOBBJLQSA-N 0.000 description 1
- DNDVVILEHVMWIS-LPEHRKFASA-N Met-Asp-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DNDVVILEHVMWIS-LPEHRKFASA-N 0.000 description 1
- NCVJJAJVWILAGI-SRVKXCTJSA-N Met-Gln-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N NCVJJAJVWILAGI-SRVKXCTJSA-N 0.000 description 1
- KLFPZIUIXZNEKY-DCAQKATOSA-N Met-Gln-Met Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O KLFPZIUIXZNEKY-DCAQKATOSA-N 0.000 description 1
- HHCOOFPGNXKFGR-HJGDQZAQSA-N Met-Gln-Thr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HHCOOFPGNXKFGR-HJGDQZAQSA-N 0.000 description 1
- PQPMMGQTRQFSDA-SRVKXCTJSA-N Met-Glu-His Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O PQPMMGQTRQFSDA-SRVKXCTJSA-N 0.000 description 1
- SLQDSYZHHOKQSR-QXEWZRGKSA-N Met-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCSC SLQDSYZHHOKQSR-QXEWZRGKSA-N 0.000 description 1
- LQMHZERGCQJKAH-STQMWFEESA-N Met-Gly-Phe Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LQMHZERGCQJKAH-STQMWFEESA-N 0.000 description 1
- LRALLISKBZNSKN-BQBZGAKWSA-N Met-Gly-Ser Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LRALLISKBZNSKN-BQBZGAKWSA-N 0.000 description 1
- GETCJHFFECHWHI-QXEWZRGKSA-N Met-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCSC)N GETCJHFFECHWHI-QXEWZRGKSA-N 0.000 description 1
- ODFBIJXEWPWSAN-CYDGBPFRSA-N Met-Ile-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O ODFBIJXEWPWSAN-CYDGBPFRSA-N 0.000 description 1
- UROWNMBTQGGTHB-DCAQKATOSA-N Met-Leu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UROWNMBTQGGTHB-DCAQKATOSA-N 0.000 description 1
- HGAJNEWOUHDUMZ-SRVKXCTJSA-N Met-Leu-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O HGAJNEWOUHDUMZ-SRVKXCTJSA-N 0.000 description 1
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 1
- WPTHAGXMYDRPFD-SRVKXCTJSA-N Met-Lys-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O WPTHAGXMYDRPFD-SRVKXCTJSA-N 0.000 description 1
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 1
- LLKWSEXLNFBKIF-CYDGBPFRSA-N Met-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCSC LLKWSEXLNFBKIF-CYDGBPFRSA-N 0.000 description 1
- CRVSHEPROQHVQT-AVGNSLFASA-N Met-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)O)N CRVSHEPROQHVQT-AVGNSLFASA-N 0.000 description 1
- XIGAHPDZLAYQOS-SRVKXCTJSA-N Met-Pro-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 XIGAHPDZLAYQOS-SRVKXCTJSA-N 0.000 description 1
- WRXOPYNEKGZWAZ-FXQIFTODSA-N Met-Ser-Cys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O WRXOPYNEKGZWAZ-FXQIFTODSA-N 0.000 description 1
- LHXFNWBNRBWMNV-DCAQKATOSA-N Met-Ser-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LHXFNWBNRBWMNV-DCAQKATOSA-N 0.000 description 1
- CIIJWIAORKTXAH-FJXKBIBVSA-N Met-Thr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O CIIJWIAORKTXAH-FJXKBIBVSA-N 0.000 description 1
- WXJLBSXNUHIGSS-OSUNSFLBSA-N Met-Thr-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WXJLBSXNUHIGSS-OSUNSFLBSA-N 0.000 description 1
- QQPMHUCGDRJFQK-RHYQMDGZSA-N Met-Thr-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QQPMHUCGDRJFQK-RHYQMDGZSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241000202974 Methanobacterium Species 0.000 description 1
- 101100301239 Myxococcus xanthus recA1 gene Proteins 0.000 description 1
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 1
- 229910004616 Na2MoO4.2H2 O Inorganic materials 0.000 description 1
- 101000827630 Narcissus mosaic virus Uncharacterized 10 kDa protein Proteins 0.000 description 1
- 101000658690 Neisseria meningitidis serogroup B Transposase for insertion sequence element IS1106 Proteins 0.000 description 1
- PVNIIMVLHYAWGP-UHFFFAOYSA-N Niacin Chemical compound OC(=O)C1=CC=CN=C1 PVNIIMVLHYAWGP-UHFFFAOYSA-N 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 102000004020 Oxygenases Human genes 0.000 description 1
- 108090000417 Oxygenases Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- CYZBFPYMSJGBRL-DRZSPHRISA-N Phe-Ala-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CYZBFPYMSJGBRL-DRZSPHRISA-N 0.000 description 1
- UHRNIXJAGGLKHP-DLOVCJGASA-N Phe-Ala-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O UHRNIXJAGGLKHP-DLOVCJGASA-N 0.000 description 1
- MPGJIHFJCXTVEX-KKUMJFAQSA-N Phe-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O MPGJIHFJCXTVEX-KKUMJFAQSA-N 0.000 description 1
- GNUCSNWOCQFMMC-UFYCRDLUSA-N Phe-Arg-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 GNUCSNWOCQFMMC-UFYCRDLUSA-N 0.000 description 1
- HTKNPQZCMLBOTQ-XVSYOHENSA-N Phe-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O HTKNPQZCMLBOTQ-XVSYOHENSA-N 0.000 description 1
- JOXIIFVCSATTDH-IHPCNDPISA-N Phe-Asn-Trp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JOXIIFVCSATTDH-IHPCNDPISA-N 0.000 description 1
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 1
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 1
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 1
- PBXYXOAEQQUVMM-ULQDDVLXSA-N Phe-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=CC=C2)N PBXYXOAEQQUVMM-ULQDDVLXSA-N 0.000 description 1
- KBVJZCVLQWCJQN-KKUMJFAQSA-N Phe-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KBVJZCVLQWCJQN-KKUMJFAQSA-N 0.000 description 1
- SMFGCTXUBWEPKM-KBPBESRZSA-N Phe-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 SMFGCTXUBWEPKM-KBPBESRZSA-N 0.000 description 1
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 1
- BNRFQGLWLQESBG-YESZJQIVSA-N Phe-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O BNRFQGLWLQESBG-YESZJQIVSA-N 0.000 description 1
- RTUWVJVJSMOGPL-KKUMJFAQSA-N Phe-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RTUWVJVJSMOGPL-KKUMJFAQSA-N 0.000 description 1
- UXQFHEKRGHYJRA-STQMWFEESA-N Phe-Met-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O UXQFHEKRGHYJRA-STQMWFEESA-N 0.000 description 1
- TXJJXEXCZBHDNA-ACRUOGEOSA-N Phe-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N TXJJXEXCZBHDNA-ACRUOGEOSA-N 0.000 description 1
- GPLWGAYGROGDEN-BZSNNMDCSA-N Phe-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GPLWGAYGROGDEN-BZSNNMDCSA-N 0.000 description 1
- CKJACGQPCPMWIT-UFYCRDLUSA-N Phe-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CKJACGQPCPMWIT-UFYCRDLUSA-N 0.000 description 1
- NJJBATPLUQHRBM-IHRRRGAJSA-N Phe-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CO)C(=O)O NJJBATPLUQHRBM-IHRRRGAJSA-N 0.000 description 1
- ODGNUUUDJONJSC-UFYCRDLUSA-N Phe-Pro-Tyr Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O ODGNUUUDJONJSC-UFYCRDLUSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- MVIJMIZJPHQGEN-IHRRRGAJSA-N Phe-Ser-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@H](CO)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 MVIJMIZJPHQGEN-IHRRRGAJSA-N 0.000 description 1
- FGWUALWGCZJQDJ-URLPEUOOSA-N Phe-Thr-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FGWUALWGCZJQDJ-URLPEUOOSA-N 0.000 description 1
- NWVMQNAELALJFW-RNXOBYDBSA-N Phe-Trp-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NWVMQNAELALJFW-RNXOBYDBSA-N 0.000 description 1
- 108010064851 Plant Proteins Proteins 0.000 description 1
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 1
- VXCHGLYSIOOZIS-GUBZILKMSA-N Pro-Ala-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 VXCHGLYSIOOZIS-GUBZILKMSA-N 0.000 description 1
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 1
- FZHBZMDRDASUHN-NAKRPEOUSA-N Pro-Ala-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1)C(O)=O FZHBZMDRDASUHN-NAKRPEOUSA-N 0.000 description 1
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 1
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 1
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 description 1
- OOLOTUZJUBOMAX-GUBZILKMSA-N Pro-Ala-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O OOLOTUZJUBOMAX-GUBZILKMSA-N 0.000 description 1
- XROLYVMNVIKVEM-BQBZGAKWSA-N Pro-Asn-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O XROLYVMNVIKVEM-BQBZGAKWSA-N 0.000 description 1
- HQVPQXMCQKXARZ-FXQIFTODSA-N Pro-Cys-Ser Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O HQVPQXMCQKXARZ-FXQIFTODSA-N 0.000 description 1
- SNIPWBQKOPCJRG-CIUDSAMLSA-N Pro-Gln-Cys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O SNIPWBQKOPCJRG-CIUDSAMLSA-N 0.000 description 1
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 1
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 1
- VPEVBAUSTBWQHN-NHCYSSNCSA-N Pro-Glu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O VPEVBAUSTBWQHN-NHCYSSNCSA-N 0.000 description 1
- ULIWFCCJIOEHMU-BQBZGAKWSA-N Pro-Gly-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 ULIWFCCJIOEHMU-BQBZGAKWSA-N 0.000 description 1
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 1
- WSRWHZRUOCACLJ-UWVGGRQHSA-N Pro-Gly-His Chemical compound C([C@@H](C(=O)O)NC(=O)CNC(=O)[C@H]1NCCC1)C1=CN=CN1 WSRWHZRUOCACLJ-UWVGGRQHSA-N 0.000 description 1
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 1
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 1
- FDINZVJXLPILKV-DCAQKATOSA-N Pro-His-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O FDINZVJXLPILKV-DCAQKATOSA-N 0.000 description 1
- GBRUQFBAJOKCTF-DCAQKATOSA-N Pro-His-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O GBRUQFBAJOKCTF-DCAQKATOSA-N 0.000 description 1
- XQHGISDMVBTGAL-ULQDDVLXSA-N Pro-His-Phe Chemical compound C([C@@H](C(=O)[O-])NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H]1[NH2+]CCC1)C1=CC=CC=C1 XQHGISDMVBTGAL-ULQDDVLXSA-N 0.000 description 1
- LXLFEIHKWGHJJB-XUXIUFHCSA-N Pro-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 LXLFEIHKWGHJJB-XUXIUFHCSA-N 0.000 description 1
- FKVNLUZHSFCNGY-RVMXOQNASA-N Pro-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 FKVNLUZHSFCNGY-RVMXOQNASA-N 0.000 description 1
- UREQLMJCKFLLHM-NAKRPEOUSA-N Pro-Ile-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UREQLMJCKFLLHM-NAKRPEOUSA-N 0.000 description 1
- YXHYJEPDKSYPSQ-AVGNSLFASA-N Pro-Leu-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 YXHYJEPDKSYPSQ-AVGNSLFASA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 1
- ZLXKLMHAMDENIO-DCAQKATOSA-N Pro-Lys-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLXKLMHAMDENIO-DCAQKATOSA-N 0.000 description 1
- ULWBBFKQBDNGOY-RWMBFGLXSA-N Pro-Lys-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N2CCC[C@@H]2C(=O)O ULWBBFKQBDNGOY-RWMBFGLXSA-N 0.000 description 1
- WFIVLLFYUZZWOD-RHYQMDGZSA-N Pro-Lys-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WFIVLLFYUZZWOD-RHYQMDGZSA-N 0.000 description 1
- WOIFYRZPIORBRY-AVGNSLFASA-N Pro-Lys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WOIFYRZPIORBRY-AVGNSLFASA-N 0.000 description 1
- XZBYTHCRAVAXQQ-DCAQKATOSA-N Pro-Met-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O XZBYTHCRAVAXQQ-DCAQKATOSA-N 0.000 description 1
- ANESFYPBAJPYNJ-SDDRHHMPSA-N Pro-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ANESFYPBAJPYNJ-SDDRHHMPSA-N 0.000 description 1
- AJBQTGZIZQXBLT-STQMWFEESA-N Pro-Phe-Gly Chemical compound C([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 AJBQTGZIZQXBLT-STQMWFEESA-N 0.000 description 1
- MHBSUKYVBZVQRW-HJWJTTGWSA-N Pro-Phe-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MHBSUKYVBZVQRW-HJWJTTGWSA-N 0.000 description 1
- SPLBRAKYXGOFSO-UNQGMJICSA-N Pro-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@@H]2CCCN2)O SPLBRAKYXGOFSO-UNQGMJICSA-N 0.000 description 1
- DWPXHLIBFQLKLK-CYDGBPFRSA-N Pro-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 DWPXHLIBFQLKLK-CYDGBPFRSA-N 0.000 description 1
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 1
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 1
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 1
- VVAWNPIOYXAMAL-KJEVXHAQSA-N Pro-Thr-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VVAWNPIOYXAMAL-KJEVXHAQSA-N 0.000 description 1
- FYXCBXDAMPEHIQ-FHWLQOOXSA-N Pro-Trp-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CCCCN)C(=O)O FYXCBXDAMPEHIQ-FHWLQOOXSA-N 0.000 description 1
- LEBTWGWVUVJNTA-FKBYEOEOSA-N Pro-Trp-Phe Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC4=CC=CC=C4)C(=O)O LEBTWGWVUVJNTA-FKBYEOEOSA-N 0.000 description 1
- SNSYSBUTTJBPDG-OKZBNKHCSA-N Pro-Trp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N4CCC[C@@H]4C(=O)O SNSYSBUTTJBPDG-OKZBNKHCSA-N 0.000 description 1
- FZXSYIPVAFVYBH-KKUMJFAQSA-N Pro-Tyr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O FZXSYIPVAFVYBH-KKUMJFAQSA-N 0.000 description 1
- AWJGUZSYVIVZGP-YUMQZZPRSA-N Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 AWJGUZSYVIVZGP-YUMQZZPRSA-N 0.000 description 1
- OOZJHTXCLJUODH-QXEWZRGKSA-N Pro-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 OOZJHTXCLJUODH-QXEWZRGKSA-N 0.000 description 1
- STGVYUTZKGPRCI-GUBZILKMSA-N Pro-Val-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 STGVYUTZKGPRCI-GUBZILKMSA-N 0.000 description 1
- KHRLUIPIMIQFGT-AVGNSLFASA-N Pro-Val-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHRLUIPIMIQFGT-AVGNSLFASA-N 0.000 description 1
- VDHGTOHMHHQSKG-JYJNAYRXSA-N Pro-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O VDHGTOHMHHQSKG-JYJNAYRXSA-N 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 241000589776 Pseudomonas putida Species 0.000 description 1
- 101000748660 Pseudomonas savastanoi Uncharacterized 21 kDa protein in iaaL 5'region Proteins 0.000 description 1
- LCTONWCANYUPML-UHFFFAOYSA-M Pyruvate Chemical compound CC(=O)C([O-])=O LCTONWCANYUPML-UHFFFAOYSA-M 0.000 description 1
- 108010052388 RGES peptide Proteins 0.000 description 1
- 108020004518 RNA Probes Proteins 0.000 description 1
- 239000003391 RNA probe Substances 0.000 description 1
- 230000006819 RNA synthesis Effects 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 101000584469 Rice tungro bacilliform virus (isolate Philippines) Protein P1 Proteins 0.000 description 1
- 101001121571 Rice tungro bacilliform virus (isolate Philippines) Protein P2 Proteins 0.000 description 1
- 101001113905 Rice tungro bacilliform virus (isolate Philippines) Protein P4 Proteins 0.000 description 1
- 241000235070 Saccharomyces Species 0.000 description 1
- 102100037204 Sal-like protein 1 Human genes 0.000 description 1
- 229920002684 Sepharose Polymers 0.000 description 1
- WTWGOQRNRFHFQD-JBDRJPRFSA-N Ser-Ala-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WTWGOQRNRFHFQD-JBDRJPRFSA-N 0.000 description 1
- BRKHVZNDAOMAHX-BIIVOSGPSA-N Ser-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N BRKHVZNDAOMAHX-BIIVOSGPSA-N 0.000 description 1
- HBZBPFLJNDXRAY-FXQIFTODSA-N Ser-Ala-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O HBZBPFLJNDXRAY-FXQIFTODSA-N 0.000 description 1
- RZEQTVHJZCIUBT-WDSKDSINSA-N Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N RZEQTVHJZCIUBT-WDSKDSINSA-N 0.000 description 1
- NLQUOHDCLSFABG-GUBZILKMSA-N Ser-Arg-Arg Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NLQUOHDCLSFABG-GUBZILKMSA-N 0.000 description 1
- QWZIOCFPXMAXET-CIUDSAMLSA-N Ser-Arg-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QWZIOCFPXMAXET-CIUDSAMLSA-N 0.000 description 1
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 1
- UGJRQLURDVGULT-LKXGYXEUSA-N Ser-Asn-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UGJRQLURDVGULT-LKXGYXEUSA-N 0.000 description 1
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 1
- KNZQGAUEYZJUSQ-ZLUOBGJFSA-N Ser-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N KNZQGAUEYZJUSQ-ZLUOBGJFSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 1
- HVKMTOIAYDOJPL-NRPADANISA-N Ser-Gln-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVKMTOIAYDOJPL-NRPADANISA-N 0.000 description 1
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 1
- UFKPDBLKLOBMRH-XHNCKOQMSA-N Ser-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)C(=O)O UFKPDBLKLOBMRH-XHNCKOQMSA-N 0.000 description 1
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 1
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 1
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 1
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 1
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 1
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 1
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 1
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 1
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 1
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 1
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 1
- JAWGSPUJAXYXJA-IHRRRGAJSA-N Ser-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CC=CC=C1 JAWGSPUJAXYXJA-IHRRRGAJSA-N 0.000 description 1
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- STIAINRLUUKYKM-WFBYXXMGSA-N Ser-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CO)=CNC2=C1 STIAINRLUUKYKM-WFBYXXMGSA-N 0.000 description 1
- AXKJPUBALUNJEO-UBHSHLNASA-N Ser-Trp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O AXKJPUBALUNJEO-UBHSHLNASA-N 0.000 description 1
- SDFUZKIAHWRUCS-QEJZJMRPSA-N Ser-Trp-Glu Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N SDFUZKIAHWRUCS-QEJZJMRPSA-N 0.000 description 1
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 1
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 1
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 1
- HNDMFDBQXYZSRM-IHRRRGAJSA-N Ser-Val-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HNDMFDBQXYZSRM-IHRRRGAJSA-N 0.000 description 1
- SIEBDTCABMZCLF-XGEHTFHBSA-N Ser-Val-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SIEBDTCABMZCLF-XGEHTFHBSA-N 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 101000818096 Spirochaeta aurantia Uncharacterized 15.5 kDa protein in trpE 3'region Proteins 0.000 description 1
- 101000818098 Spirochaeta aurantia Uncharacterized protein in trpE 3'region Proteins 0.000 description 1
- 101000766081 Streptomyces ambofaciens Uncharacterized HTH-type transcriptional regulator in unstable DNA locus Proteins 0.000 description 1
- 101001026590 Streptomyces cinnamonensis Putative polyketide beta-ketoacyl synthase 2 Proteins 0.000 description 1
- 101000804403 Synechococcus elongatus (strain PCC 7942 / FACHB-805) Uncharacterized HIT-like protein Synpcc7942_1390 Proteins 0.000 description 1
- 101000750910 Synechococcus elongatus (strain PCC 7942 / FACHB-805) Uncharacterized HTH-type transcriptional regulator Synpcc7942_2319 Proteins 0.000 description 1
- 101000750896 Synechococcus elongatus (strain PCC 7942 / FACHB-805) Uncharacterized protein Synpcc7942_2318 Proteins 0.000 description 1
- 101000644897 Synechococcus sp. (strain ATCC 27264 / PCC 7002 / PR-6) Uncharacterized protein SYNPCC7002_B0001 Proteins 0.000 description 1
- 108700005078 Synthetic Genes Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 1
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 1
- NFMPFBCXABPALN-OWLDWWDNSA-N Thr-Ala-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O NFMPFBCXABPALN-OWLDWWDNSA-N 0.000 description 1
- GZYNMZQXFRWDFH-YTWAJWBKSA-N Thr-Arg-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O GZYNMZQXFRWDFH-YTWAJWBKSA-N 0.000 description 1
- JNQZPAWOPBZGIX-RCWTZXSCSA-N Thr-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N JNQZPAWOPBZGIX-RCWTZXSCSA-N 0.000 description 1
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 1
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 1
- GKMYGVQDGVYCPC-IUKAMOBKSA-N Thr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)O)N GKMYGVQDGVYCPC-IUKAMOBKSA-N 0.000 description 1
- JXKMXEBNZCKSDY-JIOCBJNQSA-N Thr-Asp-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O JXKMXEBNZCKSDY-JIOCBJNQSA-N 0.000 description 1
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 1
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 1
- ZUUDNCOCILSYAM-KKHAAJSZSA-N Thr-Asp-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZUUDNCOCILSYAM-KKHAAJSZSA-N 0.000 description 1
- MMTOHPRBJKEZHT-BWBBJGPYSA-N Thr-Cys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O MMTOHPRBJKEZHT-BWBBJGPYSA-N 0.000 description 1
- DSLHSTIUAPKERR-XGEHTFHBSA-N Thr-Cys-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O DSLHSTIUAPKERR-XGEHTFHBSA-N 0.000 description 1
- WLDUCKSCDRIVLJ-NUMRIWBASA-N Thr-Gln-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O WLDUCKSCDRIVLJ-NUMRIWBASA-N 0.000 description 1
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- KBLYJPQSNGTDIU-LOKLDPHHSA-N Thr-Glu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O KBLYJPQSNGTDIU-LOKLDPHHSA-N 0.000 description 1
- SLUWOCTZVGMURC-BFHQHQDPSA-N Thr-Gly-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O SLUWOCTZVGMURC-BFHQHQDPSA-N 0.000 description 1
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- UBDDORVPVLEECX-FJXKBIBVSA-N Thr-Gly-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O UBDDORVPVLEECX-FJXKBIBVSA-N 0.000 description 1
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 1
- NQVDGKYAUHTCME-QTKMDUPCSA-N Thr-His-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O NQVDGKYAUHTCME-QTKMDUPCSA-N 0.000 description 1
- XSTGOZBBXFKGHA-YJRXYDGGSA-N Thr-His-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O XSTGOZBBXFKGHA-YJRXYDGGSA-N 0.000 description 1
- JRAUIKJSEAKTGD-TUBUOCAGSA-N Thr-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N JRAUIKJSEAKTGD-TUBUOCAGSA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- RRRRCRYTLZVCEN-HJGDQZAQSA-N Thr-Leu-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O RRRRCRYTLZVCEN-HJGDQZAQSA-N 0.000 description 1
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 1
- QHUWWSQZTFLXPQ-FJXKBIBVSA-N Thr-Met-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QHUWWSQZTFLXPQ-FJXKBIBVSA-N 0.000 description 1
- GUHLYMZJVXUIPO-RCWTZXSCSA-N Thr-Met-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O GUHLYMZJVXUIPO-RCWTZXSCSA-N 0.000 description 1
- KZURUCDWKDEAFZ-XVSYOHENSA-N Thr-Phe-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O KZURUCDWKDEAFZ-XVSYOHENSA-N 0.000 description 1
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 1
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 1
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 1
- BEZTUFWTPVOROW-KJEVXHAQSA-N Thr-Tyr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O BEZTUFWTPVOROW-KJEVXHAQSA-N 0.000 description 1
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 1
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 1
- KPMIQCXJDVKWKO-IFFSRLJSSA-N Thr-Val-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KPMIQCXJDVKWKO-IFFSRLJSSA-N 0.000 description 1
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- BRBCKMMXKONBAA-KWBADKCTSA-N Trp-Ala-Ala Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 BRBCKMMXKONBAA-KWBADKCTSA-N 0.000 description 1
- MJBBMTOGSOSAKJ-HJXMPXNTSA-N Trp-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MJBBMTOGSOSAKJ-HJXMPXNTSA-N 0.000 description 1
- FOAJSVIXYCLTSC-PJODQICGSA-N Trp-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N FOAJSVIXYCLTSC-PJODQICGSA-N 0.000 description 1
- VZBWRZGNEPBRDE-HZUKXOBISA-N Trp-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N VZBWRZGNEPBRDE-HZUKXOBISA-N 0.000 description 1
- IUFQHOCOKQIOMC-XIRDDKMYSA-N Trp-Asn-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N IUFQHOCOKQIOMC-XIRDDKMYSA-N 0.000 description 1
- PXQPYPMSLBQHJJ-WFBYXXMGSA-N Trp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N PXQPYPMSLBQHJJ-WFBYXXMGSA-N 0.000 description 1
- LHHDBONOFZDWMW-AAEUAGOBSA-N Trp-Asp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N LHHDBONOFZDWMW-AAEUAGOBSA-N 0.000 description 1
- HQJOVVWAPQPYDS-ZFWWWQNUSA-N Trp-Gly-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O HQJOVVWAPQPYDS-ZFWWWQNUSA-N 0.000 description 1
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 1
- IQXWAJUIAQLZNX-IHPCNDPISA-N Trp-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N IQXWAJUIAQLZNX-IHPCNDPISA-N 0.000 description 1
- UPNRACRNHISCAF-SZMVWBNQSA-N Trp-Lys-Gln Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 UPNRACRNHISCAF-SZMVWBNQSA-N 0.000 description 1
- NLWCSMOXNKBRLC-WDSOQIARSA-N Trp-Lys-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLWCSMOXNKBRLC-WDSOQIARSA-N 0.000 description 1
- VUMCLPHXCBIJJB-PMVMPFDFSA-N Trp-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC3=CNC4=CC=CC=C43)N VUMCLPHXCBIJJB-PMVMPFDFSA-N 0.000 description 1
- RNDWCRUOGGQDKN-UBHSHLNASA-N Trp-Ser-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RNDWCRUOGGQDKN-UBHSHLNASA-N 0.000 description 1
- HIZDHWHVOLUGOX-BPUTZDHNSA-N Trp-Ser-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O HIZDHWHVOLUGOX-BPUTZDHNSA-N 0.000 description 1
- JTMZSIRTZKLBOA-NWLDYVSISA-N Trp-Thr-Gln Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O JTMZSIRTZKLBOA-NWLDYVSISA-N 0.000 description 1
- DTPWXZXGFAHEKL-NWLDYVSISA-N Trp-Thr-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DTPWXZXGFAHEKL-NWLDYVSISA-N 0.000 description 1
- DVLHKUWLNKDINO-PMVMPFDFSA-N Trp-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DVLHKUWLNKDINO-PMVMPFDFSA-N 0.000 description 1
- MXKUGFHWYYKVDV-SZMVWBNQSA-N Trp-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(C)C)C(O)=O MXKUGFHWYYKVDV-SZMVWBNQSA-N 0.000 description 1
- VCXWRWYFJLXITF-AUTRQRHGSA-N Tyr-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 VCXWRWYFJLXITF-AUTRQRHGSA-N 0.000 description 1
- XLMDWQNAOKLKCP-XDTLVQLUSA-N Tyr-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N XLMDWQNAOKLKCP-XDTLVQLUSA-N 0.000 description 1
- AKFLVKKWVZMFOT-IHRRRGAJSA-N Tyr-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AKFLVKKWVZMFOT-IHRRRGAJSA-N 0.000 description 1
- ADBDQGBDNUTRDB-ULQDDVLXSA-N Tyr-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O ADBDQGBDNUTRDB-ULQDDVLXSA-N 0.000 description 1
- GFZQWWDXJVGEMW-ULQDDVLXSA-N Tyr-Arg-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GFZQWWDXJVGEMW-ULQDDVLXSA-N 0.000 description 1
- JBBYKPZAPOLCPK-JYJNAYRXSA-N Tyr-Arg-Met Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O JBBYKPZAPOLCPK-JYJNAYRXSA-N 0.000 description 1
- MNMYOSZWCKYEDI-JRQIVUDYSA-N Tyr-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MNMYOSZWCKYEDI-JRQIVUDYSA-N 0.000 description 1
- CKHQKYHIZCRTAP-SOUVJXGZSA-N Tyr-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O CKHQKYHIZCRTAP-SOUVJXGZSA-N 0.000 description 1
- UNUZEBFXGWVAOP-DZKIICNBSA-N Tyr-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UNUZEBFXGWVAOP-DZKIICNBSA-N 0.000 description 1
- OLWFDNLLBWQWCP-STQMWFEESA-N Tyr-Gly-Met Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O OLWFDNLLBWQWCP-STQMWFEESA-N 0.000 description 1
- KEANSLVUGJADPN-LKTVYLICSA-N Tyr-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N KEANSLVUGJADPN-LKTVYLICSA-N 0.000 description 1
- ADECJAKCRKPSOR-ULQDDVLXSA-N Tyr-His-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O ADECJAKCRKPSOR-ULQDDVLXSA-N 0.000 description 1
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 1
- GGXUDPQWAWRINY-XEGUGMAKSA-N Tyr-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 GGXUDPQWAWRINY-XEGUGMAKSA-N 0.000 description 1
- ARJASMXQBRNAGI-YESZJQIVSA-N Tyr-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N ARJASMXQBRNAGI-YESZJQIVSA-N 0.000 description 1
- BJCILVZEZRDIDR-PMVMPFDFSA-N Tyr-Leu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=C(O)C=C1 BJCILVZEZRDIDR-PMVMPFDFSA-N 0.000 description 1
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 1
- XYNFFTNEQDWZNY-ULQDDVLXSA-N Tyr-Met-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N XYNFFTNEQDWZNY-ULQDDVLXSA-N 0.000 description 1
- SZEIFUXUTBBQFQ-STQMWFEESA-N Tyr-Pro-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SZEIFUXUTBBQFQ-STQMWFEESA-N 0.000 description 1
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 1
- IEWKKXZRJLTIOV-AVGNSLFASA-N Tyr-Ser-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O IEWKKXZRJLTIOV-AVGNSLFASA-N 0.000 description 1
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 1
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 1
- XUIOBCQESNDTDE-FQPOAREZSA-N Tyr-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O XUIOBCQESNDTDE-FQPOAREZSA-N 0.000 description 1
- KSGKJSFPWSMJHK-JNPHEJMOSA-N Tyr-Tyr-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSGKJSFPWSMJHK-JNPHEJMOSA-N 0.000 description 1
- NWEGIYMHTZXVBP-JSGCOSHPSA-N Tyr-Val-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O NWEGIYMHTZXVBP-JSGCOSHPSA-N 0.000 description 1
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 1
- FZSPNKUFROZBSG-ZKWXMUAHSA-N Val-Ala-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O FZSPNKUFROZBSG-ZKWXMUAHSA-N 0.000 description 1
- UUYCNAXCCDNULB-QXEWZRGKSA-N Val-Arg-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O UUYCNAXCCDNULB-QXEWZRGKSA-N 0.000 description 1
- KKHRWGYHBZORMQ-NHCYSSNCSA-N Val-Arg-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKHRWGYHBZORMQ-NHCYSSNCSA-N 0.000 description 1
- PFNZJEPSCBAVGX-CYDGBPFRSA-N Val-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C(C)C)N PFNZJEPSCBAVGX-CYDGBPFRSA-N 0.000 description 1
- DNOOLPROHJWCSQ-RCWTZXSCSA-N Val-Arg-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DNOOLPROHJWCSQ-RCWTZXSCSA-N 0.000 description 1
- LNYOXPDEIZJDEI-NHCYSSNCSA-N Val-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LNYOXPDEIZJDEI-NHCYSSNCSA-N 0.000 description 1
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 1
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 1
- XLDYBRXERHITNH-QSFUFRPTSA-N Val-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)C(C)C XLDYBRXERHITNH-QSFUFRPTSA-N 0.000 description 1
- SCBITHMBEJNRHC-LSJOCFKGSA-N Val-Asp-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N SCBITHMBEJNRHC-LSJOCFKGSA-N 0.000 description 1
- FRUYSSRPJXNRRB-GUBZILKMSA-N Val-Cys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N FRUYSSRPJXNRRB-GUBZILKMSA-N 0.000 description 1
- CFSSLXZJEMERJY-NRPADANISA-N Val-Gln-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CFSSLXZJEMERJY-NRPADANISA-N 0.000 description 1
- HURRXSNHCCSJHA-AUTRQRHGSA-N Val-Gln-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HURRXSNHCCSJHA-AUTRQRHGSA-N 0.000 description 1
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 1
- BRPKEERLGYNCNC-NHCYSSNCSA-N Val-Glu-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N BRPKEERLGYNCNC-NHCYSSNCSA-N 0.000 description 1
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 1
- UEHRGZCNLSWGHK-DLOVCJGASA-N Val-Glu-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UEHRGZCNLSWGHK-DLOVCJGASA-N 0.000 description 1
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 1
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 1
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 1
- MDYSKHBSPXUOPV-JSGCOSHPSA-N Val-Gly-Phe Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N MDYSKHBSPXUOPV-JSGCOSHPSA-N 0.000 description 1
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 1
- XXROXFHCMVXETG-UWVGGRQHSA-N Val-Gly-Val Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXROXFHCMVXETG-UWVGGRQHSA-N 0.000 description 1
- WJVLTYSHNXRCLT-NHCYSSNCSA-N Val-His-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WJVLTYSHNXRCLT-NHCYSSNCSA-N 0.000 description 1
- HLBHFAWNMAQGNO-AVGNSLFASA-N Val-His-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCSC)C(=O)O)N HLBHFAWNMAQGNO-AVGNSLFASA-N 0.000 description 1
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 1
- JZWZACGUZVCQPS-RNJOBUHISA-N Val-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N JZWZACGUZVCQPS-RNJOBUHISA-N 0.000 description 1
- DJQIUOKSNRBTSV-CYDGBPFRSA-N Val-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N DJQIUOKSNRBTSV-CYDGBPFRSA-N 0.000 description 1
- BMOFUVHDBROBSE-DCAQKATOSA-N Val-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N BMOFUVHDBROBSE-DCAQKATOSA-N 0.000 description 1
- BZOSBRIDWSSTFN-AVGNSLFASA-N Val-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](C(C)C)N BZOSBRIDWSSTFN-AVGNSLFASA-N 0.000 description 1
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 1
- RFKJNTRMXGCKFE-FHWLQOOXSA-N Val-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC(C)C)C(O)=O)=CNC2=C1 RFKJNTRMXGCKFE-FHWLQOOXSA-N 0.000 description 1
- RWOGENDAOGMHLX-DCAQKATOSA-N Val-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N RWOGENDAOGMHLX-DCAQKATOSA-N 0.000 description 1
- KTEZUXISLQTDDQ-NHCYSSNCSA-N Val-Lys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KTEZUXISLQTDDQ-NHCYSSNCSA-N 0.000 description 1
- DIOSYUIWOQCXNR-ONGXEEELSA-N Val-Lys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O DIOSYUIWOQCXNR-ONGXEEELSA-N 0.000 description 1
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 1
- WSUWDIVCPOJFCX-TUAOUCFPSA-N Val-Met-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N WSUWDIVCPOJFCX-TUAOUCFPSA-N 0.000 description 1
- YDVDTCJGBBJGRT-GUBZILKMSA-N Val-Met-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N YDVDTCJGBBJGRT-GUBZILKMSA-N 0.000 description 1
- QPPZEDOTPZOSEC-RCWTZXSCSA-N Val-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N)O QPPZEDOTPZOSEC-RCWTZXSCSA-N 0.000 description 1
- MJFSRZZJQWZHFQ-SRVKXCTJSA-N Val-Met-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N MJFSRZZJQWZHFQ-SRVKXCTJSA-N 0.000 description 1
- BCBFMJYTNKDALA-UFYCRDLUSA-N Val-Phe-Phe Chemical compound N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O BCBFMJYTNKDALA-UFYCRDLUSA-N 0.000 description 1
- JMCOXFSCTGKLLB-FKBYEOEOSA-N Val-Phe-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JMCOXFSCTGKLLB-FKBYEOEOSA-N 0.000 description 1
- YTNGABPUXFEOGU-SRVKXCTJSA-N Val-Pro-Arg Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O YTNGABPUXFEOGU-SRVKXCTJSA-N 0.000 description 1
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 1
- QZKVWWIUSQGWMY-IHRRRGAJSA-N Val-Ser-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QZKVWWIUSQGWMY-IHRRRGAJSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 1
- GTACFKZDQFTVAI-STECZYCISA-N Val-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 GTACFKZDQFTVAI-STECZYCISA-N 0.000 description 1
- RTJPAGFXOWEBAI-SRVKXCTJSA-N Val-Val-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RTJPAGFXOWEBAI-SRVKXCTJSA-N 0.000 description 1
- ZHWZDZFWBXWPDW-GUBZILKMSA-N Val-Val-Cys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(O)=O ZHWZDZFWBXWPDW-GUBZILKMSA-N 0.000 description 1
- 101000916321 Xenopus laevis Transposon TX1 uncharacterized 149 kDa protein Proteins 0.000 description 1
- 101000916336 Xenopus laevis Transposon TX1 uncharacterized 82 kDa protein Proteins 0.000 description 1
- 101001000760 Zea mays Putative Pol polyprotein from transposon element Bs1 Proteins 0.000 description 1
- 101000760088 Zymomonas mobilis subsp. mobilis (strain ATCC 10988 / DSM 424 / LMG 404 / NCIMB 8938 / NRRL B-806 / ZM1) 20.9 kDa protein Proteins 0.000 description 1
- 101000678262 Zymomonas mobilis subsp. mobilis (strain ATCC 10988 / DSM 424 / LMG 404 / NCIMB 8938 / NRRL B-806 / ZM1) 65 kDa protein Proteins 0.000 description 1
- LJSAJMXWXGSVNA-UHFFFAOYSA-N a805044 Chemical compound OC1=CC=C(O)C=C1.OC1=CC=C(O)C=C1 LJSAJMXWXGSVNA-UHFFFAOYSA-N 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 108010087049 alanyl-alanyl-prolyl-valine Proteins 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 1
- 108010078114 alanyl-tryptophyl-alanine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 1
- 230000009604 anaerobic growth Effects 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 108010052670 arginyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010091818 arginyl-glycyl-aspartyl-valine Proteins 0.000 description 1
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 1
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 1
- 108010077245 asparaginyl-proline Proteins 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- 230000003796 beauty Effects 0.000 description 1
- TUCIXUDAQRPDCG-UHFFFAOYSA-N benzene-1,2-diol Chemical compound OC1=CC=CC=C1O.OC1=CC=CC=C1O TUCIXUDAQRPDCG-UHFFFAOYSA-N 0.000 description 1
- VEVJTUNLALKRNO-TYHXJLICSA-N benzoyl-CoA Chemical compound O=C([C@H](O)C(C)(COP(O)(=O)OP(O)(=O)OC[C@@H]1[C@H]([C@@H](O)[C@@H](O1)N1C2=NC=NC(N)=C2N=C1)OP(O)(O)=O)C)NCCC(=O)NCCSC(=O)C1=CC=CC=C1 VEVJTUNLALKRNO-TYHXJLICSA-N 0.000 description 1
- 108010091694 benzoyl-coenzyme A reductase (dearomatizing) Proteins 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- KGBXLFKZBHKPEV-UHFFFAOYSA-N boric acid Chemical compound OB(O)O KGBXLFKZBHKPEV-UHFFFAOYSA-N 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 235000017471 coenzyme Q10 Nutrition 0.000 description 1
- ACTIUHUUMQJHFO-UPTCCGCDSA-N coenzyme Q10 Chemical compound COC1=C(OC)C(=O)C(C\C=C(/C)CC\C=C(/C)CC\C=C(/C)CC\C=C(/C)CC\C=C(/C)CC\C=C(/C)CC\C=C(/C)CC\C=C(/C)CC\C=C(/C)CCC=C(C)C)=C(C)C1=O ACTIUHUUMQJHFO-UPTCCGCDSA-N 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- MPTQRFCYZCXJFQ-UHFFFAOYSA-L copper(II) chloride dihydrate Chemical compound O.O.[Cl-].[Cl-].[Cu+2] MPTQRFCYZCXJFQ-UHFFFAOYSA-L 0.000 description 1
- 229960002104 cyanocobalamin Drugs 0.000 description 1
- 235000000639 cyanocobalamin Nutrition 0.000 description 1
- 239000011666 cyanocobalamin Substances 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000006114 decarboxylation reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 108010009297 diglycyl-histidine Proteins 0.000 description 1
- ZPWVASYFFYYZEW-UHFFFAOYSA-L dipotassium hydrogen phosphate Chemical compound [K+].[K+].OP([O-])([O-])=O ZPWVASYFFYYZEW-UHFFFAOYSA-L 0.000 description 1
- 229910000396 dipotassium phosphate Inorganic materials 0.000 description 1
- 108010054812 diprotin A Proteins 0.000 description 1
- GRWZHXKQBITJKP-UHFFFAOYSA-L dithionite(2-) Chemical compound [O-]S(=O)S([O-])=O GRWZHXKQBITJKP-UHFFFAOYSA-L 0.000 description 1
- 230000001214 effect on cellular process Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000006167 equilibration buffer Substances 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 108010085059 glutamyl-arginyl-proline Proteins 0.000 description 1
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 108010040856 glutamyl-cysteinyl-alanine Proteins 0.000 description 1
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 1
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 1
- 108010081985 glycyl-cystinyl-aspartic acid Proteins 0.000 description 1
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 1
- 108010028188 glycyl-histidyl-serine Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010083327 glycyl-prolyl-arginyl-valine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 108010020688 glycylhistidine Proteins 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- XLYOFNOQVPJJNP-ZSJDYOACSA-N heavy water Substances [2H]O[2H] XLYOFNOQVPJJNP-ZSJDYOACSA-N 0.000 description 1
- 108010092114 histidylphenylalanine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 125000004356 hydroxy functional group Chemical group O* 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 238000005805 hydroxylation reaction Methods 0.000 description 1
- 230000014726 immortalization of host cell Effects 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- PGLTVOMIXTUURA-UHFFFAOYSA-N iodoacetamide Chemical compound NC(=O)CI PGLTVOMIXTUURA-UHFFFAOYSA-N 0.000 description 1
- WSSMOXHYUFMBLS-UHFFFAOYSA-L iron dichloride tetrahydrate Chemical compound O.O.O.O.[Cl-].[Cl-].[Fe+2] WSSMOXHYUFMBLS-UHFFFAOYSA-L 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000001155 isoelectric focusing Methods 0.000 description 1
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- 101150109249 lacI gene Proteins 0.000 description 1
- 108010009932 leucyl-alanyl-glycyl-valine Proteins 0.000 description 1
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 1
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 1
- 238000007834 ligase chain reaction Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 108010043322 lysyl-tryptophyl-alpha-lysine Proteins 0.000 description 1
- 108010010679 lysyl-valyl-leucyl-aspartic acid Proteins 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 108010038320 lysylphenylalanine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 101150023497 mcrA gene Proteins 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 108010056582 methionylglutamic acid Proteins 0.000 description 1
- 108010005942 methionylglycine Proteins 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 235000013336 milk Nutrition 0.000 description 1
- 239000008267 milk Substances 0.000 description 1
- 210000004080 milk Anatomy 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 235000010755 mineral Nutrition 0.000 description 1
- 229910000402 monopotassium phosphate Inorganic materials 0.000 description 1
- 235000001968 nicotinic acid Nutrition 0.000 description 1
- 239000011664 nicotinic acid Substances 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 238000001216 nucleic acid method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 229940031826 phenolate Drugs 0.000 description 1
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 1
- 108010072637 phenylalanyl-arginyl-phenylalanine Proteins 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010018625 phenylalanylarginine Proteins 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- LFGREXWGYUGZLY-UHFFFAOYSA-N phosphoryl Chemical group [P]=O LFGREXWGYUGZLY-UHFFFAOYSA-N 0.000 description 1
- 235000021118 plant-derived protein Nutrition 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- GNSKLFRGEWLPPA-UHFFFAOYSA-M potassium dihydrogen phosphate Chemical compound [K+].OP(O)([O-])=O GNSKLFRGEWLPPA-UHFFFAOYSA-M 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000019525 primary metabolic process Effects 0.000 description 1
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 1
- 108010020432 prolyl-prolylisoleucine Proteins 0.000 description 1
- 108010079317 prolyl-tyrosine Proteins 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010015796 prolylisoleucine Proteins 0.000 description 1
- 108060006633 protein kinase Proteins 0.000 description 1
- NPCOQXAVBJJZBQ-UHFFFAOYSA-N reduced coenzyme Q9 Natural products COC1=C(O)C(C)=C(CC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)C)C(O)=C1OC NPCOQXAVBJJZBQ-UHFFFAOYSA-N 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- JQXXHWHPUNPDRT-WLSIYKJHSA-N rifampicin Chemical compound O([C@](C1=O)(C)O/C=C/[C@@H]([C@H]([C@@H](OC(C)=O)[C@H](C)[C@H](O)[C@H](C)[C@@H](O)[C@@H](C)\C=C\C=C(C)/C(=O)NC=2C(O)=C3C([O-])=C4C)C)OC)C4=C1C3=C(O)C=2\C=N\N1CC[NH+](C)CC1 JQXXHWHPUNPDRT-WLSIYKJHSA-N 0.000 description 1
- 229960001225 rifampicin Drugs 0.000 description 1
- 239000012723 sample buffer Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- FDEIWTXVNPKYDL-UHFFFAOYSA-N sodium molybdate dihydrate Chemical compound O.O.[Na+].[Na+].[O-][Mo]([O-])(=O)=O FDEIWTXVNPKYDL-UHFFFAOYSA-N 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- DPJRMOMPQZCRJU-UHFFFAOYSA-M thiamine hydrochloride Chemical compound Cl.[Cl-].CC1=C(CCO)SC=[N+]1CC1=CN=C(C)N=C1N DPJRMOMPQZCRJU-UHFFFAOYSA-M 0.000 description 1
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 239000011573 trace mineral Substances 0.000 description 1
- 235000013619 trace mineral Nutrition 0.000 description 1
- 108010029384 tryptophyl-histidine Proteins 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 1
- 108010020532 tyrosyl-proline Proteins 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 229940035936 ubiquinone Drugs 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 108010072644 valyl-alanyl-prolyl-glycine Proteins 0.000 description 1
- 108010009962 valyltyrosine Proteins 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 239000011782 vitamin Substances 0.000 description 1
- 229940088594 vitamin Drugs 0.000 description 1
- 235000013343 vitamin Nutrition 0.000 description 1
- 229930003231 vitamin Natural products 0.000 description 1
- 150000003722 vitamin derivatives Chemical class 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 108010000998 wheylin-2 peptide Proteins 0.000 description 1
- 239000011592 zinc chloride Substances 0.000 description 1
- JIAARYAFYJHUJI-UHFFFAOYSA-L zinc dichloride Chemical compound [Cl-].[Cl-].[Zn+2] JIAARYAFYJHUJI-UHFFFAOYSA-L 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/52—Genes encoding for enzymes or proenzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
Definitions
- This invention is in the field of molecular biology. More specifically, this invention pertains to nucleic acid fragments encoding phenol-induced proteins of the denitrifying bacterium Thauera aromatica.
- Phenolic compounds are basic chemicals of high interest to the chemical and pharmaceutical industries. Phenolic compounds are important plant constituents and phenol is formed from a variety of natural and synthetic substrates by the activity of microorganisms. The aerobic metabolism of phenol has been studied extensively; in all aerobic metabolic pathways oxygenases initiate the degradation of phenol by hydroxylation to catechol. Catechol can be oxygenolytically cleaved by dioxygenases, either by ortho- or meta-cleavage.
- Products are 4-hydroxybenzoate, 4-aminobenzoate, 4-hydroxy-3-methylbenzoate, gentisate (2,5-dihydroxybenzoate), and protocatechuate (3,4-dihydroxybenzoate) (Heider et al., Eur. J. Biochem. 243:577-596 (1997)).
- Consortia of fermenting bacteria convert phenol to benzoate and decarboxylate 4-hydroxybenzoate to phenol (Winter et al., Appl. Microbiol. Biotechnol. 25:384-391 (1987); He et al., Eur. J. Biochem. 229:77-82 (1995); He et al., J. Bacteriol.
- phenol-induced proteins from Thauera aromatica Five phenol-induced proteins from Thauera aromatica have been isolated. Three dominant phenol-induced proteins called F1, F2, and F3 were purified and sequenced in an attempt to purify the enzyme(s) that catalyze the 14 CO 2 :4-hydroxybenzoate isotope exchange reaction and the carboxylation of phenylphosphate. The N-terminal amino acid sequences of these proteins as well as the N-terminus of the phenol-induced proteins F4 and F5 were determined. Internal sequences of F2 were obtained by trypsin digest. All of these sequences have application in industrial processes that involve the use of phenol or its intermediates. The instant invention provides a means to manipulate phenol metabolism and to produce various phenol intermediates in recombinant microorganisms. The approach is based on the observation that anoxic growth with phenol and nitrate induces novel proteins that are lacking in cells grown with 4-hydroxybenzoate and nitrate.
- SEQ ID NO:1 is the deduced amino acid sequence of protein F1 and is coded by orf6.
- SEQ ID NO:2 is the nucleotide sequence of orf6 that codes for protein F1.
- SEQ ID NO:3 is the deduced amino acid sequence of protein F2 and is coded by orf4.
- SEQ ID NO:4 is the nucleotide sequence of orf4 that codes for protein F2.
- SEQ ID NO:5 is the deduced amino acid sequence of protein F3 and is coded by orf1.
- SEQ ID NO:6 is the nucleotide sequence of orf1 that codes for protein F3.
- SEQ ID NO:7 is the deduced amino acid sequence of protein F4 and is coded by orf5.
- SEQ ID NO:8 is the nucleotide sequence of orf5 that codes for protein F4.
- SEQ ID NO:9 is the deduced amino acid sequence of protein F5 and is coded by orf8.
- SEQ ID NO:10 is the nucleotide sequence of orf8 that codes for protein F5.
- SEQ ID NO:11 is the deduced amino acid sequence of orf2.
- SEQ ID NO:12 is the nucleotide sequence of orf2 that codes for an unknown protein.
- SEQ ID NO:13 is the deduced amino acid sequence of orf3.
- SEQ ID NO:14 is the nucleotide sequence of orf3 that codes for an unknown protein.
- SEQ ID NO:15 is the deduced amino acid sequence of orf7.
- SEQ ID NO:16 is the nucleotide sequence of orf7 that codes for an unknown protein.
- SEQ ID NO:17 is the deduced amino acid sequence of orf9.
- SEQ ID NO:18 is the nucleotide sequence of orf9 that codes for an unknown protein.
- SEQ ID NO:19 is the deduced amino acid sequence of orf10.
- SEQ ID NO:20 is the nucleotide sequence of orf10 that codes for an unknown protein.
- SEQ ID NO:21 is the deduced amino acid sequence of orf-1.
- SEQ ID NO:22 is the nucleotide sequence of orf-1 that codes for an unknown protein.
- SEQ ID NO:23 is the nucleotide sequence containing two gene clusters that are involved in phenol metabolism.
- SEQ ID NO:24 is the N-terminal amino acid sequence of F1 (experimentally determined).
- SEQ ID NO:25 is the N-terminal amino acid sequence of F1 (deduced from the genes).
- SEQ ID NO:26 is the N-terminal amino acid sequence of F2 (experimentally determined).
- SEQ ID NO:27 is the N-terminal amino acid sequence of F2 (deduced from the genes).
- SEQ ID NO:28 is the N-terminal amino acid sequence of F3 (experimentally determined).
- SEQ ID NO:29 is the N-terminal amino acid sequence of F3 (deduced from the genes).
- SEQ ID NO:30 is the amino acid sequence of an internal fragment of F2 that was obtained by trypsin-digest.
- SEQ ID NO:31 is the amino acid sequence of an internal fragment of F2 that was obtained by trypsin-digest.
- SEQ ID NO:32 is the primer of F2-forward (N-terminus).
- SEQ ID NO:33 is the primer of F2T6-reverse.
- SEQ ID NO:34 is the primer of F2T43-reverse.
- SEQ ID NO:35 is the primer T7.
- SEQ ID NO:36 is the primer T3.
- SEQ ID NO:37 is the primer designated breib31.
- SEQ ID NO:38 is the primer designated breib07r3.
- SEQ ID NO:39 is the primer of ⁇ 15-forward.
- SEQ ID NO:40 is the primer of ⁇ 15-reverse.
- SEQ ID NO:41 is the N-terminal amino acid sequence of F4 (experimentally determined).
- SEQ ID NO:42 is the N-terminal amino acid sequence of F4 (deduced from the genes).
- SEQ ID NO:43 is the N-terminal amino acid sequence of F5 (experimentally determined).
- SEQ ID NO:44 is the N-terminal amino acid sequence of F5 (deduced from the genes).
- FIG. 1 shows phenol metabolism in Thauera aromatica .
- the enzymes active in this pathway are Phenylphosphate synthase E 1 ); Phenylphosphate carboxylase (Mn 2+ , K + )(E 2 ); 4-Hydroxybenzoate-CoA Ligase (E3); 4-Hydroxybenzoyl-CoA reductase (Mo, FAD, Fe/S) (E 4 ); Benzoyl-CoA reductase (Fe/S, FAD) (E 5 ).
- FIG. 2 shows SDS-PAGE (12.5%) with fractions after chromatography of the soluble fraction of K172 (grown anaerobically on phenol) on DEAE sepharose fast flow. See Example 4.
- FIG. 3 shows clone 8 (pKSBam2.7). See Example 8.
- FIG. 4 shows clone 9 (pKSEco5.25). See Example 8.
- FIG. 5 shows clone 19 (pKSBam4). See Example 8.
- FIG. 6 shows clone 2 (pKSBam9).
- FIG. 7 shows clone 7 (pKSPst3.7). See Example 8.
- FIG. 8 shows phagemid-vector—clone 1 (pBK-CMV).
- FIG. 9 shows the expression of F1-F5 in E. coli . See Example 9.
- FIG. 10 shows the two dimensional gel electrophoresis of 100 000 ⁇ g supernatant of Thauera aromatica anaerobically grown on 4-hydroxybenzoate (A) and phenol (B), respectively. Phenol-induced proteins are indicated by triangulars.
- FIG. 11 shows the organization of the genes possibly involved in anaerobic phenol metabolism of Thauera aromatica and their homologies to known proteins.
- FIG. 12 shows the map of the orientation of the clones in the whole sequence of 14272 bp.
- FIG. 13 shows the organization of the genes, with restriction sites, involved in phenol metabolism of Thauera aromatica.
- Applicants have succeeded in identifying the genes coding for phenol-induced proteins.
- Five phenol-induced proteins from Thauera aromatica have been isolated.
- Three dominant phenol-induced proteins called F1, F2, and F3 were purified and sequenced to obtain the enzyme(s) that catalyze the 14 CO 2 :4-hydroxybenzoate isotope exchange reaction and the carboxylation of phenylphosphate.
- the N-terminal amino acid sequences of these proteins as well as the N-terminus of the phenol-induced proteins F4 and F5 were determined. Internal sequences of F2 were obtained by trypsin digest. All of these sequences have utility in industrial processes.
- the instant invention provides a means to manipulate phenol metabolism and specifically the carboxylation of phenyl phosphate.
- Transformation of host cells with at least one copy of the identified genes under the control of appropriate promoters will provide the ability to produce various intermediates in phenol metabolism.
- the approach is based on the observation that anoxic growth with phenol and nitrate induces novel proteins that are lacking in cells grown with 4-hydroxybenzoate and nitrate.
- ORF means “open reading frame
- PCR means polymerase chain reaction
- HPLC high performance liquid chromatography
- ca means approximately
- dcw means dry cell weight
- O.D optical density at the designated wavelength
- IU means International Units.
- PCR Polymerase chain reaction
- ORF Open reading frame
- sample channels ratio is abbreviated SCR.
- HPLC High performance liquid chromatography
- F1 refers to the protein encoded by orf6.
- F2 refers to the protein encoded by orf4.
- F3 refers to the protein encoded by orf1.
- F4 refers to the protein encoded by orf5.
- F5 refers to the protein encoded by orf8.
- E 1 refers to phenol phosphorylating, phenol kinase or phenylphosphate synthase. Phenol phosphorylating and phenol kinase are used interchangeably by those skilled in the art.
- E 2 refers to phenylphosphate carboxylase.
- isolated nucleic acid fragment or “isolated nucleic acid molecule” refer to a polymer of mononucleotides (RNA or DNA) that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases.
- An isolated nucleic acid fragment or an isolated nucleic acid molecule in the form of a polymer of mononucleotides may be comprised of one or more segments of cDNA, genomic DNA, or synthetic DNA.
- host cell and “host microorganism” refer to a cell capable of receiving foreign or heterologous genes and expressing those genes to produce an active gene product.
- suitable host cells encompasses microorganisms such as bacteria and fungi, and also includes plant cells.
- fragment refers to a DNA or amino acid sequence comprising a subsequence of the nucleic acid sequence or protein of the instant invention.
- an active fragment of the instant invention comprises a sufficient portion of the protein to maintain activity.
- gene cluster refers to genes organized in a single expression unit or in close proximity to each other on the chromosome.
- substantially similar refers to nucleic acid fragments wherein changes in one or more nucleotide bases result in substitution of one or more amino acids, but do not affect the functional properties of the protein encoded by the DNA sequence. “Substantially similar” also refers to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate alteration of gene expression by antisense or co-suppression technology.
- “Substantially similar” also refers to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotide bases that do not substantially affect the functional properties of the resulting transcript vis-à-vis the ability to mediate alteration of gene expression by antisense or co-suppression technology or alteration of the functional properties of the resulting protein molecule. It is therefore understood that the invention encompasses more than the specific exemplary sequences.
- a codon for the amino acid alanine, a hydrophobic amino acid may be substituted by a codon encoding another less hydrophobic residue (such as glycine) or a more hydrophobic residue (such as valine, leucine, or isoleucine).
- Preferred substantially similar nucleic acid fragments of the instant invention are those nucleic acid fragments whose DNA sequences are at least 80% identical to the DNA sequence of the nucleic acid fragments reported herein. More preferred nucleic acid fragments are at least 90% identical to the DNA sequence of the nucleic acid fragments reported herein. Most preferred are nucleic acid fragments that are at least 95% identical to the DNA sequence of the nucleic acid fragments reported herein.
- a nucleic acid molecule is “hybridizable” to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength.
- Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual , Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 and Table 11.1 therein (entirely incorporated herein by reference). The conditions of temperature and ionic strength determine the “stringency” of the hybridization.
- low stringency hybridization conditions corresponding to a Tm of 55°
- Moderate stringency hybridization conditions correspond to a higher Tm, e.g., 40-45% formamide, with 5 ⁇ or 6 ⁇ SSC.
- Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible.
- the appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art.
- RNA:RNA, DNA:RNA, DNA:DNA The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51).
- the length for a hybridizable nucleic acid is at least about 10 nucleotides.
- a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least 30 nucleotides.
- the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe.
- a “substantial portion” refers to an amino acid or nucleotide sequence which comprises enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford putative identification of that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Basic Local Alignment Search Tool; Altschul et al., J. Mol. Biol. 215:403-410 (1993); see also www.ncbi.nlm.nih.gov/BLAST/).
- BLAST Basic Local Alignment Search Tool
- a sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene.
- gene-specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques).
- short oligonucleotides generally 12 bases or longer may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers.
- a “substantial portion” of a nucleotide sequence comprises enough of the sequence to afford specific identification and/or isolation of a nucleic acid fragment comprising the sequence.
- the instant specification teaches partial or complete amino acid and nucleotide sequences encoding one or more particular plant proteins.
- the skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for the purpose known to those skilled in the art. Accordingly, the instant invention comprises the complete sequences as reported in the accompanying Sequence Listing, as well as substantial portions of those sequences as defined above.
- antisense suppression and co-suppression of gene expression may be accomplished using nucleic acid fragments representing less than the entire coding region of a gene, and by nucleic acid fragments that do not share 100% identity with the gene to be suppressed.
- alterations in a gene that result in the production of a chemically equivalent amino acid at a given site, but do not effect the functional properties of the encoded protein are well known in the art.
- a codon for the amino acid alanine, a hydrophobic amino acid may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine.
- a codon encoding another less hydrophobic residue such as glycine
- a more hydrophobic residue such as valine, leucine, or isoleucine.
- changes which result in substitution of one negatively charged residue for another such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a functionally equivalent product.
- Nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the protein molecule would also not be expected to alter the activity of the protein.
- substantially similar sequences encompassed by this invention are also defined by their ability to hybridize, under stringent conditions (0.1 ⁇ SSC, 0.1% SDS, 65° C.) or moderately stringent conditions, with the sequences exemplified herein.
- Preferred substantially similar nucleic acid fragments of the instant invention are those nucleic acid fragments whose DNA sequences are 80% identical to the DNA sequence of the nucleic acid fragments reported herein. More preferred nucleic acid fragments are 90% identical to the DNA sequence of the nucleic acid fragments reported herein. Most preferred are nucleic acid fragments that are 95% identical to the DNA sequence of the nucleic acid fragments reported herein.
- nucleotide bases that are capable to hybridizing to one another.
- adenosine is complementary to thymine and cytosine is complementary to guanine.
- the instant invention also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences.
- identity is a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the sequences.
- identity also means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as determined by the match between strings of such sequences.
- Identity and similarity can be readily calculated by known methods, including but not limited to those described in: Computational Molecular Biology (Lesk, A. M., ed.) Oxford University Press, New York (1988); Biocomputing: Informatics and Genome Projects (Smith, D.
- the BLASTX program is publicly available from NCBI and other sources ( BLAST Manual , Altschul et al., Natl. Cent. Biotechnol. Inf., Natl. Library Med.
- NCBI NLM National Land Mobile Networks Inc. NIH, Bethesda, Md. 20894; Altschul et al., J. Mol. Biol. 215:403-410 (1990); Altschul et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”, Nucleic Acids Res. 25:3389-3402 (1997)).
- the method to determine percent identity preferred in the instant invention is by the method of DNASTAR protein alignment protocol using the Jotun-Hein algorithm (Hein et al., Methods Enzymol. 183:626-645 (1990)).
- the nucleotide sequence of the polynucleotide is identical to the reference sequence except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the reference nucleotide sequence.
- a polynucleotide having a nucleotide sequence at least 95% identical to a reference nucleotide sequence up to 5% of the nucleotides in the reference sequence may be deleted or substituted with another nucleotide, or a number of nucleotides up to 5% of the total nucleotides in the reference sequence may be inserted into the reference sequence.
- These mutations of the reference sequence may occur at the 5′ or 3′ terminal positions of the reference nucleotide sequence or anywhere between those terminal positions, interspersed either individually among nucleotides in the reference sequence or in one or more contiguous groups within the reference sequence.
- a polypeptide having an amino acid sequence having at least 95% identity to a reference amino acid sequence it is intended that the amino acid sequence of the polypeptide is identical to the reference sequence except that the polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the reference amino acid.
- up to 5% of the amino acid residues in the reference sequence may be deleted or substituted with another amino acid, or a number of amino acids up to 5% of the total amino acid residues in the reference sequence may be inserted into the reference sequence.
- These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence.
- percent homology refers to the extent of amino acid sequence identity between polypeptides. When a first amino acid sequence is identical to a second amino acid sequence, then the first and second amino acid sequences exhibit 100% homology.
- the homology between any two polypeptides is a direct function of the total number of matching amino acids at a given position in either sequence, e.g., if half of the total number of amino acids in either of the two sequences are the same then the two sequences are said to exhibit 50% homology.
- Codon degeneracy refers to divergence in the genetic code permitting variation of the nucleotide sequence without effecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment that encodes all or a substantial portion of the amino acid sequence encoding the instant Thauera aromatica proteins as set forth in SEQ ID NO:1, SEQ ID NO:3 and SEQ ID NO:5.
- the skilled artisan is well aware of the “codon-bias” exhibited by a specific host cell to use nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
- “Synthetic genes” can be assembled from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form gene segments that are then enzymatically assembled to construct the entire gene. “Chemically synthesized”, as related to a sequence of DNA, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be accomplished using well established procedures, or automated chemical synthesis can be performed using one of a number of commercially available machines. Accordingly, the genes can be tailored for optimal gene expression based on optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determining preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.
- Gene refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5′ non-coding sequences) and following (3′ non-coding sequences) the coding sequence.
- “Native gene” refers to a gene as found in nature with its own regulatory sequences.
- “Chimeric gene” refers to any gene, not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature.
- Endogenous gene refers to a native gene in its natural location in the genome of an organism.
- a “foreign” gene refers to a gene not normally found in the host organism, but which is introduced into the host organism by gene transfer.
- Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes.
- a “transgene” is a gene that has been introduced into the genome by a transformation procedure.
- Coding sequence refers to a DNA sequence that codes for a specific amino acid sequence.
- Regulatory sequences refer to nucleotide sequences located upstream (5′ non-coding sequences), within, or downstream (3′ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
- Promoter refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA.
- a coding sequence is located 3′ to a promoter sequence.
- the promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers.
- An “enhancer” is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments.
- promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters”. New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, ( Biochemistry of Plants 15:1-82 (1989)). It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.
- the “translation leader sequence” refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence.
- the translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence.
- the translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner et a., Mol. Biotech. 3:225 (1995)).
- the “3′ non-coding sequences” refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression.
- the polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3′ end of the mRNA precursor.
- the use of different 3′ non-coding sequences is exemplified by Ingelbrecht et al., Plant Cell 1:671-680 (1989).
- RNA transcript refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript. The RNA transcript it may be a RNA sequence derived from posttranscriptional processing of the primary transcript and is referred to then as the mature RNA. “Messenger RNA” (mRNA) refers to the RNA that is without introns and that can be translated into protein by the cell. “cDNA” refers to a double-stranded DNA that is complementary to and derived from mRNA. “Sense” RNA refers to RNA transcript that includes the mRNA and so can be translated into protein by the cell.
- Antisense RNA refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065).
- the complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5′ non-coding sequence, 3′ non-coding sequence, introns, or the coding sequence.
- “Functional RNA” refers to antisense RNA, ribozyme RNA, or other RNA that is not translated, yet has an effect on cellular processes.
- operably-linked refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other.
- a promoter is operably-linked with a coding sequence when it affects the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter).
- Coding sequences can be operably-linked to regulatory sequences in sense or antisense orientation.
- expression refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide.
- Antisense inhibition refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein.
- Overexpression refers to the production of a gene product in transgenic organisms that exceeds levels of production in normal or non-transformed organisms.
- Co-suppression refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).
- altered levels refers to the production of gene product(s) in organisms in amounts or proportions that differ from that of normal or non-transformed organisms.
- Transformation refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” organisms. Examples of methods of plant transformation include Agrobacterium-mediated transformation (De Blaere et al., Meth. Enzymol. 143:277 (1987)) and particle-accelerated or “gene gun” transformation technology (Klein et al., Nature, London 327:70-73 (1987); U.S. 4,945,050).
- Plasmid refers to an extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules.
- Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3′ untranslated sequence into a cell.
- Transformation cassette refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell.
- Expression cassette refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.
- Novel phenol-induced proteins F1, F2, and F3, have been isolated. Comparison of their random cDNA sequences to the GenBank database using the BLAST algorithms, well known to those skilled in the art, revealed that F3 (orf1) and orf2 are proteins homologous to phosphoenolpyruvate sythase (PEP) of E. coli and are likely to represent the phenol phosphorylating enzyme E 1 (FIG. 1).
- the nucleotide sequences of the F1, F2, and F3 genomic DNA are provided in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, and their deduced amino acid sequences are provided in SEQ ID NO:1, SEQ ID NO:3, and SEQ ID NO:5, respectively.
- F1, F2, and F3 genes from other bacteria can now be identified by comparison of random cDNA sequences to the F1, F2, and F3 sequences provided herein.
- the nucleic acid fragments of the instant invention may be used to isolate cDNAs and genes encoding homologous F1, F2, and F3 phenol-induced proteins from the same or other plant or fungal species. Isolating homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to, methods of nucleic acid hybridization and methods of DNA and RNA amplification as exemplified by various uses of nucleic acid amplification technologies (e.g., polymerase chain reaction (PCR) or ligase chain reaction).
- PCR polymerase chain reaction
- ligase chain reaction e chain reaction
- F1, F2, and F3 genes could be isolated directly by using all or a portion of the instant nucleic acid fragments as DNA hybridization probes to screen libraries from any desired bacteria using methodology well known to those skilled in the art.
- Specific oligonucleotide probes based upon the instant F1, F2, and F3 sequences can be designed and synthesized by methods known in the art (Sambrook, supra).
- entire sequences can be used directly to synthesize DNA probes by methods known to the skilled artisan such as random primers, DNA labeling, nick translation, or end-labeling techniques, or RNA probes using available in vitro transcription systems.
- primers can be designed and used to amplify a part of or full-length of the instant sequences.
- the resulting amplification products can be labeled directly during amplification reactions or labeled after amplification reactions, and used as probes to isolate full length cDNA or genomic fragments under conditions of appropriate stringency.
- two short segments of the instant ORF's may be used in PCR protocols to amplify longer nucleic acid fragments encoding homologous F1, F2, F3, F4, and F5 genes from DNA or RNA.
- the polymerase chain reaction may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the instant nucleic acid fragments, and the sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3′ end of the mRNA precursor encoding bacterial F1, F2, F3, F4, and F5.
- the second primer sequence may be based upon sequences derived from the cloning vector.
- the skilled artisan can follow the RACE protocol (Frohman et al., Proc. Natl. Acad. Sci., USA 85:8998 (1988)) to generate cDNAs by using PCR to amplify copies of the region between a single point in the transcript and the 3′ or 5′ end. Primers oriented in the 3′ and 5′ directions can be designed from the instant sequences. Using commercially available 3′ RACE or 5′ RACE systems (BRL), specific 3′ or 5′ cDNA fragments can be isolated (Ohara et al., Proc. Natl. Acad. Sci., USA 86:5673 (1989); Loh et al., Science 243:217 (1989)). Products generated by the 3′ and 5′ RACE procedures can be combined to generate full-length cDNAs (Frohman et al., Techniques 1:165 (1989)).
- RACE protocol Frohman et al., Proc. Natl. Acad. Sci.
- Availability of the instant nucleotide and deduced amino acid sequences facilitates immunological screening of cDNA expression libraries.
- Synthetic peptides representing portions of the instant amino acid sequences may be synthesized. These peptides can be used to immunize animals to produce polyclonal or monoclonal antibodies with specificity for peptides or proteins comprising the amino acid sequences. These antibodies can then be used to screen cDNA expression libraries to isolate full-length cDNA clones of interest (Lemer et al., Adv. Immunol. 36:1 (1984); Sambrook, supra).
- the enzymes and gene products of the instant ORF's may be produced in heterologous host cells, particularly in the cells of microbial hosts, and can be used to prepare antibodies to the resulting proteins by methods well known to those skilled in the art.
- the antibodies are useful for detecting the proteins in situ in cells or in vitro in cell extracts.
- Preferred heterologous host cells for production of the instant enzymes are microbial hosts and include those selected from the following: Comamonas sp., Corynebacterium sp., Brevibacterium sp., Rhodococcus sp., Azotobacter sp., Citrobacter sp., Enterobacter sp., Clostridium sp., Klebsiella sp., Salmonella s.p, Lactobacillus sp., Aspergillus sp., Saccharomyces sp., Zygosaccharomyces sp, Pichia sp., Kluyveromyces sp., Candida sp., Hansenula sp., Dunaliella sp., Debaryomyces sp., Mucor sp., Torylopsis sp., Methylobacteriasp., Bacillussp., Escherichia s
- Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct chimeric genes for production of any of the gene products of the instant ORF's. These chimeric genes could then be introduced into appropriate microorganisms via transformation to provide high level expression of the enzymes.
- chimeric genes will be effective in altering the properties of the host bacteria. It is expected, for example, that introduction of chimeric genes encoding one or more of the ORF's 1-10 under the control of the appropriate promoters, into a host cell comprising at least one copy of these genes will demonstrate the ability to produce various intermediates in phenol metabolism.
- the appropriately regulated ORF 1 and ORF 2 would be expected to express an enzyme capable of phosphorylating phenol (phenylphosphate synthase—FIG. 1).
- ORF 4, ORF 6, ORF 7 and ORF 8 would be expected to express an enzyme capable of carboxylating phenylphosphate to afford 4-hydroxbenzoate (phenylphosphate carboxylase—FIG. 1).
- expression of SEQ ID NO:23 in a single recombinant organism will be expected to effect the conversion of phenol to 4-hydroxybenzoate in a transformed host (FIG. 1).
- Vectors or cassettes useful for the transformation of suitable host cells are well known in the art.
- the vector or cassette contains sequences directing transcription and translation of the relevant gene, a selectable marker, and sequences allowing autonomous replication or chromosomal integration.
- Suitable vectors comprise a region 5′ of the gene which harbors transcriptional initiation controls and a region 3′ of the DNA fragment which controls transcriptional termination. It is most preferred when both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host.
- Initiation control regions or promoters which are useful to drive expression of the instant ORF's in the desired host cell are numerous and familiar to those skilled in the art.
- a promoter capable of driving these genes is suitable for the present invention including but not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces); AOX1 (useful for expression in Pichia); and lac, trp, 1P L , 1P R , T7, tac, and trc (useful for expression in Escherichia coli ).
- Useful strong promoters may also be used from Corynebacterium, Comamonas, Pseudomonas, and Rhodococcus.
- Termination control regions may also be derived from various genes native to the preferred hosts. Optionally, a termination site may be unnecessary, however, it is most preferred if included.
- Thauera aromatica phenol carboxylation proceeds in two steps and involves formation of phenylphosphate as the first intermediate (Equation 1).
- Cells grown with phenol were simultaneously adapted to growth with 4-hydroxybenzoate, whereas, vice-versa, 4-hydroxybenzoate-grown cells did not metabolize phenol.
- Induction of the capacity to metabolize phenol required several hours.
- Phenylphosphate is the substrate of a second enzyme E 2 , phenylphosphate carboxylase. It requires K + and Mn 2+ and catalyzes the carboxylation of phenylphosphate to 4-hydroxybenzoate (Equation 7).
- E 2 -phenolate intermediate (Equations 9 and 10) which is formed in a presumably exergonic reaction (Equation 11) followed by the reversible carboxylation (Equation 12).
- the actual substrate is CO 2 rather than bicarbonate, and the carboxylating enzyme was not inhibited by avidin; both results suggest that biotin is not involved in carboxylation.
- the enzyme E 2 is termed phenylphosphate carboxylase.
- Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1989 (hereinafter “Sambrook”); and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions , Cold Spring Harbor Laboratory Press, Cold Spring, N.Y. (1984) and by Ausubel et al., Current Protocols in Molecular Biology , pub. by Greene Publishing Assoc. and Wiley-Interscience (1987).
- Thauera aromatica phenol carboxylation proceeds in two steps and involves formation of phenylphosphate as the first intermediate (FIG. 1).
- Cells grown with phenol were simultaneously adapted to growth with 4-hydroxybenzoate, whereas, vice-versa, 4-hydroxybenzoate-grown cells did not metabolize phenol.
- Induction of the capacity to metabolize phenol required several hours.
- the enzyme system not only acts on 4-hydroxy-benzoate/phenol (100%), but also on protocatechuate/catechol (30%), o-cresol (30%), 2-chlorophenol (75%) and 2,6-dichlorophenol (30%).
- the enzyme specifically catalyzes a para-carboxylation, and anaerobic growth of the organism on phenolic compounds and nitrate requires CO 2 .
- Thauera aromatica was cultured anaerobically at 30° C. in a mineral salt medium (1.08 g/L KH 2 PO 4 , 5.6 g/L K 2 HPO 4 , 0.54 g/L NH 4 Cl) supplemented with 0.1 mM CaCl 2 , 0.8 mM MgSO 4 , 1 mL/L vitamin solution (cyanocobalamin 100 mg/L, pyridoxamin-2 HCl 300 mg/L, Ca-D(+)-pantothenate 100 mg/L, thiamindichloride 200 mg/L, nicotinate 200 mg/L, 4-aminobenzoate 80 mg/L, D(+)-biotin 20 mg/L) and 1 mL/L of a solution of trace elements (25% HCl 10 mL/L, FeCl 2 .4H 2 O 1.5 g/l, ZnCl 2 70 mg/L, MnCl 2 .4H
- the assay conditions were as follows: 20 mM imidazole/HCl (pH 6.5), 20 mM KCl, 0.5 mM MnCl 2 , 2 mM 4-hydroxybenzoate, 50 ⁇ mol CO 2 (50 ⁇ L 1 M NaHCO 3 per 1 mL assay), 25 ⁇ L soluble fraction (see Example 4) per 1 mL assay.
- the reaction was started by addition of 10 ⁇ L 14 C-Na 2 CO 3 (7 kBq; specific radioactivity 80 nCi/mmol). After 5 min incubation at 30° C. the reaction was stopped by the addition of 30 ⁇ L 3 M perchloric acid per 250 ⁇ L sample.
- the precipitated proteins were centrifuged down and the supernatant was acidified with 150 ⁇ L 10 M formic acid.
- the mixture was incubated under steady flow of CO 2 (10 mL/min) to remove all the 14 CO 2 which was not fixed in the reaction. After 15 min 150 ⁇ L 1 M KHCO 3 was added and incubated another 15 min under steady flow of CO 2 (10 mL/min).
- the formed amount of non-volatile labeled product (4-hydroxybenzoate: 14 CO 2 ) was analyzed by liquid scintillation counting.
- Phenylphosphate is the substrate of the second enzyme E 2 , phenylphosphate carboxylase. It requires K + and Mn 2+ and catalyzes the carboxylation of phenylphosphate to 4-hydroxybenzoate.
- the assay conditions were as follows: 20 mM imidazole/HCl (pH 6.5), 20 mM KCl, 0.5 mM MnCl 2 , 2 mM phenylphosphate, 25 ⁇ mol CO 2 (25 ⁇ L 1 M NaHCO 3 per 1 mL assay), 25 ⁇ L soluble fraction (see Example 4) per 1 mL assay.
- the reaction was started by addition of 205 ⁇ L 14 C-Na 2 CO 3 (14 kBq; specific radioactivity 250 nCi/mmol). After 5 min incubation at 30° C. the reaction was stopped by the addition of 30 ⁇ L 3 M perchloric acid per 250 ⁇ L sample. The precipitated proteins were centrifuged down and the supernatant was acidified with 150 ⁇ L 10 M formic acid. The mixture was incubated under steady flow of CO 2 (10 ML/min) to remove all the 14 CO 2 which was not fixed in the reaction. After 15 min 150 ⁇ L of 1.0 M KHCO 3 was added and incubated another 15 min under steady flow of CO 2 (10 mL/min). The formed amount of non-volatile labeled product was analyzed by liquid scintillation counting.
- the carboxylase activity was calculated as described in Example 2 taking into account the fact that 3923 Bq (235380 dpm) ⁇ 25 ⁇ mol incorporated 14 Co 2 per 1 mL assay.
- the specific activity was determined to be 10 nmol/min/mg.
- Thauera aromatica (K 172) was cultured anaerobically at 30° C. with 0.5 mM phenol and 10 mM NaHCO 3 as sole source of carbon and energy, as well as 2 mM NaNO 3 as the terminal electron acceptor.
- the bacterial cells were harvested and 20 g of the bacterial cells were resuspended in 20 mL 20 mM imidazole/HCl (pH 6.5), 10% glycerol, 0.5 mM dithionite and traces of DNase I, disrupted (French Press, 137.6 MPa) and ultracentrifuged (100 000 ⁇ g).
- the supernatant with the soluble protein fraction contained all the 4-hydroxy-benzoate: 14 CO 2 -exchange activity (383 nmol min ⁇ 1 mg ⁇ 1 ) and phenylphosphate carboxylase activity (10 nmol min ⁇ 1 mg ⁇ 1 ).
- the supernatant was loaded on a DEAE Sepharose fast flow chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden).
- FIG. 2 shows the results of SDS-PAGE (12.5%) with fractions after chromatography of the soluble fraction of K172 (grown aerobically on phenol). A total amount of 20 ⁇ g protein was loaded per lane.
- Lane 1 K172 grown on 4-hydroxybenzoate/NO 3 -(105 ⁇ g supernatant); Lane 2: K172 grown on phenol/NO 3 ⁇ (10 5 ⁇ g supematent) show that three dominant phenol-induced proteins F1, F2, and F3 were separated. F1, F2, and F3 were identified by molecular weight: F1 ⁇ 60 kDa, F2 ⁇ 58 kDa, F3 ⁇ 67 kDa. Lane 3: pooled fractions containing F1; Lane 4: pooled fractions containing F2; Lanes 5-7: fractions 17-19; Lanes 8-10: fractions 53-55; Lane 1: proteins that did not bind to DEAE; and Lane 12: fraction 84 containing F3.
- the fractions containing F2 were subjected to peptide and N-terminal sequencing.
- peptide sequencing the fractions after chromatography on DEAE sepharose containing F2 were pooled and loaded on a Blue sepharose chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). Then the fractions containing F2 were pooled and digested with modified trypsin (Promega, Mannheim, Germany).
- the trypsin digest was done according to the following procedure: 500 ⁇ g protein in 200 ⁇ L of 20 mM Tris/HCl, pH 7.5, was adjusted to pH 8 with 3 ⁇ L of triethylamine.
- oligonucleotides On the basis of the N-terminal amino acid sequences of F1, F2, and F3 and of the internal fragments of F2 (Example 4), degenerated oligonucleotides were designed.
- the oligonucleotides F2-forward (N-terminus) (SEQ ID NO:32; ATG-GA T C -CT G C -CG C G -TAC-TTC-ATC), F2T6-reverse (SEQ ID NO:33; TT- G A TC- G A TC- G C AG-CAT-CTG-CAT) and F2T43-reverse (SEQ ID NO:34; CAT- C G AG-GAA- T C TC-GCGC-CTG-CTG) (both internal fragments) were used as primers in a polymerase chain reaction (PCR) with genomic DNA of Thauera aromatica as target.
- PCR polymerase chain reaction
- PCR conditions were as follows: 100 ng target, 200 nM each primer, 200 ⁇ M each of dATP, dCTP, dTTP, dGTP, 50 mM KCl, 1.5 mM MgCl 2 , 10 mM Tris/HCl (pH 9.0), 1 unit Taq-DNA-Polymerase (Amersham Parmacia Biotech, Uppsala, Sweden).
- PCR parameters were as follows: 95° C. 30 sec, 40° C. 1 min, 72° C. 2.5 min, 30 cycles.
- the PCR products were subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification.
- the purified PCR product (F2-forward/F2T43-reverse) in a size of approximately 750 bp was sequenced and confirmed to be the N-terminus of F2.
- the PCR product was labeled with [ 32 P]-dCTP and used as a probe for screening a ⁇ EMBL3 gene library of Thauera aromatica .
- One positive phage of about 11 kb was detected, prepared and restricted with BamHI, EcoRI and Pst1.
- the digests were subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification of the restriction fragments.
- the purified fragments were ligated in the corresponding pBluescript vector KS(+) [Ap r , lacZ, f1, ori] restricted with BamHI, EcoRI and Pst1, respectively. Ligation mix was used to transform competent E. coli XL 1-Blue and plated onto LB plates supplemented with IPTG, X-Gal and 50 ⁇ g/IL ampicillin. Plasmid DNA was prepared from several white colonies (clones 8, 9, and 19; FIGS. 3, 4, and 5 , respectively) and sequenced by dideoxy termination protocol using T7 and T3 primer (SEQ ID NO 35: 3′ CGGGATATCACTCAGCATAATG 5′ and SEQ ID NO 36:5′ AATTAACCCTCACTAAAGGG 3′, respectively). Nucleotide sequence analysis confirmed that the amino acid sequences deduced from the genes corresponded to the N termini of F1, F2, and F3.
- oligonucleotide designated breib31 SEQ ID NO:37; 5′ GACAACTTCGTCGTCAA 3′
- oligonucleotide designated breib07r3 SEQ ID NO:38; 5′ GTGGATATTGGCTTCGGAAA 3′
- PCR conditions were as described in Example 5.
- the PCR product was subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification.
- the purified PCR product in a size of approximately 500 bp was labeled with [ 32 P]-dCTP and used as a probe for screening a ⁇ EMBL3 gene library of Thauera aromatica .
- Two positive phages could be detected.
- the phage DNA was prepared and restricted with BamHI, EcoRI and Pst1.
- the digests were subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification of the restriction fragments.
- the purified fragments were ligated in the corresponding pBluescript vector KS(+) [Apr, lacZ, f1, ori] restricted with BamHI, EcoRI and Pst1, respectively. Ligation mix was used to transform competent E.
- Plasmid DNA was prepared from several white colonies (clone 2 with a 9 kb BamHI insert and clone 7 with a 3.7 kb Pst1 insert as described in FIGS. 6 and 7) and sequenced by dideoxy termination protocol using T3 primer (SEQ ID NO:36). DNA sequences upstream of the known sequences were revealed by DNA analysis (FIG. 12).
- oligonucleotide designated ⁇ 15-forward SEQ ID NO:39; 5′TCGCCGGCGACGACGCCG 3′
- oligonucleotide designated ⁇ 15-reverse SEQ ID NO:40; 5′ CCGCGCGCTGCGCCGCCG 3′
- PCR conditions were as follows: 100 ng target, 200 nM each primer, 200 ⁇ M each of dATP, dCTP, dTTP, dGTP, (NH 4 )SO 4 , KCl, 4.5 mM MgCl 2 , 10 mM Tris/HCl (pH 8.7), 1 ⁇ Q solution, 1 unit Taq-DNA-Polymerase (Qiagen, Hilden, Germany).
- PCR parameters were as follows: 95° C. 30 sec, 45° C. 1 min, 72° C. 2.5 min, 30 cycles.
- the PCR product was subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification.
- the purified PCR product in a size of approximately 600 bp was labeled with [ 32 P]-dCTP and used as a probe for screening a ⁇ zap express gene library (Stratagene, Heidelberg, Germany) of Thauera aromatica .
- One positive clone was detected.
- the phagemid was prepared according to the manufacturer's protocol and restricted with Sal1/EcoRI. After ethidium bromide agarose gel electrophoresis of the digest, the DNA insert was estimated to be 9 kb in size (clone 1—FIG. 8).
- the restricted DNA was blotted and hybridized with [ 32 P]-labeled probe designated as described above. A fragment of approximately 1 kb could be detected. DNA sequences downstream of the known sequences were revealed by DNA analysis (FIG. 12).
- a 3.7-kb Pst1 fragment, a 2.7-kb BamHI fragment, a 4.0-kb BamHI fragment, a 5.25-kb EcoRI fragment and a 9 kb BamHI fragment were each ligated to the corresponding pBluescript KS(+) [Apr, lacZ, f1, ori] vector restricted with BamHI, Pst1 and EcoRI, respectively (FIGS. 7, 3, 5 , and 4 , respectively).
- the plasmids were transformed into competent E. coli XL 1-blue.
- Plasmid DNA purified by alkaline lysis method was sequenced by dideoxy termination protocol using T7 and T3 primers (SEQ ID NO:35 and SEQ ID NO:36, respectively) and then by primer walking. About 14 kb (SEQ ID NO:23) were sequenced which contained two gene clusters that appear to be involved in phenol metabolism.
- nucleotide sequences of F1, F2, and F3 are provided in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, respectively, and their deduced amino acid sequences are provided in SEQ ID NO:1, SEQ ID NO:3, and SEQ ID NO:5, respectively.
- Nucleotide and amino acid sequences were analyzed using the PC/gene software package (Genofit). Homologous sequences were identified using the BLAST (Basic Local Alignment Search Tool; Altschul et al., J. Mol. Biol. 215:403-410 (1990)) search using the TBLASTN algorithm provided by the National Center for Biotechnology Information (Table 4 and FIG. 13).
- F3 shows homology to phosphoenolpyruvate (PEP) synthase.
- PEP phosphoenolpyruvate
- the reaction catalyzed by this enzyme is shown in FIG. 11.
- PEP-synthase is phosphorylted by ATP, AMP and Pi being the products.
- the phosphorylated enzyme transfers the ⁇ -phosphoryl group of ATP to pyruvate. This reaction may be similar to the proposed reaction mechanism of the phenol kinase, whereby phenol ultimately becomes phosphorylated.
- F1, F2, and F5 show good homology to the ubiD, a gene which codes for the 3-octaprenyl-4-hydroxybenzoate decarboxylase. This enzyme is involved in the biosynthesis of ubiquinone. The reaction catalyzed is shown in FIG. 11. This reaction is analogous to the reverse reaction of the postulated carboxylation of phenol.
- a 3.7-kb Pst1 fragment contains: orf1 (SEQ ID NO:6) which codes for F3 protein (SEQ ID NO:5) and orf2 (SEQ ID NO:12) which codes for unknown protein (SEQ ID NO:11).
- a 2.7-kb BamHI fragment contains: orf3 (SEQ ID NO:14) which codes for unknown protein (SEQ ID NO:13) and orf4 (SEQ ID NO:4) which codes for F2 protein (SEQ ID NO:3).
- a 4.0-kb BamHI fragment contains: orf5 (SEQ ID NO:8) which codes for F4 protein (SEQ ID NO:7), orf6 (SEQ ID NO:2) which codes for F1 protein (SEQ ID NO:1), and orf7 (SEQ ID NO:16) which codes for unknown protein (SEQ ID NO:15).
- a 5.25-kb EcoRI fragment contains: orf7 (SEQ ID NO:16) which codes for unknown protein, SEQ ID NO:15), orf8 (SEQ ID NO:10) which codes for F5 protein (SEQ ID NO:9), orf9 (SEQ ID NO:18) which codes for unknown protein, SEQ ID NO:17), and orf10 (SEQ ID NO:20) which codes for unknown protein, SEQ ID NO:19).
- Each restriction fragment was ligated into pBluescript SK.
- the recombinant plasmids were transformed into E. coli K38 containing the plasmid pGP 1-2 [kan r , cI857 T7Gen1(RNA Polymerase)] (Tabor and Richardson, 1985). Cells were grown in 1 mL Luria-Bertani medium plus ampicillin and kanamycin at 30° C. to an absorbance of 0.5 at 600 nm, washed in Maschinenman minimal medium (Fraenkel and Neidhardt, 1961) and resuspended in 5 mL Maschinenman minimal medium containing 0.01% (mass/volume) amino acids besides cysteine and methionine.
- FIG. 9 shows the experimentally determined molecular masses of the proteins. Expression of F1-F5 in E. coli (T7 experiment). 25 ⁇ L were loaded on each lane.
- Lanes 1, 4, 7 marker proteins
- Lane 2 Proteins (F3 & unknown) coded by 3.7 kb Pst1 fragment containing orf1 and orf2 respectively
- Lane 3 Proteins (unknown & F2) coded by 2.7 kb BamHI fragment containing orf3 and orf4 respectively
- Lane 5 Proteins (F5 and 3 unknowns) coded by 5.25 kb EcoRI fragment containing orf8, orf7, orf9 and orf10 respectively
- Lane 6 Proteins (F1, F4 and unknown) coded by 4.0 kb BamHI fragment containing orf6, orf5 and orf7.
- the predicted molecular masses agreed reasonably well with the experimentally determined molecular masses of FIG. 9.
- the horizontal isoelectric focussing was run overnight (15 h, 1400 V). After the first dimension the Immobiline Dry Strips were equilibrated twice for 15 min in equilibration buffer (0.05 M Tris/HCl pH 8.8, 6 M urea, 30% (w/v) glycerol, 2% (w/v) SDS, traces of bromophenol blue and 10 mg/mL DTT or 48 mg/mL iodoacetamide, respectively). The second dimension was a vertical SDS polyacrylamide gel electrophoresis (11.5% polyacrylamide) indicating phenol-induced proteins (FIG. 10). The proteins were blotted to a PVDF membrane and stained with Coomassie Blue.
- ORF Finder Open Reading Frame Finder
- the nucleotide sequence of an ORF is automatically transcribed in amino acid sequence by the ORF Finder. Comparison of deduced amino acid sequences of orf1-10 and orf-1 (see FIG. 11) with the experimentally determined N-terminal amino acid sequences of phenol-induced proteins and the internal sequences revealed that the following ORFs coded for known proteins. orf1 (SEQ ID NO:6) for F3, orf4 (SEQ ID NO:4) for F2, orf5 (SEQ ID NO:8) for F4, orf6 (SEQ ID NO:2) for F1 and orf8 (SEQ ID NO:10) for F5. The predicted molecular masses agreed reasonably well with the experimentally determined masses (FIG. 10).
- coli 189 aa 9 gi
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
This invention pertains to genes coding for phenol-induced proteins Five phenol-induced proteins isolated from Thauera aromatica. Three dominant phenol-induced proteins called F1, F2, and F3 respecitively were purified and sequenced to obtain the enzyme(s) that catalyze the 14CO2:4-hydroxybenzoate isotope exchange reaction and the carboxylation of phenylphosphate. The N-terminal amino acid sequences of these proteins as well as the N-terminus of the phenol-induced proteins (F4 and F5) were also determined.
Description
- This invention is in the field of molecular biology. More specifically, this invention pertains to nucleic acid fragments encoding phenol-induced proteins of the denitrifying bacterium Thauera aromatica.
- Phenolic compounds are basic chemicals of high interest to the chemical and pharmaceutical industries. Phenolic compounds are important plant constituents and phenol is formed from a variety of natural and synthetic substrates by the activity of microorganisms. The aerobic metabolism of phenol has been studied extensively; in all aerobic metabolic pathways oxygenases initiate the degradation of phenol by hydroxylation to catechol. Catechol can be oxygenolytically cleaved by dioxygenases, either by ortho- or meta-cleavage.
- Anaerobic metabolism of phenol, aniline, o-cresol (2-methylphenol), hydroquinone (1,4-dihydroxybenzene), catechol (1,2-dihydroxybenzene), naphthalene and phenanthrene (Zhang et al., App. Environ. Microbiol. 63:4759-4764 (1997)) by denitrifying and sulfate-reducing bacteria involves carboxylation of the aromatic ring ortho or para to the hydroxy or amino substituent. Products are 4-hydroxybenzoate, 4-aminobenzoate, 4-hydroxy-3-methylbenzoate, gentisate (2,5-dihydroxybenzoate), and protocatechuate (3,4-dihydroxybenzoate) (Heider et al., Eur. J. Biochem. 243:577-596 (1997)). Consortia of fermenting bacteria convert phenol to benzoate and decarboxylate 4-hydroxybenzoate to phenol (Winter et al., Appl. Microbiol. Biotechnol. 25:384-391 (1987); He et al., Eur. J. Biochem. 229:77-82 (1995); He et al., J. Bacteriol. 178:3539-3543 (1996); Van Schie et al., Appl. Environ. Microbiol. 64:2432-2438 (1998)). They also catalyze an isotope exchange between D2O and the proton at C4 of the aromatic ring of 4-hydroxybenzoate. Phenol carboxylation to 4-hydroxybenzoate in the denitrifying bacterium Thauera aromatica is the best studied of these carboxylation reactions and is a paradigm for this new type of carboxylation reaction (Tschech et al., Arch. Microbiol. 148:213-217 (1987); Lack et al., Eur. J. Biochem. 197:473-479 (1991); Lack et al., J. Bacteriol. 174:3629-3636 (1992); Lack et al., Arch. Microbiol. 161:132-139 (1994)).
- Without an isolated gene and corresponding sequence of the coding sequence, there remains a need for a convenient way to produce various intermediates in phenol metabolism with a transformed microorganism.
- Five phenol-induced proteins from Thauera aromatica have been isolated. Three dominant phenol-induced proteins called F1, F2, and F3 were purified and sequenced in an attempt to purify the enzyme(s) that catalyze the 14CO2:4-hydroxybenzoate isotope exchange reaction and the carboxylation of phenylphosphate. The N-terminal amino acid sequences of these proteins as well as the N-terminus of the phenol-induced proteins F4 and F5 were determined. Internal sequences of F2 were obtained by trypsin digest. All of these sequences have application in industrial processes that involve the use of phenol or its intermediates. The instant invention provides a means to manipulate phenol metabolism and to produce various phenol intermediates in recombinant microorganisms. The approach is based on the observation that anoxic growth with phenol and nitrate induces novel proteins that are lacking in cells grown with 4-hydroxybenzoate and nitrate.
- The following 44 sequence descriptions and sequence listings attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. §1.821-1.825 (“Requirements for Patent Applications contaning nucleotide sequences and/or Amino Acid Sequence Disclosure—the Sequence Rules”) and consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (1998) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 4.95(a-bis) and Section 208 and Annex C of the Administrative Instructions). The Sequence Descriptions contain the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IYUB standards described in Nucleic Acids Research 13:3021-3030 (1985) and in the Biochemical Journal 219(2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. §1.822. The present invention utilizes Wisconsin Package Version 9.0 software from Genetics Computer Group (GCG), Madison, Wis.
- SEQ ID NO:1 is the deduced amino acid sequence of protein F1 and is coded by orf6.
- SEQ ID NO:2 is the nucleotide sequence of orf6 that codes for protein F1.
- SEQ ID NO:3 is the deduced amino acid sequence of protein F2 and is coded by orf4.
- SEQ ID NO:4 is the nucleotide sequence of orf4 that codes for protein F2.
- SEQ ID NO:5 is the deduced amino acid sequence of protein F3 and is coded by orf1.
- SEQ ID NO:6 is the nucleotide sequence of orf1 that codes for protein F3.
- SEQ ID NO:7 is the deduced amino acid sequence of protein F4 and is coded by orf5.
- SEQ ID NO:8 is the nucleotide sequence of orf5 that codes for protein F4.
- SEQ ID NO:9 is the deduced amino acid sequence of protein F5 and is coded by orf8.
- SEQ ID NO:10 is the nucleotide sequence of orf8 that codes for protein F5.
- SEQ ID NO:11 is the deduced amino acid sequence of orf2.
- SEQ ID NO:12 is the nucleotide sequence of orf2 that codes for an unknown protein.
- SEQ ID NO:13 is the deduced amino acid sequence of orf3.
- SEQ ID NO:14 is the nucleotide sequence of orf3 that codes for an unknown protein.
- SEQ ID NO:15 is the deduced amino acid sequence of orf7.
- SEQ ID NO:16 is the nucleotide sequence of orf7 that codes for an unknown protein.
- SEQ ID NO:17 is the deduced amino acid sequence of orf9.
- SEQ ID NO:18 is the nucleotide sequence of orf9 that codes for an unknown protein.
- SEQ ID NO:19 is the deduced amino acid sequence of orf10.
- SEQ ID NO:20 is the nucleotide sequence of orf10 that codes for an unknown protein.
- SEQ ID NO:21 is the deduced amino acid sequence of orf-1.
- SEQ ID NO:22 is the nucleotide sequence of orf-1 that codes for an unknown protein.
- SEQ ID NO:23 is the nucleotide sequence containing two gene clusters that are involved in phenol metabolism.
- SEQ ID NO:24 is the N-terminal amino acid sequence of F1 (experimentally determined).
- SEQ ID NO:25 is the N-terminal amino acid sequence of F1 (deduced from the genes).
- SEQ ID NO:26 is the N-terminal amino acid sequence of F2 (experimentally determined).
- SEQ ID NO:27 is the N-terminal amino acid sequence of F2 (deduced from the genes).
- SEQ ID NO:28 is the N-terminal amino acid sequence of F3 (experimentally determined).
- SEQ ID NO:29 is the N-terminal amino acid sequence of F3 (deduced from the genes).
- SEQ ID NO:30 is the amino acid sequence of an internal fragment of F2 that was obtained by trypsin-digest.
- SEQ ID NO:31 is the amino acid sequence of an internal fragment of F2 that was obtained by trypsin-digest.
- SEQ ID NO:32 is the primer of F2-forward (N-terminus).
- SEQ ID NO:33 is the primer of F2T6-reverse.
- SEQ ID NO:34 is the primer of F2T43-reverse.
- SEQ ID NO:35 is the primer T7.
- SEQ ID NO:36 is the primer T3.
- SEQ ID NO:37 is the primer designated breib31.
- SEQ ID NO:38 is the primer designated breib07r3.
- SEQ ID NO:39 is the primer of λ15-forward.
- SEQ ID NO:40 is the primer of λ15-reverse.
- SEQ ID NO:41 is the N-terminal amino acid sequence of F4 (experimentally determined).
- SEQ ID NO:42 is the N-terminal amino acid sequence of F4 (deduced from the genes).
- SEQ ID NO:43 is the N-terminal amino acid sequence of F5 (experimentally determined).
- SEQ ID NO:44 is the N-terminal amino acid sequence of F5 (deduced from the genes).
- FIG. 1 shows phenol metabolism in Thauera aromatica. The enzymes active in this pathway are Phenylphosphate synthase E1); Phenylphosphate carboxylase (Mn2+, K+)(E2); 4-Hydroxybenzoate-CoA Ligase (E3); 4-Hydroxybenzoyl-CoA reductase (Mo, FAD, Fe/S) (E4); Benzoyl-CoA reductase (Fe/S, FAD) (E5).
- FIG. 2 shows SDS-PAGE (12.5%) with fractions after chromatography of the soluble fraction of K172 (grown anaerobically on phenol) on DEAE sepharose fast flow. See Example 4.
- FIG. 3 shows clone 8 (pKSBam2.7). See Example 8.
- FIG. 4 shows clone 9 (pKSEco5.25). See Example 8.
- FIG. 5 shows clone 19 (pKSBam4). See Example 8.
- FIG. 6 shows clone 2 (pKSBam9).
- FIG. 7 shows clone 7 (pKSPst3.7). See Example 8.
- FIG. 8 shows phagemid-vector—clone 1 (pBK-CMV).
- FIG. 9 shows the expression of F1-F5 in E. coli. See Example 9.
- FIG. 10 shows the two dimensional gel electrophoresis of 100 000× g supernatant of Thauera aromatica anaerobically grown on 4-hydroxybenzoate (A) and phenol (B), respectively. Phenol-induced proteins are indicated by triangulars.
- FIG. 11 shows the organization of the genes possibly involved in anaerobic phenol metabolism of Thauera aromatica and their homologies to known proteins.
- FIG. 12 shows the map of the orientation of the clones in the whole sequence of 14272 bp.
- FIG. 13 shows the organization of the genes, with restriction sites, involved in phenol metabolism of Thauera aromatica.
- Applicants have succeeded in identifying the genes coding for phenol-induced proteins. Five phenol-induced proteins from Thauera aromatica have been isolated. Three dominant phenol-induced proteins called F1, F2, and F3 were purified and sequenced to obtain the enzyme(s) that catalyze the 14CO2:4-hydroxybenzoate isotope exchange reaction and the carboxylation of phenylphosphate. The N-terminal amino acid sequences of these proteins as well as the N-terminus of the phenol-induced proteins F4 and F5 were determined. Internal sequences of F2 were obtained by trypsin digest. All of these sequences have utility in industrial processes. The instant invention provides a means to manipulate phenol metabolism and specifically the carboxylation of phenyl phosphate. Transformation of host cells with at least one copy of the identified genes under the control of appropriate promoters will provide the ability to produce various intermediates in phenol metabolism. The approach is based on the observation that anoxic growth with phenol and nitrate induces novel proteins that are lacking in cells grown with 4-hydroxybenzoate and nitrate.
- The following definitions are provided for the full understanding of terms and abbreviations used in this specification.
- The abbreviations in the specification correspond to units of measure, techniques, properties, or compounds as follows: “sec” means second(s), “min” means minute(s), “h” means hour(s), “d” means day(s), “L” means microliter, “mL” means milliliters, “L” means liters, “mM” means millimolar, “M” means molar, “mmol” means millimole(s), “Ampr” means ampicillin resistance, “Amps” means ampicillin sensitivity, “kb” means kilo base, “kd” means kilodaltons, “nm” means nanometers, and “wt” means weight. “ORF” means “open reading frame, “PCR” means polymerase chain reaction, “HPLC” means high performance liquid chromatography, “ca” means approximately, “dcw” means dry cell weight, “O.D.” means optical density at the designated wavelength, “IU” means International Units.
- “Polymerase chain reaction” is abbreviated PCR.
- “Open reading frame” is abbreviated ORF.
- “Sample channels ratio” is abbreviated SCR.
- “High performance liquid chromatography” is abbreviated HPLC.
- The term “F1” refers to the protein encoded by orf6.
- The term “F2” refers to the protein encoded by orf4.
- The term “F3” refers to the protein encoded by orf1.
- The term “F4” refers to the protein encoded by orf5.
- The term “F5” refers to the protein encoded by orf8.
- The term “E 1” refers to phenol phosphorylating, phenol kinase or phenylphosphate synthase. Phenol phosphorylating and phenol kinase are used interchangeably by those skilled in the art.
- The term “E 2” refers to phenylphosphate carboxylase.
- The terms “isolated nucleic acid fragment” or “isolated nucleic acid molecule” refer to a polymer of mononucleotides (RNA or DNA) that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment or an isolated nucleic acid molecule in the form of a polymer of mononucleotides may be comprised of one or more segments of cDNA, genomic DNA, or synthetic DNA.
- The terms “host cell” and “host microorganism” refer to a cell capable of receiving foreign or heterologous genes and expressing those genes to produce an active gene product. The term “suitable host cells” encompasses microorganisms such as bacteria and fungi, and also includes plant cells.
- The term “fragment” refers to a DNA or amino acid sequence comprising a subsequence of the nucleic acid sequence or protein of the instant invention. However, an active fragment of the instant invention comprises a sufficient portion of the protein to maintain activity.
- The term “gene cluster” refers to genes organized in a single expression unit or in close proximity to each other on the chromosome.
- The term “substantially similar” refers to nucleic acid fragments wherein changes in one or more nucleotide bases result in substitution of one or more amino acids, but do not affect the functional properties of the protein encoded by the DNA sequence. “Substantially similar” also refers to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate alteration of gene expression by antisense or co-suppression technology. “Substantially similar” also refers to modifications of the nucleic acid fragments of the instant invention such as deletion or insertion of one or more nucleotide bases that do not substantially affect the functional properties of the resulting transcript vis-à-vis the ability to mediate alteration of gene expression by antisense or co-suppression technology or alteration of the functional properties of the resulting protein molecule. It is therefore understood that the invention encompasses more than the specific exemplary sequences.
- For example, it is well known in the art that alterations in a gene which result in the production of a chemically equivalent amino acid at a given site, and yet do not effect the functional properties of the encoded protein, are common. For example, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue (such as glycine) or a more hydrophobic residue (such as valine, leucine, or isoleucine). Similarly, changes which result in substitution of one negatively charged residue for another (such as aspartic acid for glutamic acid) or one positively charged residue for another (such as lysine for arginine) can also be expected to produce a functionally equivalent product. Nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the protein molecule would also not be expected to alter the activity of the protein. Each of the proposed modifications is well within the routine skill in the art, as is determining what biological activity of the encoded products is retained. Moreover, the skilled artisan recognizes that substantially similar sequences encompassed by this invention are also defined by their ability to hybridize, under stringent conditions (0.1× SSC, 0.1% SDS, 65° C. and washed with 2× SSC, 0.1% SDS followed by 0.1× SSC, 0.1% SDS), with the sequences exemplified herein. Preferred substantially similar nucleic acid fragments of the instant invention are those nucleic acid fragments whose DNA sequences are at least 80% identical to the DNA sequence of the nucleic acid fragments reported herein. More preferred nucleic acid fragments are at least 90% identical to the DNA sequence of the nucleic acid fragments reported herein. Most preferred are nucleic acid fragments that are at least 95% identical to the DNA sequence of the nucleic acid fragments reported herein.
- A nucleic acid molecule is “hybridizable” to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989), particularly
Chapter 11 and Table 11.1 therein (entirely incorporated herein by reference). The conditions of temperature and ionic strength determine the “stringency” of the hybridization. For preliminary screening for homologous nucleic acids, low stringency hybridization conditions, corresponding to a Tm of 55°, can be used, e.g., 5× SSC, 0.1% SDS, 0.25% milk, and no formamide; or 30% formamide, 5× SSC, 0.5% SDS. Moderate stringency hybridization conditions correspond to a higher Tm, e.g., 40-45% formamide, with 5× or 6× SSC. Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra, 11.7-11.8). In one embodiment, the length for a hybridizable nucleic acid is at least about 10 nucleotides. Preferably, a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe. - A “substantial portion” refers to an amino acid or nucleotide sequence which comprises enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford putative identification of that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Basic Local Alignment Search Tool; Altschul et al., J. Mol. Biol. 215:403-410 (1993); see also www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or more contiguous amino acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect to nucleotide sequences, gene-specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or bacteriophage plaques). In addition, short oligonucleotides (generally 12 bases or longer) may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a “substantial portion” of a nucleotide sequence comprises enough of the sequence to afford specific identification and/or isolation of a nucleic acid fragment comprising the sequence. The instant specification teaches partial or complete amino acid and nucleotide sequences encoding one or more particular plant proteins. The skilled artisan, having the benefit of the sequences as reported herein, may now use all or a substantial portion of the disclosed sequences for the purpose known to those skilled in the art. Accordingly, the instant invention comprises the complete sequences as reported in the accompanying Sequence Listing, as well as substantial portions of those sequences as defined above.
- For example, it is well known in the art that antisense suppression and co-suppression of gene expression may be accomplished using nucleic acid fragments representing less than the entire coding region of a gene, and by nucleic acid fragments that do not share 100% identity with the gene to be suppressed. Moreover, alterations in a gene that result in the production of a chemically equivalent amino acid at a given site, but do not effect the functional properties of the encoded protein, are well known in the art. Thus, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can also be expected to produce a functionally equivalent product. Nucleotide changes which result in alteration of the N-terminal and C-terminal portions of the protein molecule would also not be expected to alter the activity of the protein. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products. Moreover, the skilled artisan recognizes that substantially similar sequences encompassed by this invention are also defined by their ability to hybridize, under stringent conditions (0.1× SSC, 0.1% SDS, 65° C.) or moderately stringent conditions, with the sequences exemplified herein. Preferred substantially similar nucleic acid fragments of the instant invention are those nucleic acid fragments whose DNA sequences are 80% identical to the DNA sequence of the nucleic acid fragments reported herein. More preferred nucleic acid fragments are 90% identical to the DNA sequence of the nucleic acid fragments reported herein. Most preferred are nucleic acid fragments that are 95% identical to the DNA sequence of the nucleic acid fragments reported herein.
- The term “complementary” is used to describe the relationship between nucleotide bases that are capable to hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the instant invention also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences.
- The term “percent identity” is a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the sequences. In the art, “identity” also means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as determined by the match between strings of such sequences. “Identity” and “similarity” can be readily calculated by known methods, including but not limited to those described in: Computational Molecular Biology (Lesk, A. M., ed.) Oxford University Press, New York (1988); Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.) Academic Press, New York (1993); Computer Analysis of Sequence Data Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, New Jersey (1994); Sequence Analysis in Molecular Biology (von Heinje, G., ed.) Academic Press (1987); and Sequence Analysis Primer (Gribskov, M. and Devereux, J., eds.) Stockton Press, New York (1991). Preferred methods to determine identity are designed to give the largest match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Preferred computer program methods to determine identity and similarity between two sequences include, but are not limited to, the GCG Pileup program found in the GCG program package, using the Needleman and Wunsch algorithm with their standard default values of gap creation penalty=12 and gap extension penalty=4 (Devereux et al., Nucleic Acids Res. 12:387-395 (1984)), BLASTP, BLASTN, and FASTA (Pearson et al., Proc. Natl. Acad. Sci. USA 85:2444-2448 (1988). The BLASTX program is publicly available from NCBI and other sources (BLAST Manual, Altschul et al., Natl. Cent. Biotechnol. Inf., Natl. Library Med. (NCBI NLM) NIH, Bethesda, Md. 20894; Altschul et al., J. Mol. Biol. 215:403-410 (1990); Altschul et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”, Nucleic Acids Res. 25:3389-3402 (1997)). The method to determine percent identity preferred in the instant invention is by the method of DNASTAR protein alignment protocol using the Jotun-Hein algorithm (Hein et al., Methods Enzymol. 183:626-645 (1990)). Default parameters used for the Jotun-Hein method for alignments are: for multiple alignments, gap penalty=11, gap length penalty=3; for pairwise alignments ktuple=2. As an illustration, for a polynucleotide having a nucleotide sequence with at least 95% “identity” to a reference nucleotide sequence, it is intended that the nucleotide sequence of the polynucleotide is identical to the reference sequence except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the reference nucleotide sequence. In other words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be deleted or substituted with another nucleotide, or a number of nucleotides up to 5% of the total nucleotides in the reference sequence may be inserted into the reference sequence. These mutations of the reference sequence may occur at the 5′ or 3′ terminal positions of the reference nucleotide sequence or anywhere between those terminal positions, interspersed either individually among nucleotides in the reference sequence or in one or more contiguous groups within the reference sequence. Analogously, for a polypeptide having an amino acid sequence having at least 95% identity to a reference amino acid sequence, it is intended that the amino acid sequence of the polypeptide is identical to the reference sequence except that the polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the reference amino acid. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a reference amino acid sequence, up to 5% of the amino acid residues in the reference sequence may be deleted or substituted with another amino acid, or a number of amino acids up to 5% of the total amino acid residues in the reference sequence may be inserted into the reference sequence. These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence.
- The term “percent homology” refers to the extent of amino acid sequence identity between polypeptides. When a first amino acid sequence is identical to a second amino acid sequence, then the first and second amino acid sequences exhibit 100% homology. The homology between any two polypeptides is a direct function of the total number of matching amino acids at a given position in either sequence, e.g., if half of the total number of amino acids in either of the two sequences are the same then the two sequences are said to exhibit 50% homology.
- “Codon degeneracy” refers to divergence in the genetic code permitting variation of the nucleotide sequence without effecting the amino acid sequence of an encoded polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment that encodes all or a substantial portion of the amino acid sequence encoding the instant Thauera aromatica proteins as set forth in SEQ ID NO:1, SEQ ID NO:3 and SEQ ID NO:5. The skilled artisan is well aware of the “codon-bias” exhibited by a specific host cell to use nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.
- “Synthetic genes” can be assembled from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These building blocks are ligated and annealed to form gene segments that are then enzymatically assembled to construct the entire gene. “Chemically synthesized”, as related to a sequence of DNA, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be accomplished using well established procedures, or automated chemical synthesis can be performed using one of a number of commercially available machines. Accordingly, the genes can be tailored for optimal gene expression based on optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determining preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.
- “Gene” refers to a nucleic acid fragment that expresses a specific protein, including regulatory sequences preceding (5′ non-coding sequences) and following (3′ non-coding sequences) the coding sequence. “Native gene” refers to a gene as found in nature with its own regulatory sequences. “Chimeric gene” refers to any gene, not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. “Endogenous gene” refers to a native gene in its natural location in the genome of an organism. A “foreign” gene refers to a gene not normally found in the host organism, but which is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, or chimeric genes. A “transgene” is a gene that has been introduced into the genome by a transformation procedure.
- “Coding sequence” refers to a DNA sequence that codes for a specific amino acid sequence. “Regulatory sequences” refer to nucleotide sequences located upstream (5′ non-coding sequences), within, or downstream (3′ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
- “Promoter” refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3′ to a promoter sequence. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. An “enhancer” is a DNA sequence which can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as “constitutive promoters”. New promoters of various types useful in plant cells are constantly being discovered; numerous examples may be found in the compilation by Okamuro and Goldberg, ( Biochemistry of Plants 15:1-82 (1989)). It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.
- The “translation leader sequence” refers to a DNA sequence located between the promoter sequence of a gene and the coding sequence. The translation leader sequence is present in the fully processed mRNA upstream of the translation start sequence. The translation leader sequence may affect processing of the primary transcript to mRNA, mRNA stability or translation efficiency. Examples of translation leader sequences have been described (Turner et a., Mol. Biotech. 3:225 (1995)).
- The “3′ non-coding sequences” refer to DNA sequences located downstream of a coding sequence and include polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3′ end of the mRNA precursor. The use of different 3′ non-coding sequences is exemplified by Ingelbrecht et al., Plant Cell 1:671-680 (1989).
- “RNA transcript” refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript. The RNA transcript it may be a RNA sequence derived from posttranscriptional processing of the primary transcript and is referred to then as the mature RNA. “Messenger RNA” (mRNA) refers to the RNA that is without introns and that can be translated into protein by the cell. “cDNA” refers to a double-stranded DNA that is complementary to and derived from mRNA. “Sense” RNA refers to RNA transcript that includes the mRNA and so can be translated into protein by the cell. “Antisense RNA” refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5′ non-coding sequence, 3′ non-coding sequence, introns, or the coding sequence. “Functional RNA” refers to antisense RNA, ribozyme RNA, or other RNA that is not translated, yet has an effect on cellular processes.
- The term “operably-linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably-linked with a coding sequence when it affects the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). Coding sequences can be operably-linked to regulatory sequences in sense or antisense orientation.
- The term “expression” refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide. “Antisense inhibition” refers to the production of antisense RNA transcripts capable of suppressing the expression of the target protein. “Overexpression” refers to the production of a gene product in transgenic organisms that exceeds levels of production in normal or non-transformed organisms. “Co-suppression” refers to the production of sense RNA transcripts capable of suppressing the expression of identical or substantially similar foreign or endogenous genes (U.S. Pat. No. 5,231,020).
- “Altered levels” refers to the production of gene product(s) in organisms in amounts or proportions that differ from that of normal or non-transformed organisms.
- “Transformation” refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” organisms. Examples of methods of plant transformation include Agrobacterium-mediated transformation (De Blaere et al., Meth. Enzymol. 143:277 (1987)) and particle-accelerated or “gene gun” transformation technology (Klein et al., Nature, London 327:70-73 (1987); U.S. 4,945,050).
- The terms “plasmid”, “vector” and “cassette” refer to an extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3′ untranslated sequence into a cell. “Transformation cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell. “Expression cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.
- Novel phenol-induced proteins, F1, F2, and F3, have been isolated. Comparison of their random cDNA sequences to the GenBank database using the BLAST algorithms, well known to those skilled in the art, revealed that F3 (orf1) and orf2 are proteins homologous to phosphoenolpyruvate sythase (PEP) of E. coli and are likely to represent the phenol phosphorylating enzyme E1 (FIG. 1). The nucleotide sequences of the F1, F2, and F3 genomic DNA are provided in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, and their deduced amino acid sequences are provided in SEQ ID NO:1, SEQ ID NO:3, and SEQ ID NO:5, respectively. F1, F2, and F3 genes from other bacteria can now be identified by comparison of random cDNA sequences to the F1, F2, and F3 sequences provided herein.
- The nucleic acid fragments of the instant invention may be used to isolate cDNAs and genes encoding homologous F1, F2, and F3 phenol-induced proteins from the same or other plant or fungal species. Isolating homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to, methods of nucleic acid hybridization and methods of DNA and RNA amplification as exemplified by various uses of nucleic acid amplification technologies (e.g., polymerase chain reaction (PCR) or ligase chain reaction).
- For example, other F1, F2, and F3 genes, either as cDNAs or genomic DNAs, could be isolated directly by using all or a portion of the instant nucleic acid fragments as DNA hybridization probes to screen libraries from any desired bacteria using methodology well known to those skilled in the art. Specific oligonucleotide probes based upon the instant F1, F2, and F3 sequences can be designed and synthesized by methods known in the art (Sambrook, supra). Moreover, entire sequences can be used directly to synthesize DNA probes by methods known to the skilled artisan such as random primers, DNA labeling, nick translation, or end-labeling techniques, or RNA probes using available in vitro transcription systems. In addition, specific primers can be designed and used to amplify a part of or full-length of the instant sequences. The resulting amplification products can be labeled directly during amplification reactions or labeled after amplification reactions, and used as probes to isolate full length cDNA or genomic fragments under conditions of appropriate stringency.
- In addition, two short segments of the instant ORF's may be used in PCR protocols to amplify longer nucleic acid fragments encoding homologous F1, F2, F3, F4, and F5 genes from DNA or RNA. The polymerase chain reaction may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the instant nucleic acid fragments, and the sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3′ end of the mRNA precursor encoding bacterial F1, F2, F3, F4, and F5. Alternatively, the second primer sequence may be based upon sequences derived from the cloning vector. For example, the skilled artisan can follow the RACE protocol (Frohman et al., Proc. Natl. Acad. Sci., USA 85:8998 (1988)) to generate cDNAs by using PCR to amplify copies of the region between a single point in the transcript and the 3′ or 5′ end. Primers oriented in the 3′ and 5′ directions can be designed from the instant sequences. Using commercially available 3′ RACE or 5′ RACE systems (BRL), specific 3′ or 5′ cDNA fragments can be isolated (Ohara et al., Proc. Natl. Acad. Sci., USA 86:5673 (1989); Loh et al., Science 243:217 (1989)). Products generated by the 3′ and 5′ RACE procedures can be combined to generate full-length cDNAs (Frohman et al., Techniques 1:165 (1989)).
- Availability of the instant nucleotide and deduced amino acid sequences facilitates immunological screening of cDNA expression libraries. Synthetic peptides representing portions of the instant amino acid sequences may be synthesized. These peptides can be used to immunize animals to produce polyclonal or monoclonal antibodies with specificity for peptides or proteins comprising the amino acid sequences. These antibodies can then be used to screen cDNA expression libraries to isolate full-length cDNA clones of interest (Lemer et al., Adv. Immunol. 36:1 (1984); Sambrook, supra).
- The enzymes and gene products of the instant ORF's may be produced in heterologous host cells, particularly in the cells of microbial hosts, and can be used to prepare antibodies to the resulting proteins by methods well known to those skilled in the art. The antibodies are useful for detecting the proteins in situ in cells or in vitro in cell extracts. Preferred heterologous host cells for production of the instant enzymes are microbial hosts and include those selected from the following: Comamonas sp., Corynebacterium sp., Brevibacterium sp., Rhodococcus sp., Azotobacter sp., Citrobacter sp., Enterobacter sp., Clostridium sp., Klebsiella sp., Salmonella s.p, Lactobacillus sp., Aspergillus sp., Saccharomyces sp., Zygosaccharomyces sp, Pichia sp., Kluyveromyces sp., Candida sp., Hansenula sp., Dunaliella sp., Debaryomyces sp., Mucor sp., Torylopsis sp., Methylobacteriasp., Bacillussp., Escherichia sp., Pseudomonas sp., Rhizobium sp., and Streptomyces sp. Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct chimeric genes for production of any of the gene products of the instant ORF's. These chimeric genes could then be introduced into appropriate microorganisms via transformation to provide high level expression of the enzymes.
- Additionally, chimeric genes will be effective in altering the properties of the host bacteria. It is expected, for example, that introduction of chimeric genes encoding one or more of the ORF's 1-10 under the control of the appropriate promoters, into a host cell comprising at least one copy of these genes will demonstrate the ability to produce various intermediates in phenol metabolism. For example, the appropriately
regulated ORF 1 andORF 2, would be expected to express an enzyme capable of phosphorylating phenol (phenylphosphate synthase—FIG. 1). Similarly,ORF 4,ORF 6,ORF 7 andORF 8 would be expected to express an enzyme capable of carboxylating phenylphosphate to afford 4-hydroxbenzoate (phenylphosphate carboxylase—FIG. 1). Finally, expression of SEQ ID NO:23 in a single recombinant organism will be expected to effect the conversion of phenol to 4-hydroxybenzoate in a transformed host (FIG. 1). - Vectors or cassettes useful for the transformation of suitable host cells are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant gene, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a
region 5′ of the gene which harbors transcriptional initiation controls and aregion 3′ of the DNA fragment which controls transcriptional termination. It is most preferred when both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host. - Initiation control regions or promoters, which are useful to drive expression of the instant ORF's in the desired host cell are numerous and familiar to those skilled in the art. A promoter capable of driving these genes is suitable for the present invention including but not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces); AOX1 (useful for expression in Pichia); and lac, trp, 1P L, 1PR, T7, tac, and trc (useful for expression in Escherichia coli). Useful strong promoters may also be used from Corynebacterium, Comamonas, Pseudomonas, and Rhodococcus.
- Termination control regions may also be derived from various genes native to the preferred hosts. Optionally, a termination site may be unnecessary, however, it is most preferred if included.
- In the denitrifying bacterium Thauera aromatica phenol carboxylation proceeds in two steps and involves formation of phenylphosphate as the first intermediate (Equation 1). Cells grown with phenol were simultaneously adapted to growth with 4-hydroxybenzoate, whereas, vice-versa, 4-hydroxybenzoate-grown cells did not metabolize phenol. Induction of the capacity to metabolize phenol required several hours.
- An enzyme activity catalyzing an isotope exchange of the phenyl moiety of phenylphosphate with free 14C-phenol was identified in extracts of phenol-grown cells (Equation 2), and was lacking in 4-hydroxybenzoate grown cells. Free 32P-phosphate did not exchange with phenylphosphate. This suggests a phosphorylated enzyme E1 (
Equations 3 and 4) which becomes phosphorylated in an essentially irreversible step (Equation 5). The phosphorylated enzyme transforms phenol to phenylphosphate in a reversible reaction (Equation 6). The whole reaction is understood as the sum ofEquation 5 andEquation 6. The phosphoryl donor X˜P is unknown so far. The enzyme E1 is termed phenol kinase. - Phenylphosphate is the substrate of a second enzyme E 2, phenylphosphate carboxylase. It requires K+ and Mn2+ and catalyzes the carboxylation of phenylphosphate to 4-hydroxybenzoate (Equation 7). An enzyme activity catalyzing an isotope exchange between the carboxyl of 4-hydroxybenzoate and free 4CO2 (Equation 8) was present in phenol-grown cells. Free 14C-phenol did not exchange. This suggests an enzyme E2-phenolate intermediate (
Equations 9 and 10) which is formed in a presumably exergonic reaction (Equation 11) followed by the reversible carboxylation (Equation 12). The actual substrate is CO2 rather than bicarbonate, and the carboxylating enzyme was not inhibited by avidin; both results suggest that biotin is not involved in carboxylation. The enzyme E2 is termed phenylphosphate carboxylase. - The present invention is further defined in the following Examples, in which all parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usage and conditions.
- Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1989 (hereinafter “Sambrook”); and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring, N.Y. (1984) and by Ausubel et al., Current Protocols in Molecular Biology, pub. by Greene Publishing Assoc. and Wiley-Interscience (1987).
- Manipulations of genetic sequences were accomplished using the suite of programs available from the Genetics Computer Group Inc. (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wis.) and PC/Gene©: the nucleic acid and protein sequence analysis software system, A. Bairoch, University of Geneva, Switzerland, Intelligenetics™ Inc. Serial Number IGI2626/Version 6.70; programs used were as follows: REFORM—sequence file conversion program, Version 4.3, February 1991; RESTRI—restriction site analysis; NMANIP—simple nucleic acid sequence manipulations (inverse and complement the sequence); HAIRPIN—search for hairpin loops in a nucleotide sequence; default parameters: minimum stem size: 5, lower range of number of unpaired bases: 3, upper range of number of unpaired bases: 20, allowed basepairs: G-C, A-T (A-U).
- In the denitrfying bacterium Thauera aromatica phenol carboxylation proceeds in two steps and involves formation of phenylphosphate as the first intermediate (FIG. 1). Cells grown with phenol were simultaneously adapted to growth with 4-hydroxybenzoate, whereas, vice-versa, 4-hydroxybenzoate-grown cells did not metabolize phenol. Induction of the capacity to metabolize phenol required several hours. The enzyme system not only acts on 4-hydroxy-benzoate/phenol (100%), but also on protocatechuate/catechol (30%), o-cresol (30%), 2-chlorophenol (75%) and 2,6-dichlorophenol (30%). The enzyme specifically catalyzes a para-carboxylation, and anaerobic growth of the organism on phenolic compounds and nitrate requires CO2.
- Both, the phosphorylating and the carboxylating enzymes (E 1 and E2, respectively), are strictly regulated. All activities were only present after anoxic growth of cells on phenol, and were lacking after growth on 4-hydroxybenzoate. Further metabolism of 4-hydroxybenzoate proceeds via benzyl-CoA in two steps, as shown in FIG. 1.
- Thauera aromatica (K 172) was cultured anaerobically at 30° C. in a mineral salt medium (1.08 g/L KH2PO4, 5.6 g/L K2HPO4, 0.54 g/L NH4Cl) supplemented with 0.1 mM CaCl2, 0.8 mM MgSO4, 1 mL/L vitamin solution (cyanocobalamin 100 mg/L, pyridoxamin-2 HCl 300 mg/L, Ca-D(+)-pantothenate 100 mg/L, thiamindichloride 200 mg/L, nicotinate 200 mg/L, 4-aminobenzoate 80 mg/L, D(+)-biotin 20 mg/L) and 1 mL/L of a solution of trace elements (25
% HCl 10 mL/L, FeCl2.4H2O 1.5 g/l, ZnCl2 70 mg/L, MnCl2.4H2O 100 mg/L, CoCl2.6H2O 100 mg/L, CuCl2.2H2O 2 mg/L, NiCl2.6H2O 24 mg/L, Na2MoO4.2H2O 36 mg/L, H3BO3 6 mg/L). 0.5 mM phenol and 10 mM NaHCO3 as sole source of carbon and energy were added, as well as 2 mM NaNO3 as the terminal electron acceptor. Note: All media, supplements and substrates were strictly anaerobic. - Escherichia coli strains XL1-blue [(F′, proAB, lacIqZΔM15, Tn10, tetR), gyrA96, hsdR17, recA1, relA1, thi-1, Δ(lac), Lambda-], K38 [hfrC, ompF267, phoA4, pit-10, relA1] and P2392 [hsdR514, supE44, supF58, lacY1, galK2, galT22, metB1, trpR55, mcrA, P2 lysogen] were cultured in Luria-Bertani medium at 37° C. (Sambrook). Antibiotics were added to E. coli cultures to the following final concentrations: kanamycin 50 μg/mL, ampicillin 50 μg/mL and tetracycline 20 μg/mL.
- The assay conditions were as follows: 20 mM imidazole/HCl (pH 6.5), 20 mM KCl, 0.5 mM MnCl 2, 2 mM 4-hydroxybenzoate, 50 μmol CO2 (50 μL 1 M NaHCO3 per 1 mL assay), 25 μL soluble fraction (see Example 4) per 1 mL assay. The reaction was started by addition of 10 μL 14C-Na2CO3 (7 kBq; specific radioactivity 80 nCi/mmol). After 5 min incubation at 30° C. the reaction was stopped by the addition of 30 μL 3 M perchloric acid per 250 μL sample. The precipitated proteins were centrifuged down and the supernatant was acidified with 150 μL 10 M formic acid. The mixture was incubated under steady flow of CO2 (10 mL/min) to remove all the 14CO2 which was not fixed in the reaction. After 15 min 150 μL 1 M KHCO3 was added and incubated another 15 min under steady flow of CO2 (10 mL/min). The formed amount of non-volatile labeled product (4-hydroxybenzoate:14CO2) was analyzed by liquid scintillation counting.
- Measurement of the 4-hydroxybenzoate: 4CO2-isotope exchange in the soluble fraction of cells grown on phenol and 4-hydroxybenzoate, respectively was performed in an assay described below:
50 mM MnCl 210 μL 2M KCl 10 μL 1M NaHCO3 50 μL 0.2M 4- hydroxybenzoate 10 μL 20 mM imidazole/HCl pH 6.5 895 μL soluble fraction 25 μL 14C—Na2CO3 10 μL (≅3923 Bq) - Following incubation for 4 min/30° C., to 200 μL sample treated as described above, 3.0 mL of scintillation cocktail was added and the amount of 14C was counted in a liquid scintillation counter for 5 min. The output of the scintillation counter was:
sample CpmA cpmB scr** dpmA dpmB % A* % B* Phenol grown 276 1659 0.168 0 1900 .00 87.32 cells 4-hydroxy- 6 20 0.318 0 25 .00 79.44 benzoate grown cells no cell extract 5 11 0.386 0 15 .00 75.97 (control) - Calculating of the activity (nmol min −1 mg−1): total incorporation of 14CO2 would result in a value of 235380 dpm (desintegrations per minute, 60×3923 Bq) per 50 μmoL NaHCO3 in 1 mL assay. 1900 dpm (see table dpmB) correspond to 32 Bq which means 382 nmol/4 min×200 μL sample. A 200 μL sample contains about 5 μL soluble fraction. The protein concentration of the soluble fraction of phenol-grown cells is about 62 mg/mL. Therefore, a 200 μL of sample corresponds to 310 μg soluble fraction. The specific activity was determined to be 308 nmol/min/mg protein.
- Phenylphosphate is the substrate of the second enzyme E 2, phenylphosphate carboxylase. It requires K+ and Mn2+ and catalyzes the carboxylation of phenylphosphate to 4-hydroxybenzoate. The assay conditions were as follows: 20 mM imidazole/HCl (pH 6.5), 20 mM KCl, 0.5 mM MnCl2, 2 mM phenylphosphate, 25 μmol CO2 (25 μL 1 M NaHCO3 per 1 mL assay), 25 μL soluble fraction (see Example 4) per 1 mL assay. The reaction was started by addition of 205 μL 14C-Na2CO3 (14 kBq; specific radioactivity 250 nCi/mmol). After 5 min incubation at 30° C. the reaction was stopped by the addition of 30 μL 3 M perchloric acid per 250 μL sample. The precipitated proteins were centrifuged down and the supernatant was acidified with 150 μL 10 M formic acid. The mixture was incubated under steady flow of CO2 (10 ML/min) to remove all the 14CO2 which was not fixed in the reaction. After 15 min 150 μL of 1.0 M KHCO3 was added and incubated another 15 min under steady flow of CO2 (10 mL/min). The formed amount of non-volatile labeled product was analyzed by liquid scintillation counting.
- See description in Example 2 with the difference that 0.2 M phenyl-phosphate instead of 4-hydroxybenzoate and 25 μL 1 M NaHCO 3 instead of 50 μL were used. The output of the scintillation counter was:
sample cpmA cpmB scr** dpmA dpmB % A* % B* phenol 21 114 0.199 0 134 .00 85.65 4-hydroxy- 7 19 0.360 0 24 .00 77.28 benzoate no extract 5 11 0.386 0 15 .00 75.97 - The carboxylase activity was calculated as described in Example 2 taking into account the fact that 3923 Bq (235380 dpm)≅25 μmol incorporated 14Co2 per 1 mL assay. The specific activity was determined to be 10 nmol/min/mg.
- Thauera aromatica (K 172) was cultured anaerobically at 30° C. with 0.5 mM phenol and 10 mM NaHCO3 as sole source of carbon and energy, as well as 2 mM NaNO3 as the terminal electron acceptor. The bacterial cells were harvested and 20 g of the bacterial cells were resuspended in 20 mL 20 mM imidazole/HCl (pH 6.5), 10% glycerol, 0.5 mM dithionite and traces of DNase I, disrupted (French Press, 137.6 MPa) and ultracentrifuged (100 000× g). The supernatant with the soluble protein fraction contained all the 4-hydroxy-benzoate:14CO2-exchange activity (383 nmol min−1 mg−1) and phenylphosphate carboxylase activity (10 nmol min−1 mg−1). The supernatant was loaded on a DEAE Sepharose fast flow chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). FIG. 2 shows the results of SDS-PAGE (12.5%) with fractions after chromatography of the soluble fraction of K172 (grown aerobically on phenol). A total amount of 20 μg protein was loaded per lane. Lane 1: K172 grown on 4-hydroxybenzoate/NO3-(105× g supernatant); Lane 2: K172 grown on phenol/NO3 − (105× g supematent) show that three dominant phenol-induced proteins F1, F2, and F3 were separated. F1, F2, and F3 were identified by molecular weight: F1≈60 kDa, F2≈58 kDa, F3≈67 kDa. Lane 3: pooled fractions containing F1; Lane 4: pooled fractions containing F2; Lanes 5-7: fractions 17-19; Lanes 8-10: fractions 53-55; Lane 1: proteins that did not bind to DEAE; and Lane 12: fraction 84 containing F3.
- The fraction, after chromatography on DEAE sepharose, containing F1 were pooled and loaded on a MonoQ chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). Then the fractions containing F1 were pooled and blotted to an immobilon-P sq transfer membrane (Millipore, Bedford, Mass.). After staining of the PVDF membrane with Coomassie Blue, F1 was cut off and sequenced using an Applied Biosystems 473A sequencer (Table 1).
- The fractions containing F2 were subjected to peptide and N-terminal sequencing. For peptide sequencing, the fractions after chromatography on DEAE sepharose containing F2 were pooled and loaded on a Blue sepharose chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). Then the fractions containing F2 were pooled and digested with modified trypsin (Promega, Mannheim, Germany). The trypsin digest was done according to the following procedure: 500 μg protein in 200 μμL of 20 mM Tris/HCl, pH 7.5, was adjusted to
pH 8 with 3 μL of triethylamine. 10 μg trypsin in 10 μL H2O (Promega sequencing grade modified, catalog #V5111) were added. The digest was carried out at 37° C. for 4 h. The reaction was stopped by heating for 5 min to 100° C. Aftercentrifugation 5 μL, 70 μL and 100 μL, respectively, were applied to the HPLC. The peptides generated were separated on a reverse phase C-18 Superpac-Sephasil high performance liquid chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). Fractions containing well resolved peptides were sequenced (Table 2). - For N-terminal sequencing, the pooled fractions after chromatography on DEAE sepharose containing F2 were loaded on a MonoQ chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). Then the fractions containing F2 were pooled and blotted to a immobilon-Psq transfer membrane (Millipore, Bedford, Mass.). After staining of the PVDF membrane with Coomassie Blue, F2 was cut off and sequenced using an Applied Biosystems 473A sequencer (Table 1).
- After chromatography on DEAE sepharose the pooled fractions containing F3 were loaded on a MonoQ chromatography column (Amersham Pharmacia Biotech, Uppsala, Sweden). The fractions containing F3 were pooled and blotted to a immobilon-Psq transfer membrane (Millipore, Bedford, Mass.). After staining of the PVDF membrane with Coomassie Blue, F3 was cut off and sequenced using an Applied Biosystems 473A sequencer (Table 1).
TABLE 1 N-Terminal Amino Acid Sequence N-Terminal Amino (Applied Biosystems 473A Acid Sequence Sequencer)* Deduced from the Genes F1 gKISA PKNNR EFIEA sVKSG MGKIIS APKNN REFIE DAVRI RQEVD WDNEA GAIVr ACVKS GDAVR I PA (SEQ ID NO: 24) (SEQ ID NO: 25) F2 MDLRY FINQX AEAHE LKRIT MDLRY FINQC ABAHE TEVDW NLEIS HVsKL XXe LKRIT TEVDW NLEIS (SEQ ID NO: 26) HVSKL TEE (SEQ ID NO: 27) F3 MKFPV PHDIQ AKTIP GTEGw MKFPV PHDIQ AKTIP ERMYP XXXAF VXd GTEGW ERMYP YHYQF VTD (SEQ ID NO: 28) (SEQ ID NO: 29) -
TABLE 2 Internal Fragments by Trypsin-Digest: Amino Acid Sequence F2 .FHEGG gg. .MQMLD DK. (SEQ ID NO: 30) .QVADA VIASN TGSYg M. .FWSVV DER. .IXTEV DWNLE ISXV. .TATLW TELEQ MR. .YIGTM VSVVL YDPET GR. .GQQAE FLMAX XXXXP VXAGA EIVLE XGI. (SEQ ID NO: 31) .GQQAE FLM.. - On the basis of the N-terminal amino acid sequences of F1, F2, and F3 and of the internal fragments of F2 (Example 4), degenerated oligonucleotides were designed. The oligonucleotides F2-forward (N-terminus) (SEQ ID NO:32; ATG-GA T C-CTG C-CGC G-TAC-TTC-ATC), F2T6-reverse (SEQ ID NO:33; TT-G ATC-G ATC-G CAG-CAT-CTG-CAT) and F2T43-reverse (SEQ ID NO:34; CAT-C GAG-GAA-T CTC-GCGC-CTG-CTG) (both internal fragments) were used as primers in a polymerase chain reaction (PCR) with genomic DNA of Thauera aromatica as target. PCR conditions were as follows: 100 ng target, 200 nM each primer, 200 μM each of dATP, dCTP, dTTP, dGTP, 50 mM KCl, 1.5 mM MgCl2, 10 mM Tris/HCl (pH 9.0), 1 unit Taq-DNA-Polymerase (Amersham Parmacia Biotech, Uppsala, Sweden). PCR parameters were as follows: 95° C. 30 sec, 40° C. 1 min, 72° C. 2.5 min, 30 cycles. The PCR products were subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification.
- The purified PCR product (F2-forward/F2T43-reverse) in a size of approximately 750 bp was sequenced and confirmed to be the N-terminus of F2. The PCR product was labeled with [ 32P]-dCTP and used as a probe for screening a λEMBL3 gene library of Thauera aromatica. One positive phage of about 11 kb was detected, prepared and restricted with BamHI, EcoRI and Pst1. The digests were subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification of the restriction fragments. The purified fragments were ligated in the corresponding pBluescript vector KS(+) [Apr, lacZ, f1, ori] restricted with BamHI, EcoRI and Pst1, respectively. Ligation mix was used to transform competent E. coli XL 1-Blue and plated onto LB plates supplemented with IPTG, X-Gal and 50 μg/IL ampicillin. Plasmid DNA was prepared from several white colonies (
8, 9, and 19; FIGS. 3, 4, and 5, respectively) and sequenced by dideoxy termination protocol using T7 and T3 primer (SEQ ID NO 35: 3′clones CGGGATATCACTCAGCATAATG 5′ and SEQ ID NO 36:5′AATTAACCCTCACTAAAGGG 3′, respectively). Nucleotide sequence analysis confirmed that the amino acid sequences deduced from the genes corresponded to the N termini of F1, F2, and F3. - The oligonucleotide designated breib31 (SEQ ID NO:37; 5′
GACAACTTCGTCGTCAA 3′) and the oligonucleotide designated breib07r3 (SEQ ID NO:38; 5′GTGGATATTGGCTTCGGAAA 3′) were used as primers in a PCR with genomic DNA of Thauera aromatica as target. PCR conditions were as described in Example 5. The PCR product was subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification. The purified PCR product in a size of approximately 500 bp was labeled with [32P]-dCTP and used as a probe for screening a λEMBL3 gene library of Thauera aromatica. Two positive phages could be detected. The phage DNA was prepared and restricted with BamHI, EcoRI and Pst1. The digests were subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification of the restriction fragments. The purified fragments were ligated in the corresponding pBluescript vector KS(+) [Apr, lacZ, f1, ori] restricted with BamHI, EcoRI and Pst1, respectively. Ligation mix was used to transform competent E. coli XL1-Blue which was plated onto LB plates supplemented with IPTG, X-Gal and 50 μg/mL ampicillin. Plasmid DNA was prepared from several white colonies (clone 2 with a 9 kb BamHI insert andclone 7 with a 3.7 kb Pst1 insert as described in FIGS. 6 and 7) and sequenced by dideoxy termination protocol using T3 primer (SEQ ID NO:36). DNA sequences upstream of the known sequences were revealed by DNA analysis (FIG. 12). - The oligonucleotide designated λ15-forward (SEQ ID NO:39; 5′
TCGCCGGCGACGACGCCG 3′) and the oligonucleotide designated λ15-reverse (SEQ ID NO:40; 5′CCGCGCGCTGCGCCGCCG 3′) were used as primers in a PCR with genomic DNA of Thauera aromatica as target. PCR conditions were as follows: 100 ng target, 200 nM each primer, 200 μM each of dATP, dCTP, dTTP, dGTP, (NH4)SO4, KCl, 4.5 mM MgCl2, 10 mM Tris/HCl (pH 8.7), 1× Q solution, 1 unit Taq-DNA-Polymerase (Qiagen, Hilden, Germany). PCR parameters were as follows: 95° C. 30 sec, 45° C. 1 min, 72° C. 2.5 min, 30 cycles. The PCR product was subjected to ethidium bromide agarose gel electrophoresis followed by excision and purification. The purified PCR product in a size of approximately 600 bp was labeled with [32P]-dCTP and used as a probe for screening a λzap express gene library (Stratagene, Heidelberg, Germany) of Thauera aromatica. One positive clone was detected. The phagemid was prepared according to the manufacturer's protocol and restricted with Sal1/EcoRI. After ethidium bromide agarose gel electrophoresis of the digest, the DNA insert was estimated to be 9 kb in size (clone 1—FIG. 8). The restricted DNA was blotted and hybridized with [32P]-labeled probe designated as described above. A fragment of approximately 1 kb could be detected. DNA sequences downstream of the known sequences were revealed by DNA analysis (FIG. 12). - A 3.7-kb Pst1 fragment, a 2.7-kb BamHI fragment, a 4.0-kb BamHI fragment, a 5.25-kb EcoRI fragment and a 9 kb BamHI fragment were each ligated to the corresponding pBluescript KS(+) [Apr, lacZ, f1, ori] vector restricted with BamHI, Pst1 and EcoRI, respectively (FIGS. 7, 3, 5, and 4, respectively). The plasmids were transformed into competent E. coli XL 1-blue. Plasmid DNA purified by alkaline lysis method was sequenced by dideoxy termination protocol using T7 and T3 primers (SEQ ID NO:35 and SEQ ID NO:36, respectively) and then by primer walking. About 14 kb (SEQ ID NO:23) were sequenced which contained two gene clusters that appear to be involved in phenol metabolism.
- The nucleotide sequences of F1, F2, and F3 are provided in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, respectively, and their deduced amino acid sequences are provided in SEQ ID NO:1, SEQ ID NO:3, and SEQ ID NO:5, respectively. Nucleotide and amino acid sequences were analyzed using the PC/gene software package (Genofit). Homologous sequences were identified using the BLAST (Basic Local Alignment Search Tool; Altschul et al., J. Mol. Biol. 215:403-410 (1990)) search using the TBLASTN algorithm provided by the National Center for Biotechnology Information (Table 4 and FIG. 13).
- F3 shows homology to phosphoenolpyruvate (PEP) synthase. The reaction catalyzed by this enzyme is shown in FIG. 11. First, PEP-synthase is phosphorylted by ATP, AMP and Pi being the products. In a second step, the phosphorylated enzyme transfers the β-phosphoryl group of ATP to pyruvate. This reaction may be similar to the proposed reaction mechanism of the phenol kinase, whereby phenol ultimately becomes phosphorylated.
- F1, F2, and F5 show good homology to the ubiD, a gene which codes for the 3-octaprenyl-4-hydroxybenzoate decarboxylase. This enzyme is involved in the biosynthesis of ubiquinone. The reaction catalyzed is shown in FIG. 11. This reaction is analogous to the reverse reaction of the postulated carboxylation of phenol.
- A 3.7-kb Pst1 fragment contains: orf1 (SEQ ID NO:6) which codes for F3 protein (SEQ ID NO:5) and orf2 (SEQ ID NO:12) which codes for unknown protein (SEQ ID NO:11). A 2.7-kb BamHI fragment contains: orf3 (SEQ ID NO:14) which codes for unknown protein (SEQ ID NO:13) and orf4 (SEQ ID NO:4) which codes for F2 protein (SEQ ID NO:3). A 4.0-kb BamHI fragment contains: orf5 (SEQ ID NO:8) which codes for F4 protein (SEQ ID NO:7), orf6 (SEQ ID NO:2) which codes for F1 protein (SEQ ID NO:1), and orf7 (SEQ ID NO:16) which codes for unknown protein (SEQ ID NO:15). A 5.25-kb EcoRI fragment contains: orf7 (SEQ ID NO:16) which codes for unknown protein, SEQ ID NO:15), orf8 (SEQ ID NO:10) which codes for F5 protein (SEQ ID NO:9), orf9 (SEQ ID NO:18) which codes for unknown protein, SEQ ID NO:17), and orf10 (SEQ ID NO:20) which codes for unknown protein, SEQ ID NO:19). Each restriction fragment was ligated into pBluescript SK.
- For expression of the genes, the recombinant plasmids were transformed into E. coli K38 containing the plasmid pGP 1-2 [kanr, cI857 T7Gen1(RNA Polymerase)] (Tabor and Richardson, 1985). Cells were grown in 1 mL Luria-Bertani medium plus ampicillin and kanamycin at 30° C. to an absorbance of 0.5 at 600 nm, washed in Werkman minimal medium (Fraenkel and Neidhardt, 1961) and resuspended in 5 mL Werkman minimal medium containing 0.01% (mass/volume) amino acids besides cysteine and methionine. After incubation for 1-2 h at 30° C. the temperature was shifted to 42° C. to induce expression of T7 polymerase. After 15 min E. coli RNA synthesis was stopped by addition of 200 μg rifampicin/mL. The cells were incubated for 10 min at 42° C. and for further 20 min at 30° C. to ensure degradation of E. coli mRNA. Aliquots of 1 mL of the induced culture were subsequently pulse-labeled with 10 μCi [35S]methionine (Amersham) for 5 min at 30° C. Cells were centrifuged, resuspended in 120 μiL sample buffer and lysed by 5 min incubation at 95° C. Labeled proteins were separated by sodium dodecyl sulfate gel electrophoresis and localized by autoradiography. FIG. 9 shows the experimentally determined molecular masses of the proteins. Expression of F1-F5 in E. coli (T7 experiment). 25 μL were loaded on each lane.
1, 4, 7: marker proteins; Lane 2: Proteins (F3 & unknown) coded by 3.7 kb Pst1 fragment containing orf1 and orf2 respectively; Lane 3: Proteins (unknown & F2) coded by 2.7 kb BamHI fragment containing orf3 and orf4 respectively; Lane 5: Proteins (F5 and 3 unknowns) coded by 5.25 kb EcoRI fragment containing orf8, orf7, orf9 and orf10 respectively; and Lane 6: Proteins (F1, F4 and unknown) coded by 4.0 kb BamHI fragment containing orf6, orf5 and orf7. The predicted molecular masses agreed reasonably well with the experimentally determined molecular masses of FIG. 9.Lanes - 120 μg of the soluble fraction of cells that were grown on phenol/nitrate and of cells grown on 4-hydroxybenzoate, respectively, were lysed in 10 μL lysis buffer (9.5 M urea, 2% (w/v) CHAPS, 0.8% (w/v) ampholytes pH 3-10 (40% (w/v); Biorad), 1% (w/v) DTT, traces of bromophenol blue) and applied to a rehydrated Immobiline Dry Strip (linear pH gradient 3-10; Pharmacia) according to the manufacturers protocol (rehydration buffer: 8 M urea, 0.5% (w/v) CHAPS, 15 mM DTT, 0.2% (w/v) ampholytes pH 3-10 (40% (w/v); Biorad). The horizontal isoelectric focussing was run overnight (15 h, 1400 V). After the first dimension the Immobiline Dry Strips were equilibrated twice for 15 min in equilibration buffer (0.05 M Tris/HCl pH 8.8, 6 M urea, 30% (w/v) glycerol, 2% (w/v) SDS, traces of bromophenol blue and 10 mg/mL DTT or 48 mg/mL iodoacetamide, respectively). The second dimension was a vertical SDS polyacrylamide gel electrophoresis (11.5% polyacrylamide) indicating phenol-induced proteins (FIG. 10). The proteins were blotted to a PVDF membrane and stained with Coomassie Blue. The phenol-induced proteins F4 and F5 were cut off and N-terminal sequenced using an Applied Biosystems 473A sequencer (Table 3). Analysis of the amino acid sequence and translation into nucleotide sequence confirmed the genes encoding for F4 and F5. Furthermore, the predicted molecular masses agreed reasonably well with the experimentally determined masses.
TABLE 3 N-Terminal Amino Acid Sequence N-Terminal Amino (Applied Biosystems 473A Acid Sequence Sequencer) Deduced from the Genes F4 MEQAK NIKLV MEQAK NIKLV (SEQ ID NO: 42) (SEQ ID NO: 41) F5 MRIVV GMXGA MRIVV GMSGA (SEQ ID NO: 44) (SEQ ID NO: 43) - About 14 kb of the λEMBL3 gene library were sequenced (SEQ ID NO:23). The nucleotide sequence was analyzed with The ORF Finder (Open Reading Frame Finder) (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) to find the open reading frames (ORFs). Eleven ORFs could be detected (orfs1-10 and orf-1) as shown in FIG. 11.
- Analysis of the sequence revealed 10 ORFs that were transcribed in the same direction. The first six ORFs were separated by less than 65 bp and totaled 7210 bp. This cluster of putative genes was followed by a 658 bp non-coding region containing putative secondary structures.
- Another cluster of putative genes followed which also showed less than 40 bp intergenic regions. Downstream of orf10 470 bp were sequenced; however this appeared not to code for proteins. Upstream of orf7 and transcribed in the opposite direction another putative gene was found which was separated by 428 bp from orf1.
- The nucleotide sequence of an ORF is automatically transcribed in amino acid sequence by the ORF Finder. Comparison of deduced amino acid sequences of orf1-10 and orf-1 (see FIG. 11) with the experimentally determined N-terminal amino acid sequences of phenol-induced proteins and the internal sequences revealed that the following ORFs coded for known proteins. orf1 (SEQ ID NO:6) for F3, orf4 (SEQ ID NO:4) for F2, orf5 (SEQ ID NO:8) for F4, orf6 (SEQ ID NO:2) for F1 and orf8 (SEQ ID NO:10) for F5. The predicted molecular masses agreed reasonably well with the experimentally determined masses (FIG. 10).
- The deduced amino acid sequences of the ORFs was analyzed by using the BLAST search (Basic Local Alignment Search Tool; Altschul et al., J. Mol. Biol. 215:403-410 (1990)) using the BLASTP 2.0.8 algorithm (http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST/nph-newblast) provided by the National Center for Biotechnology Information and by using the BLAST+BEAUTY searches using the NCBI BLAST Server (http://dot.imgen.bcm.tmc.edu:9331/seq-search/Options/beauty_pp.html) (Tables 4 and 5). Table 4 contains homologous hits and Table 5 contains hits with the highest homology.
- orf1 (SEQ ID NO:6) and orf2 (SEQ ID NO:12) are likely to encode for the phenol-phosphorylating enzyme E 1. This conclusion is deduced from the high similarity of the genes with the domains of PEP synthase of E. coli. PEP synthase catalyzes a similar posphorylation reaction (FIGS. 1 and 11).
- orf4 (SEQ ID NO:4), orf6 (SEQ ID NO:2), orf7 (SEQ ID NO:16) and orf8 (SEQ ID NO:10) are likely to represent the carboxylating enzyme E 2. This conclusion is deduced from the high similarity of the genes with two enzymes of E. coli that catalyze the decarboxylation of a 4-hydroxybenzoate isoprene derivative to the corresponding phenolic product (ubiD and ubiX). This reaction is formally equal to the phenol carboxylation reaction (FIGS. 1 and 11).
- The function of the proteins encoded by orf3 (SEQ ID NO:14), orf5 (SEQ ID NO:8), orf9 (SEQ ID NO:18) and orf70 (SEQ ID NO:20) are unknown, and have low homology to other known sequences.
TABLE 4 SEQ- ID Nuc- SEQ ID % % Similarity leo- Amino Iden- Similar- E- ORF Identified tide Acid titya ityb valueC −1 gnl|PID|d1010531 22 21 47.2 72.3 1e-20 582 (D63814) pheR aa [Pseudomonasputida] 563 aa 1 gi|147146(M69116) 6 5 16.7 39.3 4e-10 612 PEP synthase aa [E. coli] 793 aa 2 gi|147146(M69116) 12 11 21.8 34.5 1e-63 233 PEP synthase aa [E. coli] 793 aa 3 gi|2621183 14 13 14.5 30.2 1e-8 223 (AE000803) aa inosine-5′- monophosphate dehydrogenase [Methanobacterium thermoauto- trophicum] 484 aa 4 gi|549586|sp|P26615| 4 3 30.8 58.95 5e-47 472 yigC aa [E. coli] 497 aa 5 gi|2851406|sp|P45396| 8 7 38.8 63.8 2e-25 169 yrbI aa [E. coli] 188 aa 6 gi|549586|sp|P26615| 2 1 29.4 57.1 1e-31 485 yigC aa [E coli] 497 aa 7 gi|549586|sp|P26615| 16 15 24.7 47.5 7e-25 357 yigC aa [E. coli] 497 aa 8 gi|2507150|sp|P09550| 10 9 60.3 86.8 5e-56 194 ubiX aa [E. coli] 189 aa 9 gi|2622617 18 17 40 64.8 8e-13 143 (AE000910) aa conserved protein [Methanbacterium- thermo.] 122 aa 10 gi|2129134|pir| 20 19 36.1 62.7 2e-9 182 D64443| aa mutator protein mutT [Methanoccus jann.] 169 aa -
TABLE 5 Amino Acid Name Gene Dir Range Size Top Hit PheR Transcriptional regulator ← 688 2479 582 gil3445531 (AF026065) positive phenol-degradative gene regulator F3 PEP Synthase → 2864 4703 612 splO29548IPPSA_ARCFU PROBABLE PHOSPHOENOLPYRUVATE SYNTHASE PEP Synthase → 4707 5841 374 splP46893IPPSA_STAMA PROBABLE PHOSPHOENOLPYRUVATE SYNTHASE (PYRUVATE, WATER DIKINASE) (PEP SYNTHASE) inosine-5′-monophosphate → 5853 6525 223 gil2621183 (AE000803) inosine-5′-monophosphate dehydrogenase dehydrogenase [Methanobacterium thermoautotrophicum] F2 hypothetical protein → 6587 8006 472 gil2650432 (AE001091) conserved hypothetical protein [Archaeoglobus (oxidoreductase) fulgidus] F4 YRBI_ECOLI HYPOTHETICAL → 8070 8580 169 splP45396IYRBI_ECOLI HYPOTHETICAL 20.0 KD PROTEIN IN MURA-RPON INTERGENIC REGION F1 probable membrane protein → 8589 10074 485 pirIIS62018 probable membrane protein YDR539w - yeast [Saccharomyces cerevisiae] Conserved Hypothetical → 10773 11805 357 gil2622505 (AE000902) conserved protein [Methanobacterium (oxidoreductase?) thermoautotrophicum] F5 Decarboxylase → 11819 12404 194 splP09550IUBIX-ECOLI 3-OCTAPRENYL-4-HYDROXYBENZOATE CARBOXY-LYASE (POLYPRENYL P-HYDROXYBENZOATE DECARBOXYLASE) conserved protein → 12414 12846 143 gil2622617 (AE000910) conserved protein [Methanobacterium thermoautotrophicum] mutator MutT protein → 12884 13433 182 gil2622420 (AE000895) mutator MutT protein [Methanobacterium thermoautotrophicum] -
-
1 44 1 485 PRT Thauera aromatica 1 Met Gly Lys Ile Ser Ala Pro Lys Asn Asn Arg Glu Phe Ile Glu Ala 1 5 10 15 Cys Val Lys Ser Gly Asp Ala Val Arg Ile Arg Gln Glu Val Asp Trp 20 25 30 Asp Asn Glu Ala Gly Ala Ile Val Arg Arg Ala Cys Glu Leu Ala Glu 35 40 45 Ala Ala Pro Phe Met Glu Asn Ile Lys Asp Tyr Pro Gly Phe Ser Tyr 50 55 60 Phe Gly Ala Pro Leu Ser Thr Tyr Arg Arg Met Ala Ile Ser Leu Gly 65 70 75 80 Met Asp Pro Ala Ser Thr Leu Pro Gln Ile Gly Ala Glu Tyr Leu Lys 85 90 95 Arg Thr Asn Ser Glu Pro Val Ala Pro Val Ile Val Asp Lys Arg Asp 100 105 110 Ala Pro Cys Lys Glu Asn Ile Leu Leu Gly Ala Asp Val Asp Leu Thr 115 120 125 Lys Leu Pro Val Pro Leu Val His Asp Gly Asp Gly Gly Arg Tyr Val 130 135 140 Gly Thr Trp His Ala Val Ile Thr Lys His Pro Val Arg Gly Asp Val 145 150 155 160 Asn Trp Gly Met Tyr Arg Gln Met Met Trp Asp Gly Arg Thr Met Ser 165 170 175 Gly Ala Val Phe Pro Phe Ser Asp Leu Gly Lys Ala Leu Thr Glu Tyr 180 185 190 Tyr Leu Pro Arg Gly Glu Gly Cys Pro Phe Ala Thr Ala Ile Gly Leu 195 200 205 Ser Pro Leu Ala Ala Met Ala Ala Cys Ala Pro Ser Pro Ile Pro Glu 210 215 220 Pro Glu Leu Thr Gly Met Leu Ala Gly Glu Pro Val Arg Leu Val Lys 225 230 235 240 Cys Glu Thr Asn Asp Leu Glu Val Pro Ala Asp Ala Glu Ile Ile Ile 245 250 255 Glu Gly Val Ile Leu Pro Asp Tyr Lys Val Glu Glu Gly Pro Phe Gly 260 265 270 Glu Tyr Thr Gly Tyr Arg Thr Ser Pro Arg Asp Phe Arg Val Thr Phe 275 280 285 Arg Val Asp Ala Ile Thr Tyr Arg Asn Asn Ala Thr Met Thr Ile Ser 290 295 300 Asn Met Gly Val Pro Gln Asp Glu Gly Gln Leu Leu Arg Ser Phe Ser 305 310 315 320 Leu Gly Leu Glu Leu Glu Lys Leu Leu Lys Ser Gln Gly Ile Pro Val 325 330 335 Thr Gly Val Tyr Met His Pro Arg Ser Thr His His Met Met Ile Val 340 345 350 Gly Val Lys Pro Thr Tyr Ala Gly Ile Ala Met Gln Ile Ala Gln Leu 355 360 365 Ala Phe Gly Ser Lys Leu Gly Pro Trp Phe His Met Val Met Val Val 370 375 380 Asp Asp Gln Thr Asp Ile Phe Asn Trp Asp Glu Val Tyr His Ala Phe 385 390 395 400 Cys Thr Arg Cys Asn Pro Glu Arg Gly Ile His Val Phe Lys Asn Thr 405 410 415 Thr Gly Thr Ala Leu Tyr Pro His Ala Thr Pro His Asp Arg Lys Tyr 420 425 430 Ser Ile Gly Ser Gln Val Leu Phe Asp Cys Leu Trp Pro Val Asp Trp 435 440 445 Asp Lys Thr Asn Asp Val Pro Thr Leu Val Ser Phe Lys Asn Val Tyr 450 455 460 Pro Lys Asp Ile Gln Glu Lys Val Thr Asn Asn Trp Thr Asp Tyr Gly 465 470 475 480 Phe Lys Pro Val Lys 485 2 1458 DNA Thauera aromatica 2 atgggaaaga tttcagcacc gaaaaacaac cgtgaattca tcgaggcatg cgtcaagtcc 60 ggcgatgcgg tccggatcag acaggaagtg gactgggaca acgaggccgg cgccatcgtg 120 cgccgcgcct gcgagctcgc cgaagccgcc ccgttcatgg agaacatcaa ggactacccc 180 ggcttcagct acttcggcgc gccgctgtcg acctaccgcc gcatggcgat ctcgctcggc 240 atggacccgg catcgacctt gccgcagatc ggcgccgagt acctcaaacg taccaacagc 300 gagcccgtgg cgccggtgat cgtcgacaaa cgggacgccc cgtgcaagga gaacatcctg 360 ctcggcgccg acgtcgatct gaccaagctg ccggtaccgc tggtccatga cggcgacggc 420 ggccgctacg tcggcacctg gcacgcggtg atcaccaagc acccggtgcg cggcgacgtg 480 aactggggca tgtaccggca gatgatgtgg gacggccgca cgatgtcggg cgccgtgttc 540 ccgttctcgg atctgggcaa ggcgctcacc gagtactacc tgccgcgcgg cgagggctgc 600 ccgttcgcga ccgcgatcgg cctgtcgccg ctcgccgcga tggccgcctg cgcgccctct 660 ccgatccccg agcccgagct caccggcatg ctcgccggcg agccggtgcg cctggtgaag 720 tgcgagacca acgacctcga agtcccggcc gatgccgaga tcatcatcga gggcgtgatc 780 ctgcccgact acaaggtcga ggaaggcccg ttcggcgaat acaccggcta ccgcaccagc 840 ccgcgcgact tccgcgtcac cttccgcgtc gatgcgatca cctatcgcaa caacgcgacg 900 atgacgatct cgaacatggg cgtgccgcag gacgagggcc agctgctgcg ctcgttctcg 960 ctcgggctcg aactcgagaa gctgctgaag agccagggta tcccggtgac cggcgtgtac 1020 atgcacccgc gctcgaccca ccacatgatg atcgtcggcg tgaagccgac ctacgccggc 1080 atcgcgatgc agatcgcgca gctcgcgttc ggctccaagc tcgggccgtg gttccacatg 1140 gtgatggtgg tcgacgacca gaccgacatc ttcaactggg acgaggtcta tcacgcgttc 1200 tgcacgcgct gcaatccgga gcgcggcatc cacgtgttca agaacaccac cggcaccgcc 1260 ctctatccgc acgccacccc gcacgaccgc aagtactcga tcggctcgca ggtgctgttc 1320 gattgcctgt ggccggtcga ttgggacaag accaacgacg tgccgacgct cgtcagcttc 1380 aagaacgtct atccgaagga catccaggaa aaggtcacga acaactggac cgactacggc 1440 ttcaagccgg tgaaataa 1458 3 472 PRT Thauera aromatica 3 Met Asp Leu Arg Tyr Phe Ile Asn Gln Cys Ala Glu Ala His Glu Leu 1 5 10 15 Lys Arg Ile Thr Thr Glu Val Asp Trp Asn Leu Glu Ile Ser His Val 20 25 30 Ser Lys Leu Thr Glu Glu Lys Lys Gly Pro Ala Leu Leu Phe Glu Ser 35 40 45 Ile Lys Gly Tyr Asp Thr Pro Val Phe Thr Gly Ala Phe Ala Thr Thr 50 55 60 Lys Arg Leu Ala Val Met Leu Gly Leu Pro His Asn Leu Ser Leu Cys 65 70 75 80 Glu Ser Ala Gln Gln Trp Met Lys Lys Thr Ile Thr Ser Glu Gly Leu 85 90 95 Ile Lys Ala Lys Glu Val Lys Asp Gly Pro Val Leu Glu Asn Val Leu 100 105 110 Ser Gly Asp Lys Val Asp Leu Asn Met Phe Pro Val Pro Lys Phe Phe 115 120 125 Pro Leu Asp Gly Gly Arg Tyr Ile Gly Thr Met Val Ser Val Val Leu 130 135 140 Arg Asp Pro Glu Thr Gly Glu Val Asn Leu Gly Thr Tyr Arg Met Gln 145 150 155 160 Met Leu Asp Asp Lys Arg Cys Gly Val Gln Ile Leu Pro Gly Lys Arg 165 170 175 Gly Glu Arg Ile Met Lys Lys Tyr Ala Lys Met Gly Lys Lys Met Pro 180 185 190 Ala Ala Ala Ile Ile Gly Cys Asp Pro Leu Ile Phe Met Ser Gly Thr 195 200 205 Leu Met His Lys Gly Ala Ser Asp Phe Asp Ile Thr Gly Thr Val Arg 210 215 220 Gly Gln Gln Ala Glu Phe Leu Met Ala Pro Leu Thr Gly Leu Pro Val 225 230 235 240 Pro Ala Gly Ala Glu Ile Val Leu Glu Gly Glu Ile Asp Pro Asn Ala 245 250 255 Phe Leu Pro Glu Gly Pro Phe Ala Glu Tyr Thr Gly Tyr Tyr Thr Asp 260 265 270 Glu Leu His Lys Pro Ile Pro Lys Pro Val Leu Glu Val Gln Gln Ile 275 280 285 Leu His Arg Asn Ser Pro Ile Leu Trp Ala Thr Gly Gln Gly Arg Pro 290 295 300 Val Thr Asp Val His Met Leu Leu Ala Phe Thr Arg Thr Ala Thr Leu 305 310 315 320 Trp Thr Glu Leu Glu Gln Met Arg Ile Pro Gly Ile Gln Ser Val Cys 325 330 335 Val Met Pro Glu Ser Thr Gly Arg Phe Trp Ser Val Val Ser Val Lys 340 345 350 Gln Ala Tyr Pro Gly His Ser Arg Gln Val Ala Asp Ala Val Ile Ala 355 360 365 Ser Asn Thr Gly Ser Tyr Gly Met Lys Gly Val Ile Thr Val Asp Glu 370 375 380 Asp Ile Gln Ala Asp Asp Leu Gln Arg Val Phe Trp Ala Leu Ser Cys 385 390 395 400 Arg Tyr Asp Pro Ala Arg Gly Thr Glu Leu Ile Lys Arg Gly Arg Ser 405 410 415 Thr Pro Leu Asp Pro Ala Leu Asp Pro Asn Gly Asp Lys Leu Thr Thr 420 425 430 Ser Arg Ile Leu Met Asp Ala Cys Ile Pro Tyr Glu Trp Lys Gln Lys 435 440 445 Pro Val Glu Ala Arg Met Asp Glu Glu Met Leu Ala Lys Ile Arg Ala 450 455 460 Arg Trp His Glu Tyr Gly Ile Asp 465 470 4 1419 DNA Thauera aromatica 4 atggacctgc gctacttcat caaccagtgt gccgaagccc acgaactgaa gagaatcacc 60 accgaggtcg attggaatct ggagatttcc catgtttcca agctgaccga agagaaaaaa 120 ggcccggcgc tgctgttcga aagcatcaag ggctacgaca cgccggtgtt caccggggcc 180 ttcgcgacca ccaagcgcct cgccgtcatg ctcggcctgc cgcacaacct gtcgctgtgc 240 gaatccgccc agcaatggat gaagaaaacg atcacctccg aagggctgat caaggcgaag 300 gaagtgaagg acggcccggt gctggaaaac gtgctcagcg gcgacaaggt cgatctcaac 360 atgttcccgg tgccgaagtt cttccccctc gacggcgggc gctacatcgg cacgatggta 420 tcggtggtgc tgcgtgatcc ggagacgggc gaggtcaacc tcggcaccta ccgcatgcag 480 atgctcgacg acaagcgctg cggggtgcag atcctgcccg ggaagcgcgg cgaacggatc 540 atgaaaaagt acgccaagat gggcaaaaag atgcccgccg cggcgatcat cggctgcgat 600 ccgctgatct tcatgtccgg cacgctgatg cacaagggcg ccagcgactt cgacattacc 660 ggcaccgtgc gcggccagca ggccgagttc ctgatggcgc cgctgaccgg gctgccggtg 720 ccggccgggg ccgagatcgt gctcgaaggc gagatcgatc cgaacgcctt cctgcccgaa 780 ggcccgttcg ccgaatacac cggctactac accgacgaac tgcacaagcc gatcccgaaa 840 ccggtgctcg aagtgcagca gatcctgcac cgcaacagcc cgatcctgtg ggccaccggc 900 cagggccgcc cggtgaccga cgtccatatg ctgctcgcct tcacccggac cgcgaccttg 960 tggaccgagc tcgagcagat gcgcattccc ggcatccagt cggtgtgcgt gatgccggaa 1020 tcgaccgggc gcttctggtc ggtggtgtcg gtcaagcagg cctacccggg gcactcgcgc 1080 caggtggccg acgcggtgat cgccagcaac accggctcgt acggcatgaa gggtgtgatc 1140 acggtcgatg aggacatcca ggccgacgat ctgcagcgcg tgttctgggc gctgtcgtgc 1200 cgctacgacc cggcgcgcgg caccgagctg atcaagcgcg gccgctcgac gccgctcgat 1260 ccggcgctcg acccgaacgg cgacaagctc accacgtcgc ggatcctgat ggacgcctgc 1320 atcccctacg agtggaagca gaagccggtc gaagcgcgca tggacgaaga gatgctggcg 1380 aagatccgcg cccgctggca cgagtacggc atcgactga 1419 5 612 PRT Thauera aromatica 5 Met Lys Phe Pro Val Pro His Asp Ile Gln Ala Lys Thr Ile Pro Gly 1 5 10 15 Thr Glu Gly Trp Glu Arg Met Tyr Pro Tyr His Tyr Gln Phe Val Thr 20 25 30 Asp Asp Pro Gln Arg Asn Gln Tyr Glu Lys Glu Thr Phe Trp Phe Tyr 35 40 45 Asp Gly Leu His Tyr Pro Glu Pro Leu Tyr Pro Phe Asp Thr Ile Trp 50 55 60 Asp Glu Ala Trp Tyr Leu Ala Leu Ser Gln Phe Asn Asn Arg Ile Phe 65 70 75 80 Gln Val Pro Pro Val Arg Gly Val Asp His Arg Ile Ile Asn Gly Tyr 85 90 95 Val Tyr Ile Ser Pro Val Pro Ile Lys Asp Pro Asp Glu Ile Gly Lys 100 105 110 Arg Val Pro Asn Phe Met Glu Arg Ala Gly Phe Tyr Tyr Lys Asn Trp 115 120 125 Asp Glu Leu Glu Ala Lys Trp Lys Val Lys Met Glu Ala Thr Ile Ala 130 135 140 Glu Leu Glu Ala Leu Glu Val Pro Arg Leu Pro Asp Ala Glu Asp Met 145 150 155 160 Ser Val Val Thr Glu Gly Val Gly Glu Ser Lys Ala Tyr His Leu Leu 165 170 175 Lys Asn Tyr Asp Asp Leu Ile Asn Leu Gly Ile Lys Cys Trp Gln Tyr 180 185 190 His Phe Glu Phe Leu Asn Leu Gly Tyr Ala Ala Tyr Val Phe Phe Met 195 200 205 Asp Phe Ala Gln Lys Leu Phe Pro Ser Ile Pro Leu Gln Arg Val Thr 210 215 220 Gln Met Val Ser Gly Ile Asp Val Ile Met Tyr Arg Pro Asp Asp Glu 225 230 235 240 Leu Lys Glu Leu Ala Lys Lys Ala Val Ser Leu Glu Val Asp Glu Ile 245 250 255 Val Thr Gly His Arg Glu Trp Ser Asp Val Lys Ala Ala Leu Ser Ala 260 265 270 His Arg His Gly Ala Glu Trp Leu Glu Ala Phe Glu Lys Ser Arg Tyr 275 280 285 Pro Trp Phe Asn Ile Ser Thr Gly Thr Gly Trp Phe His Thr Asp Arg 290 295 300 Ser Trp Asn Asp Asn Leu Asn Ile Pro Leu Asp Gly Ile Gln Thr Tyr 305 310 315 320 Ile Gly Lys Leu His Ala Gly Val Ala Ile Glu Arg Pro Met Glu Ala 325 330 335 Val Arg Ala Glu Arg Asp Arg Ile Thr Ala Glu Tyr Arg Asp Leu Ile 340 345 350 Asp Ser Asp Glu Asp Arg Lys Gln Phe Asp Glu Leu Leu Gly Cys Ala 355 360 365 Arg Thr Val Phe Pro Tyr Val Glu Asn His Leu Phe Tyr Val Glu His 370 375 380 Trp Phe His Ser Val Phe Trp Asn Lys Met Arg Glu Val Ala Ala Ile 385 390 395 400 Met Lys Glu His Cys Met Ile Asp Asp Ile Glu Asp Ile Trp Tyr Leu 405 410 415 Arg Arg Asp Glu Ile Lys Gln Ala Leu Trp Asp Leu Val Thr Ala Trp 420 425 430 Ala Thr Gly Val Thr Pro Arg Gly Thr Ala Thr Trp Pro Ala Glu Ile 435 440 445 Glu Trp Arg Lys Gly Val Met Gln Lys Phe Arg Glu Trp Ser Pro Pro 450 455 460 Pro Ala Ile Gly Ile Ala Pro Glu Val Ile Gln Glu Pro Phe Thr Ile 465 470 475 480 Val Leu Trp Gly Val Thr Asn Ser Ser Leu Ser Ala Trp Ala Ala Val 485 490 495 Gln Glu Ile Asp Asp Pro Asp Ser Ile Thr Glu Leu Lys Gly Phe Ala 500 505 510 Ala Ser Pro Gly Thr Val Glu Gly Lys Ala Arg Val Cys Arg Ser Ala 515 520 525 Glu Asp Ile Arg Asp Leu Lys Glu Gly Glu Ile Leu Val Ala Pro Thr 530 535 540 Thr Ser Pro Ser Trp Ala Pro Ala Phe Ala Lys Ile Lys Ala Cys Val 545 550 555 560 Thr Asp Val Gly Gly Val Met Ser His Ala Ala Ile Val Cys Arg Glu 565 570 575 Tyr Gly Met Pro Ala Val Val Gly Thr Gly Leu Ser Thr Arg Val Val 580 585 590 Arg Thr Gly Met Thr Leu Arg Val Asp Gly Ser Ser Gly Leu Ile Thr 595 600 605 Ile Ile Thr Asp 610 6 1839 DNA Thauera aromatica 6 atgaagtttc ctgttccgca cgacatccag gccaagacga ttccggggac cgaaggctgg 60 gagcggatgt acccgtacca ctaccagttc gtcaccgacg atccgcagcg taaccagtac 120 gagaaagaaa ccttctggtt ttacgacgga ttgcattacc cggagccgct ttatccgttc 180 gacacgatct gggacgaggc ctggtatctc gccctgtcgc aattcaacaa tcgaattttc 240 caggtgccgc cggtgcgcgg cgtcgatcac cggatcatca acggttacgt ctatatctcg 300 ccggttccga tcaaggaccc cgatgaaatc ggcaagcgcg tgcccaattt catggagcgc 360 gccggtttct attacaagaa ctgggacgag ctcgaggcga aatggaaagt gaagatggag 420 gcgacgatcg ccgagctcga agcgctcgag gttccgcgcc tgcccgacgc cgaagacatg 480 tcggtggtga ccgaaggagt cggtgaatcg aaggcctacc acctgctcaa gaattacgac 540 gacctgatca acctcggcat caagtgctgg caataccact tcgaattcct caatcttggc 600 tatgccgcct acgttttctt catggatttc gcgcagaagc tgtttccgag cattccgctc 660 cagcgcgtca cccagatggt gtcggggatc gacgtcatca tgtaccgccc ggacgacgaa 720 ctgaaggaac tggcaaagaa ggccgtttca ctcgaagtcg atgaaatcgt caccggccat 780 cgggagtgga gcgacgtcaa ggcggcgctt tcggcacacc gccacggtgc cgaatggctc 840 gaagcattcg agaaatcccg ctacccgtgg ttcaacattt cgaccggcac gggatggttc 900 cataccgacc gcagctggaa cgacaacctc aacattccgc tcgacggcat ccagacctat 960 atcggcaagc ttcacgccgg cgtcgccatc gagcggccga tggaagcggt ccgtgccgag 1020 cgcgaccgga tcaccgccga gtaccgcgat ctgatcgaca gcgacgagga ccgcaagcag 1080 ttcgacgaac tgctcggctg cgcccggacg gtgttcccct acgtcgagaa ccatctgttc 1140 tacgtcgagc actggttcca ctcggtgttc tggaacaaga tgcgcgaagt cgctgcgatc 1200 atgaaagaac actgcatgat cgacgacatt gaagacatct ggtatctgcg ccgcgatgaa 1260 atcaagcagg cgctgtggga tctggtcacc gcctgggcaa ccggcgtcac ccctcgcggc 1320 accgccacct ggccggccga aatcgaatgg cgcaaggggg tgatgcagaa gttccgcgaa 1380 tggagcccgc cgccggccat cggcatcgca ccggaagtga tccaggagcc cttcaccatc 1440 gtgctctggg gggtcaccaa cagctcgctc tcggcctggg ccgccgtcca ggaaatcgac 1500 gaccccgaca gcatcaccga gctgaaaggc ttcgccgcca gcccgggcac ggtcgaaggc 1560 aaggcgcgcg tgtgccgcag cgccgaagac atccgcgacc tgaaggaggg cgaaattctc 1620 gtcgccccga ccacctcgcc ttcgtgggcg ccggccttcg ccaagatcaa ggcctgcgtc 1680 accgatgtcg gcggcgtcat gagccatgcc gcgatcgtat gccgcgaata cggcatgccg 1740 gcggtggtgg gcaccgggct atcgacccgt gtggtccgca ccggcatgac gctgcgggtc 1800 gatggttcga gcgggctgat cacgatcatc acggattga 1839 7 169 PRT Thauera aromatica 7 Met Glu Gln Ala Lys Asn Ile Lys Leu Val Ile Leu Asp Val Asp Gly 1 5 10 15 Val Met Thr Asp Gly Arg Ile Val Ile Asn Asp Glu Gly Ile Glu Ser 20 25 30 Arg Asn Phe Asp Ile Lys Asp Gly Met Gly Val Ile Val Leu Gln Leu 35 40 45 Cys Gly Val Glu Val Ala Ile Ile Thr Ser Lys Lys Ser Gly Ala Val 50 55 60 Arg His Arg Ala Glu Glu Leu Lys Ile Lys Arg Phe His Glu Gly Ile 65 70 75 80 Lys Lys Lys Thr Glu Pro Tyr Ala Gln Met Leu Glu Glu Met Asn Ile 85 90 95 Ser Asp Ala Glu Val Cys Tyr Val Gly Asp Asp Leu Val Asp Leu Ser 100 105 110 Met Met Lys Arg Val Gly Leu Ala Val Ala Val Gly Asp Ala Val Ala 115 120 125 Asp Val Lys Glu Val Ala Ala Tyr Val Thr Thr Ala Arg Gly Gly His 130 135 140 Gly Ala Val Arg Glu Val Ala Glu Leu Ile Leu Lys Ala Gln Gly Lys 145 150 155 160 Trp Asp Ala Met Leu Ser Lys Ile His 165 8 510 DNA Thauera aromatica 8 atggaacagg cgaagaacat caagctggtg atcctcgacg tcgatggcgt gatgaccgac 60 gggcgcatcg tgatcaatga cgaaggcatc gagtcgcgca acttcgacat caaggacggc 120 atgggcgtga tcgtgctgca actgtgcggc gtcgaggtcg cgatcatcac ctcgaagaaa 180 tccggcgcgg tgcgccatcg cgccgaggag ctgaagatca agcgcttcca cgagggcatc 240 aagaagaaga ccgagcccta cgcgcagatg ctcgaggaga tgaacatctc cgatgccgaa 300 gtctgctacg tcggcgacga cctcgtcgat ctgtcgatga tgaagcgcgt cggcctggcc 360 gtggcggtcg gtgacgccgt ggccgacgtc aaggaagtgg ccgcttatgt gacgactgcg 420 cgcggcgggc acggcgcggt gcgcgaagtc gcggagctga tcctgaaagc gcagggcaag 480 tgggacgcga tgctctcgaa gatccattga 510 9 194 PRT Thauera aromatica 9 Met Arg Ile Val Val Gly Met Ser Gly Ala Ser Gly Ala Ile Tyr Gly 1 5 10 15 Ile Arg Ile Leu Glu Ala Leu Gln Arg Ile Gly Val Glu Thr Asp Leu 20 25 30 Val Met Ser Asp Ser Ala Lys Arg Thr Ile Ala Tyr Glu Thr Asp Tyr 35 40 45 Ser Ile Ser Asp Leu Lys Gly Leu Ala Thr Cys Val His Asp Ile Asn 50 55 60 Asp Val Gly Ala Ser Ile Ala Ser Gly Ser Phe Arg His Ala Gly Met 65 70 75 80 Ile Ile Ala Pro Cys Ser Ile Lys Thr Leu Ser Ala Val Ala Asn Ser 85 90 95 Phe Asn Thr Asn Leu Leu Ile Arg Ala Ala Asp Val Ala Leu Lys Glu 100 105 110 Arg Arg Lys Leu Val Leu Met Leu Arg Glu Thr Pro Leu His Leu Gly 115 120 125 His Leu Arg Leu Met Thr Gln Ala Thr Glu Asn Gly Ala Val Leu Leu 130 135 140 Pro Pro Leu Pro Ala Phe Tyr His Arg Pro Lys Thr Leu Asp Asp Ile 145 150 155 160 Ile Asn Gln Ser Val Thr Lys Val Leu Asp Gln Phe Asp Leu Asp Val 165 170 175 Asp Leu Phe Gly Arg Trp Thr Gly Asn Glu Glu Arg Glu Leu Ala Lys 180 185 190 Ser Arg 10 585 DNA Thauera aromatica 10 atgagaatcg tcgtcggaat gtccggtgcc agcggtgcga tctacggcat ccggatcctc 60 gaggcactac agcgcatcgg tgtcgaaacc gacctggtga tgtcggattc ggccaagcgg 120 accatcgcat acgaaacgga ctattcgatc agcgacttga agggactcgc gacctgcgtc 180 catgacatca atgatgtcgg ggcgtcgatc gccagcggct cgttccgcca tgccggcatg 240 atcatcgcgc cctgttcgat caagaccctg tccgcagtcg ccaactcgtt caacacgaat 300 ctgttgatcc gcgccgccga cgtcgcgttg aaggagcggc gcaagctcgt gctgatgctg 360 cgcgagacgc cgctgcacct gggccacctg cgcctgatga cccaggccac ggagaacggc 420 gcggttctcc tccctcccct gcccgcgttc taccaccgcc ccaagacgct cgacgacatc 480 atcaaccagt cggtgacgaa agtgctcgac cagttcgatc tcgacgtcga tctcttcggg 540 cggtggacgg gcaacgaaga acgcgaactg gcgaaatccc gatag 585 11 374 PRT Thauera aromatica 11 Met Gly Ser Ile Val Ser Thr Val Ala Leu Ser Ala Ala Thr Ala Asp 1 5 10 15 Ser Thr Ser Pro Lys Val Cys Pro Phe Glu Ala Cys Gly Lys Asp Ser 20 25 30 Val Pro Leu Val Gly Gly Lys Cys Ala Ser Leu Gly Glu Leu Ile Asn 35 40 45 Ala Gly Val Arg Val Pro Pro Gly Phe Ala Leu Thr Thr Ser Gly Tyr 50 55 60 Ala Gln Phe Met Arg Glu Ala Gly Ile Gln Ala Asp Ile Gly Ala Leu 65 70 75 80 Leu Glu Gly Leu Asp His Gln Asp Met Asp Lys Leu Glu Glu Ala Ser 85 90 95 Arg Ala Ile Arg Glu Met Ile Glu Ser Arg Pro Met Pro Ile Glu Leu 100 105 110 Glu Asp Leu Ile Ala Glu Ala Tyr Arg Lys Leu Ser Val Arg Cys Tyr 115 120 125 Leu Pro Ala Ala Pro Val Ala Val Arg Ser Ser Ala Thr Ala Glu Asp 130 135 140 Leu Pro Gly Ala Ser Phe Ala Gly Gln Gln Asp Thr Tyr Leu Trp Ile 145 150 155 160 Arg Gly Val Asp Asp Leu Ile His His Val Arg Arg Cys Ile Ser Ser 165 170 175 Leu Tyr Thr Gly Arg Ala Ile Ala Tyr Arg Met Lys Met Gly Phe Pro 180 185 190 His Glu Gln Val Ala Ile Ser Val Gly Val Gln Met Met Ala Asn Ala 195 200 205 Tyr Thr Ala Gly Val Met Phe Thr Ile His Pro Gly Thr Gly Asp Arg 210 215 220 Ser Val Ile Val Ile Asp Ser Asn Phe Gly Phe Gly Glu Ser Val Val 225 230 235 240 Ser Gly Glu Val Thr Pro Asp Asn Phe Val Val Asn Lys Val Thr Leu 245 250 255 Asp Ile Ile Glu Arg Thr Ile Ser Thr Lys Glu Leu Cys His Thr Val 260 265 270 Asp Leu Lys Thr Gln Lys Ser Val Ala Leu Pro Val Pro Ala Glu Arg 275 280 285 Gln Asn Ile Gln Ser Ile Thr Asp Asp Glu Ile Ser Glu Leu Ala Trp 290 295 300 Ala Ala Lys Lys Ile Glu Lys His Tyr Gly Arg Pro Met Asp Ile Glu 305 310 315 320 Trp Ala Ile Asp Lys Asn Leu Pro Ala Asp Gly Asn Ile Phe Ile Leu 325 330 335 Gln Ala Arg Pro Glu Thr Ile Trp Ser Asn Arg Gln Lys Ala Ser Ala 340 345 350 Thr Thr Gly Ser Thr Ser Ala Met Asp Tyr Ile Val Ser Ser Leu Ile 355 360 365 Thr Gly Lys Arg Leu Gly 370 12 1125 DNA Thauera aromatica 12 atgggaagta tcgtttccac cgtagccctg tccgcggcca ccgccgacag cacttcgccg 60 aaggtctgcc cgttcgaggc ctgcggcaag gactcggtcc cgctggtggg cggcaagtgc 120 gcgtccctgg gcgaactgat caacgccggc gtacgggtgc cgccgggctt tgccctgacc 180 accagcggct atgcccagtt catgcgtgaa gccggcatcc aggcggacat cggcgcgctg 240 ctcgaaggcc tcgaccacca ggacatggac aagctcgagg aagcatcgag ggcgatccgc 300 gaaatgatcg aatcgcgccc gatgccgatc gagctcgaag acctgatcgc cgaggcctac 360 cgcaagctgt cggtccgctg ctatctgccc gcggcgccgg tggcggtgcg ttcgagcgcg 420 accgccgagg acctgcccgg tgcgagcttt gccggccagc aggataccta cctgtggatc 480 cgcggcgtcg atgacctcat ccaccacgtc cggcgctgca tctccagcct ctacaccggc 540 cgggcgatcg cctaccggat gaagatgggc ttcccgcacg agcaggtcgc gatcagcgtc 600 ggcgtccaga tgatggcgaa cgcctacacc gcgggggtga tgttcacgat ccatccgggc 660 accggcgacc gctcggtgat cgtcatcgat tcgaatttcg gcttcggtga atccgtggtg 720 tcgggcgaag tcacgccgga caacttcgtc gtcaacaagg tcaccctcga catcatcgag 780 cgcacgattt cgacgaagga gctgtgccac accgtcgatc tgaagaccca gaaatcagtc 840 gcacttccgg tccctgccga gcgccagaac atccagtcga ttaccgatga cgaaatcagc 900 gaactcgcct gggccgccaa gaagatcgaa aagcattacg gccgcccgat ggacatcgaa 960 tgggcgatcg acaagaacct gcccgcggac ggaaacattt tcatcctcca ggcccggccc 1020 gaaacgatct ggagcaaccg ccagaaagcc agcgcgacga ccggcagcac gtcggcgatg 1080 gattacatcg tatcgagcct gatcacgggc aagcggctcg gctag 1125 13 223 PRT Thauera aromatica 13 Met Ile Val Arg Asn Trp Met Gln Thr Asn Pro Ile Val Leu Thr Gly 1 5 10 15 Asp Thr Leu Leu Ser Glu Ala Lys Arg Ile Phe Ser Glu Ala Asn Ile 20 25 30 His Ala Leu Pro Val Val Asp Asp Gly Arg Leu Arg Gly Leu Ile Thr 35 40 45 Arg Ala Gly Cys Leu Arg Ala Ala His Ala Ala Leu Arg Thr Gln Asp 50 55 60 Thr Asp Glu Leu Asn Tyr Phe Ser Asn Arg Val Lys Val Lys Asp Ile 65 70 75 80 Met Val Arg Asn Pro Ala Thr Ile Asp Ala Asp Asp Thr Met Glu His 85 90 95 Cys Leu Gln Val Gly Gln Glu His Gly Val Gly Gln Leu Pro Val Met 100 105 110 Asp Lys Gly Asn Val Val Gly Ile Ile Ser Ala Ile Glu Met Phe Ser 115 120 125 Leu Ala Ala His Phe Leu Gly Ala Trp Glu Lys Arg Ser Gly Val Thr 130 135 140 Leu Ala Pro Ile Asp Leu Lys Gln Gly Thr Met Gly Arg Ile Ile Asp 145 150 155 160 Thr Val Glu Ala Ala Gly Ala Glu Val His Ala Ile Tyr Pro Ile Ser 165 170 175 Ala His Asp Arg Glu Ser Ala Ser Ala Arg Arg Glu Arg Lys Val Ile 180 185 190 Ile Arg Phe His Ala Ala Asn Val Ala Ala Val Ile Glu Ala Leu Ala 195 200 205 His Ala Gly Tyr Glu Val Ile Glu Ala Val Gln Ala Ala Ala His 210 215 220 14 672 DNA Thauera aromatica 14 atgatcgtac gcaactggat gcagaccaat ccgatcgtgc tcaccgggga caccttgctg 60 tccgaagcga agcggatctt ttccgaagcc aatatccacg cattaccggt cgtcgatgac 120 ggccgcctgc gcggactcat cacccgcgcc ggctgcctgc gggccgcgca tgccgcgctg 180 cggacccagg acaccgacga gctcaactac ttctcgaacc gggtcaaggt caaggacatc 240 atggtccgca acccggccac catcgatgcc gacgacacga tggaacactg cctgcaggtc 300 ggccaggaac acggcgtcgg ccaattgccg gtgatggaca aaggcaatgt cgtcggaatc 360 atttcggcaa tcgaaatgtt ctcgctggcg gcgcatttcc ttggtgcctg ggaaaagcgc 420 agcggcgtca ccctggcccc gatcgatctc aagcagggaa ccatgggccg catcatcgac 480 accgtcgaag ccgccggcgc cgaggtgcac gcgatctacc cgatctcggc ccatgacagg 540 gagtccgcct cggccaggcg ggagcggaaa gtgatcatcc gcttccacgc cgcgaacgtc 600 gcggcagtca tcgaggcgct cgcccacgcc ggctacgaag tcatcgaggc cgttcaagcc 660 gcagcgcatt ga 672 15 357 PRT Thauera aromatica 15 Leu His Arg Ser Arg Arg Gly Thr Arg Pro Arg Ser Lys Glu Val Ile 1 5 10 15 His Arg His Pro Asp Asp Leu Leu Ser Leu Leu Pro Ile Leu Thr His 20 25 30 His Glu Lys Asp Ala Ala Pro Phe Ile Thr Thr Gly Val Val Leu Cys 35 40 45 Thr Asp Pro Glu Thr Gly Arg Arg Gly Met Gly Ile His Arg Met Met 50 55 60 Val Lys Gly Gly Arg Arg Leu Gly Ile Leu Leu Ala Asn Pro Pro Ile 65 70 75 80 Pro His Phe Leu Ala Lys Ala Glu Ala Ala Gly Lys Pro Leu Asp Val 85 90 95 Ala Ile Ala Leu Gly Leu Glu Pro Ala Thr Leu Leu Ser Ser Val Val 100 105 110 Lys Val Gly Pro Arg Val Pro Asp Lys Met Ala Ala Ala Gly Ala Leu 115 120 125 Arg Gly Glu Pro Val Glu Leu Val Arg Ala Glu Thr Val Asp Val Asp 130 135 140 Ile Pro Ala Arg Ala Glu Ile Val Ile Glu Gly Arg Ile Leu Pro Gly 145 150 155 160 Val Arg Glu Leu Glu Gly Pro Phe Gly Glu Asn Thr Gly His Tyr Phe 165 170 175 Ser Asn Val Ser Pro Val Ile Glu Ile Ser Ala Val Thr His Arg Asp 180 185 190 Asn Phe Ile Tyr Pro Gly Leu Cys Pro Trp Ser Pro Glu Val Asp Ala 195 200 205 Leu Leu Ser Leu Ala Ala Gly Ala Glu Leu Leu Gly Gln Leu Gln Gly 210 215 220 Leu Ile Asp Gly Val Val Asp Leu Glu Met Ala Gly Gly Thr Ser Gly 225 230 235 240 Phe Ser Val Val Val Ala Val His Arg Thr Thr Ala Ala Asp Val Arg 245 250 255 Arg Leu Val Met Leu Ala Leu Asn Leu Asp Arg Arg Leu Lys Thr Ile 260 265 270 Thr Val Val Asp Asp Asp Val Asp Ile Arg Asp Pro Arg Glu Val Ala 275 280 285 Trp Ala Met Ala Thr Arg Tyr Gln Pro Ala Arg Asp Thr Val Val Ile 290 295 300 His Gly Cys Glu Ala Tyr Val Ile Asp Pro Ser Ala Thr Gly Asp Gly 305 310 315 320 Thr Ser Lys Val Gly Phe Ile Ala Thr Arg Ala Ser Gly Ala Asp Ser 325 330 335 Asp Arg Ile Thr Leu Pro Pro Ala Ala Leu Ala Lys Ala Arg Ala Ile 340 345 350 Ile Ala Arg Leu His 355 16 1074 DNA Thauera aromatica 16 ttgcaccgat ccaggcgcgg gacgcggccc cggtcaaagg aagtgatcca ccgccatccg 60 gacgatctgc tgtcgctgct gccgatcctg acccaccacg aaaaggatgc ggcccccttc 120 atcaccaccg gcgtggtgtt gtgcaccgac cccgagaccg gccggcgcgg catgggcatc 180 caccgcatga tggtcaaggg cgggcgccgg ctcggcatcc tgctcgccaa tccgccgatt 240 ccgcatttcc tcgccaaggc cgaagcggcc ggcaagccgc tcgatgtcgc catcgcgctc 300 ggtctcgaac ccgccaccct gctgtcgtcg gtggtcaagg tcggcccgcg ggtgcccgac 360 aagatggccg ctgccggcgc cctgcgtggc gaaccggtcg agctggtgcg cgccgaaacg 420 gtggatgtgg acatcccggc gcgcgccgaa atcgtcatcg aaggccggat tctgccgggc 480 gtgcgcgaac tcgagggccc gttcggggag aacaccgggc actatttttc caacgtcagc 540 ccggtcatcg agatcagcgc cgtcacccat cgcgacaact tcatctaccc gggcctgtgc 600 ccatggtcgc ccgaggtcga tgcgctgctg tcgctggcgg ccggtgccga attgctcggc 660 cagttgcagg ggctgatcga cggcgtcgtc gatctggaga tggccggcgg caccagcggc 720 ttttccgtgg ttgtcgcagt ccatcggacc actgcggccg acgtcagacg gctggtcatg 780 ctcgcgctca atctcgaccg ccgcctgaag acgatcaccg tcgtcgacga cgacgtcgac 840 atccgcgacc cgcgcgaagt cgcctgggcc atggctaccc gctaccagcc cgcccgggac 900 acggtcgtga tccacggctg cgaagcctat gtcatcgatc cttcggcgac cggggacggc 960 acatcgaaag tcgggttcat cgccacccgt gccagcggcg cggactcgga ccgcatcacc 1020 ctgccgccgg cagcgctcgc gaaggcgcgc gccatcatcg ccagactgca ttga 1074 17 143 PRT Thauera aromatica 17 Met Pro Pro Ile Ala Leu Pro Leu Ser Leu Glu Gly Val Val Cys Thr 1 5 10 15 Gly Leu Gly Ala Gly Ala Gln Phe Thr Thr Leu Asp Trp Val Val Asp 20 25 30 Glu Cys Arg Glu Lys Leu Gly Phe Ile Pro Trp Pro Gly Thr Phe Asn 35 40 45 Val Arg Thr Gln Gly Ala Leu Ala Gly Val Asp Arg Thr Arg Leu Leu 50 55 60 Arg Ser Gly Tyr Ser Ile Arg Ile Arg Pro Ala Pro Gly Tyr Cys Ala 65 70 75 80 Ala Glu Cys Leu Val Val Asn Ile Ala Gly Arg Ile Ser Gly Ala Val 85 90 95 Leu Phe Pro Glu Val Pro Gly Tyr Pro Asp Gly Gln Leu Glu Ile Ile 100 105 110 Ala Pro Val Pro Val Arg Arg Thr Leu Gly Leu Asn Asp Gly Asp Arg 115 120 125 Val Asn Leu Ser Ile Gly Ile Ser Thr Ser Leu Phe Cys Arg Ala 130 135 140 18 432 DNA Thauera aromatica 18 atgccaccga tcgcccttcc cctgtcactc gaaggcgtcg tctgcacggg actcggtgca 60 ggcgcgcagt tcaccaccct cgactgggtc gtcgatgaat gccgggaaaa gctcggcttc 120 atcccctggc ccggcacctt caacgtgagg acgcagggcg cgcttgcggg cgtggaccgc 180 acccgcctcc tgcgctcggg atacagcatc cgcatccggc cggcgcccgg ctactgtgcc 240 gcggaatgcc tcgtggtcaa catcgcgggg cggatctccg gcgcggtgct attcccagag 300 gtgcccggct acccggacgg ccagctcgaa atcatcgctc cggtgccggt acgaagaacc 360 ctcggcctca atgacggcga ccgggtcaac ctctccatcg gcatcagcac ctcccttttc 420 tgccgggcct ga 432 19 182 PRT Thauera aromatica 19 Met Ala Pro Lys Phe Cys Pro Gln Cys Gly Thr Ala Leu Val Leu Ala 1 5 10 15 Thr Ile His Gly Arg Glu Arg Glu Thr Cys Pro Ala Cys Gly Glu Thr 20 25 30 Phe Phe His Lys Pro Ala Pro Val Val Leu Ala Val Ile Glu His Ala 35 40 45 Gly Gln Leu Val Leu Ile Arg Arg Lys Leu Asp Pro Leu Ala Gly Tyr 50 55 60 Trp Ala Pro Pro Gly Gly Tyr Val Glu Arg Gly Glu Ser Leu Glu Glu 65 70 75 80 Ala Val Val Arg Glu Ala Arg Glu Glu Ser Gly Leu Glu Val Ala Val 85 90 95 Asp Glu Leu Ile Gly Val Tyr Ser Gln Ala Asp Val Arg Ala Val Ile 100 105 110 Leu Ala Tyr Arg Ala His Ser Ile Gly Gly Glu Pro Val Ala Gly Asp 115 120 125 Asp Ala Gly Glu Ile Cys Leu Val Ala Pro Gly Gln Leu Pro Val Gln 130 135 140 Arg Pro Pro Gln Ser Gly Ile Pro Ile Glu His Trp Phe Phe Ser Val 145 150 155 160 Val Glu Glu Val Thr Asp Pro Trp Lys Trp Gly Arg Arg Asn Ser Ala 165 170 175 Lys Lys Met Met Arg Arg 180 20 549 DNA Thauera aromatica 20 atggcaccga agttctgccc gcaatgcggc accgccctgg tcctggcgac gatccatggg 60 cgcgaacgtg aaacctgtcc ggcctgtggc gaaacctttt tccacaagcc cgcgcccgtc 120 gtgctggcgg tgatcgagca cgccgggcaa ctcgtgctga tccgccgcaa gctcgatccg 180 ctcgccggct actgggcacc gccgggcggc tacgtcgaac gcggcgaatc gctcgaggag 240 gcggtcgtac gcgaggcgcg cgaggaaagc ggactcgagg tcgccgtcga tgaactgatc 300 ggcgtgtatt cgcaggccga cgtgcgcgcg gtgatcctcg cctaccgcgc gcactcgatc 360 ggcggcgaac cggtcgccgg cgacgacgcc ggcgagatct gcctcgtcgc cccgggccag 420 ctgccggtgc agcgcccgcc gcagagcggc ataccgatcg aacactggtt tttcagcgta 480 gtggaggaag tcaccgatcc atggaagtgg gggcgccgca acagcgccaa gaaaatgatg 540 aggagatag 549 21 582 PRT Thauera aromatica 21 Met Ala Lys Leu His Asp Met Ser Cys Ile Asp Gly Gly Asp Leu Arg 1 5 10 15 Ser Arg Ile His Phe Cys Ala Asp Thr Gly Gln Ile Trp Leu His Glu 20 25 30 His Arg Met Leu Leu Val His Ala Glu Ala Gln Ala Ala Leu Arg Lys 35 40 45 Glu Leu Ile Asp Thr Leu Gly Met Ala Arg Ala Arg Gly Leu Leu Leu 50 55 60 Arg Met Gly Phe Ala Ser Gly Ala Arg Asp Ala Glu Leu Ala Gln Thr 65 70 75 80 Arg Ile Arg Thr Gly Asp Asp Leu Ala Ala Phe Met Thr Gly Pro Gln 85 90 95 Leu His Ala Leu Glu Gly Ile Val Gly Val Ile Pro Leu Gln Leu Glu 100 105 110 Phe Asp Arg Ala Ala Gly Thr Phe Asn Ala Glu Phe Arg Trp Ile Asn 115 120 125 Ser Trp Glu Gly Gln Ser His Lys Arg His Phe Gly Thr Cys Ser Glu 130 135 140 Pro Val Cys Trp Thr Gln Ile Gly Tyr Ala Cys Gly Tyr Ser Thr Ala 145 150 155 160 Phe Met Gly Arg Pro Ile Leu Tyr Lys Glu Ala Glu Cys Ala Gly Met 165 170 175 Gly Ala Glu His Cys His Ile Val Gly Lys Pro Ala Glu Glu Trp Pro 180 185 190 Asp Ala Glu Glu Tyr Arg Arg Leu Phe Ala Pro Glu Ser Ile Ala Glu 195 200 205 Gln Leu Ile Asp Leu Gln Ala Gln Val Glu Gln Leu Arg Ser Thr Ile 210 215 220 Asp Glu Arg Ala Arg Leu Pro Gly Asp Met Ile Gly Asp Ser Pro Gly 225 230 235 240 Phe Arg Phe Ala Leu Ser Leu Leu Gln Gln Ala Ala Gly Ser Ser Ile 245 250 255 Ala Ile Leu Leu Leu Gly Glu Thr Gly Val Gly Lys Glu Leu Phe Thr 260 265 270 Arg Ala Leu His Glu Met Ser Ala Arg Arg Asp Arg Pro Leu Val Ala 275 280 285 Ile Asn Cys Ala Ala Ile Pro His Asp Leu Val Glu Ala Glu Leu Phe 290 295 300 Gly Val Glu Lys Gly Ala Tyr Thr Gly Ala Leu Ala Ala Arg Pro Gly 305 310 315 320 Arg Phe Glu Arg Ala Asn Gly Gly Thr Leu Phe Leu Asp Glu Ile Gly 325 330 335 Asp Leu Pro Leu Thr Ala Gln Ser Lys Leu Leu Arg Val Leu Gln Glu 340 345 350 Gly Glu Val Glu Arg Leu Gly Asp Asp Lys Thr Arg Arg Ile Asp Val 355 360 365 Arg Leu Val Ala Ala Thr Asn Ala Ser Leu Ala Gln Leu Val Lys Glu 370 375 380 Gly Arg Phe Arg Ala Asp Leu Tyr Tyr Arg Leu Asn Ala Phe Gln Ile 385 390 395 400 Asp Ile Pro Pro Leu Arg Gln Arg Arg Glu Asp Ile Ser Pro Leu Ala 405 410 415 Lys His Phe Leu Arg Lys Tyr Ala Ala Ile Asn Gly Lys Lys Leu Leu 420 425 430 Gly Phe Ser Asp Lys Ala Lys Lys Ala Leu Val Gly His Ala Trp Pro 435 440 445 Gly Asn Ile Arg Glu Leu Gln Asn Thr Val Glu Arg Gly Val Ile Leu 450 455 460 Ala Pro Asn Gly Gly Arg Val Glu Val Asp His Leu Phe Leu Ser Gly 465 470 475 480 Ala His Ile Glu Asp Glu Asp Gly Phe Gly Leu Gly Pro Asn Gly Lys 485 490 495 Ile Asp Thr Glu Gln Asp Ser Leu Ala Arg Ser Leu Cys Ser Ala Val 500 505 510 Cys Asp Gly Ala Leu Thr Leu Glu Gln Ile Glu Thr Thr Leu Leu Glu 515 520 525 Thr Ala Leu Asp Lys Ala Arg Gly Asn Leu Ser Ser Ala Ala Arg Met 530 535 540 Leu Gly Leu Thr Arg Pro Gln Phe Ala Tyr Arg Leu Lys Arg Leu Arg 545 550 555 560 Gly Glu Glu Ser Gly Ala Gly Pro Gly Ala Asp Val Thr Asp Thr Leu 565 570 575 Ser Gly Arg Ala His Ala 580 22 1749 DNA Thauera aromatica 22 tcatgcgtgc gccctcccgg acagggtgtc ggtcacgtca gctccgggac cggcaccact 60 ttcttcaccg cgcagacgct tgaggcggta ggcgaattgc ggccgggtca ggccgagcat 120 gcgcgccgcc gaagacaggt tgccgcgcgc cttgtcgagc gcggtttcga gcagggtggt 180 ctcgatctgc tcgagggtca gggcaccatc gcacaccgcg ctgcacaggc tgcgcgccag 240 gctgtcctgt tcggtgtcga tctttccgtt cggcccgagg ccgaacccgt cttcatcctc 300 gatgtgcgca ccggacagga aaaggtggtc cacttcgacc cggccgccgt tcggcgcaag 360 gatcaccccg cgttccaccg tgttctgcag ttcgcggatg ttgcccggcc aggcatggcc 420 gaccagcgcc ttcttcgcct tgtcggaaaa tccgagcagc ttcttgccgt tgatcgccgc 480 atatttcctg aggaaatgct tggccagagg ggagatgtcc tccctgcgct ggcgcagcgg 540 cggaatgtcg atctggaaag cattgagacg gtagtacagg tcggccctga aacgcccttc 600 cttcaccaac tgggcgaggc tggcattggt cgcggcgacg aggcggacgt cgatacggcg 660 ggtcttgtca tcgcccaaac gctcgacctc gccttcctgg agcacccgca gcagcttgct 720 ctgcgccgtc agcggcagat cgccgatttc gtccaggaac agggtgccgc cgttggcgcg 780 ctcgaacctg cccgggcggg ctgccagcgc gccggtgtat gccccttttt ccacgccgaa 840 aagctcggcc tccacgaggt cgtgcggaat cgcggcgcag ttgatcgcaa ccagcgggcg 900 atcgcggcgg gcgctcattt cgtgcagcgc gcgcgtgaac agttccttgc cgacccccgt 960 ttcgccgagc agcaaaatgg cgatgctgct gcccgcggcc tgctgcagca agctgagcgc 1020 gaaccggaac ccgggcgagt cgccgatcat gtcgccaggc agcctggcgc gttcatcgat 1080 cgtggagcgc agctgttcca cctgggcctg caggtcgatc agttgctcgg cgatcgattc 1140 gggggcgaac aggcgtctgt attcctcggc atccggccat tcctcggccg gcttgccgac 1200 gatgtggcaa tgctcggcac ccatgcccgc gcactcggct tccttgtaca ggatcggtcg 1260 ccccatgaag gccgtggagt agccgcaggc atagccgatc tgggtccagc acaccggttc 1320 cgagcaggtt ccgaagtggc gcttgtgcga ctgcccctcc cacgaattga tccagcggaa 1380 ctcggcattg aaggtgccgg cggcgcggtc gaattccagc tggagcggga tgacgccgac 1440 aatgccctcg agcgcgtgca gctgcggccc ggtcatgaat gccgcaaggt cgtcgccggt 1500 cctgatccgt gtctgcgcga gctccgcatc acgggcaccg gatgcgaacc ccatgcgcag 1560 cagcaacccg cgcgcgcgcg ccatgccgag cgtatcgatc agctccttgc gcaaggccgc 1620 ctgcgcctcg gcgtgcacga gcagcatccg atgctcatga agccagatct gcccggtatc 1680 ggcgcagaaa tggatgcgcg accggagatc accgccgtct atgcagctca tatcgtgaag 1740 cttggccat 1749 23 14272 DNA Thauera aromatica 23 cggtcgcggt gatgaagcgg accttgttcc tgggcgtgta cgcggcaggc ctgcttgtgg 60 cgctcggatc ggtcatcggg gtgcctccgg gcagaaagcc gtgcctcccc gtaatcctag 120 agattccgcc ccgccttcgc caccgctgtc gcggcggacg cgcacggcgc gcggaatgcg 180 gcgcgccggc atccgggggc ggcgcccggc gcggcgcgga tcatggcctg ccgtcgcggc 240 agtcgatctc gtcccggtgg ccgaagccgc gcgagttgtc gatgaaatac agccgttcgg 300 gcacgaaacg gtaccagtgc accttcgcca gggcctgcag gatcgcggcc ggcgcacccc 360 ccagcgtgcc gacgacgggg tatttcgccc cgtagagcgc gcgcgcctgc cctgcggcat 420 cgccggacag ttccacgaca tggccttcgg cctggatgcc cttgacctca cgccagtcgg 480 agcagtcctc ctggatggtc accgcggcac gcccatcgcg cgcgatgttg ctgctgtggc 540 gggcgcctgg cttggacagg aagtacaggt cgaaaccgtc gctggcgtaa aacaccgccg 600 ccgcccacac cccctgctcg ccctgcgtcg ccagcgtcat cgtgtggtgc gcgcgcagcc 660 agtcgaggac atgggcctgg tgcccgttca tgcgtgcgcc ctcccggaca gggtgtcggt 720 cacgtcagct ccgggaccgg caccactttc ttcaccgcgc agacgcttga ggcggtaggc 780 gaattgcggc cgggtcaggc cgagcatgcg cgccgccgaa gacaggttgc cgcgcgcctt 840 gtcgagcgcg gtttcgagca gggtggtctc gatctgctcg agggtcaggg caccatcgca 900 caccgcgctg cacaggctgc gcgccaggct gtcctgttcg gtgtcgatct ttccgttcgg 960 cccgaggccg aacccgtctt catcctcgat gtgcgcaccg gacaggaaaa ggtggtccac 1020 ttcgacccgg ccgccgttcg gcgcaaggat caccccgcgt tccaccgtgt tctgcagttc 1080 gcggatgttg cccggccagg catggccgac cagcgccttc ttcgccttgt cggaaaatcc 1140 gagcagcttc ttgccgttga tcgccgcata tttcctgagg aaatgcttgg ccagagggga 1200 gatgtcctcc ctgcgctggc gcagcggcgg aatgtcgatc tggaaagcat tgagacggta 1260 gtacaggtcg gccctgaaac gcccttcctt caccaactgg gcgaggctgg cattggtcgc 1320 ggcgacgagg cggacgtcga tacggcgggt cttgtcatcg cccaaacgct cgacctcgcc 1380 ttcctggagc acccgcagca gcttgctctg cgccgtcagc ggcagatcgc cgatttcgtc 1440 caggaacagg gtgccgccgt tggcgcgctc gaacctgccc gggcgggctg ccagcgcgcc 1500 ggtgtatgcc cctttttcca cgccgaaaag ctcggcctcc acgaggtcgt gcggaatcgc 1560 ggcgcagttg atcgcaacca gcgggcgatc gcggcgggcg ctcatttcgt gcagcgcgcg 1620 cgtgaacagt tccttgccga cccccgtttc gccgagcagc aaaatggcga tgctgctgcc 1680 cgcggcctgc tgcagcaagc tgagcgcgaa ccggaacccg ggcgagtcgc cgatcatgtc 1740 gccaggcagc ctggcgcgtt catcgatcgt ggagcgcagc tgttccacct gggcctgcag 1800 gtcgatcagt tgctcggcga tcgattcggg ggcgaacagg cgtctgtatt cctcggcatc 1860 cggccattcc tcggccggct tgccgacgat gtggcaatgc tcggcaccca tgcccgcgca 1920 ctcggcttcc ttgtacagga tcggtcgccc catgaaggcc gtggagtagc cgcaggcata 1980 gccgatctgg gtccagcaca ccggttccga gcaggttccg aagtggcgct tgtgcgactg 2040 cccctcccac gaattgatcc agcggaactc ggcattgaag gtgccggcgg cgcggtcgaa 2100 ttccagctgg agcgggatga cgccgacaat gccctcgagc gcgtgcagct gcggcccggt 2160 catgaatgcc gcaaggtcgt cgccggtcct gatccgtgtc tgcgcgagct ccgcatcacg 2220 ggcaccggat gcgaacccca tgcgcagcag caacccgcgc gcgcgcgcca tgccgagcgt 2280 atcgatcagc tccttgcgca aggccgcctg cgcctcggcg tgcacgagca gcatccgatg 2340 ctcatgaagc cagatctgcc cggtatcggc gcagaaatgg atgcgcgacc ggagatcacc 2400 gccgtctatg cagctcatat cgtgaagctt ggccatcacc cttcctcctg aactggtcct 2460 tttacgcgca gccaccacgg gtcgtattga cgtgcgtcaa acggcccggc gcgcgactgc 2520 gcagcgccgg aaacgaagag aagcccctgc gttcatctaa tggtcaatcc tgcagccggc 2580 cggaaggaga actgatcatt tgatgaatcg catccaatgg ccgctttttc caattacccg 2640 gcacaaacgc cccgccagaa atttattttt tgcaactgca tgaaatgctc gaaaggcctg 2700 cacaacgggc aaacagcgct cccggcgtat gcgcccgaag gctgaattgc tgctctgccg 2760 caattaatcg tggcacaccc tttgcattgg atgcctggca ggcgtcgtcc aacaaatccg 2820 gtcgcaacga tcgacaacgg aaatagcaaa ggaggggcat cagatgaagt ttcctgttcc 2880 gcacgacatc caggccaaga cgattccggg gaccgaaggc tgggagcgga tgtacccgta 2940 ccactaccag ttcgtcaccg acgatccgca gcgtaaccag tacgagaaag aaaccttctg 3000 gttttacgac ggattgcatt acccggagcc gctttatccg ttcgacacga tctgggacga 3060 ggcctggtat ctcgccctgt cgcaattcaa caatcgaatt ttccaggtgc cgccggtgcg 3120 cggcgtcgat caccggatca tcaacggtta cgtctatatc tcgccggttc cgatcaagga 3180 ccccgatgaa atcggcaagc gcgtgcccaa tttcatggag cgcgccggtt tctattacaa 3240 gaactgggac gagctcgagg cgaaatggaa agtgaagatg gaggcgacga tcgccgagct 3300 cgaagcgctc gaggttccgc gcctgcccga cgccgaagac atgtcggtgg tgaccgaagg 3360 agtcggtgaa tcgaaggcct accacctgct caagaattac gacgacctga tcaacctcgg 3420 catcaagtgc tggcaatacc acttcgaatt cctcaatctt ggctatgccg cctacgtttt 3480 cttcatggat ttcgcgcaga agctgtttcc gagcattccg ctccagcgcg tcacccagat 3540 ggtgtcgggg atcgacgtca tcatgtaccg cccggacgac gaactgaagg aactggcaaa 3600 gaaggccgtt tcactcgaag tcgatgaaat cgtcaccggc catcgggagt ggagcgacgt 3660 caaggcggcg ctttcggcac accgccacgg tgccgaatgg ctcgaagcat tcgagaaatc 3720 ccgctacccg tggttcaaca tttcgaccgg cacgggatgg ttccataccg accgcagctg 3780 gaacgacaac ctcaacattc cgctcgacgg catccagacc tatatcggca agcttcacgc 3840 cggcgtcgcc atcgagcggc cgatggaagc ggtccgtgcc gagcgcgacc ggatcaccgc 3900 cgagtaccgc gatctgatcg acagcgacga ggaccgcaag cagttcgacg aactgctcgg 3960 ctgcgcccgg acggtgttcc cctacgtcga gaaccatctg ttctacgtcg agcactggtt 4020 ccactcggtg ttctggaaca agatgcgcga agtcgctgcg atcatgaaag aacactgcat 4080 gatcgacgac attgaagaca tctggtatct gcgccgcgat gaaatcaagc aggcgctgtg 4140 ggatctggtc accgcctggg caaccggcgt cacccctcgc ggcaccgcca cctggccggc 4200 cgaaatcgaa tggcgcaagg gggtgatgca gaagttccgc gaatggagcc cgccgccggc 4260 catcggcatc gcaccggaag tgatccagga gcccttcacc atcgtgctct ggggggtcac 4320 caacagctcg ctctcggcct gggccgccgt ccaggaaatc gacgaccccg acagcatcac 4380 cgagctgaaa ggcttcgccg ccagcccggg cacggtcgaa ggcaaggcgc gcgtgtgccg 4440 cagcgccgaa gacatccgcg acctgaagga gggcgaaatt ctcgtcgccc cgaccacctc 4500 gccttcgtgg gcgccggcct tcgccaagat caaggcctgc gtcaccgatg tcggcggcgt 4560 catgagccat gccgcgatcg tatgccgcga atacggcatg ccggcggtgg tgggcaccgg 4620 gctatcgacc cgtgtggtcc gcaccggcat gacgctgcgg gtcgatggtt cgagcgggct 4680 gatcacgatc atcacggatt gagggagtga ctgacatggg aagtatcgtt tccaccgtag 4740 ccctgtccgc ggccaccgcc gacagcactt cgccgaaggt ctgcccgttc gaggcctgcg 4800 gcaaggactc ggtcccgctg gtgggcggca agtgcgcgtc cctgggcgaa ctgatcaacg 4860 ccggcgtacg ggtgccgccg ggctttgccc tgaccaccag cggctatgcc cagttcatgc 4920 gtgaagccgg catccaggcg gacatcggcg cgctgctcga aggcctcgac caccaggaca 4980 tggacaagct cgaggaagca tcgagggcga tccgcgaaat gatcgaatcg cgcccgatgc 5040 cgatcgagct cgaagacctg atcgccgagg cctaccgcaa gctgtcggtc cgctgctatc 5100 tgcccgcggc gccggtggcg gtgcgttcga gcgcgaccgc cgaggacctg cccggtgcga 5160 gctttgccgg ccagcaggat acctacctgt ggatccgcgg cgtcgatgac ctcatccacc 5220 acgtccggcg ctgcatctcc agcctctaca ccggccgggc gatcgcctac cggatgaaga 5280 tgggcttccc gcacgagcag gtcgcgatca gcgtcggcgt ccagatgatg gcgaacgcct 5340 acaccgcggg ggtgatgttc acgatccatc cgggcaccgg cgaccgctcg gtgatcgtca 5400 tcgattcgaa tttcggcttc ggtgaatccg tggtgtcggg cgaagtcacg ccggacaact 5460 tcgtcgtcaa caaggtcacc ctcgacatca tcgagcgcac gatttcgacg aaggagctgt 5520 gccacaccgt cgatctgaag acccagaaat cagtcgcact tccggtccct gccgagcgcc 5580 agaacatcca gtcgattacc gatgacgaaa tcagcgaact cgcctgggcc gccaagaaga 5640 tcgaaaagca ttacggccgc ccgatggaca tcgaatgggc gatcgacaag aacctgcccg 5700 cggacggaaa cattttcatc ctccaggccc ggcccgaaac gatctggagc aaccgccaga 5760 aagccagcgc gacgaccggc agcacgtcgg cgatggatta catcgtatcg agcctgatca 5820 cgggcaagcg gctcggctag gaggacgaaa aaatgatcgt acgcaactgg atgcagacca 5880 atccgatcgt gctcaccggg gacaccttgc tgtccgaagc gaagcggatc ttttccgaag 5940 ccaatatcca cgcattaccg gtcgtcgatg acggccgcct gcgcggactc atcacccgcg 6000 ccggctgcct gcgggccgcg catgccgcgc tgcggaccca ggacaccgac gagctcaact 6060 acttctcgaa ccgggtcaag gtcaaggaca tcatggtccg caacccggcc accatcgatg 6120 ccgacgacac gatggaacac tgcctgcagg tcggccagga acacggcgtc ggccaattgc 6180 cggtgatgga caaaggcaat gtcgtcggaa tcatttcggc aatcgaaatg ttctcgctgg 6240 cggcgcattt ccttggtgcc tgggaaaagc gcagcggcgt caccctggcc ccgatcgatc 6300 tcaagcaggg aaccatgggc cgcatcatcg acaccgtcga agccgccggc gccgaggtgc 6360 acgcgatcta cccgatctcg gcccatgaca gggagtccgc ctcggccagg cgggagcgga 6420 aagtgatcat ccgcttccac gccgcgaacg tcgcggcagt catcgaggcg ctcgcccacg 6480 ccggctacga agtcatcgag gccgttcaag ccgcagcgca ttgagcccag ccccacccat 6540 cctgcctcac cccggtttca cccatttctg ccaaggagcg acacccatgg acctgcgcta 6600 cttcatcaac cagtgtgccg aagcccacga actgaagaga atcaccaccg aggtcgattg 6660 gaatctggag atttcccatg tttccaagct gaccgaagag aaaaaaggcc cggcgctgct 6720 gttcgaaagc atcaagggct acgacacgcc ggtgttcacc ggggccttcg cgaccaccaa 6780 gcgcctcgcc gtcatgctcg gcctgccgca caacctgtcg ctgtgcgaat ccgcccagca 6840 atggatgaag aaaacgatca cctccgaagg gctgatcaag gcgaaggaag tgaaggacgg 6900 cccggtgctg gaaaacgtgc tcagcggcga caaggtcgat ctcaacatgt tcccggtgcc 6960 gaagttcttc cccctcgacg gcgggcgcta catcggcacg atggtatcgg tggtgctgcg 7020 tgatccggag acgggcgagg tcaacctcgg cacctaccgc atgcagatgc tcgacgacaa 7080 gcgctgcggg gtgcagatcc tgcccgggaa gcgcggcgaa cggatcatga aaaagtacgc 7140 caagatgggc aaaaagatgc ccgccgcggc gatcatcggc tgcgatccgc tgatcttcat 7200 gtccggcacg ctgatgcaca agggcgccag cgacttcgac attaccggca ccgtgcgcgg 7260 ccagcaggcc gagttcctga tggcgccgct gaccgggctg ccggtgccgg ccggggccga 7320 gatcgtgctc gaaggcgaga tcgatccgaa cgccttcctg cccgaaggcc cgttcgccga 7380 atacaccggc tactacaccg acgaactgca caagccgatc ccgaaaccgg tgctcgaagt 7440 gcagcagatc ctgcaccgca acagcccgat cctgtgggcc accggccagg gccgcccggt 7500 gaccgacgtc catatgctgc tcgccttcac ccggaccgcg accttgtgga ccgagctcga 7560 gcagatgcgc attcccggca tccagtcggt gtgcgtgatg ccggaatcga ccgggcgctt 7620 ctggtcggtg gtgtcggtca agcaggccta cccggggcac tcgcgccagg tggccgacgc 7680 ggtgatcgcc agcaacaccg gctcgtacgg catgaagggt gtgatcacgg tcgatgagga 7740 catccaggcc gacgatctgc agcgcgtgtt ctgggcgctg tcgtgccgct acgacccggc 7800 gcgcggcacc gagctgatca agcgcggccg ctcgacgccg ctcgatccgg cgctcgaccc 7860 gaacggcgac aagctcacca cgtcgcggat cctgatggac gcctgcatcc cctacgagtg 7920 gaagcagaag ccggtcgaag cgcgcatgga cgaagagatg ctggcgaaga tccgcgcccg 7980 ctggcacgag tacggcatcg actgagccct tagccgcatg acaaaccacg gccgccgatg 8040 gggcggccgt cactggagga catggagaca tggaacaggc gaagaacatc aagctggtga 8100 tcctcgacgt cgatggcgtg atgaccgacg ggcgcatcgt gatcaatgac gaaggcatcg 8160 agtcgcgcaa cttcgacatc aaggacggca tgggcgtgat cgtgctgcaa ctgtgcggcg 8220 tcgaggtcgc gatcatcacc tcgaagaaat ccggcgcggt gcgccatcgc gccgaggagc 8280 tgaagatcaa gcgcttccac gagggcatca agaagaagac cgagccctac gcgcagatgc 8340 tcgaggagat gaacatctcc gatgccgaag tctgctacgt cggcgacgac ctcgtcgatc 8400 tgtcgatgat gaagcgcgtc ggcctggccg tggcggtcgg tgacgccgtg gccgacgtca 8460 aggaagtggc cgcttatgtg acgactgcgc gcggcgggca cggcgcggtg cgcgaagtcg 8520 cggagctgat cctgaaagcg cagggcaagt gggacgcgat gctctcgaag atccattgat 8580 tcatccgcat gacatccatc gacaaggaga tcgacatggg aaagatttca gcaccgaaaa 8640 acaaccgtga attcatcgag gcatgcgtca agtccggcga tgcggtccgg atcagacagg 8700 aagtggactg ggacaacgag gccggcgcca tcgtgcgccg cgcctgcgag ctcgccgaag 8760 ccgccccgtt catggagaac atcaaggact accccggctt cagctacttc ggcgcgccgc 8820 tgtcgaccta ccgccgcatg gcgatctcgc tcggcatgga cccggcatcg accttgccgc 8880 agatcggcgc cgagtacctc aaacgtacca acagcgagcc cgtggcgccg gtgatcgtcg 8940 acaaacggga cgccccgtgc aaggagaaca tcctgctcgg cgccgacgtc gatctgacca 9000 agctgccggt accgctggtc catgacggcg acggcggccg ctacgtcggc acctggcacg 9060 cggtgatcac caagcacccg gtgcgcggcg acgtgaactg gggcatgtac cggcagatga 9120 tgtgggacgg ccgcacgatg tcgggcgccg tgttcccgtt ctcggatctg ggcaaggcgc 9180 tcaccgagta ctacctgccg cgcggcgagg gctgcccgtt cgcgaccgcg atcggcctgt 9240 cgccgctcgc cgcgatggcc gcctgcgcgc cctctccgat ccccgagccc gagctcaccg 9300 gcatgctcgc cggcgagccg gtgcgcctgg tgaagtgcga gaccaacgac ctcgaagtcc 9360 cggccgatgc cgagatcatc atcgagggcg tgatcctgcc cgactacaag gtcgaggaag 9420 gcccgttcgg cgaatacacc ggctaccgca ccagcccgcg cgacttccgc gtcaccttcc 9480 gcgtcgatgc gatcacctat cgcaacaacg cgacgatgac gatctcgaac atgggcgtgc 9540 cgcaggacga gggccagctg ctgcgctcgt tctcgctcgg gctcgaactc gagaagctgc 9600 tgaagagcca gggtatcccg gtgaccggcg tgtacatgca cccgcgctcg acccaccaca 9660 tgatgatcgt cggcgtgaag ccgacctacg ccggcatcgc gatgcagatc gcgcagctcg 9720 cgttcggctc caagctcggg ccgtggttcc acatggtgat ggtggtcgac gaccagaccg 9780 acatcttcaa ctgggacgag gtctatcacg cgttctgcac gcgctgcaat ccggagcgcg 9840 gcatccacgt gttcaagaac accaccggca ccgccctcta tccgcacgcc accccgcacg 9900 accgcaagta ctcgatcggc tcgcaggtgc tgttcgattg cctgtggccg gtcgattggg 9960 acaagaccaa cgacgtgccg acgctcgtca gcttcaagaa cgtctatccg aaggacatcc 10020 aggaaaaggt cacgaacaac tggaccgact acggcttcaa gccggtgaaa taaggagacg 10080 caacatgaac cagtgggaag tattcgtcat ggacccggcg gaactgccgg aaggcaagca 10140 gctcgagctg agcgtgcgca ccctcaaccc cgggctgaag aaatacacct atcagcgcgt 10200 cagggctgaa gtgtcacccg cgctcgacaa gttccccgac cagctccagg tccggctcgg 10260 gcgcggccag ctgagccccc agcgcttctc gatccgcatc atcgagaccg tccagcgcat 10320 gccggccaag tacctgtagt gacggcggac ggcgccgggc aactgcctct gcccggcgcc 10380 ggaagcgtga ccgccgcctt ttgtccgccc gcggcagcgc cgcggccggc actcaacccg 10440 ctaaagcatt gggggaacga tggcctattc cgatctgcgt gccttcctcg ccgacctcgg 10500 tgacgacttg ctgcgcatcc gcgatgagtt cgacccgcgc ttcgaagcgg cagccttgct 10560 ccgcaccctc cccgccgaag ggccggccgt gctgttcgag aacgtccgcg cctaccccgg 10620 cgcacgcatc gccggcaacc tgatcgccag ccgcagccgc ctggcgcgcg cactcggcac 10680 caccgccgac gcgctgccgc ggacctggct ggagcgcaag gagcacggca ttgcaccgat 10740 ccaggcgcgg gacgcggccc cggtcaaagg aagtgatcca ccgccatccg gacgatctgc 10800 tgtcgctgct gccgatcctg acccaccacg aaaaggatgc ggcccccttc atcaccaccg 10860 gcgtggtgtt gtgcaccgac cccgagaccg gccggcgcgg catgggcatc caccgcatga 10920 tggtcaaggg cgggcgccgg ctcggcatcc tgctcgccaa tccgccgatt ccgcatttcc 10980 tcgccaaggc cgaagcggcc ggcaagccgc tcgatgtcgc catcgcgctc ggtctcgaac 11040 ccgccaccct gctgtcgtcg gtggtcaagg tcggcccgcg ggtgcccgac aagatggccg 11100 ctgccggcgc cctgcgtggc gaaccggtcg agctggtgcg cgccgaaacg gtggatgtgg 11160 acatcccggc gcgcgccgaa atcgtcatcg aaggccggat tctgccgggc gtgcgcgaac 11220 tcgagggccc gttcggggag aacaccgggc actatttttc caacgtcagc ccggtcatcg 11280 agatcagcgc cgtcacccat cgcgacaact tcatctaccc gggcctgtgc ccatggtcgc 11340 ccgaggtcga tgcgctgctg tcgctggcgg ccggtgccga attgctcggc cagttgcagg 11400 ggctgatcga cggcgtcgtc gatctggaga tggccggcgg caccagcggc ttttccgtgg 11460 ttgtcgcagt ccatcggacc actgcggccg acgtcagacg gctggtcatg ctcgcgctca 11520 atctcgaccg ccgcctgaag acgatcaccg tcgtcgacga cgacgtcgac atccgcgacc 11580 cgcgcgaagt cgcctgggcc atggctaccc gctaccagcc cgcccgggac acggtcgtga 11640 tccacggctg cgaagcctat gtcatcgatc cttcggcgac cggggacggc acatcgaaag 11700 tcgggttcat cgccacccgt gccagcggcg cggactcgga ccgcatcacc ctgccgccgg 11760 cagcgctcgc gaaggcgcgc gccatcatcg ccagactgca ttgaacaggg agcaagccat 11820 gagaatcgtc gtcggaatgt ccggtgccag cggtgcgatc tacggcatcc ggatcctcga 11880 ggcactacag cgcatcggtg tcgaaaccga cctggtgatg tcggattcgg ccaagcggac 11940 catcgcatac gaaacggact attcgatcag cgacttgaag ggactcgcga cctgcgtcca 12000 tgacatcaat gatgtcgggg cgtcgatcgc cagcggctcg ttccgccatg ccggcatgat 12060 catcgcgccc tgttcgatca agaccctgtc cgcagtcgcc aactcgttca acacgaatct 12120 gttgatccgc gccgccgacg tcgcgttgaa ggagcggcgc aagctcgtgc tgatgctgcg 12180 cgagacgccg ctgcacctgg gccacctgcg cctgatgacc caggccacgg agaacggcgc 12240 ggttctcctc cctcccctgc ccgcgttcta ccaccgcccc aagacgctcg acgacatcat 12300 caaccagtcg gtgacgaaag tgctcgacca gttcgatctc gacgtcgatc tcttcgggcg 12360 gtggacgggc aacgaagaac gcgaactggc gaaatcccga taggacgctt ccgatgccac 12420 cgatcgccct tcccctgtca ctcgaaggcg tcgtctgcac gggactcggt gcaggcgcgc 12480 agttcaccac cctcgactgg gtcgtcgatg aatgccggga aaagctcggc ttcatcccct 12540 ggcccggcac cttcaacgtg aggacgcagg gcgcgcttgc gggcgtggac cgcacccgcc 12600 tcctgcgctc gggatacagc atccgcatcc ggccggcgcc cggctactgt gccgcggaat 12660 gcctcgtggt caacatcgcg gggcggatct ccggcgcggt gctattccca gaggtgcccg 12720 gctacccgga cggccagctc gaaatcatcg ctccggtgcc ggtacgaaga accctcggcc 12780 tcaatgacgg cgaccgggtc aacctctcca tcggcatcag cacctccctt ttctgccggg 12840 cctgaacagt cgggagccgg caaacgtcag caaggagatt cacatggcac cgaagttctg 12900 cccgcaatgc ggcaccgccc tggtcctggc gacgatccat gggcgcgaac gtgaaacctg 12960 tccggcctgt ggcgaaacct ttttccacaa gcccgcgccc gtcgtgctgg cggtgatcga 13020 gcacgccggg caactcgtgc tgatccgccg caagctcgat ccgctcgccg gctactgggc 13080 accgccgggc ggctacgtcg aacgcggcga atcgctcgag gaggcggtcg tacgcgaggc 13140 gcgcgaggaa agcggactcg aggtcgccgt cgatgaactg atcggcgtgt attcgcaggc 13200 cgacgtgcgc gcggtgatcc tcgcctaccg cgcgcactcg atcggcggcg aaccggtcgc 13260 cggcgacgac gccggcgaga tctgcctcgt cgccccgggc cagctgccgg tgcagcgccc 13320 gccgcagagc ggcataccga tcgaacactg gtttttcagc gtagtggagg aagtcaccga 13380 tccatggaag tgggggcgcc gcaacagcgc caagaaaatg atgaggagat agaacgtgaa 13440 tatcatcgat acacccccga tcacccccga gatgccgcca aacctgctgg attacctgcg 13500 cggcggcgga cctgccctgc tgctgacgac gggcaccgac ggatacccga gctcggccta 13560 cacatgggca atcgccctcg acggcacgca cctgcgcttc ggcgcggacg agggcggctc 13620 cggctacgcc aacctggagc gcaccggaca ggccgcgata cacatcatcg gcccgaatga 13680 cctcgccttc ctcgtcaagg gaacggcacg tcttctcaag gcgcacatcg acactgcctc 13740 gcccgcgcgc atggcgctgt acgaactcga agtgatcgga gcccgcgatc agtccttccc 13800 cggcgtcacg gccaagccct tcacctatga atggccggcg gcgcagcgcg cggcgctgac 13860 gaagatggaa cagtcggtgt ttaccgaaat gcgcgaattc gcccagtgac aaaggccgca 13920 cgctcctgga ccccccattc aaaccttcag gaattttctc atgtcgtatt tcgaccagac 13980 caccgaaacc cttccccgcg aacgcctggc cgccctgcag ttcgacaagc tgcaggcgat 14040 gatgaacgag ctgtggggca ggaaccgctt ctacaccaac aagtggaaag ccgccggcgt 14100 cgaaccgggt gacatccgga cgctcgacga tctgcgcacc aactacgaag tcggcaacac 14160 ccaggccgtg ctcgacggcg acctcgacga cttcatcgcg gcaagcctga agcagggcgt 14220 ctgatccgct ggcgccgccc ctgcaggcgg gcggcgaatc ggttccgccg gc 14272 24 42 PRT Thauera aromatica 24 Gly Lys Ile Ser Ala Pro Lys Asn Asn Arg Glu Phe Ile Glu Ala Ser 1 5 10 15 Val Lys Ser Gly Asp Ala Val Arg Ile Arg Gln Glu Val Asp Trp Asp 20 25 30 Asn Glu Ala Gly Ala Ile Val Arg Arg Ala 35 40 25 26 PRT Thauera aromatica 25 Met Gly Lys Ile Ser Ala Pro Lys Asn Asn Arg Glu Phe Ile Glu Ala 1 5 10 15 Cys Val Lys Ser Gly Asp Ala Val Arg Ile 20 25 26 38 PRT Thauera aromatica UNSURE (10) Xaa = unknown 26 Met Asp Leu Arg Tyr Phe Ile Asn Gln Xaa Ala Glu Ala His Glu Leu 1 5 10 15 Lys Arg Ile Thr Thr Glu Val Asp Trp Asn Leu Glu Ile Ser His Val 20 25 30 Ser Lys Leu Xaa Xaa Glu 35 27 38 PRT Thauera aromatica 27 Met Asp Leu Arg Tyr Phe Ile Asn Gln Cys Ala Glu Ala His Glu Leu 1 5 10 15 Lys Arg Ile Thr Thr Glu Val Asp Trp Asn Leu Glu Ile Ser His Val 20 25 30 Ser Lys Leu Thr Glu Glu 35 28 33 PRT Thauera aromatica UNSURE (26)..(28) Xaa = unknown 28 Met Lys Phe Pro Val Pro His Asp Ile Gln Ala Lys Thr Ile Pro Gly 1 5 10 15 Thr Glu Gly Trp Glu Arg Met Tyr Pro Xaa Xaa Xaa Ala Phe Val Xaa 20 25 30 Asp 29 33 PRT Thauera aromatica 29 Met Lys Phe Pro Val Pro His Asp Ile Gln Ala Lys Thr Ile Pro Gly 1 5 10 15 Thr Glu Gly Trp Glu Arg Met Tyr Pro Tyr His Tyr Gln Phe Val Thr 20 25 30 Asp 30 7 PRT Thauera aromatica 30 Met Gln Met Leu Asp Asp Lys 1 5 31 28 PRT Thauera aromatica UNSURE (10)..(14) Xaa = unknown 31 Gly Gln Gln Ala Glu Phe Leu Met Ala Xaa Xaa Xaa Xaa Xaa Pro Val 1 5 10 15 Xaa Ala Gly Ala Glu Ile Val Leu Glu Xaa Gly Ile 20 25 32 21 DNA Primer 32 atggayctsc gstacttcat c 21 33 20 DNA Primer 33 ttrtcrtcsa gcatctgcat 20 34 21 DNA Primer 34 catsaggaay tcsgcctgct g 21 35 22 DNA Primer 35 cgggatatca ctcagcataa tg 22 36 20 DNA Primer 36 aattaaccct cactaaaggg 20 37 17 DNA Primer 37 gacaacttcg tcgtcaa 17 38 20 DNA Primer 38 gtggatattg gcttcggaaa 20 39 18 DNA Primer 39 tcgccggcga cgacgccg 18 40 18 DNA Primer 40 ccgcgcgctg cgccgccg 18 41 10 PRT Thauera aromatica 41 Met Glu Gln Ala Lys Asn Ile Lys Leu Val 1 5 10 42 10 PRT Thauera aromatica 42 Met Glu Gln Ala Lys Asn Ile Lys Leu Val 1 5 10 43 10 PRT Thauera aromatica UNSURE (8) Xaa = unknown 43 Met Arg Ile Val Val Gly Met Xaa Gly Ala 1 5 10 44 10 PRT Thauera aromatica 44 Met Arg Ile Val Val Gly Met Ser Gly Ala 1 5 10
Claims (16)
1. A polypeptide encoded by DNA selected from the group consisting of:
(a) DNA having the nucleotide sequence shown in SEQ ID NO:2, SEQ ID NO:4, or SEQ ID NO:6;
(b) a degenerate nucleotide sequence of the DNA of (1); and
(c) DNA that hybridizes with the complement of the nucleotide sequence of (1) or analog thereof under hybridization conditions wherein 6× SSC (1 NaCl), 40 to 45% formamide, 1% SDS at 37° C., and a wash in 0.5× to 1× SSC at 55 to 60° C. wherein the polypeptide is further characterized by phosphorylase activity on phenol substrates.
2. The polypeptide of claim 1 having the amino acid sequence of SEQ ID NO:1, SEQ ID NO:3, or SEQ ID NO:5.
3. An isolated nucleic acid fragment encoding the polypeptide of claim 1 , the nucleic acid fragment selected from the group consisting of:
(a) an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence of SEQ ID NO:2; SEQ ID NO:4; or SEQ ID NO:6;
(b) an isolated nucleic acid fragment that is substantially similar to an isolated nucleic acid fragment encoding all or a substantial portion of the amino acid sequence SEQ ID NO:2; SEQ ID NO:4; or SEQ ID NO:6;
(c) an isolated nucleic acid molecule that hybridizes with the nucleic acid fragment of (a) under hybridization conditions wherein 6× SSC (1 NaCl), 40 to 45% formamide, 1% SDS at 37° C., and a wash in 0.5× to 1× SSC at 55 to 60° C.; and
(d) an isolated nucleic acid fragment that is complementary to (a), (b), or (c),
wherein the isolated nucleic acid is further characterized by phosphorlase activity on phenol substrates.
4. The DNA fragment of claim 3 , wherein the DNA fragment is isolated from Thauera aromatica.
5. An expression cassette comprising the DNA fragment of claim 3 operably linked to suitable signal sequences for the expression of the DNA fragment in a host microorganism.
6. An expression vector comprising the expression cassette of claim 5 and regulatory sequences ensuring the stable maintenance of said expression vector.
7. A microorganism stably transformed with the DNA fragment of claim 3 .
8. A transformed microorganism comprising the expression vector of claim 6 .
9. A transformed microorganism comprising the expression cassette of claim 5 , wherein the signal sequences of the expression cassette are a ribosome binding site and a promoter sequence located upstream of the DNA fragment.
10. The transformed microorganism of claim 9 wherein the promoter is at least one of CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI, AOX1, lac, trp, 1PL, IPR, T7, tac, and trc or at least one strong promoter of Corynebacterium, Comamonas, Rhodococcus or Pseudomonas.
11. The transformed microorganism of claim 9 , wherein the ribosome binding site is selected from the group consisting of ribosome binding sites from the genomes of E. coli, P. pastoris, Comamonas, Pseudomonas, Rhodococcus, and Corynebacterium.
12. The transformed microorganism of claim 11 , wherein the host microorganism is selected from the group consisting of Comamonas sp., Corynebacterium sp., Brevibacterium sp., Rhodococcus sp., Azotobacter sp., Citrobacter sp., Enterobacter sp., Clostridium sp., Klebsiella sp., Salmonella s.p, Lactobacillus sp., Aspergillus sp., Saccharomyces sp., Zygosaccharomyces sp, Pichia sp., Kluyveromyces sp., Candida sp., Hansenula sp., Dunaliella sp., Debaryomyces sp., Mucor sp., Torylopsis sp., Methylobacteriasp., Bacillussp., Escherichia sp., Pseudomonas sp., Rhizobium sp., and Streptomyces sp.
13. An isolated and purified DNA fragment having a nucleotide sequence SEQ ID NO:2, SEQ ID NO:4, or SEQ ID NO:6.
14. An isolated and purified 14.27 kb DNA fragment as shown in FIG. 11.
15. A microorganism stably transformed with chimeric genes having at least one copy of one or more of nucleotide sequences selected from the group consisting of SEQ ID NOs:6, 12, 14, 4, 8, 2, 16, 10, 18, and 20.
16. A microorganism stably transformed with a chimeric gene having at least one copy of the nucleic acid sequence of SEQ ID NO:23.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09/870,162 US20020042118A1 (en) | 1999-03-05 | 2001-05-30 | Phenol-induced proteins of Thauera aromatica |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12295299P | 1999-03-05 | 1999-03-05 | |
| US09/516,914 US6333401B1 (en) | 1999-03-05 | 2000-03-01 | Phenol-induced proteins of Thauera aromatica |
| US09/870,162 US20020042118A1 (en) | 1999-03-05 | 2001-05-30 | Phenol-induced proteins of Thauera aromatica |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US09/516,914 Division US6333401B1 (en) | 1999-03-05 | 2000-03-01 | Phenol-induced proteins of Thauera aromatica |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20020042118A1 true US20020042118A1 (en) | 2002-04-11 |
Family
ID=22405870
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US09/516,914 Expired - Fee Related US6333401B1 (en) | 1999-03-05 | 2000-03-01 | Phenol-induced proteins of Thauera aromatica |
| US09/870,162 Abandoned US20020042118A1 (en) | 1999-03-05 | 2001-05-30 | Phenol-induced proteins of Thauera aromatica |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US09/516,914 Expired - Fee Related US6333401B1 (en) | 1999-03-05 | 2000-03-01 | Phenol-induced proteins of Thauera aromatica |
Country Status (5)
| Country | Link |
|---|---|
| US (2) | US6333401B1 (en) |
| EP (1) | EP1157116A1 (en) |
| AU (1) | AU3248900A (en) |
| CA (1) | CA2360935A1 (en) |
| WO (1) | WO2000052170A1 (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107828692B (en) * | 2017-11-28 | 2020-01-21 | 广东省生态环境技术研究所 | Terres tarum and preparation and application of microbial agent thereof |
-
2000
- 2000-03-01 US US09/516,914 patent/US6333401B1/en not_active Expired - Fee Related
- 2000-03-03 EP EP00910389A patent/EP1157116A1/en not_active Withdrawn
- 2000-03-03 WO PCT/US2000/005460 patent/WO2000052170A1/en not_active Ceased
- 2000-03-03 CA CA002360935A patent/CA2360935A1/en not_active Abandoned
- 2000-03-03 AU AU32489/00A patent/AU3248900A/en not_active Abandoned
-
2001
- 2001-05-30 US US09/870,162 patent/US20020042118A1/en not_active Abandoned
Also Published As
| Publication number | Publication date |
|---|---|
| EP1157116A1 (en) | 2001-11-28 |
| CA2360935A1 (en) | 2000-09-08 |
| US6333401B1 (en) | 2001-12-25 |
| AU3248900A (en) | 2000-09-21 |
| WO2000052170A1 (en) | 2000-09-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1706457B1 (en) | Production of 3-hydroxypropionic acid using beta-alanine/pyruvate aminotransferase | |
| US6660507B2 (en) | Genes involved in isoprenoid compound production | |
| AU2015266785B2 (en) | Increasing lipid production and optimizing lipid composition | |
| KR20150108367A (en) | Recombinant cell, and method for producing 1,4-butanediol | |
| CN101720356A (en) | Enzyme for preparing methylmalonyl-CoA or ethylmalonyl-CoA and use thereof | |
| US20070031951A1 (en) | Method for the production of resveratrol in a recombinant bacterial host cell | |
| US6908992B2 (en) | Methanotrophic carbon metabolism pathway genes and enzymes | |
| WO2001055342A9 (en) | Synthetic genes for enhanced expression | |
| US6951751B2 (en) | DNA and amino acid sequences of a tyrosine-inducible tyrosine ammonia lyase enzyme from the yeast Trichosporon cutaneum | |
| EP1358333B1 (en) | Gene encoding the gumd polypeptide from methylomonas sp. involved in exopolysaccharide production | |
| US6333401B1 (en) | Phenol-induced proteins of Thauera aromatica | |
| US7057030B2 (en) | Rhodococcus gene encoding aldoxime dehydratase | |
| US20030170653A1 (en) | Biological method for the production of tuliposide A and its intermediates | |
| JP2006501819A (en) | DNA and amino acid sequences of tyrosine ammonia lyase enzyme from the bacterium Rhodobacter sphaeroides | |
| WO2000052170A9 (en) | PHENOL-INDUCED PROTEINS OF $i(THAUERA AROMATICA) |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |