US20220243237A1 - Sialyltransferases and uses thereof - Google Patents
Sialyltransferases and uses thereof Download PDFInfo
- Publication number
- US20220243237A1 US20220243237A1 US17/688,900 US202217688900A US2022243237A1 US 20220243237 A1 US20220243237 A1 US 20220243237A1 US 202217688900 A US202217688900 A US 202217688900A US 2022243237 A1 US2022243237 A1 US 2022243237A1
- Authority
- US
- United States
- Prior art keywords
- seq
- δ20bstc
- lactose
- enzyme
- bacterium
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108090000141 Sialyltransferases Proteins 0.000 title claims description 116
- 102000003838 Sialyltransferases Human genes 0.000 title claims description 115
- 241000894006 Bacteria Species 0.000 claims abstract description 148
- 238000000034 method Methods 0.000 claims abstract description 47
- 229920001542 oligosaccharide Polymers 0.000 claims abstract description 47
- 150000002482 oligosaccharides Chemical class 0.000 claims abstract description 44
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 33
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 27
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 27
- 108090000790 Enzymes Proteins 0.000 claims description 98
- 102000004190 Enzymes Human genes 0.000 claims description 97
- 239000008101 lactose Substances 0.000 claims description 87
- OIZGSVFYNBZVIK-FHHHURIISA-N 3'-sialyllactose Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)O[C@@H]1[C@@H](O)[C@H](O[C@H]([C@H](O)CO)[C@H](O)[C@@H](O)C=O)O[C@H](CO)[C@@H]1O OIZGSVFYNBZVIK-FHHHURIISA-N 0.000 claims description 85
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 79
- 241000588724 Escherichia coli Species 0.000 claims description 72
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 claims description 70
- 235000001014 amino acid Nutrition 0.000 claims description 69
- 230000035772 mutation Effects 0.000 claims description 64
- 150000001413 amino acids Chemical class 0.000 claims description 59
- 230000000694 effects Effects 0.000 claims description 56
- 108010005774 beta-Galactosidase Proteins 0.000 claims description 53
- 238000006467 substitution reaction Methods 0.000 claims description 45
- 238000004519 manufacturing process Methods 0.000 claims description 40
- 230000001580 bacterial effect Effects 0.000 claims description 32
- 230000003834 intracellular effect Effects 0.000 claims description 29
- 230000002829 reductive effect Effects 0.000 claims description 27
- TYALNJQZQRNQNQ-JLYOMPFMSA-N alpha-Neup5Ac-(2->6)-beta-D-Galp-(1->4)-beta-D-Glcp Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)OC[C@@H]1[C@H](O)[C@H](O)[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](O)[C@H](O)O[C@@H]2CO)O)O1 TYALNJQZQRNQNQ-JLYOMPFMSA-N 0.000 claims description 26
- DVGKRPYUFRZAQW-UHFFFAOYSA-N 3 prime Natural products CC(=O)NC1OC(CC(O)C1C(O)C(O)CO)(OC2C(O)C(CO)OC(OC3C(O)C(O)C(O)OC3CO)C2O)C(=O)O DVGKRPYUFRZAQW-UHFFFAOYSA-N 0.000 claims description 24
- 230000037430 deletion Effects 0.000 claims description 23
- 238000012217 deletion Methods 0.000 claims description 23
- 230000014509 gene expression Effects 0.000 claims description 23
- 101150066555 lacZ gene Proteins 0.000 claims description 15
- 108010060845 lactose permease Proteins 0.000 claims description 15
- 239000013612 plasmid Substances 0.000 claims description 15
- 101150072314 thyA gene Proteins 0.000 claims description 13
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 11
- 238000003780 insertion Methods 0.000 claims description 11
- 230000037431 insertion Effects 0.000 claims description 11
- 241000589875 Campylobacter jejuni Species 0.000 claims description 10
- TYALNJQZQRNQNQ-UHFFFAOYSA-N #alpha;2,6-sialyllactose Natural products O1C(C(O)C(O)CO)C(NC(=O)C)C(O)CC1(C(O)=O)OCC1C(O)C(O)C(O)C(OC2C(C(O)C(O)OC2CO)O)O1 TYALNJQZQRNQNQ-UHFFFAOYSA-N 0.000 claims description 9
- 108010035265 N-acetylneuraminate synthase Proteins 0.000 claims description 9
- 102100029954 Sialic acid synthase Human genes 0.000 claims description 9
- 101710091363 UDP-N-acetylglucosamine 2-epimerase Proteins 0.000 claims description 9
- 230000002255 enzymatic effect Effects 0.000 claims description 9
- 241000193830 Bacillus <bacterium> Species 0.000 claims description 7
- SQVRNKJHWKZAKO-LUWBGTNYSA-N N-acetylneuraminic acid Chemical compound CC(=O)N[C@@H]1[C@@H](O)CC(O)(C(O)=O)O[C@H]1[C@H](O)[C@H](O)CO SQVRNKJHWKZAKO-LUWBGTNYSA-N 0.000 claims description 7
- 239000012228 culture supernatant Substances 0.000 claims description 7
- CILYIEBUXJIHCO-UHFFFAOYSA-N 102778-91-6 Natural products O1C(C(O)C(O)CO)C(NC(=O)C)C(O)CC1(C(O)=O)OC1C(O)C(OC2C(C(O)C(O)OC2CO)O)OC(CO)C1O CILYIEBUXJIHCO-UHFFFAOYSA-N 0.000 claims description 6
- 241000194033 Enterococcus Species 0.000 claims description 6
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 6
- 241000186660 Lactobacillus Species 0.000 claims description 6
- CILYIEBUXJIHCO-UITFWXMXSA-N N-acetyl-alpha-neuraminyl-(2->3)-beta-D-galactosyl-(1->4)-beta-D-glucose Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)O[C@@H]1[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](O)[C@H](O)O[C@@H]2CO)O)O[C@H](CO)[C@@H]1O CILYIEBUXJIHCO-UITFWXMXSA-N 0.000 claims description 6
- OIZGSVFYNBZVIK-UHFFFAOYSA-N N-acetylneuraminosyl-D-lactose Natural products O1C(C(O)C(O)CO)C(NC(=O)C)C(O)CC1(C(O)=O)OC1C(O)C(OC(C(O)CO)C(O)C(O)C=O)OC(CO)C1O OIZGSVFYNBZVIK-UHFFFAOYSA-N 0.000 claims description 6
- 241000588912 Pantoea agglomerans Species 0.000 claims description 6
- 235000004279 alanine Nutrition 0.000 claims description 6
- FCIROHDMPFOSFG-LAVSNGQLSA-N disialyllacto-N-tetraose Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)OC[C@@H]1[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](O[C@]3(O[C@H]([C@H](NC(C)=O)[C@@H](O)C3)[C@H](O)[C@H](O)CO)C(O)=O)[C@@H](O)[C@@H](CO)O2)O)[C@@H](NC(C)=O)[C@H](O[C@@H]2[C@H]([C@H](O[C@H]3[C@@H]([C@@H](O)C(O)O[C@@H]3CO)O)O[C@H](CO)[C@@H]2O)O)O1 FCIROHDMPFOSFG-LAVSNGQLSA-N 0.000 claims description 6
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 claims description 6
- 229940039696 lactobacillus Drugs 0.000 claims description 6
- 241000194017 Streptococcus Species 0.000 claims description 5
- 101150086432 lacA gene Proteins 0.000 claims description 5
- 102000030902 Galactosyltransferase Human genes 0.000 claims description 4
- 108060003306 Galactosyltransferase Proteins 0.000 claims description 4
- 244000199866 Lactobacillus casei Species 0.000 claims description 4
- 235000013958 Lactobacillus casei Nutrition 0.000 claims description 4
- 229940017800 lactobacillus casei Drugs 0.000 claims description 4
- 239000003550 marker Substances 0.000 claims description 4
- 241000193755 Bacillus cereus Species 0.000 claims description 3
- 241000193752 Bacillus circulans Species 0.000 claims description 3
- 241000193749 Bacillus coagulans Species 0.000 claims description 3
- 241000193422 Bacillus lentus Species 0.000 claims description 3
- 241000194108 Bacillus licheniformis Species 0.000 claims description 3
- 241000194107 Bacillus megaterium Species 0.000 claims description 3
- 241000194106 Bacillus mycoides Species 0.000 claims description 3
- 244000063299 Bacillus subtilis Species 0.000 claims description 3
- 235000014469 Bacillus subtilis Nutrition 0.000 claims description 3
- 241000186000 Bifidobacterium Species 0.000 claims description 3
- 241000186016 Bifidobacterium bifidum Species 0.000 claims description 3
- 241001608472 Bifidobacterium longum Species 0.000 claims description 3
- 241000186015 Bifidobacterium longum subsp. infantis Species 0.000 claims description 3
- 241000193417 Brevibacillus laterosporus Species 0.000 claims description 3
- 241000588919 Citrobacter freundii Species 0.000 claims description 3
- 241000194031 Enterococcus faecium Species 0.000 claims description 3
- 240000001046 Lactobacillus acidophilus Species 0.000 claims description 3
- 235000013956 Lactobacillus acidophilus Nutrition 0.000 claims description 3
- 244000199885 Lactobacillus bulgaricus Species 0.000 claims description 3
- 235000013960 Lactobacillus bulgaricus Nutrition 0.000 claims description 3
- 241000218492 Lactobacillus crispatus Species 0.000 claims description 3
- 241000186606 Lactobacillus gasseri Species 0.000 claims description 3
- 240000002605 Lactobacillus helveticus Species 0.000 claims description 3
- 235000013967 Lactobacillus helveticus Nutrition 0.000 claims description 3
- 241001561398 Lactobacillus jensenii Species 0.000 claims description 3
- 240000006024 Lactobacillus plantarum Species 0.000 claims description 3
- 235000013965 Lactobacillus plantarum Nutrition 0.000 claims description 3
- 241000186604 Lactobacillus reuteri Species 0.000 claims description 3
- 241000186869 Lactobacillus salivarius Species 0.000 claims description 3
- 241000194036 Lactococcus Species 0.000 claims description 3
- 241000192041 Micrococcus Species 0.000 claims description 3
- 241000588701 Pectobacterium carotovorum Species 0.000 claims description 3
- 241000589516 Pseudomonas Species 0.000 claims description 3
- 241000589517 Pseudomonas aeruginosa Species 0.000 claims description 3
- 241000589540 Pseudomonas fluorescens Species 0.000 claims description 3
- 241000316848 Rhodococcus <scale insect> Species 0.000 claims description 3
- 241000204117 Sporolactobacillus Species 0.000 claims description 3
- 244000057717 Streptococcus lactis Species 0.000 claims description 3
- 235000014897 Streptococcus lactis Nutrition 0.000 claims description 3
- 241000520244 Tatumella citrea Species 0.000 claims description 3
- 241000589636 Xanthomonas campestris Species 0.000 claims description 3
- 101150114167 ampC gene Proteins 0.000 claims description 3
- 229940054340 bacillus coagulans Drugs 0.000 claims description 3
- 229940002008 bifidobacterium bifidum Drugs 0.000 claims description 3
- 229940004120 bifidobacterium infantis Drugs 0.000 claims description 3
- 229940009291 bifidobacterium longum Drugs 0.000 claims description 3
- 101150109249 lacI gene Proteins 0.000 claims description 3
- 229940039695 lactobacillus acidophilus Drugs 0.000 claims description 3
- 229940004208 lactobacillus bulgaricus Drugs 0.000 claims description 3
- 229940054346 lactobacillus helveticus Drugs 0.000 claims description 3
- 229940072205 lactobacillus plantarum Drugs 0.000 claims description 3
- 229940001882 lactobacillus reuteri Drugs 0.000 claims description 3
- 231100000241 scar Toxicity 0.000 claims description 3
- FNCPZGGSTQEGGK-DRSOAOOLSA-N 3'-Sialyl-3-fucosyllactose Chemical compound O[C@H]1[C@H](O)[C@H](O)[C@H](C)O[C@H]1O[C@H]([C@@H](O)C=O)[C@@H]([C@H](O)CO)O[C@H]1[C@H](O)[C@@H](O[C@]2(O[C@H]([C@H](NC(C)=O)[C@@H](O)C2)[C@H](O)[C@H](O)CO)C(O)=O)[C@@H](O)[C@@H](CO)O1 FNCPZGGSTQEGGK-DRSOAOOLSA-N 0.000 claims description 2
- 101710149180 Alpha-(1,4)-fucosyltransferase Proteins 0.000 claims description 2
- 241000194103 Bacillus pumilus Species 0.000 claims description 2
- 101000819503 Homo sapiens 4-galactosyl-N-acetylglucosaminide 3-alpha-L-fucosyltransferase 9 Proteins 0.000 claims description 2
- 101001022183 Homo sapiens 4-galactosyl-N-acetylglucosaminide 3-alpha-L-fucosyltransferase FUT5 Proteins 0.000 claims description 2
- 101001022175 Homo sapiens 4-galactosyl-N-acetylglucosaminide 3-alpha-L-fucosyltransferase FUT6 Proteins 0.000 claims description 2
- 101000862213 Homo sapiens Alpha-(1,3)-fucosyltransferase 11 Proteins 0.000 claims description 2
- 101000819497 Homo sapiens Alpha-(1,3)-fucosyltransferase 7 Proteins 0.000 claims description 2
- 241000186673 Lactobacillus delbrueckii Species 0.000 claims description 2
- 241000218588 Lactobacillus rhamnosus Species 0.000 claims description 2
- 241000520272 Pantoea Species 0.000 claims description 2
- QUOQJNYANJQSDA-MHQSSNGYSA-N Sialyllacto-N-tetraose a Chemical compound O1C([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)O[C@@H]1[C@@H](O)[C@H](OC2[C@H]([C@H](OC3[C@H]([C@H](O[C@H]([C@H](O)CO)[C@H](O)[C@@H](O)C=O)O[C@H](CO)[C@@H]3O)O)O[C@H](CO)[C@H]2O)NC(C)=O)O[C@H](CO)[C@@H]1O QUOQJNYANJQSDA-MHQSSNGYSA-N 0.000 claims description 2
- SFMRPVLZMVJKGZ-JRZQLMJNSA-N Sialyllacto-N-tetraose b Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)OC[C@@H]1[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](O)[C@@H](O)[C@@H](CO)O2)O)[C@@H](NC(C)=O)[C@H](O[C@@H]2[C@H]([C@H](O[C@H]([C@H](O)CO)[C@H](O)[C@@H](O)C=O)O[C@H](CO)[C@@H]2O)O)O1 SFMRPVLZMVJKGZ-JRZQLMJNSA-N 0.000 claims description 2
- SXMGGNXBTZBGLU-UHFFFAOYSA-N sialyllacto-n-tetraose c Chemical compound OCC1OC(OC2C(C(OC(C(O)CO)C(O)C(O)C=O)OC(CO)C2O)O)C(NC(=O)C)C(O)C1OC(C(C(O)C1O)O)OC1COC1(C(O)=O)CC(O)C(NC(C)=O)C(C(O)C(O)CO)O1 SXMGGNXBTZBGLU-UHFFFAOYSA-N 0.000 claims description 2
- 230000002103 transcriptional effect Effects 0.000 claims description 2
- 102000005936 beta-Galactosidase Human genes 0.000 claims 6
- 108090000765 processed proteins & peptides Proteins 0.000 abstract description 37
- 102000004196 processed proteins & peptides Human genes 0.000 abstract description 36
- 229920001184 polypeptide Polymers 0.000 abstract description 35
- 108090000623 proteins and genes Proteins 0.000 description 149
- 235000018102 proteins Nutrition 0.000 description 54
- 102000004169 proteins and genes Human genes 0.000 description 54
- 229940024606 amino acid Drugs 0.000 description 52
- SQVRNKJHWKZAKO-UHFFFAOYSA-N beta-N-Acetyl-D-neuraminic acid Natural products CC(=O)NC1C(O)CC(O)(C(O)=O)OC1C(O)C(O)CO SQVRNKJHWKZAKO-UHFFFAOYSA-N 0.000 description 37
- 239000000047 product Substances 0.000 description 33
- 210000004027 cell Anatomy 0.000 description 32
- 230000015572 biosynthetic process Effects 0.000 description 31
- LFTYTUAZOPRMMI-CFRASDGPSA-N UDP-N-acetyl-alpha-D-glucosamine Chemical compound O1[C@H](CO)[C@@H](O)[C@H](O)[C@@H](NC(=O)C)[C@H]1OP(O)(=O)OP(O)(=O)OC[C@@H]1[C@@H](O)[C@@H](O)[C@H](N2C(NC(=O)C=C2)=O)O1 LFTYTUAZOPRMMI-CFRASDGPSA-N 0.000 description 30
- LFTYTUAZOPRMMI-UHFFFAOYSA-N UNPD164450 Natural products O1C(CO)C(O)C(O)C(NC(=O)C)C1OP(O)(=O)OP(O)(=O)OCC1C(O)C(O)C(N2C(NC(=O)C=C2)=O)O1 LFTYTUAZOPRMMI-UHFFFAOYSA-N 0.000 description 30
- 239000000203 mixture Substances 0.000 description 30
- SQVRNKJHWKZAKO-OQPLDHBCSA-N sialic acid Chemical compound CC(=O)N[C@@H]1[C@@H](O)C[C@@](O)(C(O)=O)OC1[C@H](O)[C@H](O)CO SQVRNKJHWKZAKO-OQPLDHBCSA-N 0.000 description 27
- 238000003786 synthesis reaction Methods 0.000 description 22
- 102100026189 Beta-galactosidase Human genes 0.000 description 21
- 239000012634 fragment Substances 0.000 description 21
- 235000020256 human milk Nutrition 0.000 description 16
- 210000004251 human milk Anatomy 0.000 description 16
- 101150006297 nagC gene Proteins 0.000 description 16
- 235000000346 sugar Nutrition 0.000 description 15
- 239000013604 expression vector Substances 0.000 description 13
- 101150117187 glmS gene Proteins 0.000 description 13
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 12
- SQVRNKJHWKZAKO-PFQGKNLYSA-N N-acetyl-beta-neuraminic acid Chemical compound CC(=O)N[C@@H]1[C@@H](O)C[C@@](O)(C(O)=O)O[C@H]1[C@H](O)[C@H](O)CO SQVRNKJHWKZAKO-PFQGKNLYSA-N 0.000 description 12
- 239000011159 matrix material Substances 0.000 description 12
- 239000002773 nucleotide Substances 0.000 description 12
- 230000037361 pathway Effects 0.000 description 12
- 125000003729 nucleotide group Chemical group 0.000 description 11
- 102100033341 N-acetylmannosamine kinase Human genes 0.000 description 10
- 108010076504 Protein Sorting Signals Proteins 0.000 description 10
- 239000013613 expression plasmid Substances 0.000 description 10
- 239000000758 substrate Substances 0.000 description 10
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 9
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 9
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 9
- 241000124008 Mammalia Species 0.000 description 9
- 241001465754 Metazoa Species 0.000 description 9
- 238000005481 NMR spectroscopy Methods 0.000 description 9
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 9
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 9
- 230000006652 catabolic pathway Effects 0.000 description 9
- 150000001875 compounds Chemical class 0.000 description 9
- 238000004128 high performance liquid chromatography Methods 0.000 description 9
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 9
- 241000894007 species Species 0.000 description 9
- 238000004809 thin layer chromatography Methods 0.000 description 9
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 8
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 8
- 108010029147 N-acylmannosamine kinase Proteins 0.000 description 8
- 102000005348 Neuraminidase Human genes 0.000 description 8
- 108010006232 Neuraminidase Proteins 0.000 description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 description 8
- 238000007792 addition Methods 0.000 description 8
- 125000000539 amino acid group Chemical group 0.000 description 8
- 238000013459 approach Methods 0.000 description 8
- 230000015556 catabolic process Effects 0.000 description 8
- 238000000855 fermentation Methods 0.000 description 8
- 230000004151 fermentation Effects 0.000 description 8
- 230000000813 microbial effect Effects 0.000 description 8
- 235000013336 milk Nutrition 0.000 description 8
- 210000004080 milk Anatomy 0.000 description 8
- 101150074166 nanA gene Proteins 0.000 description 8
- 230000002018 overexpression Effects 0.000 description 8
- 102000040430 polynucleotide Human genes 0.000 description 8
- 108091033319 polynucleotide Proteins 0.000 description 8
- 239000002157 polynucleotide Substances 0.000 description 8
- 238000012360 testing method Methods 0.000 description 8
- LWIHDJKSTIGBAC-UHFFFAOYSA-K tripotassium phosphate Chemical compound [K+].[K+].[K+].[O-]P([O-])([O-])=O LWIHDJKSTIGBAC-UHFFFAOYSA-K 0.000 description 8
- 101100427060 Bacillus spizizenii (strain ATCC 23059 / NRRL B-14472 / W23) thyA1 gene Proteins 0.000 description 7
- 101100132713 Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) nanA1 gene Proteins 0.000 description 7
- 101100153154 Escherichia phage T5 thy gene Proteins 0.000 description 7
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 7
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 7
- 101100313751 Rickettsia conorii (strain ATCC VR-613 / Malish 7) thyX gene Proteins 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 230000008901 benefit Effects 0.000 description 7
- 238000005119 centrifugation Methods 0.000 description 7
- 230000001086 cytosolic effect Effects 0.000 description 7
- 239000000284 extract Substances 0.000 description 7
- 238000000338 in vitro Methods 0.000 description 7
- 239000000523 sample Substances 0.000 description 7
- 239000000126 substance Substances 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 6
- 102000053602 DNA Human genes 0.000 description 6
- 108700023372 Glycosyltransferases Proteins 0.000 description 6
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 6
- 241000288906 Primates Species 0.000 description 6
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 6
- 239000002253 acid Substances 0.000 description 6
- 239000006227 byproduct Substances 0.000 description 6
- 210000000805 cytoplasm Anatomy 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 239000002158 endotoxin Substances 0.000 description 6
- 239000008103 glucose Substances 0.000 description 6
- 230000002779 inactivation Effects 0.000 description 6
- 101150001899 lacY gene Proteins 0.000 description 6
- 239000008267 milk Substances 0.000 description 6
- 239000002243 precursor Substances 0.000 description 6
- 230000001105 regulatory effect Effects 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 230000032258 transport Effects 0.000 description 6
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 6
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 5
- KYQCXUMVJGMDNG-UHFFFAOYSA-N 4,5,6,7,8-pentahydroxy-2-oxooctanoic acid Chemical compound OCC(O)C(O)C(O)C(O)CC(=O)C(O)=O KYQCXUMVJGMDNG-UHFFFAOYSA-N 0.000 description 5
- GSXOAOHZAIYLCY-UHFFFAOYSA-N D-F6P Natural products OCC(=O)C(O)C(O)C(O)COP(O)(O)=O GSXOAOHZAIYLCY-UHFFFAOYSA-N 0.000 description 5
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 5
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 5
- OVRNDRQMDRJTHS-UHFFFAOYSA-N N-acelyl-D-glucosamine Natural products CC(=O)NC1C(O)OC(CO)C(O)C1O OVRNDRQMDRJTHS-UHFFFAOYSA-N 0.000 description 5
- MBLBDJOUHNCFQT-LXGUWJNJSA-N N-acetylglucosamine Natural products CC(=O)N[C@@H](C=O)[C@@H](O)[C@H](O)[C@H](O)CO MBLBDJOUHNCFQT-LXGUWJNJSA-N 0.000 description 5
- 101710119106 N-acetylneuraminate transporter Proteins 0.000 description 5
- 230000002950 deficient Effects 0.000 description 5
- 238000006731 degradation reaction Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 239000001963 growth medium Substances 0.000 description 5
- 230000000670 limiting effect Effects 0.000 description 5
- 229920006008 lipopolysaccharide Polymers 0.000 description 5
- 238000000425 proton nuclear magnetic resonance spectrum Methods 0.000 description 5
- 238000000746 purification Methods 0.000 description 5
- 238000001228 spectrum Methods 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- ZKHQWZAMYRWXGA-KQYNXXCUSA-N Adenosine triphosphate Chemical class C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-N 0.000 description 4
- 241001136161 Avibacterium Species 0.000 description 4
- 241000283690 Bos taurus Species 0.000 description 4
- YWWJKULNWGRYAS-XKKDATLGSA-N CMP-3-deoxy-alpha-D-manno-octulosonic acid Chemical class O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)O[C@]2(O[C@@H]([C@H](O)[C@H](O)C2)[C@H](O)CO)C(O)=O)O1 YWWJKULNWGRYAS-XKKDATLGSA-N 0.000 description 4
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 4
- 241000282472 Canis lupus familiaris Species 0.000 description 4
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 4
- 241000590000 Helicobacter acinonychis Species 0.000 description 4
- 241000282412 Homo Species 0.000 description 4
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 4
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 4
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 4
- 229910019142 PO4 Inorganic materials 0.000 description 4
- 241000607606 Photobacterium sp. Species 0.000 description 4
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 4
- 230000002378 acidificating effect Effects 0.000 description 4
- 150000007513 acids Chemical class 0.000 description 4
- MSWZFWKMSRAUBD-UHFFFAOYSA-N beta-D-galactosamine Natural products NC1C(O)OC(CO)C(O)C1O MSWZFWKMSRAUBD-UHFFFAOYSA-N 0.000 description 4
- 239000001110 calcium chloride Substances 0.000 description 4
- 229910001628 calcium chloride Inorganic materials 0.000 description 4
- 229910052799 carbon Inorganic materials 0.000 description 4
- 230000003197 catalytic effect Effects 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 4
- 210000000349 chromosome Anatomy 0.000 description 4
- 230000007423 decrease Effects 0.000 description 4
- 235000015872 dietary supplement Nutrition 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 238000009472 formulation Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- XHMJOUIAFHJHBW-VFUOTHLCSA-N glucosamine 6-phosphate Chemical compound N[C@H]1[C@H](O)O[C@H](COP(O)(O)=O)[C@H](O)[C@@H]1O XHMJOUIAFHJHBW-VFUOTHLCSA-N 0.000 description 4
- 102000045442 glycosyltransferase activity proteins Human genes 0.000 description 4
- 108700014210 glycosyltransferase activity proteins Proteins 0.000 description 4
- 208000015181 infectious disease Diseases 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 229930182817 methionine Natural products 0.000 description 4
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 4
- 238000002703 mutagenesis Methods 0.000 description 4
- 231100000350 mutagenesis Toxicity 0.000 description 4
- -1 nanE Proteins 0.000 description 4
- 230000001717 pathogenic effect Effects 0.000 description 4
- 239000000546 pharmaceutical excipient Substances 0.000 description 4
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 4
- 239000010452 phosphate Substances 0.000 description 4
- 229920001467 poly(styrenesulfonates) Polymers 0.000 description 4
- 229910000160 potassium phosphate Inorganic materials 0.000 description 4
- 235000011009 potassium phosphates Nutrition 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 239000004474 valine Substances 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 241000701474 Alistipes Species 0.000 description 3
- GUBGYTABKSRVRQ-XLOQQCSPSA-N Alpha-Lactose Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@@H](CO)O[C@H](O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-XLOQQCSPSA-N 0.000 description 3
- 241000283073 Equus caballus Species 0.000 description 3
- 101100186924 Escherichia coli neuC gene Proteins 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 3
- 241000606790 Haemophilus Species 0.000 description 3
- 241000990166 Helicobacter cetorum Species 0.000 description 3
- 241000590002 Helicobacter pylori Species 0.000 description 3
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 3
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 3
- 229930182816 L-glutamine Natural products 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 3
- BRGMHAYQAZFZDJ-PVFLNQBWSA-N N-Acetylglucosamine 6-phosphate Chemical compound CC(=O)N[C@H]1[C@@H](O)O[C@H](COP(O)(O)=O)[C@@H](O)[C@@H]1O BRGMHAYQAZFZDJ-PVFLNQBWSA-N 0.000 description 3
- OVRNDRQMDRJTHS-FMDGEEDCSA-N N-acetyl-beta-D-glucosamine Chemical compound CC(=O)N[C@H]1[C@H](O)O[C@H](CO)[C@@H](O)[C@@H]1O OVRNDRQMDRJTHS-FMDGEEDCSA-N 0.000 description 3
- 102000048245 N-acetylneuraminate lyases Human genes 0.000 description 3
- 108700023220 N-acetylneuraminate lyases Proteins 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- 241000607568 Photobacterium Species 0.000 description 3
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 3
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 3
- 239000004473 Threonine Substances 0.000 description 3
- 108090000340 Transaminases Proteins 0.000 description 3
- 230000002411 adverse Effects 0.000 description 3
- 238000005273 aeration Methods 0.000 description 3
- BGWGXPAPYGQALX-ARQDHWQXSA-N beta-D-fructofuranose 6-phosphate Chemical compound OC[C@@]1(O)O[C@H](COP(O)(O)=O)[C@@H](O)[C@@H]1O BGWGXPAPYGQALX-ARQDHWQXSA-N 0.000 description 3
- 235000013361 beverage Nutrition 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- 150000001720 carbohydrates Chemical class 0.000 description 3
- 239000013592 cell lysate Substances 0.000 description 3
- 239000000356 contaminant Substances 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 210000001035 gastrointestinal tract Anatomy 0.000 description 3
- 238000010438 heat treatment Methods 0.000 description 3
- 229940037467 helicobacter pylori Drugs 0.000 description 3
- 238000005570 heteronuclear single quantum coherence Methods 0.000 description 3
- 239000012535 impurity Substances 0.000 description 3
- 239000000543 intermediate Substances 0.000 description 3
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 3
- 229960000310 isoleucine Drugs 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 229950006780 n-acetylglucosamine Drugs 0.000 description 3
- 101150008111 nagA gene Proteins 0.000 description 3
- 235000016709 nutrition Nutrition 0.000 description 3
- 244000052769 pathogen Species 0.000 description 3
- 239000000843 powder Substances 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000010561 standard procedure Methods 0.000 description 3
- 150000008163 sugars Chemical class 0.000 description 3
- 208000024891 symptom Diseases 0.000 description 3
- 102000014898 transaminase activity proteins Human genes 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 238000011282 treatment Methods 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- RDKDZVKTUNMUMT-BHVWUGLYSA-N 1-[(3R,4R,5S,6R)-2,3,4-trihydroxy-6-(hydroxymethyl)-5-[(2S,3R,4S,5R,6R)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxyoxan-2-yl]ethanone Chemical compound C(C)(=O)C1(O)[C@H](O)[C@@H](O)[C@H](O[C@H]2[C@H](O)[C@@H](O)[C@@H](O)[C@H](O2)CO)[C@H](O1)CO RDKDZVKTUNMUMT-BHVWUGLYSA-N 0.000 description 2
- MSWZFWKMSRAUBD-IVMDWMLBSA-N 2-amino-2-deoxy-D-glucopyranose Chemical compound N[C@H]1C(O)O[C@H](CO)[C@@H](O)[C@@H]1O MSWZFWKMSRAUBD-IVMDWMLBSA-N 0.000 description 2
- 238000005084 2D-nuclear magnetic resonance Methods 0.000 description 2
- 241000606750 Actinobacillus Species 0.000 description 2
- 241000606791 Actinobacillus ureae Species 0.000 description 2
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 2
- 101100325906 Bacillus subtilis (strain 168) ganA gene Proteins 0.000 description 2
- 241000770536 Bacillus thermophilus Species 0.000 description 2
- 241001553294 Bibersteinia Species 0.000 description 2
- 101100245749 Campylobacter jejuni subsp. jejuni serotype O:23/36 (strain 81-176) pseF gene Proteins 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 108010078791 Carrier Proteins Proteins 0.000 description 2
- 101100450592 Dictyostelium discoideum hexa1 gene Proteins 0.000 description 2
- 108090000204 Dipeptidase 1 Proteins 0.000 description 2
- 241000701959 Escherichia virus Lambda Species 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- 102000051366 Glycosyltransferases Human genes 0.000 description 2
- 241000589989 Helicobacter Species 0.000 description 2
- 101000588377 Homo sapiens N-acylneuraminate cytidylyltransferase Proteins 0.000 description 2
- 239000007836 KH2PO4 Substances 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- 101100186921 Legionella pneumophila subsp. pneumophila (strain Philadelphia 1 / ATCC 33152 / DSM 7513) neuB gene Proteins 0.000 description 2
- 101150043276 Lon gene Proteins 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- CSNNHWWHGAXBCP-UHFFFAOYSA-L Magnesium sulfate Chemical compound [Mg+2].[O-][S+2]([O-])([O-])[O-] CSNNHWWHGAXBCP-UHFFFAOYSA-L 0.000 description 2
- 102000003939 Membrane transport proteins Human genes 0.000 description 2
- 108090000301 Membrane transport proteins Proteins 0.000 description 2
- LRHPLDYGYMQRHN-UHFFFAOYSA-N N-Butanol Chemical compound CCCCO LRHPLDYGYMQRHN-UHFFFAOYSA-N 0.000 description 2
- OVRNDRQMDRJTHS-RTRLPJTCSA-N N-acetyl-D-glucosamine Chemical compound CC(=O)N[C@H]1C(O)O[C@H](CO)[C@@H](O)[C@@H]1O OVRNDRQMDRJTHS-RTRLPJTCSA-N 0.000 description 2
- 101710179749 N-acetylmannosamine kinase Proteins 0.000 description 2
- 108010010750 N-acetylmannosamine-6-phosphate epimerase Proteins 0.000 description 2
- 102100031349 N-acylneuraminate cytidylyltransferase Human genes 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 108010013639 Peptidoglycan Proteins 0.000 description 2
- 101710178100 Probable UDP-N-acetylglucosamine 2-epimerase Proteins 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- 101710086464 Putative UDP-N-acetylglucosamine 2-epimerase Proteins 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 241000863430 Shewanella Species 0.000 description 2
- 241000625311 Shewanella piezotolerans Species 0.000 description 2
- UIIMBOGNXHQVGW-UHFFFAOYSA-M Sodium bicarbonate Chemical compound [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 101100111413 Thermoanaerobacter pseudethanolicus (strain ATCC 33223 / 39E) lacZ gene Proteins 0.000 description 2
- 108010022394 Threonine synthase Proteins 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 102000005497 Thymidylate Synthase Human genes 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 229960001456 adenosine triphosphate Drugs 0.000 description 2
- 238000013019 agitation Methods 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 235000009582 asparagine Nutrition 0.000 description 2
- 229960001230 asparagine Drugs 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- WQZGKKKJIJFFOK-FPRJBGLDSA-N beta-D-galactose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@H]1O WQZGKKKJIJFFOK-FPRJBGLDSA-N 0.000 description 2
- 102000006635 beta-lactamase Human genes 0.000 description 2
- 230000001851 biosynthetic effect Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 229920001429 chelating resin Polymers 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 230000002759 chromosomal effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000012258 culturing Methods 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- 210000000172 cytosol Anatomy 0.000 description 2
- 235000013365 dairy product Nutrition 0.000 description 2
- 238000011033 desalting Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 208000035475 disorder Diseases 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 235000013350 formula milk Nutrition 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 229930182830 galactose Natural products 0.000 description 2
- 238000012239 gene modification Methods 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- 229960002442 glucosamine Drugs 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 244000005709 gut microbiome Species 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 230000000415 inactivating effect Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 235000015141 kefir Nutrition 0.000 description 2
- GSXOAOHZAIYLCY-HSUXUTPPSA-N keto-D-fructose 6-phosphate Chemical compound OCC(=O)[C@@H](O)[C@H](O)[C@H](O)COP(O)(O)=O GSXOAOHZAIYLCY-HSUXUTPPSA-N 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 244000144972 livestock Species 0.000 description 2
- WRUGWIBCXHJTDG-UHFFFAOYSA-L magnesium sulfate heptahydrate Chemical compound O.O.O.O.O.O.O.[Mg+2].[O-]S([O-])(=O)=O WRUGWIBCXHJTDG-UHFFFAOYSA-L 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 150000002739 metals Chemical class 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 102000035118 modified proteins Human genes 0.000 description 2
- 108091005573 modified proteins Proteins 0.000 description 2
- 229910000402 monopotassium phosphate Inorganic materials 0.000 description 2
- 101150070589 nagB gene Proteins 0.000 description 2
- 101150019075 neuA gene Proteins 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 239000008194 pharmaceutical composition Substances 0.000 description 2
- GNSKLFRGEWLPPA-UHFFFAOYSA-M potassium dihydrogen phosphate Chemical compound [K+].OP(O)([O-])=O GNSKLFRGEWLPPA-UHFFFAOYSA-M 0.000 description 2
- 235000008476 powdered milk Nutrition 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 101150002764 purA gene Proteins 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 230000003248 secreting effect Effects 0.000 description 2
- 238000011218 seed culture Methods 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 238000007086 side reaction Methods 0.000 description 2
- 239000000377 silicon dioxide Substances 0.000 description 2
- 239000002904 solvent Substances 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 238000010361 transduction Methods 0.000 description 2
- 230000026683 transduction Effects 0.000 description 2
- 238000000825 ultraviolet detection Methods 0.000 description 2
- 108700026220 vif Genes Proteins 0.000 description 2
- 235000013618 yogurt Nutrition 0.000 description 2
- TYALNJQZQRNQNQ-PVURBZDVSA-N (2R,4S,5R,6R)-5-Acetamido-4-hydroxy-6-[(1R,2R)-1,2,3-trihydroxypropyl]-2-[[(2R,3R,4S,5R,6S)-3,4,5-trihydroxy-6-[(2R,3S,4R,5R)-4,5,6-trihydroxy-2-(hydroxymethyl)oxan-3-yl]oxyoxan-2-yl]methoxy]oxane-2-carboxylic acid Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)OC[C@@H]1[C@H](O)[C@H](O)[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](O)C(O)O[C@@H]2CO)O)O1 TYALNJQZQRNQNQ-PVURBZDVSA-N 0.000 description 1
- KJCVRFUGPWSIIH-UHFFFAOYSA-N 1-naphthol Chemical compound C1=CC=C2C(O)=CC=CC2=C1 KJCVRFUGPWSIIH-UHFFFAOYSA-N 0.000 description 1
- NNLZBVFSCVTSLA-HXUQBWEZSA-N 3-deoxy-alpha-D-manno-oct-2-ulopyranosonic acid Chemical compound OC[C@@H](O)[C@H]1O[C@@](O)(C(O)=O)C[C@@H](O)[C@H]1O NNLZBVFSCVTSLA-HXUQBWEZSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 102100031317 Alpha-N-acetylgalactosaminidase Human genes 0.000 description 1
- NLXLAEXVIDQMFP-UHFFFAOYSA-N Ammonia chloride Chemical compound [NH4+].[Cl-] NLXLAEXVIDQMFP-UHFFFAOYSA-N 0.000 description 1
- VHUUQVKOLVNVRT-UHFFFAOYSA-N Ammonium hydroxide Chemical compound [NH4+].[OH-] VHUUQVKOLVNVRT-UHFFFAOYSA-N 0.000 description 1
- 244000303258 Annona diversifolia Species 0.000 description 1
- 235000002198 Annona diversifolia Nutrition 0.000 description 1
- 241001148624 Areae Species 0.000 description 1
- 241000238421 Arthropoda Species 0.000 description 1
- 208000023275 Autoimmune disease Diseases 0.000 description 1
- 241000606767 Avibacterium paragallinarum Species 0.000 description 1
- 108010077805 Bacterial Proteins Proteins 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 102100029945 Beta-galactoside alpha-2,6-sialyltransferase 1 Human genes 0.000 description 1
- 101710136191 Beta-galactoside alpha-2,6-sialyltransferase 1 Proteins 0.000 description 1
- 102100029963 Beta-galactoside alpha-2,6-sialyltransferase 2 Human genes 0.000 description 1
- 101710136188 Beta-galactoside alpha-2,6-sialyltransferase 2 Proteins 0.000 description 1
- BVKZGUZCCUSVTD-UHFFFAOYSA-M Bicarbonate Chemical compound OC([O-])=O BVKZGUZCCUSVTD-UHFFFAOYSA-M 0.000 description 1
- TXCIAUNLDRJGJZ-UHFFFAOYSA-N CMP-N-acetyl neuraminic acid Natural products O1C(C(O)C(O)CO)C(NC(=O)C)C(O)CC1(C(O)=O)OP(O)(=O)OCC1C(O)C(O)C(N2C(N=C(N)C=C2)=O)O1 TXCIAUNLDRJGJZ-UHFFFAOYSA-N 0.000 description 1
- TXCIAUNLDRJGJZ-BILDWYJOSA-N CMP-N-acetyl-beta-neuraminic acid Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@]1(C(O)=O)OP(O)(=O)OC[C@@H]1[C@@H](O)[C@@H](O)[C@H](N2C(N=C(N)C=C2)=O)O1 TXCIAUNLDRJGJZ-BILDWYJOSA-N 0.000 description 1
- 102100029962 CMP-N-acetylneuraminate-beta-1,4-galactoside alpha-2,3-sialyltransferase Human genes 0.000 description 1
- 101710136075 CMP-N-acetylneuraminate-beta-1,4-galactoside alpha-2,3-sialyltransferase Proteins 0.000 description 1
- 241000282836 Camelus dromedarius Species 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 241000700199 Cavia porcellus Species 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 208000027244 Dysbiosis Diseases 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 241001646716 Escherichia coli K-12 Species 0.000 description 1
- 241001618099 Escherichia coli S88 Species 0.000 description 1
- 241000320863 Flavobacterium limnosediminis Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 102100041034 Glucosamine-6-phosphate isomerase 1 Human genes 0.000 description 1
- 241000282575 Gorilla Species 0.000 description 1
- 101000918657 Homo sapiens L-xylulose reductase Proteins 0.000 description 1
- 241000282620 Hylobates sp. Species 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 241001138401 Kluyveromyces lactis Species 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- 102100029137 L-xylulose reductase Human genes 0.000 description 1
- 101001010029 Lactobacillus helveticus Putative phosphotransferase enzyme IIA component Proteins 0.000 description 1
- 101710090149 Lactose operon repressor Proteins 0.000 description 1
- 102100030928 Lactosylceramide alpha-2,3-sialyltransferase Human genes 0.000 description 1
- 101710165105 Lactosylceramide alpha-2,3-sialyltransferase Proteins 0.000 description 1
- MSFSPUZXLOGKHJ-UHFFFAOYSA-N Muraminsaeure Natural products OC(=O)C(C)OC1C(N)C(O)OC(CO)C1O MSFSPUZXLOGKHJ-UHFFFAOYSA-N 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- BRGMHAYQAZFZDJ-ZTVVOAFPSA-N N-acetyl-D-mannosamine 6-phosphate Chemical compound CC(=O)N[C@@H]1C(O)O[C@H](COP(O)(O)=O)[C@@H](O)[C@@H]1O BRGMHAYQAZFZDJ-ZTVVOAFPSA-N 0.000 description 1
- FZLJPEPAYPUMMR-FMDGEEDCSA-N N-acetyl-alpha-D-glucosamine 1-phosphate Chemical compound CC(=O)N[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1OP(O)(O)=O FZLJPEPAYPUMMR-FMDGEEDCSA-N 0.000 description 1
- 108010069483 N-acetylglucosamine-6-phosphate deacetylase Proteins 0.000 description 1
- 108010081778 N-acylneuraminate cytidylyltransferase Proteins 0.000 description 1
- 229910004616 Na2MoO4.2H2 O Inorganic materials 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 241000369774 Norwalk-like virus Species 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 241000282577 Pan troglodytes Species 0.000 description 1
- 241000606594 Pasteurella dagmatis Species 0.000 description 1
- 241000606856 Pasteurella multocida Species 0.000 description 1
- 241000282405 Pongo abelii Species 0.000 description 1
- 241000677647 Proba Species 0.000 description 1
- 108090001066 Racemases and epimerases Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-N Sulfuric acid Chemical compound OS(O)(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-N 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- 108700005078 Synthetic Genes Proteins 0.000 description 1
- 101710195626 Transcriptional activator protein Proteins 0.000 description 1
- 102100027107 Type 2 lactosamine alpha-2,3-sialyltransferase Human genes 0.000 description 1
- 101710204134 Type 2 lactosamine alpha-2,3-sialyltransferase Proteins 0.000 description 1
- 108010061048 UDPacetylglucosamine pyrophosphorylase Proteins 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241000681719 Vibrio brasiliensis Species 0.000 description 1
- 241000606834 [Haemophilus] ducreyi Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 208000026935 allergic disease Diseases 0.000 description 1
- YMJBYRVFGYXULK-QZABAPFNSA-N alpha-D-glucosamine 1-phosphate Chemical compound N[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1OP(O)(O)=O YMJBYRVFGYXULK-QZABAPFNSA-N 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000002518 antifoaming agent Substances 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000002869 basic local alignment search tool Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 230000001588 bifunctional effect Effects 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- KGBXLFKZBHKPEV-UHFFFAOYSA-N boric acid Chemical compound OB(O)O KGBXLFKZBHKPEV-UHFFFAOYSA-N 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 230000003833 cell viability Effects 0.000 description 1
- 210000002421 cell wall Anatomy 0.000 description 1
- 235000013351 cheese Nutrition 0.000 description 1
- 239000012707 chemical precursor Substances 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 210000004081 cilia Anatomy 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 230000001447 compensatory effect Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- MPTQRFCYZCXJFQ-UHFFFAOYSA-L copper(II) chloride dihydrate Chemical compound O.O.[Cl-].[Cl-].[Cu+2] MPTQRFCYZCXJFQ-UHFFFAOYSA-L 0.000 description 1
- 239000006071 cream Substances 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000007857 degradation product Substances 0.000 description 1
- 230000003413 degradative effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- MNNHAPBLZZVQHP-UHFFFAOYSA-N diammonium hydrogen phosphate Chemical compound [NH4+].[NH4+].OP([O-])([O-])=O MNNHAPBLZZVQHP-UHFFFAOYSA-N 0.000 description 1
- 229910000388 diammonium phosphate Inorganic materials 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 230000023011 digestive tract development Effects 0.000 description 1
- 230000006806 disease prevention Effects 0.000 description 1
- BNIILDVGGAEEIG-UHFFFAOYSA-L disodium hydrogen phosphate Chemical compound [Na+].[Na+].OP([O-])([O-])=O BNIILDVGGAEEIG-UHFFFAOYSA-L 0.000 description 1
- 229910000397 disodium phosphate Inorganic materials 0.000 description 1
- 230000007140 dysbiosis Effects 0.000 description 1
- 230000037149 energy metabolism Effects 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 239000003797 essential amino acid Substances 0.000 description 1
- 235000020776 essential amino acid Nutrition 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 239000012527 feed solution Substances 0.000 description 1
- 235000012631 food intake Nutrition 0.000 description 1
- 239000012458 free base Substances 0.000 description 1
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 150000008195 galaktosides Chemical class 0.000 description 1
- 108091006104 gene-regulatory proteins Proteins 0.000 description 1
- 102000034356 gene-regulatory proteins Human genes 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- 101150073660 glmM gene Proteins 0.000 description 1
- 101150111330 glmU gene Proteins 0.000 description 1
- 108010084034 glucosamine-1-phosphate acetyltransferase Proteins 0.000 description 1
- 108010022717 glucosamine-6-phosphate isomerase Proteins 0.000 description 1
- 150000004676 glycans Chemical group 0.000 description 1
- 125000003147 glycosyl group Chemical group 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 238000003919 heteronuclear multiple bond coherence Methods 0.000 description 1
- 238000001052 heteronuclear multiple bond coherence spectrum Methods 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000036737 immune function Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 238000002650 immunosuppressive therapy Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 210000003000 inclusion body Anatomy 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 239000012678 infectious agent Substances 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 230000036512 infertility Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 208000028774 intestinal disease Diseases 0.000 description 1
- SURQXAFEQWPFPV-UHFFFAOYSA-L iron(2+) sulfate heptahydrate Chemical compound O.O.O.O.O.O.O.[Fe+2].[O-]S([O-])(=O)=O SURQXAFEQWPFPV-UHFFFAOYSA-L 0.000 description 1
- 230000007794 irritation Effects 0.000 description 1
- IEQCXFNWPAHHQR-UHFFFAOYSA-N lacto-N-neotetraose Natural products OCC1OC(OC2C(C(OC3C(OC(O)C(O)C3O)CO)OC(CO)C2O)O)C(NC(=O)C)C(O)C1OC1OC(CO)C(O)C(O)C1O IEQCXFNWPAHHQR-UHFFFAOYSA-N 0.000 description 1
- 229940062780 lacto-n-neotetraose Drugs 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- GZQKNULLWNGMCW-PWQABINMSA-N lipid A (E. coli) Chemical compound O1[C@H](CO)[C@@H](OP(O)(O)=O)[C@H](OC(=O)C[C@@H](CCCCCCCCCCC)OC(=O)CCCCCCCCCCCCC)[C@@H](NC(=O)C[C@@H](CCCCCCCCCCC)OC(=O)CCCCCCCCCCC)[C@@H]1OC[C@@H]1[C@@H](O)[C@H](OC(=O)C[C@H](O)CCCCCCCCCCC)[C@@H](NC(=O)C[C@H](O)CCCCCCCCCCC)[C@@H](OP(O)(O)=O)O1 GZQKNULLWNGMCW-PWQABINMSA-N 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 230000002934 lysing effect Effects 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 238000009629 microbiological culture Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 239000007003 mineral medium Substances 0.000 description 1
- 231100001228 moderately toxic Toxicity 0.000 description 1
- 150000002772 monosaccharides Chemical class 0.000 description 1
- 101150063315 nanE gene Proteins 0.000 description 1
- 101150076570 nanK gene Proteins 0.000 description 1
- 101150054323 nanR gene Proteins 0.000 description 1
- 101150048598 nanT gene Proteins 0.000 description 1
- RBMYDHMFFAVMMM-PLQWBNBWSA-N neolactotetraose Chemical compound O([C@H]1[C@H](O)[C@H]([C@@H](O[C@@H]1CO)O[C@@H]1[C@H]([C@H](O[C@H]([C@H](O)CO)[C@H](O)[C@@H](O)C=O)O[C@H](CO)[C@@H]1O)O)NC(=O)C)[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O RBMYDHMFFAVMMM-PLQWBNBWSA-N 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- MGFYIUFZLHCRTH-UHFFFAOYSA-N nitrilotriacetic acid Chemical compound OC(=O)CN(CC(O)=O)CC(O)=O MGFYIUFZLHCRTH-UHFFFAOYSA-N 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 230000035764 nutrition Effects 0.000 description 1
- 208000015380 nutritional deficiency disease Diseases 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 229940051027 pasteurella multocida Drugs 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 108010032867 phosphoglucosamine mutase Proteins 0.000 description 1
- 230000027086 plasmid maintenance Effects 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 244000144977 poultry Species 0.000 description 1
- 235000013594 poultry meat Nutrition 0.000 description 1
- 235000013406 prebiotics Nutrition 0.000 description 1
- 239000006041 probiotic Substances 0.000 description 1
- 235000018291 probiotics Nutrition 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 101150117960 rcsA gene Proteins 0.000 description 1
- 101150031509 rcsB gene Proteins 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000005067 remediation Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000000754 repressing effect Effects 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 208000023504 respiratory system disease Diseases 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 238000001896 rotating frame Overhauser effect spectroscopy Methods 0.000 description 1
- 102220240214 rs1310676971 Human genes 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 229910000030 sodium bicarbonate Inorganic materials 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- FDEIWTXVNPKYDL-UHFFFAOYSA-N sodium molybdate dihydrate Chemical compound O.O.[Na+].[Na+].[O-][Mo]([O-])(=O)=O FDEIWTXVNPKYDL-UHFFFAOYSA-N 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 238000005507 spraying Methods 0.000 description 1
- 239000011550 stock solution Substances 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 239000000375 suspending agent Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 108060008226 thioredoxin Proteins 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 231100000167 toxic agent Toxicity 0.000 description 1
- 239000003440 toxic substance Substances 0.000 description 1
- 230000005026 transcription initiation Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
- 239000003981 vehicle Substances 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- NWONKYPBYAMBJT-UHFFFAOYSA-L zinc sulfate Chemical compound [Zn+2].[O-]S([O-])(=O)=O NWONKYPBYAMBJT-UHFFFAOYSA-L 0.000 description 1
- 229910000368 zinc sulfate Inorganic materials 0.000 description 1
- 239000011686 zinc sulphate Substances 0.000 description 1
- 235000009529 zinc sulphate Nutrition 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/02—Monosaccharides
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/24—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
- C07K14/245—Escherichia (G)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1048—Glycosyltransferases (2.4)
- C12N9/1051—Hexosyltransferases (2.4.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1048—Glycosyltransferases (2.4)
- C12N9/1081—Glycosyltransferases (2.4) transferring other glycosyl groups (2.4.99)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1085—Transferases (2.) transferring alkyl or aryl groups other than methyl groups (2.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/24—Hydrolases (3) acting on glycosyl compounds (3.2)
- C12N9/2402—Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
- C12N9/2468—Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1) acting on beta-galactose-glycoside bonds, e.g. carrageenases (3.2.1.83; 3.2.1.157); beta-agarase (3.2.1.81)
- C12N9/2471—Beta-galactosidase (3.2.1.23), i.e. exo-(1-->4)-beta-D-galactanase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/26—Preparation of nitrogen-containing carbohydrates
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y204/00—Glycosyltransferases (2.4)
- C12Y204/01—Hexosyltransferases (2.4.1)
- C12Y204/01065—3-Galactosyl-N-acetylglucosaminide 4-alpha-L-fucosyltransferase (2.4.1.65), i.e. alpha-1-3 fucosyltransferase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y204/00—Glycosyltransferases (2.4)
- C12Y204/01—Hexosyltransferases (2.4.1)
- C12Y204/01149—N-Acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase (2.4.1.149)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y204/00—Glycosyltransferases (2.4)
- C12Y204/99—Glycosyltransferases (2.4) transferring other glycosyl groups (2.4.99)
- C12Y204/99007—Alpha-N-acetylneuraminyl-2,3-beta-galactosyl-1,3-N-acetylgalactosaminide 6-alpha-sialyltransferase (2.4.99.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y205/00—Transferases transferring alkyl or aryl groups, other than methyl groups (2.5)
- C12Y205/01—Transferases transferring alkyl or aryl groups, other than methyl groups (2.5) transferring alkyl or aryl groups, other than methyl groups (2.5.1)
- C12Y205/01056—N-acetylneuraminate synthase (2.5.1.56)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y207/00—Transferases transferring phosphorus-containing groups (2.7)
- C12Y207/07—Nucleotidyltransferases (2.7.7)
- C12Y207/07043—N-Acylneuraminate cytidylyltransferase (2.7.7.43)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y302/00—Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
- C12Y302/01—Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
- C12Y302/01023—Beta-galactosidase (3.2.1.23), i.e. exo-(1-->4)-beta-D-galactanase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y302/00—Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
- C12Y302/01—Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
- C12Y302/01183—UDP-N-acetylglucosamine 2-epimerase (hydrolysing) (3.2.1.183)
Definitions
- Lactose is the major nutritional carbohydrate of all mammalian milks, however human milk also contains a diverse and abundant set of more complex neutral and acidic sugars, collectively known as the human milk oligosaccharides (hMOS) (Kunz, C., et al. (2000). Annu Rev Nutr 20, 699-722; Bode, L., and Jantscher-Krenn, E. (2012). Adv Nutr 3, 383S-391S). Hundreds of different hMOS species have been identified, and their rich structural diversity and overall abundance is unique to humans.
- hMOS human milk oligosaccharides
- New methods are needed for producing purified human milk oligosaccharides.
- a method for producing a sialylated oligosaccharide in a bacterium is provided.
- the bacterium includes an exogenous lactose-utilizing sialyltransferase enzyme, e.g., an ⁇ (2,3) sialyltransferase or an ⁇ (2,6) sialyltransferase.
- the enzyme has an amino acid sequence that is from 5% to 30% identical to the amino acid sequence of Pst6-224 (SEQ ID NO: 1) over a stretch of at least 250 amino acids.
- the enzyme has an amino acid sequence that is from 45% to 75% identical to the amino acid sequence of HAC1268 (SEQ ID NO: 8) over a stretch of at least 250 amino acids.
- an isolated bacterium comprising an exogenous lactose-utilizing sialyltransferase enzyme.
- the enzyme has an amino acid sequence that is from 5% to 30% identical to the amino acid sequence of Pst6-224 (SEQ ID NO: 1) over a stretch of at least 250 amino acids.
- the enzyme has amino acid sequence that is from 45% to 75% identical to the amino acid sequence of HAC1268 (SEQ II) NO: 8) over a stretch of at least 250 amino acids.
- the enzyme has an amino acid sequence that is from 5% to 100% identical to the amino acid sequence of one or more of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), ⁇ 20BstC* (SEQ ID NO: 15), ⁇ 20BstC (SEQ ID NO: 18), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10).
- BstC SEQ ID NO: 2
- BstD SEQ ID NO: 3
- ⁇ 20BstC* SEQ ID NO: 15
- ⁇ 20BstC SEQ ID NO: 18
- BstE SEQ ID NO: 4
- BstE* SEQ ID NO: 16
- BstH SEQ
- the amino acid sequence of the enzyme is less than 100% identical to the amino acid sequence of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), ⁇ 20BstC (SEQ ID NO: 18), ⁇ 20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10).
- BstC SEQ ID NO: 2
- BstD SEQ ID NO: 3
- ⁇ 20BstC SEQ ID NO: 18
- ⁇ 20BstC* SEQ ID NO: 15
- BstE SEQ ID NO: 4
- BstE* SEQ ID NO: 16
- BstH SEQ ID NO: 5
- the enzyme has no deletions or insertions compared to BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), ⁇ 20BstC (SEQ ID NO: 18), ⁇ 20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10).
- BstC SEQ ID NO: 2
- BstD SEQ ID NO: 3
- ⁇ 20BstC SEQ ID NO: 18
- ⁇ 20BstC* SEQ ID NO: 15
- BstE SEQ ID NO: 4
- BstE* SEQ ID NO: 16
- BstH SEQ ID NO: 5
- BstI SEQ
- the difference between the amino acid sequence of the enzyme and the amino acid sequence of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), ⁇ 20BstC, (SEQ ID NO: 18), ⁇ 20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10) consists of one or more conservative amino acid substitutions.
- the difference between the amino acid sequence of the enzyme and the amino acid sequence of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), ⁇ 20BstC (SEQ ID NO: 18), ⁇ 20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10) consists of one or more conservative amino acid substitutions.
- the enzyme has an amino acid sequence that is from 5% to 100%, 10% to 90%, 20% to 80%, 30% to 70%, 40% to 60%, 5% to 75%, 5% to 50%, 5% to 25%, 10% to 75%, 10% to 50%, 15% to 25%, 15% to 75%, 15% to 50%, 15% to 25%, 25% to 50%, 50% to 75%, or 75% to 100% identical to a naturally occurring enzyme.
- the enzyme has an amino acid sequence that is at least about 5%, 10%, 15%, or 20% but less than about 30%, 35%, 40%, or 45% identical to a naturally occurring enzyme.
- the enzyme has an amino acid sequence that is at least about 45%, 50%, or 55% but less than about 65%, 70%, or 75% identical to a naturally occurring enzyme.
- the naturally occurring enzyme is a bacterial GT80 family sialyltransferase.
- the GT80 family is described in Audry, M., et al. (2011). Glycobiology 21, 716-726, the entire content of which is inforporated herein by reference.
- the bacterial GT80 family sialyltransferase has the GT-B structural fold.
- the GT-B structural fold is described in Audry, M,, et al. (2011). Glycobiology 21, 716-726, the entire content of which is incorporated herein by reference.
- the naturally occurring enzyme is produced by a microbial organism, e.g., in nature.
- the microbial organism is a bacterium that is naturally present in the gastrointestinal tract of a mammal.
- the microbial organism is a bacterium within the genus Photobacterium, Avibacterium, Shewanella, Bihersteinia, Haemophilus, Alistepes, Actinobacillus, or Helicobacter.
- the enzyme has a mutation (e.g., 1, 2, 3, 4, 5, or more mutations, such as substitution mutations) compared to a naturally occurring ⁇ (2,3) sialyltransferase.
- a mutation e.g., 1, 2, 3, 4, 5, or more mutations, such as substitution mutations
- the enzyme when the amino acid sequences of the enzyme and BstE* are aligned, then the enzyme has a mutation at the position that aligns with position 13 of the amino acid sequence of BstE* (SEQ ID NO: 16). Sequence alignments are run using a variety of publicly available software programs, including but not limited to CLC Main Workbench, version 8.0.
- the enzyme has a non-conservative mutation at the position that aligns with position 13 of the amino acid sequence of BstE* (SEQ ID NO: 16). In various embodiments, the enzyme has a histidine or an alanine at the position that aligns with position 13 of the amino acid sequence of BstE* (SEQ ID NO: 16).
- the enzyme when the amino acid sequences of the enzyme and BstE* are aligned, then the enzyme comprises a mutation at the position that aligns with position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16).
- the enzyme has a non-conservative mutation at the position that aligns with position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16). In certain embodiments, the enzyme has a histidine or an alanine at the position that aligns with position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16).
- the enzyme has a non-conservative mutation at the position that aligns with position 122 of the amino acid sequence of ⁇ 20BstC (SEQ ID NO: 18). In certain embodiments, the enzyme has an alanine, valine, leucine, methionine, or phenylalanine at the position that aligns with position 122 of the amino acid sequence of ⁇ 20BstC (SEQ ID NO: 18).
- the mutation that renders the enzyme more ⁇ (2,6)-selective than the naturally occurring ⁇ (2,3) sialyltransferase is provided.
- the enzyme is an ⁇ (2,6) sialyltransferase.
- the enzyme comprises an amino acid sequence of ⁇ 20BstC* (SEQ ID NO: 15), ⁇ 20BstC*2 (SEQ ID NO: 27), ⁇ 20BstC*3 (SEQ ID NO: 28), ⁇ 20BstC*4 (SEQ ID NO: 29), or ⁇ 20BstC*2 (SEQ ID NO: 30).
- the C ⁇ root-mean-square deviation (RMSD) between the backbone of the enzyme and a naturally occurring sialyltransferase is less than 3 ⁇ .
- the naturally occurring sialyltransferase is Pst6-224 (SEQ ID NO: 1).
- the structure of Pst6-224 (SEQ ID NO: 1) has been solved, see, e.g., Crystal Structure of Vibrionaceae Photobacterium sp. JT-ISH-224 2,6-sialyltransferase in a Ternary Complex with Donor Product CMP and Accepter Substrate Lactose, Kakuta et al. (2008) Glycobiology 18 66-73, the entire content of which is incorporated herein by reference.
- the naturally occurring sialyltransferase is BstC, BstD, BstE, BstH, BstI, BstJ, BstM, or BstN, or a homologue thereof.
- the bacterium is in a culture medium. In certain embodiments, the bacterium is on culture plate or in a flask. In various embodiments, the bacterium is cultured in a biofermentor.
- the methods of producing sialylated oligosaccharides disclosed herein may further include retrieving the sialylated oligosaccharide (e.g., sialyllactose) from the bacterium (e.g., from the cytoplasm of the bacterium by lysing the bacterium) or from a culture supernatant of the bacterium.
- the sialylated oligosaccharide e.g., sialyllactose
- the sialylated oligosaccharide includes any one of, or any combination of 2, 3, 4, 5, 6, 7, or 8 of 3′-sialyllactose (3′-SL), 6′-sialyllactose (6′-SL), 3′-sialyl-3-fucosyllactose (3′-S3FL), sialyllacto-N-tetraose a (SLNT a), sialyllacto-N-tetraose b (SLNT b), disialyllacto-N-tetraose (DSLNT), sialyllacto-N-fucopentaose II (SLNFP II), and sialyllacto-N-tetraose c (SLNT c).
- 3′-sialyllactose 3′-SL
- 6′-SL 6′-sialyllactose
- 3′-S3FL 3′-si
- the bacterium comprises an exogenous or endogenous lactose-utilizing ⁇ (1,3) fucosyltransferase enzyme, an exogenous or endogenous lactose-utilizing ⁇ (1,4) fucosyltransferase enzyme, an exogenous or endogenous ⁇ (1,3) galactosyltransferase enzyme, an exogenous or endogenous ⁇ (1,4) galactosyltransferase enzyme, an exogenous or endogenous ⁇ -1,3-N-acetylglucosaminyltransferase, or any combination thereof.
- the bacterium comprises an elevated level of cytoplasmic lactose, uridine diphosphate N-acetylglucosamine (UDP-GlcNAc), and/or cytidine-5′-monophosphosialic acid (CMP-Neu5Ac) compared to a corresponding wild-type bacterium (e.g., when the bacterium is cultured in the presence of lactose).
- UDP-GlcNAc uridine diphosphate N-acetylglucosamine
- CMP-Neu5Ac cytidine-5′-monophosphosialic acid
- the level of lactose, UDP-GlcNAc, and/or CMP-Neu5Ac is at least about 5%, 10%, 15%, 20%, 5%, 30%, 35%, 40%, 45%, 50%, 75%, 100%, 200%, 300%, 400%, or 500% greater in the cytoplasm of the bacterium than a corresponding wild-type bacterium (e.g., when the bacterium is cultured in the presence of lactose).
- Various implementations comprise providing a bacterium that comprises an exogenous lactose-utilizing sialyltransferase gene, a deficient sialic acid catabolic pathway, a sialic acid synthetic capability, and a functional lactose permease gene; and culturing the bacterium in the presence of lactose.
- the sialylated oligosaccharide is then retrieved from the bacterium or from a culture supernatant of the bacterium.
- a sialic acid synthetic capability comprises expressing exogenous CMP-Neu5Ac synthetase, an exogenous sialic acid synthase, and an exogenous UDP-GlcNAc-2-epimerase, or a functional variant or fragment thereof.
- the bacterium may further comprises the capability for increased UDP-GlcNAc production.
- increased production capability is meant that the host bacterium produces greater than 10%, 20%, 50%, 100%, 2-fold, 5-fold, 10-fold, or more of a product than the native, endogenous bacterium.
- the bacterium over-expresses a positive endogenous regulator of UDP-GlcNAc synthesis.
- the bacterium overexpresses the nagC gene of E. coli .
- the bacterium over-expresses the E.
- coli glmS L-glutamine:D-fructose-6-phosphate aminotransferase gene or mutations in glmS gene that result in a GlmS enzyme not subject to feedback inhibition by its glucosamine-6-phosphate product
- glucosamine-6-phosphate product see, e.g., Deng, M. D., Grund, A. D., Wassink, S. L., Peng, S. S., Nielsen, K. L., Huckins, B. D., and Burlingame, R. P. (2006). Directed evolution and characterization of Escherichia coli glucosamine synthase. Biochimie 88, 419-429, the entire content of which is incorporated herein by reference.
- the bacterium over-expresses the E. coli glmY gene (a positive translational regulator of glmS). In some embodiments, the bacterium over-expresses the E. coli glmZ, gene (another positive translational regulator of glmS: glmY and glmZ are described in Reichenbach et al Nucleic Acids Res 36, 2570-80 (2008)). In certain embodiments, the bacterium over-expresses any combination of these genes. In various embodiments, the bacterium over-expresses nagC and glmS. In some embodiments, the bacterium over-expresses nagC and glmY.
- the bacterium over-expresses nage and glmZ.
- the gene transcript or encoded gene product is expressed or produced 10%, 20%, 50%, 2-fold, 5-fold, 10-fold, or more than the level expressed or produced by the corresponding native, naturally-occurring, or endogenous gene.
- corresponding methods and bacteria in which any homologue or functional variant or fragment of nagC, glmS, glmY or glmZ (or any combination thereof) is overexpressed.
- E. coli nagC, glmS, glmY or glmZ (or any combination thereof) is exogenously expressed in a bacterium other than E. coli.
- UDP-GlcNAc metabolism include: (GlcNAc-1-P) N-acetylglucosamine-1-phosphate; (GlcN-1-P) glucosamine-1-phosphate; (GlcN-6-P) glucosamine-6-phosphate; (GlcNAc-6-P) N-acetylglucosamine-6-phosphate; and (Fruc-6-P) Fructose-6-phosphate.
- bacteria comprising the characteristics described herein are cultured in the presence of lactose, and lacto-N-neotetraose is retrieved, either from the bacterium itself (i.e., by lysis) or from a culture supernatant of the bacterium.
- the bacterium contains a deficient sialic acid catabolic pathway.
- sialic acid catabolic pathway is meant a sequence of reactions, usually controlled and catalyzed by enzymes, which results in the degradation of sialic acid.
- An exemplary sialic acid catabolic pathway in E. coli is described herein.
- sialic acid (Neu5Ac; N-acetylneuraminic acid) is degraded by the enzymes NanA (N-acetylneuraminic acid lyase) and NanK (N-acetylmannosamine kinase) and NanE (N-acetylmannosamine-6-phosphate epimerase), all encoded in the nanATEK-yhcH operon, and repressed by NanR (ecocyc.org/ E. COLI ).
- a deficient sialic acid catabolic pathway is engineered in E.
- nanA N-acetylneuraminate lyase
- nanK N-acetylmannosamine kinase
- GenBank Accession Number (amino acid) BAE77265.1 GI:85676015
- nanE N-acetyltnamosamine-6-phosphate epimerase
- the nanT (N-acetylneuraminate transporter) gene is also inactivated or mutated.
- Other intermediates of sialic acid metabolism include: (ManNAc-6-P) N-acetylmannosamine-6-phosphate; (GlcNAc-6-P) N-acetylglucosamine-6-phosphate; (GlcN-6-P) Glucosamine-6-phosphate; and (Fruc-6-P) Fructose-6-phosphate.
- nanA is mutated.
- nanA and nanK are mutated, while nanE remains functional.
- nanA and nanE are mutated, while nanK has not been mutated, inactivated or deleted.
- a mutation is one or more changes in the nucleic acid sequence coding the gene product of nanA, nanK, nanE, and/or nanT.
- the mutation may be 1, 2, 5, 10, 25, 50 or 100 changes in the nucleic acid sequence.
- the nanA, nanK, nanE, and/or nanT is mutated by a null mutation.
- Null mutations as described herein encompass amino acid substitutions, additions, deletions, or insertions that either cause a loss of function of the enzyme (i.e., reduced or no activity) or loss of the enzyme (i.e., no gene product).
- deleted is meant that the coding region is removed in whole or in part such that no gene product is produced.
- a gene has been inactivated such that that the coding sequence thereof has been altered such that the resulting gene product is functionally inactive or encodes a gene product with less than 100%, 80%, 50%, or 20% of the activity of the native, naturally-occurring, endogenous gene product.
- the bacterium also comprises a sialic acid synthetic capability.
- the bacterium is an E. coli bacterium.
- the bacterium comprises a sialic acid synthetic capability through provision of an exogenous UDP-GlcNAc 2-epimerase (e.g., neuC of Campylobacter jejuni, GenBank AAK91727.1; GI:15193223, incorporated herein by reference) or equivalent (e.g. E. coli S88 neuC GenBank YP_002392936.1; GI: 218560023), a Neu5Ac synthase (e.g., neuB of C.
- UDP-GlcNAc 2-epimerase e.g., neuC of Campylobacter jejuni, GenBank AAK91727.1; GI:15193223, incorporated herein by reference
- E. coli S88 neuC GenBank YP_002392936.1 GI: 218560023
- the bacterium comprises an exogenous or endogenous N-acetylneuraminate synthase, an exogenous or endogenous UDP-N-acetylglucosamine 2-epimerase, an exogenous or endogenous N-acetylneuraminate cytidylyltransferase, or any combination thereof.
- the bacterium includes an exogenous N-acetylneuraminate synthase, UDP-N-acetylglucosamine 2-epimerase, and N-acetylneuraminate cytidylyltransferase from Campylobacter jejuni.
- the bacterium includes a reduced level of ⁇ -galactosidase activity compared to a corresponding wild-type bacterium (e.g., when the bacterium is cultured in the presence of lactose).
- the reduced level of ⁇ -galactosidase activity includes reduced expression of a ⁇ -galactosidase gene or reduced ⁇ -galactosidase enzymatic activity.
- the reduced level is less than 10% the level of the corresponding wild-type bacterium when the bacterium is cultured in the presence of lactose.
- the bacterium includes a deleted or inactivated endogenous ⁇ -galactosidase gene. In certain embodiments, the bacterium includes a deleted or inactivated endogenous lacZ gene and/or a deleted or inactivated endogenous lacI gene.
- the bacterium includes an endogenous ⁇ -galactosidase gene, wherein at least a portion of a promoter of the endogenous ⁇ -galactosidase gene has been deleted.
- the bacterium includes an exogenous ⁇ -galactosidase enzyme with reduced enzymatic activity compared to an endogenous ⁇ -galactosidase enzyme in a corresponding wild-type bacterium.
- the exogenous ⁇ -galactosidase gene is expressed at a lower level than to an endogenous ⁇ -galactosidase gene in a corresponding wild-type bacterium.
- the bacterium has less than 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 75, 50, 45, 40, 35, 30, 25, 20, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 units of ⁇ -galactosidase activity when cultured in the presence of lactose.
- the bacterium comprises at least about 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, or 2.5 units of ⁇ -galactosidase activity, but less than about 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 75, 50, 45, 40, 35, 30, 25, 20, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, or 3 units of ⁇ -galactosidase activity, when the bacterium is cultured in the presence of lactose.
- the bacterium has a lactose permease gene.
- the lactose permease gene comprises a lacYgene.
- the bacterium has an inactivated adenosine-5′-triphosphate (ATP)-dependent intracellular protease.
- ATP adenosine-5′-triphosphate
- the inactivated ATP-dependent intracellular protease has a null mutation in an ATP-dependent intracellular protease gene.
- the null mutation is a deletion of an endogenous lon gene.
- the bacterium further includes an exogenous E. coli rcsA or E. coli rcsB gene.
- the bacterium further includes a mutationin a thyA gene.
- the bacterium does not express a ⁇ -galactoside transacetylase.
- a ⁇ -galactoside transacetylase gene has been inactivated (e.g., deleted) in the bacterium.
- the bacterium has a lacA mutation.
- the bacterium accumulates intracellular lactose in the presence of exogenous lactose.
- the bacterium is a member of the Bacillus, Pantoea, Lactobacillus, Lactococcus, Streptococcus, Proprionibacterium, Enterococcus, Bifidobacterium, Sporolactobacillus, Micromomospora, Micrococcus, Rhodococcus, or Pseudomonas genus.
- the bacterium is a Bacillus licheniformis, Bacillus subtilis, Bacillus coagulans, Bacillus thermophilus, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans, Erwinia herbicola ( Pantoea agglomerans ), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, Xanthomonas campestris, Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delhrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus
- the E. coli bacterium is a GI724 strain bacterium.
- the bacterium has a lacIq promoter mutation. In certain embodiments, the bacterium has a lacPL8 promoter mutation.
- the bacterium has a nucleic acid construct including an isolated nucleic acid encoding the lactose-utilizing sialyltransferase enzyme.
- a chromosome of the bacterium has a nucleic acid construct having an isolated nucleic acid encoding the lactose-utilizing sialyltransferase enzyme.
- the nucleic acid is operably linked to a heterologous control sequence that directs the production of the enzyme in the bacterium.
- the heterologous control sequence comprises a bacterial promoter, a bacterial operator, a bacterial ribosome binding site, a bacterial transcriptional terminator, or a plasmid selectable marker.
- the bacterium has the genotype:
- nucleic acids encoding a mutant enzyme In aspects, provided herein are nucleic acids encoding a mutant enzyme.
- the mutant enzyme has amino acids in the sequence set forth as SEQ ID NO: 15, 16, 19, 20, 21, 22, 23. 24, 25, 26, 27, 28, 29, or 30.
- lactose-utilizing sialyltransferase enzyme having amino acids in the sequence set forth as SEQ ID NO: 15, 16, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30.
- sialyltransferases described herein have significant advantages over other enzymes of this class.
- Preferred sialyltransferases e.g., BstM and BstN
- BstM and BstN are lactose-utilizing and produce superior amounts of sialyllactose in production strains of bacteria, e.g., engineered E. coli .
- Not all enzymes in the sialyltransferase class utilize lactose.
- BstD and BstJ were found not to utilize lactose.
- lactose-utilizing sialyltransferase enzymes are rare among enzymes in the sialyltransferase class.
- KDO-lactose side-product KDO is a component of E. coli lipopolysaccharide (LPS, endotoxin), and LPS is a molecule that elicits a strong and often dangerous immune response in some mammals, and humans in particular. KDO is part of the core structure of LPS. KDO-lactose is made from a CMP-KDO nucleotide sugar precursor that is found naturally in all strains of E. coli .
- sialyltransferases e.g., Pst6-224
- CMP-KDO CMP-KDO
- Certain enzymes of the present invention e.g., BstM, BstN, ⁇ 20BstC*
- BstM, BstN, ⁇ 20BstC* produce less of this unwanted by-product as compared to others, e.g., Pst6-224.
- the methods described herein that include a heterologous gene in the engineered E. coli production strain) that expresses these preferred enzymes lead to a reduced or negligible amount of KDO-lactose. Such a reduced amount facilitates purification of the final desired product, sialyllactose, and is associated with a better safety profile for human use.
- a composition comprising sialylated oligosaccharides and less than 5%, e.g., less than 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or less than 0.1%, KDO-lactose.
- the composition is substantially pure.
- the composition comprises sialyllactose.
- the sialyllactose produced by ⁇ 20BstC* was found to be comprised of 6′-SL and 3′-SL. Production of both of these human milk oligosaccharides in the course of a single biofermentation represents a significant advantage in terms of time and cost of production over two separate fermentations. In some situations, such as striving to develop infant formulae that better emulate human milk, producing mixtures of human milk oligosaccahides in a single production fermentation is advantageous from a cost perspective.
- a composition comprising a sialyllactose produced using the methods, constructs, production strains described herein contain at least 10%, 25%, 50%, 2-fold, 5-fold, 10-fold or less KDO-lactose compared to compositions produced by other methods, e.g., produced using constructs encoding Pst6-224 or a-(2 ⁇ 6)-sialyltransferase encoded by the gene from the Photobacterium sp. JT-ISH-224.
- the invention also encompasses methods and a composition comprising substantially pure sialyllactose with minimal or minor levels of KDO-lactose.
- the composition contains less than 5%, 4%, 3%, 2%, 1%, or 0.5% (or less) KDO-lactose of the total mass of SL.
- a mutation e.g., ⁇ (deletion) mutation in a Bst gene, e.g., ⁇ 20BstC*, leads to a reduction in KDO-lactose.
- FIG. 1 is a schematic outlining the structures of the major sialyalated oligosaccharide species of human milk, how they are related to each other, and the steps necessary for their enzymatic synsthesis from lactose.
- FIG. 2 is a table presenting pairwise percent amino acid sequence identity comparison between the two ⁇ (2,6) sialyllactose (SL) probe sequences and the 8 identified ST candidates.
- FIG. 3 is a map of an expression vector carrying one of the candidate ST genes, bstN (plasmid pG543, SEQ ID NO: 11).
- FIG. 4 is a diagram outlining the scheme for SL biosynthesis in engineered E. coli.
- FIG. 5 is an image of a thin layer chromatography result, Prominent spots corresponding to the intracellular lactose pool are seen in the control strain (E1406, which does not contain and bst+neuBCA expression plasmid) and also in all bst candidate cultures.
- FIGS. 6A, 6B, and 6C are images showing UV traces from HPLC runs for the various heat extracts (E1406 control ⁇ 16Pst60224, HAC1268, ⁇ 20BstC, ⁇ 20stC*, BstE, BstH, BstI, BstM, and BstN).
- FIG. 7 is an image of thin layer chromatography of fractions from the Dowex 1 ⁇ 4 column. Typically, fraction 3 was the purest fraction and, after desalting, was suitable for NMR analysis.
- FIG. 8 is a 1D 1 H NMR spectrum of SL samples produced by BstM (BstM-SL) which showed three anomeric signals: ⁇ 5.22. (A), ⁇ 4.66 (B), both attributed to a reducing-end Glcp, and ⁇ 4.42 (C) assigned to ⁇ -Galp residue.
- FIG. 9 is a 1D 1 H NMR spectrum of SL samples produced by BstN (BstN-SL) which showed three anomeric signals: ⁇ 5.22 (A), ⁇ 4.66 (B), both attributed to a reducing-end Glcp, and ⁇ 4.42 (C) assigned to ⁇ -Galp residue.
- FIG. 10 is an image showing a sequence alignment of wild type PdST, ⁇ 20BstC and BstE ⁇ (2,3) sialyltransferases.
- FIG. 11 is an image of thin layer chromatography showing that SL synthesized by BstE*-producing cells was efficiently converted to lactose by both sialidase S and sialidase C. This result indicated that BstE* still possessed exclusively ⁇ (2,3)-selective activity, and that the introduced mutations did not alter regioselectivity of the enzyme as was predicted.
- FIG. 12 is a 1D 1 H NMR spectrum of SL produced by ⁇ 20BstC*. Characteristic features of the spectrum were 4 distinct anomeric peaks and the up-field signals of axial and equatorial H-3 of sialic acid.
- FIG. 13 is an image of overlaid HSQC and HMBC NMR spectra of sialyllactose synthesized by ⁇ BstC*-producing cells. NMR analysis showed that the larger signals belonged to 6′-sialyllactose, whereas the smaller one was part of contaminating 3′-sialyllactose.
- FIG. 14 is an image of the BLOSUM62 matrix.
- FIG. 15 is a table showing chemical shift assignments of the two major components of ⁇ 20BstC* synthesized sialyllactose. Orange lines indicate inter-residue correlations seen in both ROESY and HMBC experiments; blue lines indicate inter-residue correlations seen in HMBC only.
- FIG. 16 is an image showing UV traces from HPLC runs for the various cell extracts ( ⁇ 20BstC*, ⁇ 20BstC*2, ⁇ 20BstC*3, ⁇ 20BstC*4, ⁇ 20BstC*5).
- the acidic oligosaccharides of human milk include a prominent sialyllactose (SL) fraction, comprising 3′-sialyllactose and 6′-sialyllactose (Bode, L., and Jantscher-Krenn, E. (2012). Adv Nutr 3, 383S-391S).
- SL sialyllactose
- 3′-sialyllactose consists of an N-acetylneuraminic acid (Neu5Ac) moiety joined through an ⁇ (2,3) linkage to the galactose portion of lactose ( ⁇ (2,3)Neu5Ac Gal( ⁇ 1-4)Glc), while 6′-sialyllactose (6′-SL) consists of a Neu5Ac moiety joined through an ⁇ (2,6) linkage to the galactose portion of lactose ( ⁇ (2,6)Neu5Ac Gal ( ⁇ 1-4)Glc).
- 3′-SL and 6′-SL are two of the most abundant sialylated oligosaccharides present in human milk, together present at concentrations of up to ⁇ 0.5 Bao, Y., Zhu, L., and Newburg, D. S. (2007). Anal Biochem 370, 206-214).
- lactose-utilizing sialyltransferase enzymes include the amino acid sequences of the lactose-utilizing sialyltransferase enzyme, as well as variants and fragments thereof that exhibit sialyltransferase activity.
- hMOS acidic human milk oligosaccharides
- 3′-SL and 6′-SL inexpensively at large scale
- Purification of sialylated oligosaccharides from natural sources such as mammalian milks is not an economically viable approach, and production of hMOS through chemical synthesis is currently limited by stereo-specificity issues, precursor availability, product impurities, and high overall cost.
- bacteria can be metabolically engineered to produce hMOS.
- This approach involves the construction of microbial strains overexpressing heterologous glycosyltransferases, membrane transporters for the import of precursor sugars into the bacterial cytosol, and possessing enhanced pools of regenerating nucleotide sugars for use as biosynthetic precursors, e.g. as described by Dumon, C., et al. (2004). Biotechnol Prog 20, 412-19; Ruffling, A., and Chen, R. R. (2006). Microb Cell Fact 5, 25; Mao, Z., et al. (2006). Biotechnol Prog 22, 369-374).
- a key aspect of this approach is the identification and use of a heterologous glycosyltransferase selected for overexpression in the microbial host.
- the choice of glycosyltransferase can significantly affect the final yield of the desired synthesized oligosaccharide, given that enzymes can vary greatly in terms of their kinetics, donor and acceptor substrate specificity, side reaction products, and enzyme stability and solubility.
- a few glycosyltransferases derived from different bacterial species have been identified and characterized in terms of their ability to catalyze the biosynthesis of hMOS in E. coli host strains [(Dumon, C., et al. (2006). Chembiochem 7, 359-365; Dumon, C., et al. (2004). Biotechnol Prog 20, 412-19; Li, M., et al. (2008). Biochemistry 47, 378-387; Li, M., et al. (2008). Biochemistry 47, 11590-97)].
- a lactose-utilizing sialyltransferase enzyme comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 1-10, 1-15, 1-20, 5-15, 5-20, 10-25, 10-50, 20-50, 25-75, 25-100 or more mutations compared to a naturally occurring protein while retaining at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 99.5%, or about 100% of the activity (e.g., enzymatic activity) of the naturally occurring protein.
- the activity e.g., enzymatic activity
- Mutations include but are not limited to substitutions (such as conservative and non-conservative substitutions), insertions, and deletions.
- Non-limiting examples of lactose-utilizing sialyltransferase enzymes may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 1-10, 1-15, 1-20, 5-15, 5-20, 10-25, 10-50, 20-50, 25-75, 25-100, or more substitution mutations compared to a naturally occurring protein while retaining at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 99.5%, or about 100% of the activity (e.g., enzymatic activity) of the naturally occurring protein.
- the lactose-utilizing sialyltransferase enzyme is not a mutant (or the sequence altered) compared to a corresponding wild type sequence.
- a lactose-utilizing sialyltransferase enzyme may comprise a stretch of amino acids (e.g., the entire length of the lactose-utilizing sialyltransferase enzyme or a portion comprising at least about 50, 100, 200, 250, 300, 350, or 400 amino acids) in a sequence that is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, or 99.5% identical to an amino acid sequence of a naturally occurring protein.
- amino acids e.g., the entire length of the lactose-utilizing sialyltransferase enzyme or a portion comprising at least about 50, 100, 200, 250, 300, 350, or 400 amino acids
- the mutations are conservative, and the present subject matter includes many lactose-utilizing sialyltransferase enzymes in which the only mutations are substitution mutations.
- a lactose-utilizing sialyltransferase enzyme has no deletions or insertions compared to a naturally occurring protein (e.g., a naturally occurring counterpart).
- the lactose-utilizing sialyltransferase enzyme does not comprise a deletion or insertion compared to a naturally occurring lactose-utilizing sialyltransferase enzyme.
- a lactose-utilizing sialyitransferase enzyme may have (i) less than about 5, 4, 3, 2, or 1 inserted amino acids, and/or (ii) less than about 5, 4, 3, 2, or 1 deleted amino acids compared to a naturally occurring protein.
- a naturally occurring protein to which a lactose-utilizing sialyltransferase enzyme is compared or has been derived is a microbial protein, e.g., a prokaryotic lactose-utilizing sialyltransferase enzyme such as a bacterial lactose-utilizing sialyltransferase enzyme.
- a prokaryotic lactose-utilizing sialyltransferase enzyme such as a bacterial lactose-utilizing sialyltransferase enzyme.
- the prokaryotic lactose-utilizing sialyltransferase enzyme is a mutant or variant of a natural (i.e., wild-type) bacterial protein.
- the microbial protein is produced by a Gram-positive bacterium or a Gram-negative bacterium.
- the lactose-utilizing sialyltransferase enzyme does not comprise a signal peptide.
- the signal peptide e.g., that is present in a naturally occurring counterpart
- signal peptide refers to a short stretch of amino acids (e.g., 5-20 or 10-50 amino acids long) at the N-terminus of a protein that directs the transport of the protein.
- the signal peptide is cleaved off during the post-translational modification of a protein by a cell.
- the signal peptide may optionally be considered to be, e.g., the first 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 amino acids from the N-terminus of the translated protein (compared to a protein that has not had the signal peptide removed, e.g., compared to a naturally occurring protein).
- a lactose-utilizing sialyltransferase enzyme may comprise an amino acid sequence which is at least 60%, 65%, 70%, 75%, 76%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% identical to the reference SEQ ID NO or to each of the reference SEQ ID NOs.
- the lactose-utilizing sialyltransferase enzyme comprises an amino acid sequence that is 100% identical to the reference SEQ ID NO.
- a lactose-utilizing sialyltransferase enzyme may comprise an amino acid sequence which is less than 75%, 70%, 65%, 60%, 59%, 58%, 57%, 56%, 55%, 54%, 53%, 52%, 51%, 50%, 49%, 48%, 47%, 46%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, or 30% identical to the reference SEQ ID NO or to each of the reference SEQ ID NOs.
- a polypeptide comprises amino acids in a sequence that is preferably at least about 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45% and less than about 75%, 70%, 65%, 60%, 55%, 50%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, or 30% identical to the reference SEQ ID NO or to each of the reference SEQ ID NOs.
- a polypeptide comprises amino acids in a sequence that is between about 5% and about 75%, about 6% and about 75%, about 7% and about 75%, about 8% and about 75%, about 9% and about 75%, about 10% and about 75%, 11% and about 75%, 12% and about 75%, 13% and about 75%, 14% and about 75%, 15% and about 75%, 16% and about 75%, 17% and about 75%, 18% and about 75%, 19% and about 75%, 20% and about 75%, 21% and about 75%, 22% and about 75%, 23% and about 75%, 24% and about 75%, 25% and about 75%, 26% and about 75%, 27% and about 75%, 28% and about 75%, 29% and about 75%, 30% and about 75%, about 5% and about 100%, about 5% and about 95%, about 5% and about 85%, about 5% and about 75%, about 5% and about 70%, about 5% and about 65%, 60%, about 5% and
- Non-limiting examples of reference lactose-utilizing sialyltransferase enzymes and amino acid sequences disclosed herein include:
- the lactose-utilizing sialyltransferase enzyme comprises an amino acid sequence with at least 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 60, 70, 80, 90, or 100% identity to 1, 2, 3, 4, 5, 9, 10 or more lactose-utilizing sialyltransferase enzymes disclosed herein.
- the amino acid sequence of a protein comprises no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 mutations compared to its naturally occurring counterpart.
- less than 50, 45, 40, 35, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, or 2 of the mutations is a deletion or insertion of 1, 2, 3, 4, or 5 or no more than 1, 2, 4, or 5 amino acids.
- 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more of the mutations is a substitution mutation.
- every mutation to a protein compared to its naturally occurring counterpart is a substitution mutation.
- 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more or all of the mutations to a protein compared to its naturally occurring counterpart is a conservative substitution mutation.
- a polypeptide does not have any insertion or deletion compared to its natural counterpart, other than (optionally) the removal of the signal peptide and/or the fusion of compounds such as another polypeptide at the N-terminus or C-terminus thereof.
- a fragment of a protein is charactetized by a length (number of amino acids) that is less than the length of the full length mature form of the protein.
- a fragment in the case of these sequences and all others provided herein, may be a part of the whole that is less than the whole.
- a fragment ranges in size from a single nucleotide or amino acid within a polynucleotide or polypeptide sequence to one fewer nucleotide or amino acid than the entire polynucleotide or polypeptide sequence.
- a fragment is defined as any portion of a complete polynucleotide or polypeptide sequence that is intermediate between the extremes defined above.
- fragments of any of the proteins or enzymes disclosed herein or encoded by any of the genes disclosed herein can be 10 to 20 amino acids, 10 to 30 amino acids, 10 to 40 amino acids, 10 to 50 amino acids, 10 to 60 amino acids, 10 to 70 amino acids, 10 to 80 amino acids, 10 to 90 amino acids, 10 to 100 amino acids, 50 to 100 amino acids, 75 to 125 amino acids, 100 to 150 amino acids, 150 to 200 amino acids, 200 to 250 amino acids, 250 to 300 amino acids, 300 to 350, 350 to 400 amino acids, or 400 to 425 amino acids.
- the fragments encompassed in the present subject matter comprise fragments that retain functional fragments. As such, the fragments preferably retain the domains that are required or are important for sialyltransferase activity.
- Fragments can be determined or generated and tested for sialyltransferase activity using standard methods known in the art.
- the encoded protein can be expressed by any recombinant technology known in the art and the sialyltransferase activity of the protein can be determined.
- a “biologically active” fragment is a portion of a polypeptide which maintains one or more activities of a full-length reference polypeptide.
- Biologically active fragments as used herein exclude the full-length polypeptide.
- Biologically active fragments can be any size as long as they maintain the defined activity.
- the biologically active fragment maintains at least 10%, at least 50%, at least 75% or at least 90%, of the activity (such as sialyltransferase activity) of the full length protein,
- Amino acid sequence variants/mutants of the polypeptides of the defined herein can be prepared by introducing appropriate nucleotide changes into a nucleic acid defined herein, or by in vitro synthesis of the desired polypeptide.
- Such variants/mutants include, for example, deletions, insertions or substitutions of residues within the amino acid sequence. A combination of deletion, insertion and substitution can be made to arrive at the final construct, provided that the final peptide product possesses the desired activity and/or specificity.
- Mutant (altered) peptides can be prepared using any technique known in the art.
- a polynucleotide defined herein can be subjected to in vitro mutagenesis or DNA shuffling techniques.
- Products derived from mutated/altered I)NA can readily be screened using techniques described herein to determine if they possess, for example, sialyltransferase activity.
- Amino acid sequence deletions generally range from about 1 to 15 residues, e.g. about 1 to 10 residues and often about 1 to 5 contiguous residues.
- a mutated or modified protein does not comprise any deletions or insertions.
- a mutated or modified protein has less than about 10, 9, 8, 7, 5, 4, 3, or 2 deleted or inserted amino acids.
- Substitution mutants have at least one amino acid residue in the polypeptide molecule removed and a different residue inserted in its place. Sites may be substituted in a relatively conservative manner in order to maintain activity and/or specificity. Such conservative substitutions are shown in the table below under the heading of “exemplary substitutions.”
- a mutant/variant polypeptide has only,or not more than, one or two or three or four conservative amino acid changes when compared to a naturally occurring polypeptide. Details of conservative amino acid changes are provided in the table below. As the skilled person would be aware, such minor changes can reasonably be predicted not to alter the activity of the polypeptide when expressed in a recombinant cell.
- Mutations can be introduced into a nucleic acid sequence such that the encoded amino acid sequence is altered by, e.g., standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis.
- conservative amino acid substitutions are made at one or more predicted non-essential amino acid residues.
- a “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. Certain amino acids have side chains with more than one classifiable characteristic.
- amino acids with basic side chains e.g., lysine, arginine, histidine
- acidic side chains e.g., aspartic acid, glutamic acid
- uncharged polar side chains e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, tryptophan, cysteine
- nonpolar side chains e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tyrosine, tryptophan
- beta-branched side chains e.g., threonine, valine, isoleucine
- aromatic side chains e.g., tyrosine, phenylalanine, tryptophan, histidine
- a predicted nonessential amino acid residue in a given polypeptide is replaced with another amino acid residue from the same side chain family.
- mutations can be introduced randomly along all or part of a given coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for given polypeptide biological activity to identify mutants that retain activity.
- the invention also provides for variants with mutations that enhance or increase the endogenous biological activity. Following mutagenesis of the nucleic acid sequence, the encoded protein can be expressed by any recombinant technology known in the art and the activity/specificity of the protein can be determined. An increase, decrease, or elimination of a given biological activity of the variants disclosed herein can be readily measured by the ordinary person skilled in the art, i.e., by measuring the capability for binding a ligand and/or signal transduction.
- substitutions with natural amino acids are characterized using a BLOcks SUbstitution Matrix (a BLOSUM matrix).
- a BLOSUM matrix is the BLOSUM62 matrix, which is described in Styczynski et al. (2008) “BLOSUM62 miscalculations improve search performance” Nat Biotech 26 (3): 274-275, the entire content of which is incorporated herein by reference.
- the BLOSUM62 matrix is shown in FIG. 14 .
- Substitutions scoring at least 4 on the BLOSUM62 matrix are referred to herein as “Class I substitutions”; substitutions scoring 3 on the BLOSUM62 matrix are referred to herein as “Class II substitutions”; substitutions scoring 2 or 1 on the BLOSUM62 matrix are referred to herein as “Class III substitutions”; substitutions scoring 0 or ⁇ 1 on the BLOSUM62 matrix are referred to herein as “Class IV substitutions”; substitutions scoring ⁇ 2, ⁇ 3, or ⁇ 4 on the BLOSUM62 matrix are referred to herein as “Class V substitutions.”
- lactose-utilizing sialyltransferase enzymes having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25 or more Class I, II, III, IV, or V substitutions compared to a naturally occurring lactose-utilizing sialyltransferase enzyme (such as a lactose-utilizing sialyltransferase enzyme mentioned herein), or any 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more of any combination of Class I, II, III, IV, and/or V substitutions compared to a naturally occurring lactose-utilizing sialyltransferase enzyme such as a lactose-utilizing sialyltransferase enzyme exemplified herein.
- a “conservative amino acid substitution” may refer to a mutation or to a difference between two sequences.
- a mutant comprises a conservative amino acid substitution compared to a naturally occurring protein, wherein the substitution was introduced into the mutant intentionally (e.g., by human-directed genetic modification) to produce a protein that is derived from the naturally occurring protein.
- one naturally occurring protein comprises a conservative amino acid substitution compared to another naturally occurring protein, in which case the “substitution” is a conservative difference between the two sequences at a given position when the sequences of each protein are aligned.
- the lactose-utilizing sialyltransferase enzyme of the present disclosure is more ⁇ (2,6)-selective than the naturally occurring ⁇ (2,3) sialyltransferase.
- an “ ⁇ (2,6)-selective” enzyme effects transfer of sialic acid at a ratio of ⁇ (2,6): ⁇ (2,3) of at least 1:1, such as from about 1.2:1 to about 100:1, e.g., 1.2:1 to 50:1, 2:1 to 50:1, 3:1 to 50:1, 4:1 to 50:1, 1.2:1 to 40:1, 1.2:1 to 30:1, 1.2:1 to 20:1, 1.2:1 to 10:1, 2:1 to 10:1, 1. 3:1 to 10:1, or about 5:1 to about 10:1.
- a variety of bacterial species may be used in the oligosaccharide biosynthesis methods provided herein, e.g., E. coli, Erwinia herbicola ( Pantoea agglomerans ), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, or Xanthomonas campestris.
- Bacteria of the genus Bacillus may also be used, including Bacillus subtilis, Bacillus licheniormis, Bacillus coagulans, Bacillus thermophilus, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pimulus, Bacillus lentus, Bacillus cereus, and Bacillus circulans.
- bacteria of the genera Lactobacillus and Lactococcus may be modified using the methods of this invention, including but not limited to Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhainnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, and Lactococcus lactis.
- Streptococcus thermophiles and Proprionibacterium freudenreichii are also suitable bacterial species for the invention described herein. Also included as part of this invention are strains, modified as described here, from the genera Enterococcus (e.g., Enterococcus faecium and Enterococcus thermophiles ), Bifidobacterium (e.g., Bifidobacterium longum, Bifidobacterium infantis, and Bifidobacterium bifidum ), Sporolactobacillus spp., Micromomospora spp., Micrococcus spp., Rhodococcus spp., and Pseudomonas (e.g., Pseudomonas fluorescens and Pseudomonas aeruginosa ).
- Enterococcus e.g., Enterococcus faecium and Enterococc
- bacteria comprising the characteristics described herein are cultured in the presence of lactose, and a sialylated oligosaccharide is retrieved, either from the bacterium itself or from a culture supernatant of the bacterium.
- the sialylated oligosaccharide is purified for use in therapeutic or nutritional products, or the bacteria are used directly in such products.
- a suitable production host bacterial strain is one that is not the same bacterial strain as the source bacterial strain from which the lactose-utilizing sialyltransferase enzyme-encoding nucleic acid sequence was identified.
- the bacterium utilized in the production methods described herein is genetically engineered to increase the efficiency and yield of sialylated oligosaccharide products.
- the host production bacterium is characterized as having a reduced level of ⁇ -galactosidase activity, an ability to produce more UDP-GlcNAc or UDP-GlcNAc at a faster rate compared to a corresponding wild-type bacterium, an ability to produce more CMP-Neu5Ac or CMP-Neu5Ac at a faster rate compared to a corresponding wild-type bacterium, a defective or reduced sialic acid degradation pathway, an inactivated ⁇ -galactoside transacetylase gene, a lactose permease gene, or a combination thereof.
- the bacterium comprises an ability to produce more UDP-GlcNAc or UDP-GlCNAc at a faster rate compared to a corresponding wild-type bacterium.
- the nucleotide sugar uridine diphosphate N-acetylglucosamine (UDP-GlcNAc) is a key metabolic intermediate in bacteria, where it is involved in the synthesis and maintenance of the cell envelope.
- UDP-GlCNAc is used to make peptidoglycan (murein); a polymer comprising the bacterial cell wall whose structural integrity is absolutely essential for growth and survival.
- grain-negative bacteria use UDP-GlcNAc for the synthesis of lipid A, an important component of the outer cell membrane. Thus, for bacteria, the ability to maintain an adequate intracellular pool of UDP-GlcNAc is critical.
- the UDP-GlcNAc pool in E. coli is produced through the combined action of three glm genes, glmS (L-glutamine:D-fructose-6-phosphate aminotransferase), glmM (phosphoglucosamine mutase), and the bifunctional glmU (fused N-acetyl glucosamine-1-phosphate uridyltransferase and glucosamine-1-phosphate acetyl transferase) ( FIG. 2 ). These three genes direct a steady flow of carbon to UDP-GlcNAc, a flow that originates with fructose-6-phosphate (an abundant molecule of central energy metabolism).
- glm genes are under positive control by the transcriptional activator protein, NagC.
- NagC transcriptional activator protein
- E. coli encounters glucosamine or N-acetyl-glucosamine in its environment, these molecules are each transported into the cell via specific membrane transport proteins and are used either to supplement the flow of carbon to the UDP-GlcNAc pool, or alternatively they are consumed to generate energy, under the action of nag operon gene products (i.e. nagA [N-acetylglucosamine-6-phosphate deacetylase] and nagB [glucosamine-6-phosphate deaminase]).
- nagA N-acetylglucosamine-6-phosphate deacetylase
- nagB glucosamine-6-phosphate deaminase
- nagA and nagB are under negative transcriptional control, but by the same regulatory protein as the glm genes, i.e. NagC.
- NagC is thus bi-functional, able to activate GlcNAc synthesis, while at the same time repressing the degradation of glucosamine-6-phosphate and N-acetylglucosamine-6-phosphate.
- the binding of NagC to specific regulatory DNA sequences (operators), whether such binding results in gene activation or repression, is sensitive to fluctuations in the cytoplasmic level of the small-molecule inducer and metabolite, GlcNAc-6-phosphate.
- GlCNAc-6-phosphate Intracellular concentrations of GlCNAc-6-phosphate increase when N-acetylglucosamine is available as a carbon source in the environment, and thus under these conditions the expression of the Wm genes (essential to maintain the vital UDP-GlcNAc pool) would decrease, unless a compensatory mechanism is brought into play.
- E. coli maintains a baseline level of UDP-GlcNAc synthesis through continuous expression of nagC directed by two constitutive promoters, located within the upstream nagA gene. This constitutive level of nagC expression is supplemented approximately threefold under conditions where the degradative nag operon is induced, and by this means E.
- coli ensures an adequate level of glm gene expression under all conditions, even when N-acetylglucosamine is being utilized as a carbon source.
- Many hMOS incorporate GlcNAc into their structures directly, and many also incorporate sialic acid, a sugar whose synthesis involves consumption of UDP-GlcNAc.
- sialic acid a sugar whose synthesis involves consumption of UDP-GlcNAc.
- One way to address this problem during engineered synthesis of GlcNAc- or sialic acid-containing hMOS is to boost the UDP-GlcNAc pool through simultaneous over-expression of nagC, or preferably by simultaneous over-expression of both nagC and glmS.
- the bacterium preferably comprises increased production of UDP-GlcNAc.
- an exemplary means to achieve this is by over-expression of a positive endogenous regulator of UDP-GlcNAc synthesis, for example, overexpression of the nagC gene of E. coll.
- this nagC over-expression is achieved by providing additional copies of the nagC gene on a plasmid vector or by integrating additional nagC gene copies into the host cell chromosome.
- over-expression is achieved by modulating the strength of the ribosome binding sequence directing nagC translation or by modulating the strength of the promoter directing nagC transcription.
- the intracellular UDP-GlcNAc pool may be enhanced by other means, for example by over-expressing the E. coli glmS (L-glutamine:D-fructos-6-phosphate aminotransferase) gene, or alternatively by over-expressing the E. coli glmY gene (a positive translational regulator of glmS), or alternatively by over-expressing the E. coli glmZ gene (another positive translational regulator of glmS), or alternatively by simultaneously using a combination of approaches.
- E. coli glmS L-glutamine:D-fructos-6-phosphate aminotransferase
- E. coli glmY gene a positive translational regulator of glmS
- E. coli glmZ gene another positive translational regulator
- the nagC (GenBank Protein Accession BAA35319.1, incorporated herein by reference) and glmS (GenBank Protein Accession NP_418185.1, incorporated herein by reference) genes which encode the sequences provided herein are overexpressed simultaneously in the same host cell in order to increase the intracellular pool of UDP-GlcNAc.
- the ability to produce more CMP-Neu5Ac or CMP-Neu5Ac at a faster rate compared to a corresponding wild-type bacterium comprises the expression of any one of, or any combination of, or all three of an N-acetylneuraminate synthase, a UDP-N-acetylglucosamine 2-epimerase, and a N-acetylneuraminate cytidylyltransferase.
- Non limiting examples of these enzymes include NeuB, NeuC, and NeuA from Campylobacter jejuni (such as Campylobacter jejuni ATCC43484).
- neuBCA genes are co-expressed in an operon.
- the defective or reduced sialic acid degradation pathway comprises the inactivation or deletion of any one of, any combination of, or each of a nanR gene, a nanA gene, a nanT gene, a nanE gene, or a nanK gene.
- the nanA, nanT, and nanE genes are inactivated or deleted in the bacterium.
- an “inactivated” or “inactivation of a” gene, encoded gene product (i.e., polypeptide), or pathway refers to reducing or eliminating the expression (i.e., transcription or translation), protein level (i.e., translation, rate of degradation), or enzymatic activity of the gene, gene product, or pathway.
- a pathway is inactivated, preferably one enzyme or polypeptide in the pathway exhibits reduced or negligible activity.
- the enzyme in the pathway is altered, deleted or mutated such that the product of the pathway is produced at low levels compared to a wild-type bacterium or an intact pathway. In certain embodiments, the product of the pathway is not produced.
- the level of a compound that is utilized (e.g., used as a substrate, altered, catalyzed, or otherwise reduced or consumed) by the pathway is increased.
- inactivation of a gene is achieved by deletion or mutation of the gene or regulatory elements of the gene such that the gene is no longer transcribed or translated.
- inactivation of a polypeptide can be achieved by deletion or mutation of the gene that encodes the gene product or mutation of the polypeptide to disrupt its activity. inactivating mutations include additions, deletions or substitutions of one or more nucleotides or amino acids of a nucleic acid or amino acid sequence that results in the reduction or elimination of the expression or activity of the gene or polypeptide.
- inactivation of a polypeptide is achieved through the addition of exogenous sequences (e.g., tags) to the N or C-terminus of the polypeptide such that the activity of the polypeptide is reduced or eliminated (e.g., by steric hindrance).
- exogenous sequences e.g., tags
- a host bacterium suitable for the production systems described herein exhibits an enhanced or increased cytoplasmic or intracellular pool of lactose and/or UDP-GlcNAc and/or CMP-Neu5Ac.
- the bacterium is E. coli and endogenous E. coli metabolic pathways and genes are manipulated in ways that result in the generation of increased cytoplasmic concentrations of lactose and/or UDP-GlcNAc and/or CMP-Neu5Ac, as compared to levels found in wild type E. coli .
- the bacterium accumulates an increased intracellular lactose pool and an increased intracellular UDP-GlcNAc and/or CMP-Neu5Ac pool.
- the bacteria contain at least 10%, 20%, 50%, or 2 ⁇ , 5 ⁇ , 10 ⁇ or more of the levels of intracellular lactose and/or intracellular UDP-GlcNAc and/or CMP-Neu5Ac compared to a corresponding wild type bacterium that lacks the genetic modifications described herein.
- increased intracellular concentration of lactose in the host bacterium compared to wild-type bacterium is achieved by manipulation of genes and pathways involved in lactose import, export and catabolism.
- described herein are methods of increasing intracellular lactose levels in E. coli genetically engineered to produce a human milk oligosaccharide by simultaneous deletion of the endogenous ⁇ -galactosidase gene (lacZ) and the lactose operon repressor gene (lacI).
- the lacIq promoter is placed immediately upstream of (contiguous with) the lactose permease gene, lacY, i.e., the sequence of the lacIq promoter is directly upstream and adjacent to the start of the sequence encoding the lacY gene, such that the lacY gene is under transcriptional regulation by the lacIq promoter.
- the modified strain maintains its ability to transport lactose from the culture medium (via LacY), but is deleted for the wild-type chromosomal copy of the lacZ (encoding ⁇ -galactosidase) gene responsible for lactose catabolism.
- an intracellular lactose pool is created when the modified strain is cultured in the presence of exogenous lactose.
- increasing the intracellular concentration of lactose in E. coli involves inactivation of a ⁇ -galactoside transacetylase gene such as the lacA gene.
- a ⁇ -galactoside transacetylase gene such as the lacA gene.
- an inactivating mutation, null mutation, or deletion of lacA prevents the formation of intracellular acetyl-lactose, which not only removes this molecule as a contaminant from subsequent purifications, but also eliminates E. coli 's ability to export excess lactose from its cytoplasm (Danchin A. Cells need safety valves. Bioessays 2009, July; 31(7):769-73.), thus greatly facilitating purposeful manipulations of the E. coli intracellular lactose pool.
- a functional lactose permease gene is present in the bacterium.
- the lactose permease gene is an endogenous lactose permease gene or an exogenous lactose permease gene.
- the lactose permease gene may comprises an E. coli lacY gene (e.g., CienBank Accession Number V00295 (GI:41897), incorporated herein by reference).
- E. coli lacY gene e.g., CienBank Accession Number V00295 (GI:41897), incorporated herein by reference.
- Many bacteria possess the inherent ability to transport lactose from the growth medium into the cell, by utilizing a transport protein that is either a homolog of the E.
- coli lactose permease e.g., as found in Bacillus licheniformis
- a transporter that is a member of the ubiquitous PTS sugar transport family e.g., as found in Lactobacillus casei and Lactobacillus rhanmosus
- E. coli lacY an exogenous lactose transporter gene
- the host bacterium preferably has a reduced level of ⁇ -galactosidase activity.
- an exogenous ⁇ -galactosidase gene may be introduced to the bacterium.
- a plasmid expressing an exogenous ⁇ -galactosidase gene may be introduced to the bacterium, or recombined or integrated into the host genome.
- the exogenous ⁇ -galactosidase gene may be inserted into a gene that is inactivated in the host bacterium, such as the lon gene.
- the exogenous ⁇ -galactosidase gene is a functional ⁇ -galactosidase gene characterized by a reduced or low level of 3-galactosidase activity compared to 3-galactosidase activity in wild-type bacteria lacking any genetic manipulation.
- Exemplary ⁇ -galactosidase genes include E. coli lacZ and ⁇ -galactosidase genes from any of a number of other organisms (e.g., the lac4 gene of Kluyveromyces lactis GenBank Accession Number M84410 (GI:173304), incorporated herein by reference) that catalyzes the hydrolysis of galactosides into monosaccharides.
- the level of ⁇ -galactosidase activity in wild-type E. coli bacteria is, for example, 1,000 units (e.g., when the bacterium is cultured in the presence of lactose).
- the reduced ⁇ -galactosidase activity level encompassed by engineered host bacterium of the present invention includes less than 1,000 units, less than 900 units, less than 800 units, less than 700 units, less than 600 units, less than 500 units, less than 400 units, less than 300 units, less than 200 units, less than 100 units, or less than 50 units (e.g., when the bacterium is cultured in the presence of lactose).
- low, functional levels of ⁇ -galactosidase include ⁇ -galactosidase activity levels of between 0.05 and 1,000 units, e.g., between 0.05 and 750 units, between 0.05 and 500 units, between 0.05 and 400 units, between 0.05 and 300 units, between 0.05 and 200 units, between 0.05 and 100 units, between 0.05 and 50 units, between 0.05 and 10 units, between 0.05 and 5 units, between 0.05 and 4 units, between 0.05 and 3 units, or between 0.05 and 2 units of ⁇ -galactosidase activity (e.g., when the bacterium is cultured in the presence of lactose).
- ⁇ -galactosidase activity levels of between 0.05 and 1,000 units, e.g., between 0.05 and 750 units, between 0.05 and 500 units, between 0.05 and 400 units, between 0.05 and 300 units, between 0.05 and 200 units, between 0.05 and 100 units, between 0.05 and 50 units, between 0.05 and 10 units, between 0.05 and 5 units,
- low, functional levels of ⁇ -galactosidase include ⁇ -galactosidase activity levels of between 1 and 1,000 units, e.g., between 1 and 750 units, between 1 and 500 units, between 1 and 400 units, between 1 and 300 units, between 1 and 200 units, between 1 and 100 units, between 1 and 50 units, between 1 and 10 units, between 1 and 5 units, between 1 and 4 units, between 1 and 3 units, or between 1 and 2 units of ⁇ -galactosidase activity (e.g., when the bacterium is cultured in the presence of lactose).
- ⁇ -galactosidase activity levels of between 1 and 1,000 units, e.g., between 1 and 750 units, between 1 and 500 units, between 1 and 400 units, between 1 and 300 units, between 1 and 200 units, between 1 and 100 units, between 1 and 50 units, between 1 and 10 units, between 1 and 5 units, between 1 and 4 units, between 1 and 3 units, or between 1 and 2 units of ⁇ -gal
- the bacterium has an inactivated thyA gene.
- a mutation in a thyA gene in the host bacterium allows for the maintenance of plasmids that carry thyA as a selectable marker gene.
- exemplary alternative selectable markers include antibiotic resistance genes such as BLA (beta-lactamase), or proBA genes (to complement a proAB host strain proline auxotropy) or purA (to complement a purA host strain adenine auxotrophy).
- purified oligosaccharide e.g., 3′-SL, 6′-SLNT, 3′-S3FL, SLNT a, SLNT b, DSLNT, SLNFP II, or SLNT c is one that is at least 85%, 90%, 95%, 98%, 99%, or 100% (w/w) of the desired oligosaccharide by weight. Purity may be assessed by any known method, e.g., thin layer chromatography or other chromatographic techniques known in the art.
- a method of purifying a sialylated oligosaccharide produced by a genetically engineered bacterium described herein comprises separating the desired sialylated oligosaccharide from contaminants in a bacterial cell lysate or bacterial cell culture supernatant of the bacterium.
- a sialylated oligosaccharide may be added to a food or beverage composition to increase the level of the sialylated oligosaccharide in the composition.
- the sialylated oligosaccharide is added to dried or powder milk or milk product, e.g., infant formula. In some embodiments, it is added to a liquid milk. In other embodiments, it is added to a non-milk dairy product, e.g. yogurt or kefir.
- a composition provided herein is not milk. In certain embodiments, a composition provided herein does not comprise milk.
- sialylated oligosaccharides are purified and used in a number of products for consumption by humans as well as animals, such as companion animals (dogs, cats) as well as livestock (bovine, equine, ovine, caprine, or porcine animals, as well as poultry).
- a food, beverage, dietary supplement, or pharmaceutical composition may comprise a purified 3′-SL, 6′-SL, 3′-S3FL, SLNT a, SLNT b, DSLNT, SLNFP II, or SLNT c.
- the composition comprises an excipient that is suitable for oral administration.
- a method of producing a pharmaceutical composition comprising a purified human milk oligosaccharide (HMOS) (such as a sialylated oligosaccharide present in human milk) may be carried out by culturing a bacterium described herein, purifying the HMOS produced by the bacterium, and combining the HMOS with an excipient or carrier to yield a dietary supplement for oral administration.
- HMOS human milk oligosaccharide
- these compositions are useful in methods of preventing or treating enteric and/or respiratory diseases in infants and adults. Accordingly, the compositions are administered to a subject suffering from or at risk of developing such a disease.
- HMOS binds to a pathogen and wherein the subject is infected with or at risk of infection with the pathogen.
- the infection is caused by a Norwalk-like virus or Campylobacter jejuni.
- the subject is a mammal.
- the mammal is, e.g., any mammal, e.g., a human, a primate, a mouse, a rat, a dog, a cat, a cow, a horse, or a pig. in some embodiments, the mammal is a human.
- the compositions are formulated into animal feed (e.g., pellets, kibble, mash) or animal food supplements for companion animals, e.g., dogs or cats, as well as livestock or animals grown for food consumption, e.g., cattle, sheep, pigs, chickens, and goats.
- the purified HMOS is formulated into a powder (e.g., infant formula powder or adult nutritional supplement powder, each of which is mixed with a liquid such as water or juice prior to consumption) or in the form of tablets, capsules or pastes or is incorporated as a component in dairy products such as milk, cream, cheese, yogurt or kefir, or as a component in any beverage, or combined in a preparation containing live microbial cultures intended to serve as probiotics, or in prebiotic preparations to enhance the growth of beneficial microorganisms either in vitro or in vivo.
- a powder e.g., infant formula powder or adult nutritional supplement powder, each of which is mixed with a liquid such as water or juice prior to consumption
- a liquid such as water or juice prior to consumption
- nucleic acid construct or an expression vector comprising a nucleic acid encoding at least one lactose-utilizing sialyltransferase enzyme or a variant or fragment thereof, as described herein.
- the vector can further include one or more regulatory elements, e.g., a heterologous promoter.
- heterologous is meant that the control sequence and protein-encoding sequence originate from different sources. For example, the sources may be different bacterial strains or species.
- the regulatory elements can be operably linked to a gene encoding a protein, a gene construct encoding a fusion protein gene, or a series of genes linked in an operon in order to express the fusion protein,
- an isolated recombinant cell e.g., a bacterial cell containing an aforementioned nucleic acid molecule or vector.
- the nucleic acid is optionally integrated into the genome of the host bacterium.
- the nucleic acid construct also further comprises one or more enzymes that are not lactose-utilizing sialyltransferase enzymes.
- an “expression vector” is a DNA or RNA vector that is capable of effecting expression of one or more polynucleotides.
- the expression vector is also capable of replicating within the host cell.
- Expression vectors can be either prokaryotic or eukaryotic, and are typically include plasmids.
- Expression vectors of the present invention include any vectors that function (i.e., direct gene expression) in host cells of the present invention, including in one of the prokaryotic or eukaryotic cells described herein, e,g., gram-positive, gram-negative, pathogenic, non-pathogenic, commensal, cocci, bacillus, or spiral-shaped bacterial cells; archaeal cells; or protozoan, algal, fungi, yeast, plant, animal, vertebrate, invertebrate, arthropod, mammalian, rodent, primate, or human cells.
- prokaryotic or eukaryotic cells described herein e,g., gram-positive, gram-negative, pathogenic, non-pathogenic, commensal, cocci, bacillus, or spiral-shaped bacterial cells; archaeal cells; or protozoan, algal, fungi, yeast, plant, animal, vertebrate, invertebrate, arthropod, mammalian
- Expression vectors of the present invention contain regulatory sequences such as transcription control sequences, translation control sequences, origins of replication, and other regulatory sequences that are compatible with the host cell and that control the expression of a polynucleotide.
- expression vectors of the present invention include transcription control sequences.
- Transcription control sequences are sequences which control the initiation, elongation, and termination of transcription.
- Particularly important transcription control sequences are those which control transcription initiation such as promoter, enhancer, operator and repressor sequences.
- Suitable transcription control sequences include any transcription control sequence that can function in at least one of the cells of the present invention. A variety of such transcription control sequences are known to those skilled in the art.
- a “heterologous promoter” is a promoter which is different from the promoter to which a gene or nucleic acid sequence is operably linked in nature.
- overexpress or “overexpression” refers to a situation in which more factor is expressed by a genetically-altered cell than would be, under the same conditions, by a wild-type cell. Similarly, if an unaltered cell does not express a factor that it is genetically altered to produce, the term “express” (as distinguished from “overexpress”) is used indicating the wild type cell did not express the factor at all prior to genetic manipulation.
- a polypeptide or class of polypeptides may be defined by the extent of identity (% identity) of its amino acid sequence to a reference amino acid sequence, or by having a greater % identity to one reference amino acid sequence than to another.
- a variant of any of genes or gene products disclosed herein may have, e.g., 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to the nucleic acid or amino acid sequences described herein.
- % identity in the context of two or more nucleic acid or polypeptide sequences, refers to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using a sequence comparison algorithm or by visual inspection. For example, % identity is relative to the entire length of the coding regions of the sequences being compared, or the length of a particular fragment or functional domain thereof.
- Variants as disclosed herein also include homologs, orthologs, or paralogs of the genes or gene products described herein.
- variants may, demonstrate a percentage of homology or identity, for example, at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity conserved domains important for biological function, e.g., in a functional domain, e.g. a catalytic domain.
- sequence comparison For sequence comparison, one sequence acts as a reference sequence, to which test sequences are compared.
- test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated.
- sequence comparison algorithm calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters. Percent identity is determined using
- phrases such as “at least one of” or “one or more of” may occur followed by a conjunctive list of elements or features.
- the term “and/or” may also occur in a list of two or more elements or features. Unless otherwise implicitly or explicitly, contradicted by the context in which it is used, such a phrase is intended to mean any of the listed elements or features individually or any of the recited elements or features in combination with any of the other recited elements or features.
- the phrases “at least one of A and B;” “one or more of A and B;” and “A and/or B” are each intended to mean “A alone, B alone, or A and B together.”
- a similar interpretation is also intended for lists including three or more items.
- the phrases “at least one of A, B, and C;” “one or more of A, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, B alone, C alone, A and B together, A and C together, B and C together, or A and B and C together.”
- use of the term “based on,” above and in the claims is intended to mean, “based at least in part on,” such that an unrecited feature or element is also permissible
- 0.2-5 mg is a disclosure of 0.2 mg, 0.3 mg, 0.4 mg, 0.5 mg, 0.6 mg etc. up to and including 5.0 mg.
- an “isolated” or “purified” nucleic acid molecule, polynucleotide, polypeptide, or protein is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized.
- Purified compounds are at least 60% by weight (dry weight) the compound of interest.
- the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight the compound of interest.
- a purified compound is one that is at least 90%, 91%, 92%, 93%, 94%, 95%, 98%, 99%, or 100% (w/w) of the desired compound by weight.
- RNA ribonucleic acid
- DNA deoxyribonucleic acid
- Purity is measured by any appropriate standard method, for example, by column chromatography, thin layer chromatography, or high-performance liquid chromatography (HPLC) analysis.
- a purified or isolated polynucleotide ribonucleic acid (RNA) or deoxyribonucleic acid (DNA)
- RNA ribonucleic acid
- DNA deoxyribonucleic acid
- Purified also defines a degree of sterility that is safe for administration to a human subject, e.g., lacking infectious or toxic agents.
- nucleotide or polypeptide means one that has been separated from the components that naturally accompany it.
- nucleotides and polypeptides are substantially pure when they are at least 60%, 70%, 80%, 90%, 95%, or even 99%, by weight, free from the proteins and naturally-occurring organic molecules with they are naturally associated.
- the term “substantially pure” or “substantially free” with respect to a particular composition means that the composition comprising the sialylated oligosaccharide contains less than 50%, less than 40%, less than 30%, less than 20%, less than 15%, less than 10%, less than 5%, or less than 1% by weight of other substances.
- “substantially pure” or “substantially free of” refers to a substance free of other substances, including impurities. Impurities may, for example, include by-products, contaminants, degradation products, water, and solvents.
- transitional term “comprising,” which is synonymous with “including,” “containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps.
- the transitional phrase “consisting of” excludes any element, step, or ingredient not specified in the claim.
- the transitional phrase “consisting essentially of” limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention.
- Subject refers to any organism to which a sialylated oligosaccharide may be administered.
- the subject may be a human or a non-human animal.
- the subject may be a mammal.
- the mammal may be a primate or a non-primate.
- the mammal can be a primate such as a human; a non-primate such as, for example, dog, cat, horse, cow, pig, mouse, rat, camel, llama, goat, rabbit, sheep, hamster, and guinea pig; or non-human primate such as, for example, monkey, chimpanzee, gorilla, orangutan, and gibbon.
- the subject may be of any age or stage of development, such as, for example, an adult, an adolescent, or an infant.
- the subject is a human individual less than 2 years of age, an elderly subject (e.g., 65 or more years of age), an immunocompromised subject (e.g., suffering from an autoimmune disorder, undergoing immunosuppressive therapy associated with transplantation, or a subject diagnosed with cancer and undergoing chemotherapy), a malnourished individual, an individual recovering from a dysbiosis (for example of the gut tnicrobiota following treatment with antibiotics), or any individual that would benefit from establishment or re-establishment of a healthy gut microbiota.
- an elderly subject e.g., 65 or more years of age
- an immunocompromised subject e.g., suffering from an autoimmune disorder, undergoing immunosuppressive therapy associated with transplantation, or a subject diagnosed with cancer and undergoing chemotherapy
- a malnourished individual for example of the gut tnicrobiota following
- treating and “treatment” as used herein refer to the administration of an agent or formulation to a clinically symptomatic individual afflicted with an adverse condition, disorder, or disease, so as to effect a reduction in severity and/or frequency of symptoms, eliminate the symptoms and/or their underlying cause, and/or facilitate improvement or remediation of damage.
- preventing and “prevention” refer to the administration of an agent or composition to a clinically asymptomatic individual who is susceptible to a particular adverse condition, disorder, or disease, and thus relates to the prevention of the occurrence of symptoms and/or their underlying cause.
- an effective amount and “therapeutically effective amount” of a formulation or formulation component is meant a nontoxic but sufficient amount of the formulation or component to provide the desired effect.
- a disease As used herein, the singular forms “a,” “an,” and “the” include the plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to “a disease,” “an oligonucleotide,” or “a nucleic acid” is a reference to one or more such embodiments, and includes equivalents thereof known to those skilled in the art and so forth.
- “pharmaceutically acceptable” carrier or excipient refers to a carrier or excipient that is suitable for use with humans and/or animals without undue adverse side effects (such as toxicity, irritation, and allergic response) commensurate with a reasonable benefit/risk ratio. It can be, e.g., a pharmaceutically acceptable solvent, suspending agent or vehicle, for delivering the instant compounds to the subject.
- polypeptide and “protein” are used interchangeably.
- Sialyltransferases identified from both prokaryotic and eukaryotic organisms are categorized into 5 distinct sequence families (GT29, GT38, GT42, GT52 and GT80) and possess at least two structural folds (GT-A and GT-B), (Audry, M., et al (2011). Glycobiology 21, 716-726).
- Eukaryotic sialytransferases (the GT29 family and GT-A fold) are transmembrane molecules found in the secretory pathway, and as such they present a heterologous expression problem for their use within the cytoplasm of engineered microbes as described herein.
- the PSI-BLAST program using a given query protein sequence, generates a list of closely related protein sequences based on a homology search of a database. These protein homolog hits are then used by the program to generate a profile reflecting their sequence similarities to the original query, The profile is then used by the algorithm to identify an expanded group of homolog proteins, and the process is iterated several times until the number of additional new candidates obtained after each iteration decreases (Altschul, S. F., et al. (1990) J. Mol. Biol 215, 403-410; Altschul, S. F., et al. (1997) Nucleic Acids Res 25, 3389-3402).
- Pst6-224 amino acid sequence was used as a query for 6 iterations of the PSI-BLAST search algorithm. This approach yielded a group of unique 433 candidates with varying degrees of similarity to Pst6-224, many of which (117) were highly related to Pst6-224 (shared amino acid identity in the range of 50-90%) as well as a group that was more distantly related (shared amino acid identity less than 50%).
- Pst6-224 produced sub-optimal yields of 6′-SL, with a tendency to produce undesirable side products when used in a metabolically engineered E. coli production strain (Drouillard et al., 2010). in addition, elevated production of Pst6-224 appeared to be moderately toxic in certain E.
- This group of candidates shared certain similarities primarily within the catalytic domain region of the respective proteins as inferred from the observation that they all belong to the same Pfam protein family, but not necessarily similarities in their protein domain organization. It must be noted that the presence of a “sialyltransferase” Pfam domain ensures nothing obvious about the actual catalytic ability of the protein in term of specific activity, catalytic rate, substrate specificity and/or product specificity, and that substantial experimentation is required to verify candidate genes for their desired properties.
- This group of candidates may include similar, better or distinct ⁇ (2,6) ST activities relative to Pst6-224, but that they are different enough at the amino acid level to avoid the cryptic toxicity and other functional shortcomings (e.g. poorer specificity) observed with Pst6-224 expressed in production strains.
- candidate STs were further screened to identify those candidate STs arising from bacterial species that may or are known to incorporate sialic acid into their cell surface glycan structures.
- Candidate STs from these types of organisms are more likely to utilize CMP-N-acetylneuraminic acid (CMP-Neu5Ac) as a sugar nucleotide donor substrate, given the presence of sialic acid in their surface carbohydrate structures.
- CMP-Neu5Ac CMP-N-acetylneuraminic acid
- Candidate STs from commensals or pathogens were also identified. Such organisms sometimes display carbohydrate structures on their cell-surface that contain sialic acid.
- candidate STs from these types of organisms are believed to be more likely to utilize CMP-Neu5Ac as a donor substrate and also to catalyze the linkage of sialic acid to useful acceptor oligosaccharides.
- a second sequence database screen was conducted using a second lactose-utilizing ⁇ (2,6) ST as the search probe (HAC1268 from Helicobacter acinonychis (Schur, M. J., et al, (2012). Glycobiology 22, 997-1006, SEQ ID NO: 8).
- HAC1268 is a member of the GT42 sialyltransferase family, possessing a predicted structural fold (the GT-A fold) distinct from the Pst6-224 ST sequence (that was used as the probe in the first database screen, described above, in).
- FIG. 2 presents a pairwise % amino acid sequence identity comparison between the two ⁇ (2,6) ST probe sequences and the 8 identified ST candidates.
- Synthetic bst genes for these candidates were designed and codon-optimized in silica for E. coli expression using standard bioinformatic algorithms known to the art, and engineered with modified ribosomal binding sites to tune translation to appropriate levels in E. coli .
- the expression vector utilized to express the candidate bst genes, and to test for their ability to make sialyllactose is a p15A origin-based plasmid carrying the strong bacteriophage ⁇ pL promoter to drive expression of heterologous genes.
- the plasmid carries ⁇ -lactamase (bla) gene for maintaining the plasmid in host strains using ampicillin selection (for convenience in the laboratory), and additionally it carries a native E. coli thyA (thymidylate synthase) gene as an alternative means of selection in thyA minus hosts.
- the plasmid also carries, downstream of the pL promoter and in an operon configuration downstream of the candidate bst gene, three heterologous biosynthetic genes from Campylobacter jejuni (neuB, neuC, and neuA; encoding N-acetylneuraminate synthase, UDP-N-acetylglucosamine 2-epimerase, and N-acetylneuraminate cytidylyltransferase respectively). These enzymes confer on E. coli the ability to convert UDP-GlCNAc into CMP-Neu5Ac.
- FIG. 3 is a map of this expression vector carrying one of the candidate ST genes, bstN (plasmid pG543, SEQ ID NO: 11).
- the candidate sialyltransferase gene expression plasmids were transformed into a host strain useful for the production of sialyllactose (SL). Biosynthesis of SL requires the generation of an enhanced cellular pool of both lactose and CMP-Neu5Ac ( FIG. 4 outlines the scheme for SL biosynthesis in engineered E. coli ).
- the wild-type Escherichia coli K12 prototrophic strain W3110 was selected as the starting point for engineering a host background to test the ability of the candidates to catalyze sialyllactose production (Bachmann, B. J. (1972). PBacteriol Rev 36, 525-557).
- the particular W3110 derivative employed was one that previously had been modified by the introduction (at the ampC locus) of a tryptophan-inducible P trpB cI+ repressor cassette, generating an E. coli strain known as G1724 (LaVallie et al., 200).
- GI724 Other features of GI724 include lacIq and lacPL8 promoter mutations.
- E. coli strain GI724 affords economical production of recombinant proteins from the phage ⁇ P L promoter following induction with low levels of exogenous tryptophan (LaVallie, E. R., et al. (1993). Biotechnology (NY) 11, 187-193; Mieschendahl Petri, and Hänggi (1986). Bio/Technology 4, 802-08). Additional genetic alterations were made to this strain to promote the biosynthesis of SL. This was achieved in strain GI724 through several manipulations of the chromosome using ⁇ Red recombineering (Court, D. L., et al. (2002). Annu Rev Genet 36, 361-388) and generalized P1 phage transduction (Li, X. T., et al. (2013), Nucleic Acids Res 41, e204).
- the ability of the E. coli host strain to accumulate intracellular lactose was engineered by deletion of the endogenous ⁇ -galactosidase gene (lacZ).
- lacZ endogenous ⁇ -galactosidase gene
- the strain thus modified maintains its ability to transport lactose from the culture medium (via LacY, the lactose permease). but is deleted for the wild-type copy of the lacZ gene responsible for lactose catabolism.
- An intracellular lactose pool is therefore created when the modified strain is cultured in the presence of exogenous lactose.
- the lacA gene was deleted in order to eliminate production of acetyl-lactose from the enhanced pool of intracellular lactose.
- the lacZ and LacI genes were simultaneously deleted such that the enhanced constitutive lacIq promoter was placed immediately upstream of the lactose permease gene lacY.
- a pool of the sugar nucleotide donor CMP-Neu5Ac was generated in the cytosol of the cell by co-expression of three genes from Campylobacter jejuni ATCC43484 (detailed above) encoding i) N-acetylneuraminate synthase (NeuB), ii) UDP-N-acetylglucosamine 2-epimerase (NeuC), and iii) N-acetylneuraminate cytidylyltransferase (NeuA).
- the neuBCA gene products function together in the enzymatic conversion of endogenous UDP-GlcNAc to CMP-Neu5Ac.
- the neuBCA genes are co-expressed in an operon, downstream from the hst gene on the plasmid expression vector and driven from the pL promoter,
- endogenous host cell genes encoding enzymes involved in sialic acid degradation were specifically deleted using ⁇ red recombineering.
- the sialic acid catabolic pathway in E. coli is encoded by the nan operon, consisting of the nanRATEK genes (Hopkins, A. P., et al. (2013). FEMS Microbiol Lett 347, 14-22). Specifically, the nanATE genes were deleted to stabilize CMP-Neu5Ac pools within the cell.
- a thyA (thymidylate synthase) mutation was introduced to the strain by almost entirely deleting the thyA gene and replacing it by an inserted functional, wild-type but promoter-less E. coli lacZ + gene carrying a weak ribosome binding site ( ⁇ thyA::0.8RBS lacZ + ).
- This chromosomal modification was constructed utilizing ⁇ red recombineering.
- thyA strains are unable to make DNA and die.
- This defect can be complemented in trans by supplying a wild-type thyA gene on a multi-copy plasmid (Belfort, M., et al. (1983), Proc Natl Acad Sci USA 80, 1858-861). This complementation scheme was used as a means of plasmid maintenance.
- the inserted 0.8RBS lacZ + cassette not only knocks out thyA, but also converts the lacZ ⁇ host back to both a lacZ + genotype and phenotype.
- the modified strain produced a minimal (albeit still readily detectable) level of ⁇ -galactosidase activity (0.3 units), which has very little impact on sialyllactose production during bioreactor production runs, but which is useful in removing residual lactose at the end of runs, and as an easily scorable phenotypic marker for moving the thyA region into other lacZ ⁇ E. coli strains by P1 phage transduction.
- Transformants of this strain harboring the different ST (bst) candidate expression plasmids were evaluated for their ability to synthesize sialyllactose in 20 ⁇ 150 mm test tubes, containing 6 mL of IMC medium (“Induction Medium Casamino acids”) (LaVallie, E. R., DiBlasio, E. A., Kovacic, S., Grant, K. L., Schendel, P. F., and McCoy, J. M. (1993).
- IMC medium Induction Medium Casamino acids
- the glucose and/or casamino acids concentrations are varied in the 0.05-1% range.
- Tubes were inoculated to 0.1 OD 600 /mL with strains comprising E1406 transformed with individual candidate bst+neuBCA expression plasmids, and were then incubated at 30° C. for 120 minutes with continuous aeration on a roller drum. Tryptophan was then added to the cultures to a concentration of 200 ⁇ g/mL to induce bst gene and neuBCA operon expression, along with the addition of lactose as the acceptor sugar to a concentration of 1% w/v. The culture was left at 30° C. with roller drum aeration for a further 22 h.
- OD 600 of cells from each culture were pelleted by centrifugation (14,000 ⁇ g, 1 min), re-suspended in 200 ⁇ l of water and heated to 98° C. for 10 min to release cytoplasmic sugars. After clearing the suspension by centrifugation, 2 ⁇ l aliquots were applied to 10 ⁇ 20 cm aluminum-backed silica thin layer chromatography plates (Machery-Nagel #818163). Chromatograms were developed in n-butanol/lacetic acid/water (2:1:1), and visualized by heating after spraying with 3% w/v ⁇ -napthol in 12% H 2 SO 4 /80% ethanol/8% water. FIG. 5 shows the result.
- Prominent spots corresponding to the intracellular lactose pool were seen in the control strain (E1406, that does not contain an bst+neuBCA expression plasmid) and also in all bst candidate cultures.
- the E1406 control showed no spot corresponding to sialyllactose, whereas all other cultures displayed a spot co-migrating with a sialyllactose standard that comprised a mixture of 6′-SL and 3′-SL (these species do not resolve from each other in this TLC system).
- cultures expressing candidate genes bstD and bstJ are Neither of these produced any detectable sialyllactose, and thus these genes most probably represent “false positive hits” in the database screen.
- FIG. 5 is a spot running above sialyllactose in several of the candidates.
- This spot corresponds to KDO-lactose, and results from a linkage of the E. coli lipopolysaccharide precursor, 2-keto-3-deoxyoctulosonic acid (KDO) with lactose, as a result of relaxed substrate specificity exhibited by individual bst enzymes that utilize the endogenous E. coli pool of CMP-KDO as an alternative to the engineered pool of CMP-Neu5Ac as described herein.
- KDO 2-keto-3-deoxyoctulosonic acid
- Carbohydr Res 3-15, 1394-99 generated the unwanted KDO-lactose product.
- several of the bst candidates produced little if any KDO-lactose under the same culture conditions (e.g. 134E, BstM, BstN), highlighting the utility of these enzymes for the production of purer preparations of sialyl-oligosaccharides
- FIGS. 6A, 6B, and 6C show UV traces from HPLC runs for the various heat extracts. In this system 3′-SL eluted at ⁇ 8.8 minutes, whereas 6′-SL eluted at ⁇ 10.1 minutes. Data is presented in Table 3.
- Composition of Ferm 4a Media has the following (Per Liter)
- this seed culture was then inoculated into a 2L bioreactor containing 900 mL of the same medium (but containing an additional 0.75 g/L MgSO 4 .7H 2 O, 1 mL of DF204 antifoam, and 10 mL of trace metals solution).
- the optical density of cells in the fermenter vessel after inoculation was 0.006 at 600 nm (OD 600 )
- Strains were grown in the fermenter in batch mode at 30° C. with pH control to pH 6.8 (adjusted automatically with additions of 7.4M NH 4 OH) for approximately 16 h, at which point glucose exhaustion occurred as indicated by an increase in dissolved oxygen levels and a decrease in agitation speed.
- a fed-batch continuous glucose feeding regimen was then initiated (9.1 g of a 50% w/v glucose feed solution/h) such that the culture was maintained under carbon-limitation. After 2 h a bolus of 45.5 g of a 11.4% w/v lactose solution was added, and a continuous lactose feed of 2.2 g/h of the same solution was initiated.
- FIG. 7 shows a typical thin layer chromatogram of fractions from the Dowex 1 ⁇ 4 column. Typically fraction 3 was the purest fraction and, after desalting, was suitable for NMR analysis.
- double mutations of P7H and M117A in the PdST sequence had the effect of converting PdST from an ⁇ (2,3)-selective ST to a ⁇ (2,6)-selective ST in vitro (Schmölzer, et al. (2013). Glycobiology 23, 1293-1304).
- PdST* SEQ ID NO: 14
- ⁇ 20BstC* SEQ ID NO: 15
- BstE* SEQ ID NO: 16
- amino acid substitutions Y7H and G122A were introduced into the ⁇ 20BstC sequence to generate ⁇ 20BstC* while Y13H and E128A were introduced to the BstE sequence to generate BstE*.
- ⁇ 20bstC* (pG544, SEQ ID NO: 17) and bstE* expression plasmids were transformed into the engineered E. coli production host.
- Strains were grown in IMC media to early exponential phase at 30° C. before tryptophan (200 mg/mL) and lactose (1%) were simultaneously added to initiate SL biosynthesis.
- tryptophan 200 mg/mL
- lactose (1%) were simultaneously added to initiate SL biosynthesis.
- equivalent OD 600 units of each strain were harvested, and cell lysates were prepared by heating for 10 minutes at 98° C. and centrifugation to release intracellular SL.
- Lysates containing synthesized SL were then treated with sialidase S (specific for ⁇ (2,3) linked Neu5Ac) or sialidase C (acts on both ⁇ (2,3) or ⁇ (2,6) linked Neu5Ac) to analyze whether engineered ⁇ 20BstC* or BstE* were capable of catalyzing synthesis of 6′-SL rather than 3′-SL.
- sialidase S specific for ⁇ (2,3) linked Neu5Ac
- sialidase C acts on both ⁇ (2,3) or ⁇ (2,6) linked Neu5Ac
- FIG. 12 shows the 1D-proton NMR spectrum of SL produced by ⁇ 20BstC*. Characteristic features of the spectrum were 4 distinct anomeric peaks and the up-field signals of axial and equatorial H-3 of sialic acid. The latter consisted of two pairs of distinct signals in a ratio of about 5:1. Extensive 2-D NMR analysis ( FIG. 13 ) showed that the larger signals belong to 6′-sialyllactose, whereas the smaller one was part of contaminating 3′-sialyllactose.
- the engineered ⁇ 20BstC* mutant protein generates much less KDO-lactose when used to produce sialyllactose in E. coli than does its wild-type parent, ⁇ 20BstC (see FIG. 5 ).
- the active site mutations Y7H and G122A introduced into ⁇ 20BstC to generate ⁇ 20BstC* result not only in a switch of regiospecifiity from ⁇ (2,3) to ⁇ (2,6), but also reduce the ability of the enzyme to utilize CMP-KDO as a substrate, thus leading to a purer sialyllactose product profile.
- Structurally equivalent amino acid substitutions at position 122 of the amino acid sequence of ⁇ 20BstC* would improve the enzyme's ⁇ (2,6)-regioselectivity.
- the amino acid substitutions A122V, A122L, A122M and A122F were introduced to ⁇ 20BstC* to generate ⁇ 20BstC*2 (SEQ ID NO: 27) ⁇ 20BstC*3 (SEQ ID NO: 28), ⁇ 20BstC*4 (SEQ ID NO: 29) and ⁇ 20BstC*5 (SEQ ID NO: 30), respectively.
- ⁇ 20BstC*2, ⁇ 20BstC*3, ⁇ 20BstC*4 and ⁇ 20BstC*5 expression plasmids were transformed into engineered E. coli production host. Strains were grown in Ferny 4a media to early exponential phase at 30° C. before tryptophan (200 mg/mL) and lactose (1%) were simultaneously added to initiate SL biosynthesis. At the end of the synthesis period (24 h), equivalent OD 600 units of each strain were harvested, and cell lysates were prepared by heating for 10 minutes at 98° C. and centrifugation to release intracellular SL.
- the various mutant ⁇ 20BstC* strains were harvested and extracted using 5 ml potassium phosphate (pH 4.0) in 70% acetonitrile and analyzed utilizing a HPLC system capable of resolving 6′-SL from 3′-SL.
- the extracted samples (described above) were applied to a TSKgel Amide-80 column (5 ⁇ m particle size, 4.6 ⁇ 250 mm) and eluted under isocratic conditions of 5 mM potassium phosphate (pH 4.0) in 70% acetonitrile, 1 mL/min, at room temperature with UV detection at 210 nm,
- FIG. 16 shows exemplary HPLC for the various extracts.
- 3′SL eluted at about 15.5 minutes
- 6′SL eluted at about 18.3 minutes.
- Data is presented in Table 5.
- the analysis revealed that the mutations A122F, A122M, A122L, and A122V resulted in about 2%, 4%, 6% and 8% increase, respectively, in ⁇ (2,6)-regioselectivity compared to ⁇ 20BstC*.
- Table 5 shows HPLC analysis of regioselectivity of ⁇ 20BstC* mutants.
- wild-type ⁇ 20BstC is a lactose utilizing ⁇ (2,3) sialyltransferase that produced 3′-SL in the engineered E. coli strain described herein.
- This enzyme was engineered by introducing two specific active site mutations each, to generate new enzyme variants with altered regiospecificity: ⁇ 20BstC*, ⁇ 20BstC*2, ⁇ 20BstC*3, ⁇ 20BstC*4 and ⁇ 20BstC*5, that synthesize an 85:15, 94:6, 92:8, 90:10, and 89:9 mixture of 6′-SL:3′-SL, respectively.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Biomedical Technology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
Abstract
Provided herein, inter alia, are methods, bacteria, nucleic acids, and polypeptides for producing sialylated oligosaccharides.
Description
- This application claims priority to U.S. Provisional Application No. 62/599,481, filed Dec. 15, 2017, which is incorporated herein in its entirety for all purposes.
- The content of the text file named “037847-522001US_SequenceListing_ST25.txt”, which was created on Dec. 11, 2018, and is 124,706 bytes in size, is hereby incorporated by reference in its entirety.
- Lactose is the major nutritional carbohydrate of all mammalian milks, however human milk also contains a diverse and abundant set of more complex neutral and acidic sugars, collectively known as the human milk oligosaccharides (hMOS) (Kunz, C., et al. (2000). Annu Rev Nutr 20, 699-722; Bode, L., and Jantscher-Krenn, E. (2012). Adv
Nutr 3, 383S-391S). Hundreds of different hMOS species have been identified, and their rich structural diversity and overall abundance is unique to humans. These molecules are not absorbed well by the human gut and are not utilized by infants for direct nutrition, but they have been shown to serve critical roles in the establishment of a healthy gut microbiome, in gut development, in disease prevention, and in immune function (Newburg, D. S., and Walker, W. A. (2007). Pediatr Res 61, 2-8). - New methods are needed for producing purified human milk oligosaccharides.
- Provided herein are, inter cilia, methods, enzymes, compositions, and genetically modified bacteria for producing sialylated oligosaccharide. The enzymes provided herein are able to sialylate lactose, generating either α(2,3) glycosidic linkages, α(2,6) linkages, or mixtures of α(2,3) and α(2,6) linkages to lactose, and as such are especially advantageous in producing oligosaccharide molecules identical to the lactose-based molecules of human milk. In an aspect, a method for producing a sialylated oligosaccharide in a bacterium is provided. In some embodiments, the bacterium includes an exogenous lactose-utilizing sialyltransferase enzyme, e.g., an α(2,3) sialyltransferase or an α(2,6) sialyltransferase. In various embodiments, the enzyme has an amino acid sequence that is from 5% to 30% identical to the amino acid sequence of Pst6-224 (SEQ ID NO: 1) over a stretch of at least 250 amino acids. In certain embodiments, the enzyme has an amino acid sequence that is from 45% to 75% identical to the amino acid sequence of HAC1268 (SEQ ID NO: 8) over a stretch of at least 250 amino acids.
- In an aspect, included herein is an isolated bacterium comprising an exogenous lactose-utilizing sialyltransferase enzyme. In some embodiments, the enzyme has an amino acid sequence that is from 5% to 30% identical to the amino acid sequence of Pst6-224 (SEQ ID NO: 1) over a stretch of at least 250 amino acids. In certain embodiments, the enzyme has amino acid sequence that is from 45% to 75% identical to the amino acid sequence of HAC1268 (SEQ II) NO: 8) over a stretch of at least 250 amino acids.
- In various embodiments, the enzyme has an amino acid sequence that is from 5% to 100% identical to the amino acid sequence of one or more of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), Δ20BstC* (SEQ ID NO: 15), Δ20BstC (SEQ ID NO: 18), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10).
- In some embodiments, the amino acid sequence of the enzyme is less than 100% identical to the amino acid sequence of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10).
- In certain embodiments, the enzyme has no deletions or insertions compared to BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10).
- In various embodiments, the difference between the amino acid sequence of the enzyme and the amino acid sequence of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), Δ20BstC, (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10) consists of one or more conservative amino acid substitutions.
- In various embodiments, the difference between the amino acid sequence of the enzyme and the amino acid sequence of BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), BstM (SEQ ID NO: 9), or BstN (SEQ ID NO: 10) consists of one or more conservative amino acid substitutions.
- In some embodiments, the enzyme has an amino acid sequence that is from 5% to 100%, 10% to 90%, 20% to 80%, 30% to 70%, 40% to 60%, 5% to 75%, 5% to 50%, 5% to 25%, 10% to 75%, 10% to 50%, 15% to 25%, 15% to 75%, 15% to 50%, 15% to 25%, 25% to 50%, 50% to 75%, or 75% to 100% identical to a naturally occurring enzyme. In certain embodiments, the enzyme has an amino acid sequence that is at least about 5%, 10%, 15%, or 20% but less than about 30%, 35%, 40%, or 45% identical to a naturally occurring enzyme. In various embodiments, the enzyme has an amino acid sequence that is at least about 45%, 50%, or 55% but less than about 65%, 70%, or 75% identical to a naturally occurring enzyme.
- In some embodiments, the naturally occurring enzyme is a bacterial GT80 family sialyltransferase. The GT80 family is described in Audry, M., et al. (2011). Glycobiology 21, 716-726, the entire content of which is inforporated herein by reference.
- In certain embodiments, the bacterial GT80 family sialyltransferase has the GT-B structural fold. The GT-B structural fold is described in Audry, M,, et al. (2011). Glycobiology 21, 716-726, the entire content of which is incorporated herein by reference.
- In various embodiments, the naturally occurring enzyme is produced by a microbial organism, e.g., in nature. In some embodiments, the microbial organism is a bacterium that is naturally present in the gastrointestinal tract of a mammal. In certain embodiments, the microbial organism is a bacterium within the genus Photobacterium, Avibacterium, Shewanella, Bihersteinia, Haemophilus, Alistepes, Actinobacillus, or Helicobacter.
- In various embodiments, the enzyme has a mutation (e.g., 1, 2, 3, 4, 5, or more mutations, such as substitution mutations) compared to a naturally occurring α(2,3) sialyltransferase.
- In some embodiments, when the amino acid sequences of the enzyme and BstE* are aligned, then the enzyme has a mutation at the position that aligns with
position 13 of the amino acid sequence of BstE* (SEQ ID NO: 16). Sequence alignments are run using a variety of publicly available software programs, including but not limited to CLC Main Workbench, version 8.0. - In certain embodiments, the enzyme has a non-conservative mutation at the position that aligns with
position 13 of the amino acid sequence of BstE* (SEQ ID NO: 16). In various embodiments, the enzyme has a histidine or an alanine at the position that aligns withposition 13 of the amino acid sequence of BstE* (SEQ ID NO: 16). - In various embodiments, when the amino acid sequences of the enzyme and BstE* are aligned, then the enzyme comprises a mutation at the position that aligns with position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16).
- In some embodiments, the enzyme has a non-conservative mutation at the position that aligns with position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16). In certain embodiments, the enzyme has a histidine or an alanine at the position that aligns with position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16).
- In some embodiments, the enzyme has a non-conservative mutation at the position that aligns with position 122 of the amino acid sequence of Δ20BstC (SEQ ID NO: 18). In certain embodiments, the enzyme has an alanine, valine, leucine, methionine, or phenylalanine at the position that aligns with position 122 of the amino acid sequence of Δ20BstC (SEQ ID NO: 18).
- In various embodiments, the mutation that renders the enzyme more α(2,6)-selective than the naturally occurring α(2,3) sialyltransferase.
- In some embodiments, the enzyme is an α(2,6) sialyltransferase.
- In some embodiments, the enzyme comprises an amino acid sequence of Δ20BstC* (SEQ ID NO: 15), Δ20BstC*2 (SEQ ID NO: 27), Δ20BstC*3 (SEQ ID NO: 28), Δ20BstC*4 (SEQ ID NO: 29), or Δ20BstC*2 (SEQ ID NO: 30).
- In certain embodiments, the Cα root-mean-square deviation (RMSD) between the backbone of the enzyme and a naturally occurring sialyltransferase is less than 3 Å. In some embodiments, the naturally occurring sialyltransferase is Pst6-224 (SEQ ID NO: 1). The structure of Pst6-224 (SEQ ID NO: 1) has been solved, see, e.g., Crystal Structure of Vibrionaceae Photobacterium sp. JT-ISH-224 2,6-sialyltransferase in a Ternary Complex with Donor Product CMP and Accepter Substrate Lactose, Kakuta et al. (2008) Glycobiology 18 66-73, the entire content of which is incorporated herein by reference.
- In various embodiments, the naturally occurring sialyltransferase is BstC, BstD, BstE, BstH, BstI, BstJ, BstM, or BstN, or a homologue thereof.
- In some embodiments, the bacterium is in a culture medium. In certain embodiments, the bacterium is on culture plate or in a flask. In various embodiments, the bacterium is cultured in a biofermentor.
- The methods of producing sialylated oligosaccharides disclosed herein may further include retrieving the sialylated oligosaccharide (e.g., sialyllactose) from the bacterium (e.g., from the cytoplasm of the bacterium by lysing the bacterium) or from a culture supernatant of the bacterium.
- In certain embodiments, the sialylated oligosaccharide includes any one of, or any combination of 2, 3, 4, 5, 6, 7, or 8 of 3′-sialyllactose (3′-SL), 6′-sialyllactose (6′-SL), 3′-sialyl-3-fucosyllactose (3′-S3FL), sialyllacto-N-tetraose a (SLNT a), sialyllacto-N-tetraose b (SLNT b), disialyllacto-N-tetraose (DSLNT), sialyllacto-N-fucopentaose II (SLNFP II), and sialyllacto-N-tetraose c (SLNT c).
- In various embodiments, the bacterium comprises an exogenous or endogenous lactose-utilizing α(1,3) fucosyltransferase enzyme, an exogenous or endogenous lactose-utilizing α(1,4) fucosyltransferase enzyme, an exogenous or endogenous β(1,3) galactosyltransferase enzyme, an exogenous or endogenous β(1,4) galactosyltransferase enzyme, an exogenous or endogenous β-1,3-N-acetylglucosaminyltransferase, or any combination thereof.
- In certain embodiments, the bacterium comprises an elevated level of cytoplasmic lactose, uridine diphosphate N-acetylglucosamine (UDP-GlcNAc), and/or cytidine-5′-monophosphosialic acid (CMP-Neu5Ac) compared to a corresponding wild-type bacterium (e.g., when the bacterium is cultured in the presence of lactose). In non-limiting examples, the level of lactose, UDP-GlcNAc, and/or CMP-Neu5Ac is at least about 5%, 10%, 15%, 20%, 5%, 30%, 35%, 40%, 45%, 50%, 75%, 100%, 200%, 300%, 400%, or 500% greater in the cytoplasm of the bacterium than a corresponding wild-type bacterium (e.g., when the bacterium is cultured in the presence of lactose).
- Various implementations comprise providing a bacterium that comprises an exogenous lactose-utilizing sialyltransferase gene, a deficient sialic acid catabolic pathway, a sialic acid synthetic capability, and a functional lactose permease gene; and culturing the bacterium in the presence of lactose. The sialylated oligosaccharide is then retrieved from the bacterium or from a culture supernatant of the bacterium. Specifically, a sialic acid synthetic capability comprises expressing exogenous CMP-Neu5Ac synthetase, an exogenous sialic acid synthase, and an exogenous UDP-GlcNAc-2-epimerase, or a functional variant or fragment thereof.
- In some embodiments relating to methods for producing sialylated oligosaccharides, it is the bacterium may further comprises the capability for increased UDP-GlcNAc production. By “increased production capability” is meant that the host bacterium produces greater than 10%, 20%, 50%, 100%, 2-fold, 5-fold, 10-fold, or more of a product than the native, endogenous bacterium. Preferably, the bacterium over-expresses a positive endogenous regulator of UDP-GlcNAc synthesis. In sonic embodiments, the bacterium overexpresses the nagC gene of E. coli. In certain embodiments, the bacterium over-expresses the E. coli glmS (L-glutamine:D-fructose-6-phosphate aminotransferase) gene or mutations in glmS gene that result in a GlmS enzyme not subject to feedback inhibition by its glucosamine-6-phosphate product (see, e.g., Deng, M. D., Grund, A. D., Wassink, S. L., Peng, S. S., Nielsen, K. L., Huckins, B. D., and Burlingame, R. P. (2006). Directed evolution and characterization of Escherichia coli glucosamine synthase. Biochimie 88, 419-429, the entire content of which is incorporated herein by reference. In various embodiments, the bacterium over-expresses the E. coli glmY gene (a positive translational regulator of glmS). In some embodiments, the bacterium over-expresses the E. coli glmZ, gene (another positive translational regulator of glmS: glmY and glmZ are described in Reichenbach et al Nucleic Acids Res 36, 2570-80 (2008)). In certain embodiments, the bacterium over-expresses any combination of these genes. In various embodiments, the bacterium over-expresses nagC and glmS. In some embodiments, the bacterium over-expresses nagC and glmY. In certain embodiments, the bacterium over-expresses nage and glmZ. In some embodiments, the gene transcript or encoded gene product is expressed or produced 10%, 20%, 50%, 2-fold, 5-fold, 10-fold, or more than the level expressed or produced by the corresponding native, naturally-occurring, or endogenous gene. Also provided herein are corresponding methods and bacteria in which any homologue or functional variant or fragment of nagC, glmS, glmY or glmZ (or any combination thereof) is overexpressed. In various embodiments, E. coli nagC, glmS, glmY or glmZ (or any combination thereof) is exogenously expressed in a bacterium other than E. coli.
- Other components of UDP-GlcNAc metabolism include: (GlcNAc-1-P) N-acetylglucosamine-1-phosphate; (GlcN-1-P) glucosamine-1-phosphate; (GlcN-6-P) glucosamine-6-phosphate; (GlcNAc-6-P) N-acetylglucosamine-6-phosphate; and (Fruc-6-P) Fructose-6-phosphate. In certain embodiments, bacteria comprising the characteristics described herein are cultured in the presence of lactose, and lacto-N-neotetraose is retrieved, either from the bacterium itself (i.e., by lysis) or from a culture supernatant of the bacterium.
- In various embodiments, the bacterium contains a deficient sialic acid catabolic pathway. By “sialic acid catabolic pathway” is meant a sequence of reactions, usually controlled and catalyzed by enzymes, which results in the degradation of sialic acid. An exemplary sialic acid catabolic pathway in E. coli is described herein. In the sialic acid catabolic pathway described herein, sialic acid (Neu5Ac; N-acetylneuraminic acid) is degraded by the enzymes NanA (N-acetylneuraminic acid lyase) and NanK (N-acetylmannosamine kinase) and NanE (N-acetylmannosamine-6-phosphate epimerase), all encoded in the nanATEK-yhcH operon, and repressed by NanR (ecocyc.org/E. COLI). In some embodiments, a deficient sialic acid catabolic pathway is engineered in E. coli by way of a mutation in endogenous nanA (N-acetylneuraminate lyase) (e.g., GenBank Accession Number D00067.1 (GI:216588), incorporated herein by reference) and/or nanK (N-acetylmannosamine kinase) genes (e.g., GenBank Accession Number (amino acid) BAE77265.1 (GI:85676015), incorporated herein by reference), and/or nanE (N-acetyltnamosamine-6-phosphate epimerase, 947745, incorporated herein by reference). In certain embodiments, the nanT (N-acetylneuraminate transporter) gene is also inactivated or mutated. Other intermediates of sialic acid metabolism include: (ManNAc-6-P) N-acetylmannosamine-6-phosphate; (GlcNAc-6-P) N-acetylglucosamine-6-phosphate; (GlcN-6-P) Glucosamine-6-phosphate; and (Fruc-6-P) Fructose-6-phosphate. In some embodiments, nanA is mutated. In various embodiments, nanA and nanK are mutated, while nanE remains functional. In sonic embodiments, nanA and nanE are mutated, while nanK has not been mutated, inactivated or deleted. In various embodiments, a mutation is one or more changes in the nucleic acid sequence coding the gene product of nanA, nanK, nanE, and/or nanT. For example, the mutation may be 1, 2, 5, 10, 25, 50 or 100 changes in the nucleic acid sequence. For example, the nanA, nanK, nanE, and/or nanT is mutated by a null mutation.
- Null mutations as described herein encompass amino acid substitutions, additions, deletions, or insertions that either cause a loss of function of the enzyme (i.e., reduced or no activity) or loss of the enzyme (i.e., no gene product). By deleted is meant that the coding region is removed in whole or in part such that no gene product is produced. In various embodiments, a gene has been inactivated such that that the coding sequence thereof has been altered such that the resulting gene product is functionally inactive or encodes a gene product with less than 100%, 80%, 50%, or 20% of the activity of the native, naturally-occurring, endogenous gene product.
- In various embodiments, the bacterium also comprises a sialic acid synthetic capability. In some embodiments, the bacterium is an E. coli bacterium. For example, the bacterium comprises a sialic acid synthetic capability through provision of an exogenous UDP-GlcNAc 2-epimerase (e.g., neuC of Campylobacter jejuni, GenBank AAK91727.1; GI:15193223, incorporated herein by reference) or equivalent (e.g. E. coli S88 neuC GenBank YP_002392936.1; GI: 218560023), a Neu5Ac synthase (e.g., neuB of C. jejuni AAK91726.1 GenBank GI:15193222, incorporated herein by reference) or equivalent, (e.g. Flavobacterium limnosediminis sialic acid synthase, GenBank GI:559220424), and/or a CMP-Neu5Ac synthetase (e.g., neuA of C. jejuni (GenBank AAK91728.1; GI:15193224, incorporated herein by reference) or equivalent, (e.g. Vibrio brasiliensis CMP-sialic acid synthase, GenBank GI: 493937153). Functional variants and fragments are also disclosed herein.
- In some embodiments, the bacterium comprises an exogenous or endogenous N-acetylneuraminate synthase, an exogenous or endogenous UDP-N-acetylglucosamine 2-epimerase, an exogenous or endogenous N-acetylneuraminate cytidylyltransferase, or any combination thereof.
- In certain embodiments, the bacterium includes an exogenous N-acetylneuraminate synthase, UDP-N-acetylglucosamine 2-epimerase, and N-acetylneuraminate cytidylyltransferase from Campylobacter jejuni.
- In various embodiments, the bacterium includes a reduced level of β-galactosidase activity compared to a corresponding wild-type bacterium (e.g., when the bacterium is cultured in the presence of lactose). In aspects, the reduced level of β-galactosidase activity includes reduced expression of a β-galactosidase gene or reduced β-galactosidase enzymatic activity. In aspects, the reduced level is less than 10% the level of the corresponding wild-type bacterium when the bacterium is cultured in the presence of lactose.
- In some embodiments, the bacterium includes a deleted or inactivated endogenous β-galactosidase gene. In certain embodiments, the bacterium includes a deleted or inactivated endogenous lacZ gene and/or a deleted or inactivated endogenous lacI gene.
- In various embodiments, the bacterium includes an endogenous β-galactosidase gene, wherein at least a portion of a promoter of the endogenous β-galactosidase gene has been deleted.
- In some embodiments, the bacterium includes an exogenous β-galactosidase enzyme with reduced enzymatic activity compared to an endogenous β-galactosidase enzyme in a corresponding wild-type bacterium. In certain embodiments, the exogenous β-galactosidase gene is expressed at a lower level than to an endogenous β-galactosidase gene in a corresponding wild-type bacterium.
- In various embodiments, the bacterium has less than 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 75, 50, 45, 40, 35, 30, 25, 20, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 units of β-galactosidase activity when cultured in the presence of lactose. In some embodiments, the bacterium comprises at least about 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, or 2.5 units of β-galactosidase activity, but less than about 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 75, 50, 45, 40, 35, 30, 25, 20, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, or 3 units of β-galactosidase activity, when the bacterium is cultured in the presence of lactose.
- In some embodiments, the bacterium has a lactose permease gene. in certain embodiments, the lactose permease gene comprises a lacYgene.
- In an aspect, the bacterium has an inactivated adenosine-5′-triphosphate (ATP)-dependent intracellular protease. In aspects, the inactivated ATP-dependent intracellular protease has a null mutation in an ATP-dependent intracellular protease gene. In aspects, the null mutation is a deletion of an endogenous lon gene.
- In aspects, the bacterium further includes an exogenous E. coli rcsA or E. coli rcsB gene.
- In certain embodiments, the bacterium further includes a mutationin a thyA gene.
- In various embodiments, the bacterium does not express a β-galactoside transacetylase. In some embodiments, a β-galactoside transacetylase gene has been inactivated (e.g., deleted) in the bacterium.
- In certain embodiments, the bacterium has a lacA mutation.
- In various embodiments, the bacterium accumulates intracellular lactose in the presence of exogenous lactose.
- In some embodiments, the bacterium is a member of the Bacillus, Pantoea, Lactobacillus, Lactococcus, Streptococcus, Proprionibacterium, Enterococcus, Bifidobacterium, Sporolactobacillus, Micromomospora, Micrococcus, Rhodococcus, or Pseudomonas genus.
- In certain embodiments, the bacterium is a Bacillus licheniformis, Bacillus subtilis, Bacillus coagulans, Bacillus thermophilus, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans, Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, Xanthomonas campestris, Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delhrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, Lactococcus lactis, Streptococcus thermophiles, Proprionibacteriun freudenreichii, Enterococcus faecium, Enterococcus thermophiles), Bifidobacterium longum, Bifidobacterium infantis, Bifidobacterium bifidum, Pseudomonas fluorescens, or Pseudomonas aeruginosa bacterium. In aspects, the bacterium is an Escherichia coli (E. coli) bacterium.
- In various embodiments, the E. coli bacterium is a GI724 strain bacterium.
- In some embodiments, the bacterium has a lacIq promoter mutation. In certain embodiments, the bacterium has a lacPL8 promoter mutation.
- In various embodiments, the bacterium has a nucleic acid construct including an isolated nucleic acid encoding the lactose-utilizing sialyltransferase enzyme.
- In some embodiments, a chromosome of the bacterium has a nucleic acid construct having an isolated nucleic acid encoding the lactose-utilizing sialyltransferase enzyme.
- In certain embodiments, the nucleic acid is operably linked to a heterologous control sequence that directs the production of the enzyme in the bacterium. In various embodiments, the heterologous control sequence comprises a bacterial promoter, a bacterial operator, a bacterial ribosome binding site, a bacterial transcriptional terminator, or a plasmid selectable marker.
- In various embodiments, the bacterium has the genotype:
- PlacIq-lacY, Δ(lacI-lacZ), ΔlacA, ΔthyA::(0.8RBS lacZ+), ampC::(Ptrp M13g8 RBS-λcI+, CAT), ΔnanATE::scar.
- In aspects, provided herein are nucleic acids encoding a mutant enzyme. In some embodiments, the mutant enzyme has amino acids in the sequence set forth as SEQ ID NO: 15, 16, 19, 20, 21, 22, 23. 24, 25, 26, 27, 28, 29, or 30.
- Also provided herein is a lactose-utilizing sialyltransferase enzyme having amino acids in the sequence set forth as SEQ ID NO: 15, 16, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30.
- Certain sialyltransferases described herein have significant advantages over other enzymes of this class. Preferred sialyltransferases, e.g., BstM and BstN, are lactose-utilizing and produce superior amounts of sialyllactose in production strains of bacteria, e.g., engineered E. coli. Not all enzymes in the sialyltransferase class utilize lactose. For example, BstD and BstJ were found not to utilize lactose. Thus, lactose-utilizing sialyltransferase enzymes are rare among enzymes in the sialyltransferase class.
- Another advantage of preferred sialyltransferases described is that they have fewer side activities, i.e., produce fewer undesirable by-products. An example of such an undesirable by-product is the KDO-lactose side-product. KDO is a component of E. coli lipopolysaccharide (LPS, endotoxin), and LPS is a molecule that elicits a strong and often dangerous immune response in some mammals, and humans in particular. KDO is part of the core structure of LPS. KDO-lactose is made from a CMP-KDO nucleotide sugar precursor that is found naturally in all strains of E. coli. Due to a similarity of KDO to sialic acid, some sialyltransferases, e.g., Pst6-224, utilize CMP-KDO as a substrate and produce unacceptable levels of KDO-lactose as an undesired side reaction. Certain enzymes of the present invention (e.g., BstM, BstN, Δ20BstC*) produce less of this unwanted by-product as compared to others, e.g., Pst6-224. Thus, the methods described herein that include a heterologous gene (in the engineered E. coli production strain) that expresses these preferred enzymes lead to a reduced or negligible amount of KDO-lactose. Such a reduced amount facilitates purification of the final desired product, sialyllactose, and is associated with a better safety profile for human use.
- In an aspect, provided herein is a composition comprising sialylated oligosaccharides and less than 5%, e.g., less than 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or less than 0.1%, KDO-lactose. In some embodiments, the composition is substantially pure. In some embodiments, the composition comprises sialyllactose.
- The sialyllactose produced by Δ20BstC* was found to be comprised of 6′-SL and 3′-SL. Production of both of these human milk oligosaccharides in the course of a single biofermentation represents a significant advantage in terms of time and cost of production over two separate fermentations. In some situations, such as striving to develop infant formulae that better emulate human milk, producing mixtures of human milk oligosaccahides in a single production fermentation is advantageous from a cost perspective.
- Thus, the production runs using constructs expressing the preferred enzymes and the final purified endproduct(s) produced from such runs are characterized by increased safety, increased purity (and ease of purification) as well as reduced cost compared to earlier-described approaches. A composition comprising a sialyllactose produced using the methods, constructs, production strains described herein contain at least 10%, 25%, 50%, 2-fold, 5-fold, 10-fold or less KDO-lactose compared to compositions produced by other methods, e.g., produced using constructs encoding Pst6-224 or a-(2→6)-sialyltransferase encoded by the gene from the Photobacterium sp. JT-ISH-224. The invention also encompasses methods and a composition comprising substantially pure sialyllactose with minimal or minor levels of KDO-lactose. For example, the composition contains less than 5%, 4%, 3%, 2%, 1%, or 0.5% (or less) KDO-lactose of the total mass of SL. For example, a mutation, e.g., Δ (deletion) mutation in a Bst gene, e.g., Δ20BstC*, leads to a reduction in KDO-lactose.
- Other features and advantages of the invention will be apparent from the following description of the preferred embodiments thereof, and from the claims. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below.
- The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings) will be provided by the Office upon request and payment of the necessary fee.
-
FIG. 1 is a schematic outlining the structures of the major sialyalated oligosaccharide species of human milk, how they are related to each other, and the steps necessary for their enzymatic synsthesis from lactose. -
FIG. 2 is a table presenting pairwise percent amino acid sequence identity comparison between the two α(2,6) sialyllactose (SL) probe sequences and the 8 identified ST candidates. -
FIG. 3 is a map of an expression vector carrying one of the candidate ST genes, bstN (plasmid pG543, SEQ ID NO: 11). -
FIG. 4 is a diagram outlining the scheme for SL biosynthesis in engineered E. coli. -
FIG. 5 is an image of a thin layer chromatography result, Prominent spots corresponding to the intracellular lactose pool are seen in the control strain (E1406, which does not contain and bst+neuBCA expression plasmid) and also in all bst candidate cultures. -
FIGS. 6A, 6B, and 6C are images showing UV traces from HPLC runs for the various heat extracts (E1406 control Δ16Pst60224, HAC1268, Δ20BstC, Δ20stC*, BstE, BstH, BstI, BstM, and BstN). -
FIG. 7 is an image of thin layer chromatography of fractions from theDowex 1×4 column. Typically,fraction 3 was the purest fraction and, after desalting, was suitable for NMR analysis. -
FIG. 8 is a 1D 1H NMR spectrum of SL samples produced by BstM (BstM-SL) which showed three anomeric signals: δ 5.22. (A), δ 4.66 (B), both attributed to a reducing-end Glcp, and δ 4.42 (C) assigned to β-Galp residue. -
FIG. 9 is a 1D 1H NMR spectrum of SL samples produced by BstN (BstN-SL) which showed three anomeric signals: δ 5.22 (A), δ 4.66 (B), both attributed to a reducing-end Glcp, and δ 4.42 (C) assigned to β-Galp residue. -
FIG. 10 is an image showing a sequence alignment of wild type PdST, Δ20BstC and BstE α(2,3) sialyltransferases. -
FIG. 11 is an image of thin layer chromatography showing that SL synthesized by BstE*-producing cells was efficiently converted to lactose by both sialidase S and sialidase C. This result indicated that BstE* still possessed exclusively α(2,3)-selective activity, and that the introduced mutations did not alter regioselectivity of the enzyme as was predicted. -
FIG. 12 is a 1D 1H NMR spectrum of SL produced by Δ20BstC*. Characteristic features of the spectrum were 4 distinct anomeric peaks and the up-field signals of axial and equatorial H-3 of sialic acid. -
FIG. 13 is an image of overlaid HSQC and HMBC NMR spectra of sialyllactose synthesized by ΔBstC*-producing cells. NMR analysis showed that the larger signals belonged to 6′-sialyllactose, whereas the smaller one was part of contaminating 3′-sialyllactose. -
FIG. 14 is an image of the BLOSUM62 matrix. -
FIG. 15 is a table showing chemical shift assignments of the two major components of Δ20BstC* synthesized sialyllactose. Orange lines indicate inter-residue correlations seen in both ROESY and HMBC experiments; blue lines indicate inter-residue correlations seen in HMBC only. -
FIG. 16 is an image showing UV traces from HPLC runs for the various cell extracts (Δ20BstC*, Δ20BstC*2, Δ20BstC*3, Δ20BstC*4, Δ20BstC*5). - The acidic oligosaccharides of human milk include a prominent sialyllactose (SL) fraction, comprising 3′-sialyllactose and 6′-sialyllactose (Bode, L., and Jantscher-Krenn, E. (2012).
Adv Nutr 3, 383S-391S). Structurally, 3′-sialyllactose (3′-SL) consists of an N-acetylneuraminic acid (Neu5Ac) moiety joined through an α(2,3) linkage to the galactose portion of lactose (α(2,3)Neu5Ac Gal(β1-4)Glc), while 6′-sialyllactose (6′-SL) consists of a Neu5Ac moiety joined through an α(2,6) linkage to the galactose portion of lactose (α(2,6)Neu5Ac Gal (β1-4)Glc). 3′-SL and 6′-SL are two of the most abundant sialylated oligosaccharides present in human milk, together present at concentrations of up to ˜0.5 Bao, Y., Zhu, L., and Newburg, D. S. (2007). Anal Biochem 370, 206-214). - The invention provides efficient and economical methods, cells, enzymes, and nucleic acids for producing sialylated oligosaccharides. The “lactose-utilizing sialyltransferase enzymes” disclosed herein include the amino acid sequences of the lactose-utilizing sialyltransferase enzyme, as well as variants and fragments thereof that exhibit sialyltransferase activity.
- Prior to the methods described herein, the ability to produce purified acidic human milk oligosaccharides (hMOS) such as 3′-SL and 6′-SL inexpensively at large scale was problematic and inefficient. Purification of sialylated oligosaccharides from natural sources such as mammalian milks is not an economically viable approach, and production of hMOS through chemical synthesis is currently limited by stereo-specificity issues, precursor availability, product impurities, and high overall cost. As an alternative to chemical synthesis, bacteria can be metabolically engineered to produce hMOS. This approach involves the construction of microbial strains overexpressing heterologous glycosyltransferases, membrane transporters for the import of precursor sugars into the bacterial cytosol, and possessing enhanced pools of regenerating nucleotide sugars for use as biosynthetic precursors, e.g. as described by Dumon, C., et al. (2004).
Biotechnol Prog 20, 412-19; Ruffling, A., and Chen, R. R. (2006). 5, 25; Mao, Z., et al. (2006). Biotechnol Prog 22, 369-374).Microb Cell Fact - A key aspect of this approach is the identification and use of a heterologous glycosyltransferase selected for overexpression in the microbial host. The choice of glycosyltransferase can significantly affect the final yield of the desired synthesized oligosaccharide, given that enzymes can vary greatly in terms of their kinetics, donor and acceptor substrate specificity, side reaction products, and enzyme stability and solubility. A few glycosyltransferases derived from different bacterial species have been identified and characterized in terms of their ability to catalyze the biosynthesis of hMOS in E. coli host strains [(Dumon, C., et al. (2006).
Chembiochem 7, 359-365; Dumon, C., et al. (2004).Biotechnol Prog 20, 412-19; Li, M., et al. (2008).Biochemistry 47, 378-387; Li, M., et al. (2008).Biochemistry 47, 11590-97)]. - However, there exists a growing need to identify and characterize additional glycosyltransferases that will be useful for the synthesis of hMOS in metabolically engineered bacterial hosts. The identification of additional glycosyltransferases with faster kinetics, greater affinity for nucleotide sugar donors and/or acceptor structures, or greater stability within the bacterial host has the potential to significantly improve the yields of therapeutically useful hMOS. To this end, candidate gene screening approach was undertaken to identify new α(2,3) and α(2,6) sialyltransferase genes encoding more efficient enzymes.
- Lactose-Utilizing Sialyltransferase Enzymes
- In some embodiments, a lactose-utilizing sialyltransferase enzyme comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 1-10, 1-15, 1-20, 5-15, 5-20, 10-25, 10-50, 20-50, 25-75, 25-100 or more mutations compared to a naturally occurring protein while retaining at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 99.5%, or about 100% of the activity (e.g., enzymatic activity) of the naturally occurring protein.
- Mutations include but are not limited to substitutions (such as conservative and non-conservative substitutions), insertions, and deletions. Non-limiting examples of lactose-utilizing sialyltransferase enzymes may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 1-10, 1-15, 1-20, 5-15, 5-20, 10-25, 10-50, 20-50, 25-75, 25-100, or more substitution mutations compared to a naturally occurring protein while retaining at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 99.5%, or about 100% of the activity (e.g., enzymatic activity) of the naturally occurring protein.
- Alternatively, the lactose-utilizing sialyltransferase enzyme is not a mutant (or the sequence altered) compared to a corresponding wild type sequence.
- In various embodiments, a lactose-utilizing sialyltransferase enzyme may comprise a stretch of amino acids (e.g., the entire length of the lactose-utilizing sialyltransferase enzyme or a portion comprising at least about 50, 100, 200, 250, 300, 350, or 400 amino acids) in a sequence that is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, or 99.5% identical to an amino acid sequence of a naturally occurring protein.
- In some embodiments, the mutations are conservative, and the present subject matter includes many lactose-utilizing sialyltransferase enzymes in which the only mutations are substitution mutations. In non-limiting examples, a lactose-utilizing sialyltransferase enzyme has no deletions or insertions compared to a naturally occurring protein (e.g., a naturally occurring counterpart).
- In certain embodiments, the lactose-utilizing sialyltransferase enzyme does not comprise a deletion or insertion compared to a naturally occurring lactose-utilizing sialyltransferase enzyme. Alternatively, a lactose-utilizing sialyitransferase enzyme may have (i) less than about 5, 4, 3, 2, or 1 inserted amino acids, and/or (ii) less than about 5, 4, 3, 2, or 1 deleted amino acids compared to a naturally occurring protein.
- In various embodiments, a naturally occurring protein to which a lactose-utilizing sialyltransferase enzyme is compared or has been derived (e.g., by mutation, fusion, or other modification) is a microbial protein, e.g., a prokaryotic lactose-utilizing sialyltransferase enzyme such as a bacterial lactose-utilizing sialyltransferase enzyme. For example, the prokaryotic lactose-utilizing sialyltransferase enzyme is a mutant or variant of a natural (i.e., wild-type) bacterial protein.
- In some embodiments, the microbial protein is produced by a Gram-positive bacterium or a Gram-negative bacterium.
- In some embodiments, the lactose-utilizing sialyltransferase enzyme does not comprise a signal peptide. For example, the signal peptide (e.g., that is present in a naturally occurring counterpart) may be replaced with a methionine.
- As used herein the term “signal peptide” refers to a short stretch of amino acids (e.g., 5-20 or 10-50 amino acids long) at the N-terminus of a protein that directs the transport of the protein. In various embodiments, the signal peptide is cleaved off during the post-translational modification of a protein by a cell. In instances where a signal peptide is not defined for a protein discussed herein, the signal peptide may optionally be considered to be, e.g., the first 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 amino acids from the N-terminus of the translated protein (compared to a protein that has not had the signal peptide removed, e.g., compared to a naturally occurring protein).
- With regard to a defined polypeptide, % identity values higher or lower than those provided herein will encompass various embodiments, Thus, where applicable, in light of a minimum % identity value, a lactose-utilizing sialyltransferase enzyme may comprise an amino acid sequence which is at least 60%, 65%, 70%, 75%, 76%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, or 99.9% identical to the reference SEQ ID NO or to each of the reference SEQ ID NOs. In embodiments, the lactose-utilizing sialyltransferase enzyme comprises an amino acid sequence that is 100% identical to the reference SEQ ID NO. Where applicable, in light of a maximum % identity to a reference sequence, a lactose-utilizing sialyltransferase enzyme may comprise an amino acid sequence which is less than 75%, 70%, 65%, 60%, 59%, 58%, 57%, 56%, 55%, 54%, 53%, 52%, 51%, 50%, 49%, 48%, 47%, 46%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, or 30% identical to the reference SEQ ID NO or to each of the reference SEQ ID NOs. In certain embodiments, a polypeptide comprises amino acids in a sequence that is preferably at least about 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45% and less than about 75%, 70%, 65%, 60%, 55%, 50%, 45%, 44%, 43%, 42%, 41%, 40%, 39%, 38%, 37%, 36%, 35%, 34%, 33%, 32%, 31%, or 30% identical to the reference SEQ ID NO or to each of the reference SEQ ID NOs. In certain embodiments, a polypeptide comprises amino acids in a sequence that is between about 5% and about 75%, about 6% and about 75%, about 7% and about 75%, about 8% and about 75%, about 9% and about 75%, about 10% and about 75%, 11% and about 75%, 12% and about 75%, 13% and about 75%, 14% and about 75%, 15% and about 75%, 16% and about 75%, 17% and about 75%, 18% and about 75%, 19% and about 75%, 20% and about 75%, 21% and about 75%, 22% and about 75%, 23% and about 75%, 24% and about 75%, 25% and about 75%, 26% and about 75%, 27% and about 75%, 28% and about 75%, 29% and about 75%, 30% and about 75%, about 5% and about 100%, about 5% and about 95%, about 5% and about 85%, about 5% and about 75%, about 5% and about 70%, about 5% and about 65%, 60%, about 5% and about 55%, about 5% and about 50%, about 5% and about 45%, about 5% and about 44%, about 5% and about 43%, about 5% and about 42%, about 5% and about 41%, about 5% and about 40%, about 5% and about 39%, about 5% and about 38%, about 5% and about 37%, about 5% and about 36%, about 5% and about 35%, about 5% and about 34%, about 5% and about 33%, about 5% and about 32%, about 5% and about 31%, or about 5% and about 30% identical to the reference SEQ ID NO or to each of the reference SEQ NOs.
- Non-limiting examples of reference lactose-utilizing sialyltransferase enzymes and amino acid sequences disclosed herein include:
-
- (i) a lactose-utilizing sialyltransferase enzyme from Photohacierium sp. JT-ISH-224 referred to herein as “Pst6-224” (GenBank Accession No. BAF92026.1; SEQ ID NO: 1);
- (ii) a lactose-utilizing sialyltransferase enzyme from Avibacterium paragallinarum referred to herein as “BstC” [National Center for Biotechnology Information (NCBI) Reference Sequence: WP_021724759.1; SEQ NO: 2];
- (iii) a lactose-utilizing sialyltransferase enzyme from Actinobacillus areae referred to herein as “BstD” (NCBI Reference Sequence: WP_005625206.1; SEQ ID NO: 3);
- (iv) a lactose-utilizing sialyltransferase enzyme from Haemophilus_ducreyi referred to herein as “BstE” (GenBank Accession No. AAP95068.1; SEQ ID NO: 4); a lactose-utilizing sialyltransferase enzyme from Alistipes (multispecies) referred to herein as “BstH” (NCBI Reference Sequence: WP_018695526.1; SEQ ID NO: 5);
- (vi) a lactose-utilizing sialyltransferase enzyme from Bihersteinia trealosi referred to herein as “BstI” (GenBa.nk Accession No. AG-1137861.1; SEQ ID NO: 6);
- (vii) a lactose-utilizing sialyltransferase enzyme from Shewanella piezotolerans referred to herein as “BstJ” (NCBI Reference Sequence Nos: YP_002314261.1 and WP_020915003.1; SEQ ID NO: 7);
- (viii) a lactose-utilizing sialyltransferase enzyme from Helicobacter acinonychis referred to herein as “HAC” (GenBank Accession No. CAK00018.1; SEQ ID NO: 8);
- (ix) a lactose-utilizing sialyltransferase enzyme from Helicobacter pylori referred to herein as “BstM” (NCBI Reference Sequence: WP_000743106.1; SEQ ID NO: 9); and
- (x) a lactose-utilizing sialyltransferase enzyme from Helicobacter cetorum referred to herein as “BstN” (NCBI Reference Sequence: WP_014661583.1; SEQ ID NO: 10).
- In some embodiments, the lactose-utilizing sialyltransferase enzyme comprises an amino acid sequence with at least 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 60, 70, 80, 90, or 100% identity to 1, 2, 3, 4, 5, 9, 10 or more lactose-utilizing sialyltransferase enzymes disclosed herein.
- In embodiments, the amino acid sequence of a protein comprises no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 mutations compared to its naturally occurring counterpart. In some embodiments, less than 50, 45, 40, 35, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, or 2 of the mutations is a deletion or insertion of 1, 2, 3, 4, or 5 or no more than 1, 2, 4, or 5 amino acids. In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more of the mutations is a substitution mutation. In certain embodiments; every mutation to a protein compared to its naturally occurring counterpart is a substitution mutation. in various embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more or all of the mutations to a protein compared to its naturally occurring counterpart is a conservative substitution mutation.
- In various embodiments, a polypeptide does not have any insertion or deletion compared to its natural counterpart, other than (optionally) the removal of the signal peptide and/or the fusion of compounds such as another polypeptide at the N-terminus or C-terminus thereof.
- In various embodiments, the Cα root-mean-square deviation (RMSD) between the backbone of the lactose-utilizing sialyltransferase enzyme and Pst6-224 (SEQ ID NO: 1), BstC (SEQ ID NO: 2), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 1), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), HAC1268 (SEQ ID NO: 8), BstM (SEQ ID NO: 9), BstN (SEQ ID NO: 10), or PdST (SEQ ID NO: 13) is, e.g., between about 0-3 Å, 0-1 Å, 0-1.5 Å, 0-2 Å, 0.1-3 Å, 0.5-1 Å, 0.5-1.5 Å, or 0.5-2 Å, or less than about 0.1 Å, 0.2 Å, 0.3 Å, 0.4 Å, 0.5 Å, 0.6 Å, 0.7 Å, 0.8 Å, 0.9 Å, 1.0 Å, 1.5 Å, 1.6 Å, 1.7 Å, 1.8 Å, 1.9 Å, 2.0 Å, 2.5 Å, or 3 Å. Non-limiting considerations relating to the sequence and structural differences between homologous proteins are discussed in Chothia and Lesk (1986) The EMBO Journal, 5(4):823-826, the entire content of which is incorporated herein by reference.
- Also provided are functional fragments of the genes or gene products described herein. A fragment of a protein is charactetized by a length (number of amino acids) that is less than the length of the full length mature form of the protein. A fragment, in the case of these sequences and all others provided herein, may be a part of the whole that is less than the whole. Moreover, a fragment ranges in size from a single nucleotide or amino acid within a polynucleotide or polypeptide sequence to one fewer nucleotide or amino acid than the entire polynucleotide or polypeptide sequence. Finally, a fragment is defined as any portion of a complete polynucleotide or polypeptide sequence that is intermediate between the extremes defined above.
- For example, fragments of any of the proteins or enzymes disclosed herein or encoded by any of the genes disclosed herein can be 10 to 20 amino acids, 10 to 30 amino acids, 10 to 40 amino acids, 10 to 50 amino acids, 10 to 60 amino acids, 10 to 70 amino acids, 10 to 80 amino acids, 10 to 90 amino acids, 10 to 100 amino acids, 50 to 100 amino acids, 75 to 125 amino acids, 100 to 150 amino acids, 150 to 200 amino acids, 200 to 250 amino acids, 250 to 300 amino acids, 300 to 350, 350 to 400 amino acids, or 400 to 425 amino acids. The fragments encompassed in the present subject matter comprise fragments that retain functional fragments. As such, the fragments preferably retain the domains that are required or are important for sialyltransferase activity. Fragments can be determined or generated and tested for sialyltransferase activity using standard methods known in the art. For example, the encoded protein can be expressed by any recombinant technology known in the art and the sialyltransferase activity of the protein can be determined.
- As used herein a “biologically active” fragment is a portion of a polypeptide which maintains one or more activities of a full-length reference polypeptide. Biologically active fragments as used herein exclude the full-length polypeptide. Biologically active fragments can be any size as long as they maintain the defined activity. Preferably, the biologically active fragment maintains at least 10%, at least 50%, at least 75% or at least 90%, of the activity (such as sialyltransferase activity) of the full length protein,
- Amino acid sequence variants/mutants of the polypeptides of the defined herein can be prepared by introducing appropriate nucleotide changes into a nucleic acid defined herein, or by in vitro synthesis of the desired polypeptide. Such variants/mutants include, for example, deletions, insertions or substitutions of residues within the amino acid sequence. A combination of deletion, insertion and substitution can be made to arrive at the final construct, provided that the final peptide product possesses the desired activity and/or specificity.
- Mutant (altered) peptides (compared to a wild type counterpart) can be prepared using any technique known in the art. For example, a polynucleotide defined herein can be subjected to in vitro mutagenesis or DNA shuffling techniques. Products derived from mutated/altered I)NA can readily be screened using techniques described herein to determine if they possess, for example, sialyltransferase activity.
- Amino acid sequence deletions generally range from about 1 to 15 residues, e.g. about 1 to 10 residues and often about 1 to 5 contiguous residues. In some embodiments, a mutated or modified protein does not comprise any deletions or insertions. In various embodiments, a mutated or modified protein has less than about 10, 9, 8, 7, 5, 4, 3, or 2 deleted or inserted amino acids.
- Substitution mutants have at least one amino acid residue in the polypeptide molecule removed and a different residue inserted in its place. Sites may be substituted in a relatively conservative manner in order to maintain activity and/or specificity. Such conservative substitutions are shown in the table below under the heading of “exemplary substitutions.”
- In certain embodiments, a mutant/variant polypeptide has only,or not more than, one or two or three or four conservative amino acid changes when compared to a naturally occurring polypeptide. Details of conservative amino acid changes are provided in the table below. As the skilled person would be aware, such minor changes can reasonably be predicted not to alter the activity of the polypeptide when expressed in a recombinant cell.
-
-
Original Residue Example Substitutions Alanine (Ala) Val; Leu; Ile; Gly Arginine (Arg) Lys Asparagine (Asn) Gln; His Cysteine (Cys) Ser Glutamine (Gln) Asn; His Glutamic Acid (Glu) Asp Glycine (Gly) Pro; Ala Histidine (His) Asn; Gln Isoleucine (Ile) Leu; Val; Ala Leucine (Leu) Ile; Val; Met; Ala; Phe Lysine (Lys) Arg Methionine (Met) Leu; Phe Phenylalanine (Phe) Leu; Val; Ala Proline (Pro) Gly Serine (Ser) Thr Threonine (Thr) Ser Tryptophan (Trp) Tyr Tyrosine (Tyr) Trp; Phe Valine (Val) Ile; Leu; Met; Phe; Ala - Mutations can be introduced into a nucleic acid sequence such that the encoded amino acid sequence is altered by, e.g., standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis. In various embodiments, conservative amino acid substitutions are made at one or more predicted non-essential amino acid residues. A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. Certain amino acids have side chains with more than one classifiable characteristic. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, tryptophan, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tyrosine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a given polypeptide is replaced with another amino acid residue from the same side chain family. In some embodiments, mutations can be introduced randomly along all or part of a given coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for given polypeptide biological activity to identify mutants that retain activity. Conversely, the invention also provides for variants with mutations that enhance or increase the endogenous biological activity. Following mutagenesis of the nucleic acid sequence, the encoded protein can be expressed by any recombinant technology known in the art and the activity/specificity of the protein can be determined. An increase, decrease, or elimination of a given biological activity of the variants disclosed herein can be readily measured by the ordinary person skilled in the art, i.e., by measuring the capability for binding a ligand and/or signal transduction.
- In various embodiments, substitutions with natural amino acids are characterized using a BLOcks SUbstitution Matrix (a BLOSUM matrix). A non-limiting example of a BLOSUM matrix is the BLOSUM62 matrix, which is described in Styczynski et al. (2008) “BLOSUM62 miscalculations improve search performance” Nat Biotech 26 (3): 274-275, the entire content of which is incorporated herein by reference. The BLOSUM62 matrix is shown in
FIG. 14 . - Substitutions scoring at least 4 on the BLOSUM62 matrix are referred to herein as “Class I substitutions”; substitutions scoring 3 on the BLOSUM62 matrix are referred to herein as “Class II substitutions”; substitutions scoring 2 or 1 on the BLOSUM62 matrix are referred to herein as “Class III substitutions”; substitutions scoring 0 or −1 on the BLOSUM62 matrix are referred to herein as “Class IV substitutions”; substitutions scoring −2, −3, or −4 on the BLOSUM62 matrix are referred to herein as “Class V substitutions.”
- Various embodiments of the subject application include lactose-utilizing sialyltransferase enzymes having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25 or more Class I, II, III, IV, or V substitutions compared to a naturally occurring lactose-utilizing sialyltransferase enzyme (such as a lactose-utilizing sialyltransferase enzyme mentioned herein), or any 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more of any combination of Class I, II, III, IV, and/or V substitutions compared to a naturally occurring lactose-utilizing sialyltransferase enzyme such as a lactose-utilizing sialyltransferase enzyme exemplified herein.
- Depending on context, a “conservative amino acid substitution” may refer to a mutation or to a difference between two sequences. For example, in some embodiments, a mutant comprises a conservative amino acid substitution compared to a naturally occurring protein, wherein the substitution was introduced into the mutant intentionally (e.g., by human-directed genetic modification) to produce a protein that is derived from the naturally occurring protein. In another example, one naturally occurring protein comprises a conservative amino acid substitution compared to another naturally occurring protein, in which case the “substitution” is a conservative difference between the two sequences at a given position when the sequences of each protein are aligned.
- In some embodiments, the lactose-utilizing sialyltransferase enzyme of the present disclosure is more α(2,6)-selective than the naturally occurring α(2,3) sialyltransferase. As used herein, an “α(2,6)-selective” enzyme effects transfer of sialic acid at a ratio of α(2,6):α(2,3) of at least 1:1, such as from about 1.2:1 to about 100:1, e.g., 1.2:1 to 50:1, 2:1 to 50:1, 3:1 to 50:1, 4:1 to 50:1, 1.2:1 to 40:1, 1.2:1 to 30:1, 1.2:1 to 20:1, 1.2:1 to 10:1, 2:1 to 10:1, 1. 3:1 to 10:1, or about 5:1 to about 10:1.
- A variety of bacterial species may be used in the oligosaccharide biosynthesis methods provided herein, e.g., E. coli, Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, or Xanthomonas campestris. Bacteria of the genus Bacillus may also be used, including Bacillus subtilis, Bacillus licheniormis, Bacillus coagulans, Bacillus thermophilus, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pimulus, Bacillus lentus, Bacillus cereus, and Bacillus circulans. Similarly, bacteria of the genera Lactobacillus and Lactococcus may be modified using the methods of this invention, including but not limited to Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhainnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, and Lactococcus lactis. Streptococcus thermophiles and Proprionibacterium freudenreichii are also suitable bacterial species for the invention described herein. Also included as part of this invention are strains, modified as described here, from the genera Enterococcus (e.g., Enterococcus faecium and Enterococcus thermophiles), Bifidobacterium (e.g., Bifidobacterium longum, Bifidobacterium infantis, and Bifidobacterium bifidum), Sporolactobacillus spp., Micromomospora spp., Micrococcus spp., Rhodococcus spp., and Pseudomonas (e.g., Pseudomonas fluorescens and Pseudomonas aeruginosa). In various embodiments, bacteria comprising the characteristics described herein are cultured in the presence of lactose, and a sialylated oligosaccharide is retrieved, either from the bacterium itself or from a culture supernatant of the bacterium. In some embodiments, the sialylated oligosaccharide is purified for use in therapeutic or nutritional products, or the bacteria are used directly in such products. In certain embodiments, a suitable production host bacterial strain is one that is not the same bacterial strain as the source bacterial strain from which the lactose-utilizing sialyltransferase enzyme-encoding nucleic acid sequence was identified.
- The bacterium utilized in the production methods described herein is genetically engineered to increase the efficiency and yield of sialylated oligosaccharide products. in various embodiments, the host production bacterium is characterized as having a reduced level of β-galactosidase activity, an ability to produce more UDP-GlcNAc or UDP-GlcNAc at a faster rate compared to a corresponding wild-type bacterium, an ability to produce more CMP-Neu5Ac or CMP-Neu5Ac at a faster rate compared to a corresponding wild-type bacterium, a defective or reduced sialic acid degradation pathway, an inactivated β-galactoside transacetylase gene, a lactose permease gene, or a combination thereof.
- In some embodiments, the bacterium comprises an ability to produce more UDP-GlcNAc or UDP-GlCNAc at a faster rate compared to a corresponding wild-type bacterium.
- The nucleotide sugar uridine diphosphate N-acetylglucosamine (UDP-GlcNAc) is a key metabolic intermediate in bacteria, where it is involved in the synthesis and maintenance of the cell envelope. In all known bacterial classes, UDP-GlCNAc is used to make peptidoglycan (murein); a polymer comprising the bacterial cell wall whose structural integrity is absolutely essential for growth and survival. In addition, grain-negative bacteria use UDP-GlcNAc for the synthesis of lipid A, an important component of the outer cell membrane. Thus, for bacteria, the ability to maintain an adequate intracellular pool of UDP-GlcNAc is critical.
- The UDP-GlcNAc pool in E. coli is produced through the combined action of three glm genes, glmS (L-glutamine:D-fructose-6-phosphate aminotransferase), glmM (phosphoglucosamine mutase), and the bifunctional glmU (fused N-acetyl glucosamine-1-phosphate uridyltransferase and glucosamine-1-phosphate acetyl transferase) (
FIG. 2 ). These three genes direct a steady flow of carbon to UDP-GlcNAc, a flow that originates with fructose-6-phosphate (an abundant molecule of central energy metabolism). Expression of the glm genes is under positive control by the transcriptional activator protein, NagC. When E. coli encounters glucosamine or N-acetyl-glucosamine in its environment, these molecules are each transported into the cell via specific membrane transport proteins and are used either to supplement the flow of carbon to the UDP-GlcNAc pool, or alternatively they are consumed to generate energy, under the action of nag operon gene products (i.e. nagA [N-acetylglucosamine-6-phosphate deacetylase] and nagB [glucosamine-6-phosphate deaminase]). In contrast to the glm genes, expression of nagA and nagB are under negative transcriptional control, but by the same regulatory protein as the glm genes, i.e. NagC. NagC is thus bi-functional, able to activate GlcNAc synthesis, while at the same time repressing the degradation of glucosamine-6-phosphate and N-acetylglucosamine-6-phosphate. The binding of NagC to specific regulatory DNA sequences (operators), whether such binding results in gene activation or repression, is sensitive to fluctuations in the cytoplasmic level of the small-molecule inducer and metabolite, GlcNAc-6-phosphate. Intracellular concentrations of GlCNAc-6-phosphate increase when N-acetylglucosamine is available as a carbon source in the environment, and thus under these conditions the expression of the Wm genes (essential to maintain the vital UDP-GlcNAc pool) would decrease, unless a compensatory mechanism is brought into play. E. coli maintains a baseline level of UDP-GlcNAc synthesis through continuous expression of nagC directed by two constitutive promoters, located within the upstream nagA gene. This constitutive level of nagC expression is supplemented approximately threefold under conditions where the degradative nag operon is induced, and by this means E. coli ensures an adequate level of glm gene expression under all conditions, even when N-acetylglucosamine is being utilized as a carbon source. Many hMOS incorporate GlcNAc into their structures directly, and many also incorporate sialic acid, a sugar whose synthesis involves consumption of UDP-GlcNAc. Thus, synthesis of many types of hMOS in engineered E. coli carries the significant risk of reduced product yield and compromised cell viability resulting from depletion of the bacterium's UDP-GlcNAc pool. One way to address this problem during engineered synthesis of GlcNAc- or sialic acid-containing hMOS is to boost the UDP-GlcNAc pool through simultaneous over-expression of nagC, or preferably by simultaneous over-expression of both nagC and glmS. - In some embodiments relating to E. coli or a bacterium other than E. coli, the bacterium preferably comprises increased production of UDP-GlcNAc. As noted hereinabove, an exemplary means to achieve this is by over-expression of a positive endogenous regulator of UDP-GlcNAc synthesis, for example, overexpression of the nagC gene of E. coll. In certain embodiments, this nagC over-expression is achieved by providing additional copies of the nagC gene on a plasmid vector or by integrating additional nagC gene copies into the host cell chromosome. In various embodiments, over-expression is achieved by modulating the strength of the ribosome binding sequence directing nagC translation or by modulating the strength of the promoter directing nagC transcription. In some embodiments, the intracellular UDP-GlcNAc pool may be enhanced by other means, for example by over-expressing the E. coli glmS (L-glutamine:D-fructos-6-phosphate aminotransferase) gene, or alternatively by over-expressing the E. coli glmY gene (a positive translational regulator of glmS), or alternatively by over-expressing the E. coli glmZ gene (another positive translational regulator of glmS), or alternatively by simultaneously using a combination of approaches. In various embodiments, for example, the nagC (GenBank Protein Accession BAA35319.1, incorporated herein by reference) and glmS (GenBank Protein Accession NP_418185.1, incorporated herein by reference) genes which encode the sequences provided herein are overexpressed simultaneously in the same host cell in order to increase the intracellular pool of UDP-GlcNAc.
- In certain embodiments, the ability to produce more CMP-Neu5Ac or CMP-Neu5Ac at a faster rate compared to a corresponding wild-type bacterium comprises the expression of any one of, or any combination of, or all three of an N-acetylneuraminate synthase, a UDP-N-acetylglucosamine 2-epimerase, and a N-acetylneuraminate cytidylyltransferase. Non limiting examples of these enzymes include NeuB, NeuC, and NeuA from Campylobacter jejuni (such as Campylobacter jejuni ATCC43484). In some embodiments, neuBCA genes are co-expressed in an operon.
- In various embodiments, the defective or reduced sialic acid degradation pathway comprises the inactivation or deletion of any one of, any combination of, or each of a nanR gene, a nanA gene, a nanT gene, a nanE gene, or a nanK gene. In some embodiments the nanA, nanT, and nanE genes are inactivated or deleted in the bacterium.
- As used herein, an “inactivated” or “inactivation of a” gene, encoded gene product (i.e., polypeptide), or pathway refers to reducing or eliminating the expression (i.e., transcription or translation), protein level (i.e., translation, rate of degradation), or enzymatic activity of the gene, gene product, or pathway. In the instance where a pathway is inactivated, preferably one enzyme or polypeptide in the pathway exhibits reduced or negligible activity. In some embodiments, the enzyme in the pathway is altered, deleted or mutated such that the product of the pathway is produced at low levels compared to a wild-type bacterium or an intact pathway. In certain embodiments, the product of the pathway is not produced. In various embodiments, the level of a compound that is utilized (e.g., used as a substrate, altered, catalyzed, or otherwise reduced or consumed) by the pathway is increased. In some embodiments, inactivation of a gene is achieved by deletion or mutation of the gene or regulatory elements of the gene such that the gene is no longer transcribed or translated. In certain embodiments, inactivation of a polypeptide can be achieved by deletion or mutation of the gene that encodes the gene product or mutation of the polypeptide to disrupt its activity. inactivating mutations include additions, deletions or substitutions of one or more nucleotides or amino acids of a nucleic acid or amino acid sequence that results in the reduction or elimination of the expression or activity of the gene or polypeptide. In various embodiments, inactivation of a polypeptide is achieved through the addition of exogenous sequences (e.g., tags) to the N or C-terminus of the polypeptide such that the activity of the polypeptide is reduced or eliminated (e.g., by steric hindrance).
- A host bacterium suitable for the production systems described herein exhibits an enhanced or increased cytoplasmic or intracellular pool of lactose and/or UDP-GlcNAc and/or CMP-Neu5Ac. In some embodiments, the bacterium is E. coli and endogenous E. coli metabolic pathways and genes are manipulated in ways that result in the generation of increased cytoplasmic concentrations of lactose and/or UDP-GlcNAc and/or CMP-Neu5Ac, as compared to levels found in wild type E. coli. Preferably, the bacterium accumulates an increased intracellular lactose pool and an increased intracellular UDP-GlcNAc and/or CMP-Neu5Ac pool. For example, the bacteria contain at least 10%, 20%, 50%, or 2×, 5×, 10× or more of the levels of intracellular lactose and/or intracellular UDP-GlcNAc and/or CMP-Neu5Ac compared to a corresponding wild type bacterium that lacks the genetic modifications described herein.
- In certain embodiments, increased intracellular concentration of lactose in the host bacterium compared to wild-type bacterium is achieved by manipulation of genes and pathways involved in lactose import, export and catabolism. In non-limiting examples, described herein are methods of increasing intracellular lactose levels in E. coli genetically engineered to produce a human milk oligosaccharide by simultaneous deletion of the endogenous β-galactosidase gene (lacZ) and the lactose operon repressor gene (lacI). During construction of this deletion, the lacIq promoter is placed immediately upstream of (contiguous with) the lactose permease gene, lacY, i.e., the sequence of the lacIq promoter is directly upstream and adjacent to the start of the sequence encoding the lacY gene, such that the lacY gene is under transcriptional regulation by the lacIq promoter. The modified strain maintains its ability to transport lactose from the culture medium (via LacY), but is deleted for the wild-type chromosomal copy of the lacZ (encoding β-galactosidase) gene responsible for lactose catabolism. Thus, an intracellular lactose pool is created when the modified strain is cultured in the presence of exogenous lactose.
- In some embodiments, increasing the intracellular concentration of lactose in E. coli involves inactivation of a β-galactoside transacetylase gene such as the lacA gene. With respect to an E. coli bacterium, an inactivating mutation, null mutation, or deletion of lacA prevents the formation of intracellular acetyl-lactose, which not only removes this molecule as a contaminant from subsequent purifications, but also eliminates E. coli's ability to export excess lactose from its cytoplasm (Danchin A. Cells need safety valves. Bioessays 2009, July; 31(7):769-73.), thus greatly facilitating purposeful manipulations of the E. coli intracellular lactose pool.
- In certain embodiments, a functional lactose permease gene is present in the bacterium. In various embodiments, the lactose permease gene is an endogenous lactose permease gene or an exogenous lactose permease gene. For example, the lactose permease gene may comprises an E. coli lacY gene (e.g., CienBank Accession Number V00295 (GI:41897), incorporated herein by reference). Many bacteria possess the inherent ability to transport lactose from the growth medium into the cell, by utilizing a transport protein that is either a homolog of the E. coli lactose permease (e.g., as found in Bacillus licheniformis), or a transporter that is a member of the ubiquitous PTS sugar transport family (e.g., as found in Lactobacillus casei and Lactobacillus rhanmosus). For bacteria lacking an inherent ability to transport extracellular lactose into the cell cytoplasm, this ability may be conferred by an exogenous lactose transporter gene (e.g., E. coli lacY) provided on recombinant DNA constructs, and supplied either on a plasmid expression vector or as exogenous genes integrated into the host chromosome.
- As described herein, in some embodiments, the host bacterium preferably has a reduced level of β-galactosidase activity. In the embodiment in which the bacterium is characterized by the deletion of the endogenous β-galactosidase gene, an exogenous β-galactosidase gene may be introduced to the bacterium. For example, a plasmid expressing an exogenous β-galactosidase gene may be introduced to the bacterium, or recombined or integrated into the host genome. For example, the exogenous β-galactosidase gene may be inserted into a gene that is inactivated in the host bacterium, such as the lon gene.
- In some embodiments, the exogenous β-galactosidase gene is a functional β-galactosidase gene characterized by a reduced or low level of 3-galactosidase activity compared to 3-galactosidase activity in wild-type bacteria lacking any genetic manipulation. Exemplary β-galactosidase genes include E. coli lacZ and β-galactosidase genes from any of a number of other organisms (e.g., the lac4 gene of Kluyveromyces lactis GenBank Accession Number M84410 (GI:173304), incorporated herein by reference) that catalyzes the hydrolysis of galactosides into monosaccharides. The level of β-galactosidase activity in wild-type E. coli bacteria is, for example, 1,000 units (e.g., when the bacterium is cultured in the presence of lactose). Thus, the reduced β-galactosidase activity level encompassed by engineered host bacterium of the present invention includes less than 1,000 units, less than 900 units, less than 800 units, less than 700 units, less than 600 units, less than 500 units, less than 400 units, less than 300 units, less than 200 units, less than 100 units, or less than 50 units (e.g., when the bacterium is cultured in the presence of lactose). In some embodiments, low, functional levels of β-galactosidase include β-galactosidase activity levels of between 0.05 and 1,000 units, e.g., between 0.05 and 750 units, between 0.05 and 500 units, between 0.05 and 400 units, between 0.05 and 300 units, between 0.05 and 200 units, between 0.05 and 100 units, between 0.05 and 50 units, between 0.05 and 10 units, between 0.05 and 5 units, between 0.05 and 4 units, between 0.05 and 3 units, or between 0.05 and 2 units of β-galactosidase activity (e.g., when the bacterium is cultured in the presence of lactose). In certain embodiments, low, functional levels of β-galactosidase include β-galactosidase activity levels of between 1 and 1,000 units, e.g., between 1 and 750 units, between 1 and 500 units, between 1 and 400 units, between 1 and 300 units, between 1 and 200 units, between 1 and 100 units, between 1 and 50 units, between 1 and 10 units, between 1 and 5 units, between 1 and 4 units, between 1 and 3 units, or between 1 and 2 units of β-galactosidase activity (e.g., when the bacterium is cultured in the presence of lactose). For unit definition and assays for determining β-galactosidase activity, see Miller J H, Laboratory CSH. Experiments in molecular genetics. Cold Spring Harbor Laboratory Cold Spring Harbor, N.Y.; 1972; (incorporated herein by reference). This low level of cytoplasmic β-galactosidase activity is not high enough to significantly diminish the intracellular lactose pool. The low level of β-galactosidase activity is very useful for the facile removal of undesired residual lactose at the end of fermentations.
- Optionally, the bacterium has an inactivated thyA gene. In various embodiments, a mutation in a thyA gene in the host bacterium allows for the maintenance of plasmids that carry thyA as a selectable marker gene. Exemplary alternative selectable markers include antibiotic resistance genes such as BLA (beta-lactamase), or proBA genes (to complement a proAB host strain proline auxotropy) or purA (to complement a purA host strain adenine auxotrophy).
- In some embodiments purified oligosaccharide, e.g., 3′-SL, 6′-SLNT, 3′-S3FL, SLNT a, SLNT b, DSLNT, SLNFP II, or SLNT c is one that is at least 85%, 90%, 95%, 98%, 99%, or 100% (w/w) of the desired oligosaccharide by weight. Purity may be assessed by any known method, e.g., thin layer chromatography or other chromatographic techniques known in the art. Included herein is a method of purifying a sialylated oligosaccharide produced by a genetically engineered bacterium described herein, which method comprises separating the desired sialylated oligosaccharide from contaminants in a bacterial cell lysate or bacterial cell culture supernatant of the bacterium. In some embodiments, a sialylated oligosaccharide may be added to a food or beverage composition to increase the level of the sialylated oligosaccharide in the composition. In some examples, the sialylated oligosaccharide is added to dried or powder milk or milk product, e.g., infant formula. In some embodiments, it is added to a liquid milk. In other embodiments, it is added to a non-milk dairy product, e.g. yogurt or kefir. In various embodiments, a composition provided herein is not milk. In certain embodiments, a composition provided herein does not comprise milk.
- In various embodiments, sialylated oligosaccharides are purified and used in a number of products for consumption by humans as well as animals, such as companion animals (dogs, cats) as well as livestock (bovine, equine, ovine, caprine, or porcine animals, as well as poultry). For example, a food, beverage, dietary supplement, or pharmaceutical composition may comprise a purified 3′-SL, 6′-SL, 3′-S3FL, SLNT a, SLNT b, DSLNT, SLNFP II, or SLNT c. In some embodiments, the composition comprises an excipient that is suitable for oral administration.
- In certain embodiments, a method of producing a pharmaceutical composition comprising a purified human milk oligosaccharide (HMOS) (such as a sialylated oligosaccharide present in human milk) may be carried out by culturing a bacterium described herein, purifying the HMOS produced by the bacterium, and combining the HMOS with an excipient or carrier to yield a dietary supplement for oral administration. These compositions are useful in methods of preventing or treating enteric and/or respiratory diseases in infants and adults. Accordingly, the compositions are administered to a subject suffering from or at risk of developing such a disease.
- Included herein are methods of treating, preventing, or reducing the risk of infection in a subject comprising administering to said subject a composition comprising a purified recombinant human milk oligosaccharide, wherein the HMOS binds to a pathogen and wherein the subject is infected with or at risk of infection with the pathogen. In some embodiments, the infection is caused by a Norwalk-like virus or Campylobacter jejuni. In certain embodiments, the subject is a mammal. In various embodiments, the mammal is, e.g., any mammal, e.g., a human, a primate, a mouse, a rat, a dog, a cat, a cow, a horse, or a pig. in some embodiments, the mammal is a human. In certain embodiments, the compositions are formulated into animal feed (e.g., pellets, kibble, mash) or animal food supplements for companion animals, e.g., dogs or cats, as well as livestock or animals grown for food consumption, e.g., cattle, sheep, pigs, chickens, and goats. In various embodiments, the purified HMOS is formulated into a powder (e.g., infant formula powder or adult nutritional supplement powder, each of which is mixed with a liquid such as water or juice prior to consumption) or in the form of tablets, capsules or pastes or is incorporated as a component in dairy products such as milk, cream, cheese, yogurt or kefir, or as a component in any beverage, or combined in a preparation containing live microbial cultures intended to serve as probiotics, or in prebiotic preparations to enhance the growth of beneficial microorganisms either in vitro or in vivo.
- Included herein is a nucleic acid construct or an expression vector (such as a viral vector or a plasmid) comprising a nucleic acid encoding at least one lactose-utilizing sialyltransferase enzyme or a variant or fragment thereof, as described herein. The vector can further include one or more regulatory elements, e.g., a heterologous promoter. By “heterologous” is meant that the control sequence and protein-encoding sequence originate from different sources. For example, the sources may be different bacterial strains or species. The regulatory elements can be operably linked to a gene encoding a protein, a gene construct encoding a fusion protein gene, or a series of genes linked in an operon in order to express the fusion protein, Also provided herein is an isolated recombinant cell, e.g., a bacterial cell containing an aforementioned nucleic acid molecule or vector. The nucleic acid is optionally integrated into the genome of the host bacterium. In some embodiments, the nucleic acid construct also further comprises one or more enzymes that are not lactose-utilizing sialyltransferase enzymes.
- As used herein, an “expression vector” is a DNA or RNA vector that is capable of effecting expression of one or more polynucleotides. Preferably, the expression vector is also capable of replicating within the host cell. Expression vectors can be either prokaryotic or eukaryotic, and are typically include plasmids. Expression vectors of the present invention include any vectors that function (i.e., direct gene expression) in host cells of the present invention, including in one of the prokaryotic or eukaryotic cells described herein, e,g., gram-positive, gram-negative, pathogenic, non-pathogenic, commensal, cocci, bacillus, or spiral-shaped bacterial cells; archaeal cells; or protozoan, algal, fungi, yeast, plant, animal, vertebrate, invertebrate, arthropod, mammalian, rodent, primate, or human cells. Expression vectors of the present invention contain regulatory sequences such as transcription control sequences, translation control sequences, origins of replication, and other regulatory sequences that are compatible with the host cell and that control the expression of a polynucleotide. In particular, expression vectors of the present invention include transcription control sequences. Transcription control sequences are sequences which control the initiation, elongation, and termination of transcription. Particularly important transcription control sequences are those which control transcription initiation such as promoter, enhancer, operator and repressor sequences. Suitable transcription control sequences include any transcription control sequence that can function in at least one of the cells of the present invention. A variety of such transcription control sequences are known to those skilled in the art.
- A “heterologous promoter” is a promoter which is different from the promoter to which a gene or nucleic acid sequence is operably linked in nature.
- The term “overexpress” or “overexpression” refers to a situation in which more factor is expressed by a genetically-altered cell than would be, under the same conditions, by a wild-type cell. Similarly, if an unaltered cell does not express a factor that it is genetically altered to produce, the term “express” (as distinguished from “overexpress”) is used indicating the wild type cell did not express the factor at all prior to genetic manipulation.
- A polypeptide or class of polypeptides may be defined by the extent of identity (% identity) of its amino acid sequence to a reference amino acid sequence, or by having a greater % identity to one reference amino acid sequence than to another. A variant of any of genes or gene products disclosed herein may have, e.g., 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to the nucleic acid or amino acid sequences described herein. The term “% identity,” in the context of two or more nucleic acid or polypeptide sequences, refers to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using a sequence comparison algorithm or by visual inspection. For example, % identity is relative to the entire length of the coding regions of the sequences being compared, or the length of a particular fragment or functional domain thereof. Variants as disclosed herein also include homologs, orthologs, or paralogs of the genes or gene products described herein. In some embodiments, variants may, demonstrate a percentage of homology or identity, for example, at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity conserved domains important for biological function, e.g., in a functional domain, e.g. a catalytic domain.
- For sequence comparison, one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters. Percent identity is determined using
- BLAST. For the BLAST searches, the following parameters are employed: (1) Expect threshold is 10; (2) Gap cost is Existence: 11 and Extension: 1; (3) The Matrix employed is BLOSUM62; (4) The filter for low complexity regions is “on.”
- As used herein, the term “about” in the context of a numerical value or range means ±10% of the numerical value or range recited or claimed, unless the context requires a more limited range.
- In the descriptions above and in the claims, phrases such as “at least one of” or “one or more of” may occur followed by a conjunctive list of elements or features. The term “and/or” may also occur in a list of two or more elements or features. Unless otherwise implicitly or explicitly, contradicted by the context in which it is used, such a phrase is intended to mean any of the listed elements or features individually or any of the recited elements or features in combination with any of the other recited elements or features. For example, the phrases “at least one of A and B;” “one or more of A and B;” and “A and/or B” are each intended to mean “A alone, B alone, or A and B together.” A similar interpretation is also intended for lists including three or more items. For example, the phrases “at least one of A, B, and C;” “one or more of A, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, B alone, C alone, A and B together, A and C together, B and C together, or A and B and C together.” In addition, use of the term “based on,” above and in the claims is intended to mean, “based at least in part on,” such that an unrecited feature or element is also permissible
- It is understood that where a parameter range is provided, all integers within that range, and tenths thereof, are also provided by the invention. For example, “0.2-5 mg” is a disclosure of 0.2 mg, 0.3 mg, 0.4 mg, 0.5 mg, 0.6 mg etc. up to and including 5.0 mg.
- As used herein, an “isolated” or “purified” nucleic acid molecule, polynucleotide, polypeptide, or protein, is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. Purified compounds are at least 60% by weight (dry weight) the compound of interest. Preferably, the preparation is at least 75%, more preferably at least 90%, and most preferably at least 99%, by weight the compound of interest. For example, a purified compound is one that is at least 90%, 91%, 92%, 93%, 94%, 95%, 98%, 99%, or 100% (w/w) of the desired compound by weight. Purity is measured by any appropriate standard method, for example, by column chromatography, thin layer chromatography, or high-performance liquid chromatography (HPLC) analysis. A purified or isolated polynucleotide (ribonucleic acid (RNA) or deoxyribonucleic acid (DNA)) is free of the genes/nucleic acids or sequences/amino acids that flank it in its naturally-occurring state. Purified also defines a degree of sterility that is safe for administration to a human subject, e.g., lacking infectious or toxic agents.
- Similarly, by “substantially pure” when referring to a nucleotide or polypeptide means one that has been separated from the components that naturally accompany it, Typically, the nucleotides and polypeptides are substantially pure when they are at least 60%, 70%, 80%, 90%, 95%, or even 99%, by weight, free from the proteins and naturally-occurring organic molecules with they are naturally associated.
- In some embodiments, the term “substantially pure” or “substantially free” with respect to a particular composition means that the composition comprising the sialylated oligosaccharide contains less than 50%, less than 40%, less than 30%, less than 20%, less than 15%, less than 10%, less than 5%, or less than 1% by weight of other substances. In some embodiments, “substantially pure” or “substantially free of” refers to a substance free of other substances, including impurities. Impurities may, for example, include by-products, contaminants, degradation products, water, and solvents.
- The transitional term “comprising,” which is synonymous with “including,” “containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps. By contrast, the transitional phrase “consisting of” excludes any element, step, or ingredient not specified in the claim. The transitional phrase “consisting essentially of” limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention.
- “Subject” as used herein refers to any organism to which a sialylated oligosaccharide may be administered. The subject may be a human or a non-human animal. The subject may be a mammal. The mammal may be a primate or a non-primate. The mammal can be a primate such as a human; a non-primate such as, for example, dog, cat, horse, cow, pig, mouse, rat, camel, llama, goat, rabbit, sheep, hamster, and guinea pig; or non-human primate such as, for example, monkey, chimpanzee, gorilla, orangutan, and gibbon. The subject may be of any age or stage of development, such as, for example, an adult, an adolescent, or an infant. In preferred embodiments, the subject is a human individual less than 2 years of age, an elderly subject (e.g., 65 or more years of age), an immunocompromised subject (e.g., suffering from an autoimmune disorder, undergoing immunosuppressive therapy associated with transplantation, or a subject diagnosed with cancer and undergoing chemotherapy), a malnourished individual, an individual recovering from a dysbiosis (for example of the gut tnicrobiota following treatment with antibiotics), or any individual that would benefit from establishment or re-establishment of a healthy gut microbiota.
- The terms “treating” and “treatment” as used herein refer to the administration of an agent or formulation to a clinically symptomatic individual afflicted with an adverse condition, disorder, or disease, so as to effect a reduction in severity and/or frequency of symptoms, eliminate the symptoms and/or their underlying cause, and/or facilitate improvement or remediation of damage. The terms “preventing” and “prevention” refer to the administration of an agent or composition to a clinically asymptomatic individual who is susceptible to a particular adverse condition, disorder, or disease, and thus relates to the prevention of the occurrence of symptoms and/or their underlying cause.
- By the terms “effective amount” and “therapeutically effective amount” of a formulation or formulation component is meant a nontoxic but sufficient amount of the formulation or component to provide the desired effect.
- As used herein, the singular forms “a,” “an,” and “the” include the plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to “a disease,” “an oligonucleotide,” or “a nucleic acid” is a reference to one or more such embodiments, and includes equivalents thereof known to those skilled in the art and so forth.
- As used herein, “pharmaceutically acceptable” carrier or excipient refers to a carrier or excipient that is suitable for use with humans and/or animals without undue adverse side effects (such as toxicity, irritation, and allergic response) commensurate with a reasonable benefit/risk ratio. It can be, e.g., a pharmaceutically acceptable solvent, suspending agent or vehicle, for delivering the instant compounds to the subject.
- Unless required otherwise by context, the terms “polypeptide” and “protein” are used interchangeably.
- Exemplary Sequences Disclosed Herein include the Following:
-
(Pst6-224) SEQ ID NO: 1 MKNFLLLTLILLTACNNSEENTQSIIKNDINKTIIDEEYVNLEPINQSNISFTKHSWVQTCG TQQLLTEQNKESISLSVVAPRLDDDEKYCFDFNGVSNKGEKYITKVTLNVVAPSLEVYV DHASLPTLQQLMDIIKSEEENPTAQRYIAWGRIVPTDEQMKELNITSFALINNHTPADLV QEIVKQAQTKHRLNVKLSSNTAHSFDNLVPILKELNSFNNVTVTNIDLYDDGSAEYVNL YNWRDTLNKTDNLKIGKDYLEDVINGINEDTSNTGTSSVYNWQKLYPANYHFLRKDYL TLEPSLHELRDYIGDSLKQMQWDGFKKFNSKQQELFLSIVNFDKQKLQNEYNSSNLPNF VFTTGTTVWAGNHEREYYAKQQINVINNAINESSPHYLGNSYDLFFKGHPGGGIINTLIMQ NYPSMVDIPSKISFEVLMMTDMLPDAVAGIASSLYFTIPAEKIKFIVFTSTETITDRETALR SPLVQVMIKLGIVKEENVLFWADLPNCETGVCIAV (BstC) SEQ ID NO: 2 MRKIITFFSLFFSISAWCQKMEIYLDYASLPSLNMILNLVENKNNEKVERIIGFERFDFNKE ILNSFSKERIEFSKVSILDIKERSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLF HKINIEKLYLVDDGSGNYVDLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRY TWHKIFPTEYILLRPDYLDIDEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFD QNIFNELNIGTKNKEYKTFIFTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDI KIFFKGHPKGDDINDYIIRKTGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNI DKVVFLGSEKIKNENDAKSQTLSKLMLMLNVITPEQIFFEEMPNPINF (BstD) SEQ ID NO: 3 MFKIKSYGKNPQLQAVDIYIDFATIPSLSYFLHFLKHKHDHQRLRLFSLARFEMPQTVIEQ YEGIIIQFSRNVEHNVEPLLEQLQTILSQEGKQFELHLHLNLFHSFEMFLNLSPTYTKYKEK ISKIVLHLYDDGSEGVMKQYQLQKSSSLVQDLAATKASLVSLFENGEGSFSQIDLIRYVW NAVLETHYYLLSDHFLLDEKLQPLKAELGHYQLLNLSTYQYLSSEDLLWLKQILKIDAE LESLMQKLTAQPVYFFSGTTFLG (BstE) SEQ ID NO: 4 MLIQQNLEIYLDYATIPSLACFMHFIQHKDDVDSIRLFGLARFDIPQSIIDRYPANHLFYHN IDNRDLTAVLNQLADILAQENKRFQINLHLNLFHSIDLFFAIYPIYQQYQHKISTIQLQLYD DGSEGIVTQHSLCKIADLEQLILQHKNVLLELLTKGTANVPNPTLLRYLWNNIIDSQFHLI SDHFLQHPKLQPLKRLLKRYTILDFTCYPRFNAEQKQLLKEILHISNELENLLKLLKQHNT FLFTGTTAFNLDQEKLDLLTQLHILLLNEHQNPHSTHYIGNNYLLLIKGHANSPALNHTL ALHFPDAIFLPANIPFIFAMLGFTPNKMGGFASTSYINYPTENINHLFFLTSDQPSIRTKW LDYEKQFGLMYSLLAMQKINEDQAFMCTIHN (BstH) SEQ ID NO: 5 MKRLFRLFLCLALLSGTAACSDDEVSQNLIVINGGEHFLSLDGLARAGKISVLAPAPWR VTKAAGDTWFRLSATEGPAGYSEVELSLDENPGAARSAQLAFACGDAIVPFRLSQGALS AGYDSPDYYFYVTFGTMPTLYAGIHLLSHDKPGYVFYSRSKTFDPAEFPARAEVTTAAD RTADATQAEMEAMAREMKRRILEINSADPTAVFGLYVDDLRCRIGYDWFVAQGIDSAR VKVSMLSDGTGTYNNFYNYFGDAATAEQNWESYASEVEALDWNHGGRYPETRSLPEF ESYTWPYYLSTRPDYRLVVQDGSLLESSCPFITEKLGEMEIESIQPYEMLSALPESSRKRF YDMAGFDYDKFAALFDASPKKNLIIIGTSHADDASARLQRDYVARIMEQYGAQYDVFF KPHPADTTSAGYETEFPGLTLLPGQMPFEIFVWSLIDRVDMIGGYPSTVFLTVPVDKVRFI FAADAASLVRPLNILIFRDATDVEWMQ (BstI) SEQ ID NO: 6 MEFCKMATTQKICVYLDYATIPSLNYILHFAQHFEDQETIRLFGLSRFHIPESVIQRYPKG VVQFYIDNQEKDFSALLLALKNILIEVKQQQRKCEIELHLNLFHYQLLLLPFLSLYLDTQD YCHLTLKFYDDGSEAISALQELALAPDLAAQIQFEKQQFDELVVKKSFKLSLLSRYFWG KLFESEYIWFNQAILQKAELQILKQEISSSRQMDFAIYQQMSDEQKQLVLEILNIDLNKVA YLKQLMENQPSFLFLGTTLFNITQETKTWLMQMHVDLIQQYCLPSGQFFNNKAGYLCF YKGHPNEKEMNQMILSQFKNLIALPDDIPLEILLLLGVIPSKVGGFASSALFNTTPAQIENI IFFTPRYFEKDNRLHATQYRLMQGLIELGYLDAEKSVTHFEIMQLLTKE (BstJ) SEQ ID NO: 7 MLVNNQSHNPKLICWQRHPVNDEALLQGINAASFVSIASLCQHAATLLAGHPHSHITIYG NTYWSKDLARLIRYLTRISGVEIKKLELIDDGSSEYQKMFYWQRLSSEEQTRDLATGLK NLKSYLSGNDNKLLRLLTGHSNKLPRRLSSFMNWHQLFPTTYHMLRMDYLDKPELHQL KQYLGNNAQQIRWNYIADNLFDDEQQSLFYQLLGISLAEQKQLRAGRQQLHDFMFIGV DSSNASSKLQINVIADSRQESGIIPTITAKKMLFKGHPFANFNQTIVDAHQMGEMPAMIPF ETLIMTGNLPQKVGGMASSLYFSLPNNYHIEYIVFSGSKKDLEQHALLQIMLYLKVISPE RVYFSEQFKSC (HAC1268) SEQ ID NO: 8 MGTIKKPLIIAGNGPSIKDLDYALFPKDFDVFRCNQFYFEDKYYLGREIKGVFFNPCVLSS QMQTVQYLMDNGEYSIERFFCSVSTDRHDFDGDYQTILPVDGYLKAHYPFVCDTFSLFK GHEEILKHVKYHLKTYSKELSAGVLMLLSAVVLGYKEIYLVGIDFGASSWGHFYDESQS QHFSNHMADCHNIYYDMLTICLCQKYAKLYALAPNSPLSHLLTLNPQAKYPFELLDKPI GYTSDLIISSPLEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEE KLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLL EFKNIEEKLLEFKNIEEKLLASRLNNILRKIKRKILPFFWGGGVTPTLKVSFRWGAA (BstM) SEQ ID NO: 9 MKKPLIIAGNGPSIKDLDYSLFPKDFEVFRCNQFYFEDKYYLGREIKGVFFNPCVLSSQM QTAQYLMDNGEYSIERFFCSVSTDRHDFDGDYQTILPVEGYLKAHYPFVCDTFSLFKGH EEILRHVKYHLKTYSKELSAGVLMLLSAVVLGYKEIYLVGIDFGASSWGHFYDESQSQH FSNHMADCHNIYYDMFTICLCQKYAKLYALAPNSPLRHILALNPQAKYHFELLDKPIGY TSDLIVSLPLEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLVNRLKNILRKIKRKILPF WGGGGNTHLKVSFRWGVA (BstN) SEQ ID NO: 10 MSEKIFSQVDEKNQKKPLIIAGNGPSIKDLDYSLFPKDFDVFRCNQFYFEDKYYLGKEVK GVFFNPCVFHNQMNTAKHLIDNNEYYIEQFFCSVSKEQHDFNGDYQTILSVDEYLRANY PFVRDTFSLFGEHEELLNHVKYHLKTYSKELSAGVLMLLSAIVLGYKEIYLVGVDFGANS WGHFYDDNQSQHFINHMADCHNIYYDMLTIYLCQKYAKLYALVPNSPLNHLLPLNLQA NHVFELLDKPIGYTSDLIVSSPLEEKLLESKNIDERFSQNKSFKNYLQRLKDKFLQMIFRG GGVITIPRVIFKGKFA (pG543) >pEC23′-(T7)bstN-neuBCA-thyA_(pG543) SEQ ID NO: 11 TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGC TTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGG TGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCG GTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTCAACCTGTAT ATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTC GCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACAT CAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCA TATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATGC AAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTT TGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCAC CTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTAC ACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAGTGTA TGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGACGAGATCACTACGGTA CTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCG AACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAA ACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCC AGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCT GGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCG CGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGT TTCGAAGACTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCT AATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAAT TCTTCGAGACGCCTTCCCGAAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCG ATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTA AGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTA CTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGAT TACTGCAGGTCGACTTATTTTTTCCATATCTGTTCAACCTTTTTTAAATCCTCCAAACAGTCAA TATCTAAACTTGAGCTTTCGTCCATTAAAAAATGCTTGGTTTTGCTTTGTAAAAAGCTAGGATT GTTTAAAAATTCTTTTATCTTTAAAATATAAATTGCACCATTGCTCATATAAGTTTTAGGCAAT TTTTGCCTTGGCATAAAAGGATATTCATCATTACAAATCCCTGCTAAATCGCCACAATCATTAC AAACAAAGGCTTTTAGAATTTTATTATCACATTCGCTTACGCTAATTAGGGCATTTGCATTGCT ATTTTTATAAAGATTAAAAGCTTCATTAATATGAATATTTGTTCTTAGCGGTGAAGTGGGTTGT AAAAAAACTACATCTTCATAATCTTTATAAAATTTTAGAGCATGTAACAGCACTTTATCGCTTG TGGTATCATCTTGTGCAAGGCTAATTGGGCGTTTTAAAATATCAACATTTTGACTTTTTGCATA ATTTAAAATTTCATCACTATCACTGCTTACAACAACTTTACTAATGCTTTTAGCATTTAGTGCA GCTTTGATCGTGTAGTAAATTAAAGGTTTATTGTTTAATAAAACCAAATTTTTATTTTTAATAC CCTTTGAGCCACCACGAGCAGGGATTATTGCTAAGCTCATTTTATATCCTTAAAAACTTTTTGT GTGCTGAGTTTAAAAAAATCTCCGCTTTGTAAATATTCAAAAAATAATTTTGAGCTATCTAAAA TCTCTAACTTAGCGCTAAATAAATCTTGTTTTTTATGAATAGTGTTAATAGCTTTTAGTATTTC ATCACTATTTGCATTAACTTTTAGTGTATTTTCATTGCCAAGTCTTCCATTTTGTCTTGAGCCA ACTAAAATCCCTGCTGTTTTTAAGTATAAGGCCTCTTTTAAAATACAACTTGAATTACCTATTA TAAAATCAGCATTTTTTAACAAAGTTATAAAATACTCAAATCTAAGCGATGGAAAAAGCTTAAA TCTAGGGTTATTTTTAAACTCTTCATAGCTTTGCAAGATTAATTCAAAACCTAAATCATTATTT GGATAAATAACAATATAATTTTTATTACTTTGTATCAGTGCTTTTACTAAATTGTCTGCTTGAT TTTTAATGCTAGTAATTTCAGTTGTAACAGGATGAAACATAAGCAAAGCGTAGTTTTCATAATT TATATCATAATATTTTTTTGCTTCGCTAAGTGAAATTTTATTATCGTTTAAAAGTTCTAAATCA GGCGAACCTATGATAAAAATAGATTTTTCATCTTCTCCAAGCTGCATTAAACGCCTTTTTGCAA ACTCATCATTTACTAAATGAATATGAGCTAGTTTTGATATAGCGTGGCGTAAGCTATCGTCAAT AGTTCCTGAAATCTCTCCGCCTTCAATATGCGCTACTAAGATATTATTTAATGCTCCAACAATA GCTGCTGCTAAAGGCTCAATTCTATCTCCATGTACTACGATTAAATCAGGTTTTAGCTCATTTG CATACCTTGAAAATCCATCAATTGTAGTAGCTAAAGCCTTATCAGTTTGATAATATTTATCATA ATTTATAAATTCATAAATATTTTTAAAGCCATTTTTATAAAGTTCTTTAACTGTATAGCCAAAA TTTTTACTTAAGTGCATTCCTGTTGCAAAGATGTAAAGTTCAAATTCGCTTGAGTTTTGCACCC TGTACATTAAAGATTTAATCTTAGAATAATCAGCCCTAGAGCCTGTTATAAAAAGGATTTTTTT CACGCAAAATCCTCATAGCTTAACTGAGCATCATTTTCTATATCTCTTAATGCTTTTTTGCCTA AAATATTTTCAAATTCAGCCGCACTAATTCCACCAAGTCCAGGTCTTTTAACCCAAATATTATC CATAGATAAAACTTCGCCTTTTTTAATATCTTTAATGCTAACTACACTTGCAAAGGCAAAATCA ATTGTAACTTGTTCTTGTTTAGCCGCTTTTTTACTTTCATTATTTCCTCTTATTATAGCCATTT GCTCACTTTGTATAATTAGCTCTTTTAAAGCCTTTGTATCCATAGAACAAACTATATCAGGGCC ACTTCTATGCATACTATCAGTAAAATGTCTTTCAAGCACACAAGCTCCAAGTACAACTGCACCT AAACACGCAAGATTATCTGTTGTGTGGTCGCTTAAGCCTACCATACAAGAAAATTCTTTTTTTA ACTCAAGCATAGCGTTTAATCTTACAAGATTATGCGGGGTTGGGTAAAGATTGGTCGTGTGCAT TAAAACAAAAGGAATTTCATTGTCTAATAAGATTTTTACAGTTGGTTTTATACTTTCAATACTA TTCATTCCTGTGCTAACTATCATAGGCTTTTTAAAGGCTGCTATGTGTTTAATAAGCGGATAAT TATTACACTCACCTGAACCAATCTTAAAAGCACTAACTCCCATATCTTCTAAGCGGTTCGCACC TGCACGAGAAAAAGGTGTGCTAAGATAAACAAGACCTAATTTTTCTGTGTATTCTTTAAGTGCT AGCTCATCTTTATAATCCAAAGCACATTTTTGCATAATCTCATAAATGCTTATTTTTGCATTAC CAGGAATTACTTTTTTAGCGGCCTTACTCATCTCATCTTCAACAATATGAGTTTGATGCTTTAT AATCTTAGCACCTGCGCTAAAGGCTGCATCTACCATAATTTTAGCTAGTTCTAAACTGCCATTA TGATTAATGCCTATTTCAGGTACGACTAAGGGTGCTTTTTCTTCACTTATGATTATATTTTGTA TTTTTATTTCTTTCATTTATTTTCCTCCTTAGTCGACGGTACCCTTAAGCGAATTTTCCTTTAA AGATCACGCGGGGAATTGTAATGACTCCACCCCCACGGAAGATCATTTGAAGAAACTTATCTTT AAGACGTTGAAGATAGTTTTTGAAGGACTTATTCTGAGAGAAGCGCTCGTCGATGTTCTTCGAC TCTAACAGTTTTTCTTCTAAAGGGGAGCTAACGATTAAATCCGACGTGTAGCCGATGGGCTTAT CAAGCAGCTCAAATACATGGTTTGCCTGTAAGTTCAACGGTAAAAGATGGTTCAGAGGACTGTT AGGTACTAAAGCATATAATTTGGCGTATTTTTGAGAAAGGTAAATAGTCAACATGTCATAATAA ATGTTATGGCAGTCAGCCATGTGGTTAATAAAGTGCTGACTCTGGTTGTCATCGTAAAAATGTC CCCAGCTATTTGCGCCAAAATCGACACCGACTAAGTAGATTTCCTTGTATCCTAAAACAATTGC GCTCAACAACATAAGGACCCCCGCAGATAATTCTTTTGAATATGTCTTCAGATGGTATTTGACA TGGTTTAAGATTTCCTCATGCTCCCCAAACAAGCTAAAGGTGTCACGTACAAACGGGTAGTTTG CACGAAGGTATTCGTCCACCGATAAGATGGTCTGGTAATCACCGTTAAAATCGTGTTGTTCTTT CGACACACTACAAAAGAACTGCTCGATGTAGTATTCGTTGTTGTCAATTAAATGCTTCGCGGTA TTCATTTGATTATGGAAGACGCACGGATTAAAGAATACACCTTTGACCTCTTTGCCCAAGTAAT ACTTATCTTCGAAATAGAATTGGTTACAGCGGAAAACGTCGAAATCTTTTGGGAACAACGAATA GTCAAGGTCTTTGATTGATGGTCCGTTGCCCGCGATAATCAAGGGCTTTTTTTGGTTCTTCTCG TCAACCTGGCTGAAGATTTTTTCCGACATATGTATATCTCCTTCTTGAATTCTAACAATTGATT GAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCT GGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCT CTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGC TTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCG GCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTG TAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGG CCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGG GCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGCTCAGTATCA CCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTAT GTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGC ATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTGCTAG CGGAGTGTATACTGGCTTACTATGTTGGCACTGATGAGGGTGTCAGTGAAGTGCTTCATGTGGC AGGAGAAAAAAGGCTGCACCGGTGCGTCAGCAGAATATGTGATACAGGATATATTCCGCTTCCT CGCTCACTGACTCGCTACGCTCGGTCGTTCGACTGCGGCGAGCGGAAATGGCTTACGAACGGGG CGGAGATTTCCTGGAAGATGCCAGGAAGATACTTAACAGGGAAGTGAGAGGGCCGCGGCAAAGC CGTTTTTCCATAGGCTCCGCCCCCCTGACAAGCATCACGAAATCTGACGCTCAAATCAGTGGTG GCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGCGGCTCCCTCGTGCGCTCT CCTGTTCCTGCCTTTCGGTTTACCGGTGTCATTCCGCTGTTATGGCCGCGTTTGTCTCATTCCA CGCCTGACACTCAGTTCCGGGTAGGCAGTTCGCTCCAAGCTGGACTGTATGCACGAACCCCCCG TTCAGTCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGAAAGACATGC AAAAGCACCACTGGCAGCAGCCACTGGTAATTGATTTAGAGGAGTTAGTCTTGAAGTCATGCGC CGGTTAAGGCTAAACTGAAAGGACAAGTTTTGGTGACTGCGCTCCTCCAAGCCAGTTACCTCGG TTCAAAGAGTTGGTAGCTCAGAGAACCTTCGAAAAACCGCCCTGCAAGGCGGTTTTTTCGTTTT CAGAGCAAGAGATTACGCGCAGACCAAAACGATCTCAAGAAGATCATCTTATTAATCAGATAAA ATATTTCTAGGCGGCCGCGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAA AGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATG AGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCT ATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTA CCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAG CAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCAT CCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAAC GTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCT CCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTC CTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCA GCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACG GGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGG CGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCA ACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAA TGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAA TATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGA AAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAAC CATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTC (pG549) pEC3′-(T7)bstM-neuBCAthyA SEQ ID NO: 12 TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGC TTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGG TGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCG GTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTCAACCTGTAT ATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTC GCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACAT CAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCA TATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATGC AAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTT TGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCAC CTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTAC ACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAGTGTA TGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGACCAGATCACTACGGTA CTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCG AACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAA ACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCC AGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCT GGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCG CGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGT TTCGAAGACTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCT AATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAAT TCTTCGAGACGCCTTCCCGAAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCG ATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTA AGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTA CTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGAT TACTGCAGGTCGACTTATTTTTTCCATATCTGTTCAACCTTTTTTAAATCCTCCAAACAGTCAA TATCTAAACTTGAGCTTTCGTCCATTAAAAAATGCTTGGTTTTGCTTTGTAAAAAGCTAGGATT GTTTAAAAATTCTTTTATCTTTAAAATATAAATTGCACCATTGCTCATATAAGTTTTAGGCAAT TTTTGCCTTGGCATAAAAGGATATTCATCATTACAAATCCCTGCTAAATCGCCACAATCATTAC AAACAAAGGCTTTTAGAATTTTATTATCACATTCGCTTACGCTAATTAGGGCATTTGCATTGCT ATTTTTATAAAGATTAAAAGCTTCATTAATATGAATATTTGTTCTTAGCGGTGAAGTGGGTTGT AAAAAAACTACATCTTCATAATCTTTATAAAATTTTAGAGCATGTAACAGCACTTTATCGCTTG TGGTATCATCTTGTGCAAGGCTAATTGGGCGTTTTAAAATATCAACATTTTGACTTTTTGCATA ATTTAAAATTTCATCACTATCACTGCTTACAACAACTTTACTAATGCTTTTAGCATTTAGTGCA GCTTTGATCGTGTAGTAAATTAAAGGTTTATTGTTTAATAAAACCAAATTTTTATTTTTAATAC CCTTTGAGCCACCACGAGCAGGGATTATTGCTAAGCTCATTTTATATCCTTAAAAACTTTTTGT GTGCTGAGTTTAAAAAAATCTCCGCTTTGTAAATATTCAAAAAATAATTTTGAGCTATCTAAAA TCTCTAACTTAGCGCTAAATAAATCTTGTTTTTTATGAATAGTGTTAATAGCTTTTAGTATTTC ATCACTATTTGCATTAACTTTTAGTGTATTTTCATTGCCAAGTCTTCCATTTTGTCTTGAGCCA ACTAAAATCCCTGCTGTTTTTAAGTATAAGGCCTCTTTTAAAATACAACTTGAATTACCTATTA TAAAATCAGCATTTTTTAACAAAGTTATAAAATACTCAAATCTAAGCGATGGAAAAAGCTTAAA TCTAGGGTTATTTTTAAACTCTTCATAGCTTTGCAAGATTAATTCAAAACCTAAATCATTATTT GGATAAATAACAATATAATTTTTATTACTTTGTATGAGTGCTTTTACTAAATTGTCTGCTTGAT TTTTAATGCTAGTAATTTCAGTTGTAACAGGATGAAACATAAGCAAAGCGTAGTTTTCATAATT TATATCATAATATTTTTTTGCTTCGCTAAGTGAAATTTTATTATCGTTTAAAAGTTCTAAATCA GGCGAACCTATGATAAAAATAGATTTTTCATCTTCTCCAAGCTGCATTAAACGCCTTTTTGCAA ACTCATCATTTACTAAATGAATATGAGCTAGTTTTGATATAGCGTGGCGTAAGCTATCGTCAAT AGTTCCTGAAATCTCTCCGCCTTCAATATGCGCTACTAAGATATTATTTAATGCTCCAACAATA GCTGCTGCTAAAGGCTCAATTCTATCTCCATGTACTACGATTAAATCAGGTTTTAGCTCATTTG CATACCTTGAAAATCCATCAATTGTAGTAGCTAAAGCCTTATCAGTTTGATAATATTTATCATA ATTTATAAATTCATAAATATTTTTAAAGCCATTTTTATAAAGTTCTTTAACTGTATAGCCAAAA TTTTTACTTAAGTGCATTCCTGTTGCAAAGATGTAAAGTTCAAATTCGCTTGAGTTTTGCACCC TGTACATTAAAGATTTAATCTTAGAATAATCAGCCCTAGAGCCTGTTATAAAAAGGATTTTTTT CACGCAAAATCCTCATAGCTTAACTGAGCATCATTTTCTATATCTCTTAATGCTTTTTTGCCTA AAATATTTTCAAATTCAGCCGCACTAATTCCACCAAGTCCAGGTCTTTTAACCCAAATATTATC CATAGATAAAACTTCGCCTTTTTTAATATCTTTAATGCTAACTACACTTGCAAAGGCAAAATCA ATTGTAACTTGTTCTTGTTTAGCCGCTTTTTTACTTTCATTATTTCCTCTTATTATAGCCATTT GCTCACTTTGTATAATTAGCTCTTTTAAAGCCTTTGTATCCATAGAACAAACTATATCAGGGCC ACTTCTATGCATACTATGAGTAAAATGTCTTTCAAGCACACAAGCTCCAAGTACAACTGCACCT AAACACGCAAGATTATCTGTTGTGTGGTCGCTTAAGCCTACCATACAAGAAAATTCTTTTTTTA ACTCAAGCATAGCGTTTAATCTTACAAGATTATGCGGGGTTGGGTAAAGATTGGTCGTGTGCAT TAAAACAAAAGGAATTTCATTGTCTAATAAGATTTTTACAGTTGGTTTTATACTTTCAATACTA TTCATTCCTGTGCTAACTATCATAGGCTTTTTAAAGGCTGCTATGTGTTTAATAAGCGGATAAT TATTACACTCACCTGAACCAATCTTAAAAGCACTAACTCCCATATCTTCTAAGCGGTTCGCACC TGCACGAGAAAAAGGTGTGCTAAGATAAACAAGACCTAATTTTTCTGTGTATTCTTTAAGTGCT AGCTCATCTTTATAATCCAAAGCACATTTTTGCATAATCTCATAAATGCTTATTTTTGCATTAC CAGGAATTACTTTTTTAGCGGCCTTACTCATCTCATCTTCAACAATATGAGTTTGATGCTTTAT AATCTTAGCACCTGCGCTAAAGGCTGCATCTACCATAATTTTAGCTAGTTCTAAACTGCCATTA TGATTAATGCCTATTTCAGGTACGACTAAGGGTGCTTTTTCTTCACTTATGATTATATTTTGTA TTTTTATTTCTTTCATTTATTTTCCTCCTTAGTCGACGGTACCCTTAAGCCACCCCCCAGCGGA ACGAGACTTTAAGATGCGTATTGCCGCCACCCCCCCAAAACGGCAGGATCTTACGTTTGATCTT ACGCAGGATGTTCTTAAGACGATTCACAAGAAGCTTCTCTTCAATATTCTTGAACTCAAGCAAC TTTTCCTCGATATTTTTGAACTCTAAAAGTTTCTCCTCGATGTTTTTAAATTCCAGAAGCTTCT CCTCAAGGGGAAGCGATACAATCAGGTCACTTGTATAGCCGATCGGTTTATCAAGCAACTCGAA GTGGTATTTTGCTTGCGGGTTCAGTGCCAGGATGTGACGAAGCGGAGAGTTCGGTGCTAAGGCG TAAAGTTTTGCATACTTTTGACACAGGCAGATTGTGAACATGTCATAGTAAATGTTGTGGCAAT CGGCCATGTGATTGCTGAAGTGCTGGGACTGACTCTCATCGTAGAAGTGGCCCCAGCTTGACGC ACCAAAGTCAATCCCGACCAAGTAAATCTCCTTATACCCCAAAACCACGGCCGACAACAGCATT AAGACTCCGGCACTCAATTCTTTACTATAAGTTTTTAAGTGGTACTTCACATGGCGAAGGATTT CCTCATGGCCCTTAAAAAGGCTGAATGTGTCACAAACAAATGGGTAGTGGGCCTTCAAATAACC CTCCACCGGAAGGATCGTCTGATAATCGCCGTCGAAGTCATGGCGGTCTGTCGAGACACTGCAG AAGAAGCGTTCGATGGAATATTCACCGTTGTCCATCAGATATTGAGCTGTTTGCATTTGAGAAG ATAACACACAGGGATTGAAGAATACGCCTTTAATCTCACGTCCAAGGTAATACTTATCCTCGAA ATAAAACTGATTACAGCGAAAGACTTCGAAATCCTTGGGAAATAAACTATAGTCCAGGTCTTTG ATGGATGGCCCGTTCCCCGCAATAATTAAGGGTTTCTTCATATGTATATCTCCTTCTTGAATTC TAACAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCC TTTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGA AGGGCTTTACCTCTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAA CTGGTTTTGCGCTTACCCCAACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGC GCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGT GTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAA TCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTG CCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATG TGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGAT GGTTATCTGTATGTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGA GATCAATTCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTC TTCCGCTGCTAGCGGAGTGTATACTGGCTTACTATGTTGGCACTGATGAGGGTGTCAGTGAAGT GCTTCATGTGGCAGGAGAAAAAAGGCTGCACCGGTGCGTCAGCAGAATATGTGATACAGGATAT ATTCCGCTTCCTCGCTCACTGACTCGCTACGCTCGGTCGTTCGACTGCGGCGAGCGGAAATGGC TTACGAACGGGGCGGAGATTTCCTGGAAGATGCCAGGAAGATACTTAACAGGGAAGTGAGAGGG CCGCGGCAAAGCCGTTTTTCCATAGGCTCCGCCCCCCTGACAAGCATCACGAAATCTGACGCTC AAATCAGTGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGCGGCTCC CTCGTGCGCTCTCCTGTTCCTGCCTTTCGGTTTACCGGTGTCATTCCGCTGTTATGGCCGCGTT TGTCTCATTCCACGCCTGACACTCAGTTCCGGGTAGGCAGTTCGCTCCAAGCTGGACTGTATGC ACGAACCCCCCGTTCAGTCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCC GGAAAGACATGCAAAAGCACCACTGGCAGCAGCCACTGGTAATTGATTTAGAGGAGTTAGTCTT GAAGTCATGCGCCGGTTAAGGCTAAACTGAAAGGACAAGTTTTGGTGACTGCGCTCCTCCAAGC CAGTTACCTCGGTTCAAAGAGTTGGTAGCTCAGAGAACCTTCGAAAAACCGCCCTGCAAGGCGG TTTTTTCGTTTTCAGAGCAAGAGATTACGCGCAGACCAAAACGATCTCAAGAAGATCATCTTAT TAATCAGATAAAATATTTCTAGGCGGCCGCGAACGAAAACTCACGTTAAGGGATTTTGGTCATG AGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCT AAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTC AGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATA CGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTC CAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTT ATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAAT AGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGG CTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAA AGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTC ATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGA CTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCC GGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAA CGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCA CTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAAC AGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTC TTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTG AATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGA CGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTT CGTC (PdST) SEQ ID NO: 13 MTIYLDPASLPTLNQLMHFTKESEDKETARIFGFSRFKLPEKITEQYNNIHFVEIKNNRPT EDIFTILDQYPEKLELDLHLNIAHSIQLFHPILQYRFKHPDRISIKSLNLYDDGTMEYVDLE KEENKDIKSAIKKAEKQLSDYLLTGKINFDNPTLARYVWQSQYPVKYHFLSTEYFEKAE FLQPLKTYLAGKYQKMDWSAYEKLSPEQQTFYLKLVGFSDETKQLFHTEQTKFIFTGTT TWEGNTDIREYYAKQQLNLLKHFTHSEGDLFIGDQYKIYFKGHPRGGDINDYILKHAKD ITNIPANISFEILMMTGLLPDKVGGVASSLYFSLPKEKISHIIFTSNKKIKNKEDALNDPYV RVMLRLGMIDKSQIIFWDSLKQL (PdST*) SEQ ID NO: 14 MTIYLDhASLPTLNQLMHFTKESEDKETARIFGFSRFKLPEKITEQYNNIHFVEIKNNRPTE DIFTILDQYPEKLELDLHLNLAHSIQLFHPILQYRFKHPDRISIKSLNLYDDGTaEYVDLEKE ENKDIKSAIKKAEKQLSDYLLTGKINFDNPTLARYVWQSQYPVKYHFLSTEYFEKAEFL QPLKTYLAGKYQKMDWSAYEKLSPEQQTFYLKLVGFSDETKQLFHTEQTKFIFTGTTT WEGNTDIREYYAKQQLNLLKHFTHSEGDLFIGDQYKIYFKGHPRGGDINDYILKHAKDI TNIPANISFEILMMTGLLPDKVGGVASSLYFSLPKEKISHIIFTSNKKIKNKEDALNDPYVR VMLRLGMIDKSQIIFWDSLKQL (Δ20BstC*) SEQ ID NO: 15 MEIYLDhASLPSLNMILNLVENKNNEKVERIIGFERFDFNKEILNSFSKERIEFSKVSILDIK EFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLFHKINIEKLYLYDDGSaNYV DLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRYTWHKIFPTEYILLRPDYLDI DEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFDQNIFNELNIGTKNKEYKTFI FTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDIKIFFKGHPKGDDINDYIIRK TGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNIDKVVFLGSEKIKNENDAKS QTLSKLMLMLNVITPEQIFFEEMPNPINF (BstE*) SEQ ID NO: 16 MLIQQNLEIYLDHATIPSLACFMHFIQHKDDVDSIRLFGLARFDIPQSIIDRYPANHLFYHN IDNRDLTAVLNQLADILAQENKRFQINLHLNLFHSIDLFFAIYPIYQQYQHKISTIQLQLYD DGSAGIVTQHSLCKIADLEQLILQHKNVLLELLTKGTANVPNPTLLRYLWNNIIDSQFHLI SDHFLQHPKLQPLKRLLKRYTILDFTCYPRFNAEQKQLLKEILHISNELENLLKLLKQHNT FLFTGTTAFNLDQEKLDLLTQLHILLLNEHQNPHSTHYIGNNYLLLIKGHANSPALNHTL ALHFPDAIFLPANIPFEIFAMLGFTPNKMGGFASTSYINYPTENINHLFFLTSDQPSIRTKW LDYEKQFGLMYSLLAMQKINEDQAFMCTIHN (pG544) pEC3′-(T7)delta20bstC-neuBCA-thyA SEQ ID NO: 17 TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGC TTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGG TGTCGGGGCTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCG GTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTCAACCTGTAT ATTCGTAAACCACGCCCAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTC GCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACAT CAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCA TATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATGC AAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTT TGGTCATCAGATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCAC CTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTAC ACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAACGGCGACCTCGGGCCAGTGTA TGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGACCAGATCACTACGGTA CTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGGAACGTAGGCG AACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAA ACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCC AGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCT GGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCG CGAACCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGT TTCGAAGACTTTGAGATTGAAGGCTAGGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCT AATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAAT TCTTCGAGACGCCTTCCCGAAGGCGCCATTCGCCATTGAGGCTGCGCAACTGTTGGGAAGGGCG ATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTA AGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTA CTGCTCACAAGAAAAAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGAT TACTGCAGGTCGACTTATTTTTTCCATATCTGTTCAACCTTTTTTAAATCCTCCAAACAGTCAA TATCTAAACTTGAGCTTTCGTCCATTAAAAAATGCTTGGTTTTGCTTTGTAAAAAGCTAGGATT GTTTAAAAATTCTTTTATCTTTAAAATATAAATTGCACCATTGCTCATATAAGTTTTAGGCAAT TTTTGCCTTGGCATAAAAGGATATTCATCATTACAAATCCCTGCTAAATCGCCACAATCATTAC AAACAAAGGCTTTTAGAATTTTATTATCACATTCGCTTACGCTAATTAGGGCATTTGCATTGCT ATTTTTATAAAGATTAAAAGCTTCATTAATATGAATATTTGTTCTTAGCGGTGAAGTGGGTTGT AAAAAAACTACATCTTCATAATCTTTATAAAATTTTAGAGCATGTAACAGCACTTTATCGCTTG TGGTATCATCTTGTGCAAGGCTAATTGGGCGTTTTAAAATATCAACATTTTGACTTTTTGCATA ATTTAAAATTTCATCACTATCACTGCTTACAACAACTTTACTAATGCTTTTAGCATTTAGTGCA GCTTTGATCGTGTAGTAAATTAAAGGTTTATTGTTTAATAAAACCAAATTTTTATTTTTAATAC CCTTTGAGCCACCACGAGCAGGGATTATTGCTAAGCTCATTTTATATCCTTAAAAACTTTTTGT GTGCTGAGTTTAAAAAAATCTCCGCTTTGTAAATATTCAAAAAATAATTTTGAGCTATCTAAAA TCTCTAACTTAGCGCTAAATAAATCTTGTTTTTTATGAATAGTGTTAATAGCTTTTAGTATTTC ATCACTATTTGCATTAACTTTTAGTGTATTTTCATTGCCAAGTCTTCCATTTTGTCTTGAGCCA ACTAAAATCCCTGCTGTTTTTAAGTATAAGGCCTCTTTTAAAATACAACTTGAATTACCTATTA TAAAATCAGCATTTTTTAACAAAGTTATAAAATACTCAAATCTAAGCGATGGAAAAAGCTTAAA TCTAGGGTTATTTTTAAACTCTTCATAGCTTTGCAAGATTAATTCAAAACCTAAATCATTATTT GGATAAATAACAATATAATTTTTATTACTTTGTATCAGTGCTTTTAGTAAATTGTCTGCTTGAT TTTTAATGCTAGTAATTTCAGTTGTAACAGGATGAAACATAAGCAAAGCGTAGTTTTCATAATT TATATCATAATATTTTTTTGCTTCGCTAAGTGAAATTTTATTATCGTTTAAAAGTTCTAAATCA GGCGAACCTATGATAAAAATAGATTTTTCATCTTCTCCAAGCTGCATTAAACGCCTTTTTGCAA ACTCATCATTTACTAAATGAATATGAGCTAGTTTTGATATAGCGTGGCGTAAGCTATCGTCAAT AGTTCCTGAAATCTCTCCGCCTTCAATATGCGCTACTAAGATATTATTTAATGCTCCAACAATA GCTGCTGCTAAAGGCTCAATTCTATCTCCATGTACTACGATTAAATCAGGTTTTAGCTCATTTG CATACCTTGAAAATCCATCAATTGTAGTAGCTAAAGCCTTATCAGTTTGATAATATTTATCATA ATTTATAAATTCATAAATATTTTTAAAGCCATTTTTATAAAGTTCTTTAACTGTATAGCCAAAA TTTTTACTTAAGTGCATTCCTGTTGCAAAGATGTAAAGTTCAAATTCGCTTGAGTTTTGCACCC TGTACATTAAAGATTTAATCTTAGAATAATCAGCCCTAGAGCCTGTTATAAAAAGGATTTTTTT CACGCAAAATCCTCATAGCTTAACTGAGCATCATTTTCTATATCTCTTAATGCTTTTTTGCCTA AAATATTTTCAAATTCAGCCGCACTAATTCCACCAAGTCCAGGTCTTTTAACCCAAATATTATC CATAGATAAAACTTCGCCTTTTTTAATATCTTTAATGCTAACTAGACTTGCAAAGGCAAAATCA ATTGTAACTTGTTCTTGTTTAGCCGCTTTTTTACTTTCATTATTTCCTCTTATTATAGCCATTT GCTCACTTTGTATAATTAGCTCTTTTAAAGCCTTTGTATCCATAGAACAAACTATATCAGGGCC ACTTCTATGCATACTATCAGTAAAATGTCTTTCAAGCACACAAGCTCCAAGTACAACTGCACCT AAACACGCAAGATTATCTGTTGTGTGGTCGCTTAAGCCTACCATACAAGAAAATTCTTTTTTTA ACTCAAGCATAGCGTTTAATCTTACAAGATTATGCGGGGTTGGGTAAAGATTGGTCGTGTGCAT TAAAACAAAAGGAATTTCATTGTCTAATAAGATTTTTACAGTTGGTTTTATACTTTCAATACTA TTCATTCCTGTGCTAACTATCATAGGCTTTTTAAAGGCTGCTATGTGTTTAATAAGCGGATAAT TATTACACTCACCTGAACCAATCTTAAAAGCACTAACTCCCATATCTTCTAAGCGGTTCGCACC TGCACGAGAAAAAGGTGTGCTAAGATAAACAAGACCTAATTTTTCTGTGTATTCTTTAAGTGCT AGCTCATCTTTATAATCCAAAGCACATTTTTGCATAATCTCATAAATGCTTATTTTTGCATTAC CAGGAATTACTTTTTTAGCGGCCTTACTCATCTCATCTTCAACAATATGAGTTTGATGCTTTAT AATCTTAGCACCTGCGCTAAAGGCTGCATCTACCATAATTTTAGCTAGTTCTAAACTGCCATTA TGATTAATGCCTATTTCAGGTACGACTAAGGGTGCTTTTTCTTCACTTATGATTATATTTTGTA TTTTTATTTCTTTCATTTATTTTCCTCCTTAGTCGACGGTACACTTAAAAGTTGATCGGATTCG GCATTTCTTCAAAAAAAATCTGTTCCGGAGTAATAACGTTCAGCATCAGCATCAGTTTGCTCAG GGTCTGGGATTTGGCATCGTTTTCATTTTTGATTTTTTCGGAGCCCAGGAATACTACTTTATCG ATGTTTTTCGGTGGCAGGCTAAAGTACACGGTAGACATGATGCCACCTACATAGTCCGGCAGAG AGTTGGTCATCATCAGAACTTCGAACGGGATGTTGGCCGGGATTTTTTCCGCACCGGTTTTGCG GATAATATAGTCGTTGATATCGTCGCCTTTCGGGTGGCCTTTGAAGAAGATTTTAATGTCGTTA CCCAGATAGAATTTGCCGTTCGGTTTGATAAAGGATTCCAGGATTTCCGTCTGCAGTTTCGCGT TGTTCAGACGTTTTTTTTTATCTTTCTCCCAGGTGGTGGTACCGGTGAAGATGAAAGTTTTATA TTCTTTGTTTTTGGTACCAATGTTCAGTTCGTTGAAGATGTTCTGATCAAAGTGAGTAATTTTC AGGAACAGTTCTTTCTGGTTCTTAGAGAAGTGAGAAAAGCGGCTCAGATCCATGCTAACAATGG TGTCAGACAGGAAATGCTTCAGCGGCTGCATCTTTTCGTCGATATCCAGATAGTCCGGGCGCAG CAGAATGTATTCGGTCGGAAAAATCTTGTGCCAAGTGTAACGGGTCAGAGAATGCAGTTTGTCG GTATCAGTTTCACGGTTCTCCAGTGCGTCCTTCAGCTTTTTCTGTGCTTCGATCAGGATTGCGC TGATGTTTTCCTGACGATGCTGATACAGATCTACGTAGTTACCAGAGCCGTCGTCGTACAGATA CAGCTTTTCGATGTTGATCTTGTGGAACAGCGGGGACAGGGTTTTGAAAATAGACAGCAGAGAA CGAACAGAATGATCCAGGTTAGTGTGAATAATCAGGTCCACCGGGGTATCGCTTTTTTCGATGT TCAGGTACAGTTTGTCGCTGAACTCCTTAATGTCCAGAATGCTCACTTTGGAGAACTCGATGCG CTCTTTGGAGAAAGAGTTCAGAATTTCTTTGTTGAAATCGAAGCGTTCAAAACCGATGATACGT TCCACTTTCTCATTATTTTTGTTTTCAACCAGATTCAGGATCATGTTCAGGCTAGGCAGGGATG CGTAGTCCAGGTAAATTTCCATATGTATATCTCCTTCTTGAATTCTAACAATTGATTGAATGTA TGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGT GTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGC ATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCC AACCAACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGT GTTTGTGCATCCATCTGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCT GAACGAAAACCCCCCGCGATTGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGC TTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAAT TTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAG TGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGTTTTTT ATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGCATTAATG AATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTGCTAGCGGAGTG TATACTGGCTTACTATGTTGGCACTGATGAGGGTGTCAGTGAAGTGCTTCATGTGGCAGGAGAA AAAAGGCTGCACCGGTGCGTCAGCAGAATATGTGATACAGGATATATTCCGCTTCCTCGCTCAC TGACTCGCTACGCTCGGTCGTTCGACTGCGGCGAGCGGAAATGGCTTACGAACGGGGCGGAGAT TTCCTGGAAGATGCCAGGAAGATACTTAACAGGGAAGTGAGAGGGCCGCGGCAAAGCCGTTTTT CCATAGGCTCCGCCCCCCTGACAAGCATCACGAAATCTGACGCTCAAATCAGTGGTGGCGAAAC CCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGCGGCTCCCTCGTGCGCTCTCCTGTTC CTGCCTTTCGGTTTACCGGTGTCATTCCGCTGTTATGGCCGCGTTTGTCTCATTCCACGCCTGA CACTCAGTTCCGGGTAGGCAGTTCGCTCCAAGCTGGACTGTATGCACGAACCCCCCGTTCAGTC CGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGAAAGACATGCAAAAGCA CCACTGGCAGCAGCCACTGGTAATTGATTTAGAGGAGTTAGTCTTGAAGTCATGCGCCGGTTAA GGCTAAACTGAAAGGACAAGTTTTGGTGACTGCGCTCCTCCAAGCCAGTTACCTCGGTTCAAAG AGTTGGTAGCTCAGAGAACCTTCGAAAAACCGCCCTGCAAGGCGGTTTTTTCGTTTTCAGAGCA AGAGATTACGCGCAGACCAAAACGATCTCAAGAAGATCATCTTATTAATCAGATAAAATATTTC TAGGCGGCCGCGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCT TCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAAC TTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGT TCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTG GCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAA CCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCT ATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTG CCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTC CCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGT CCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGC ATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAA GTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAAT ACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAAC TCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATC TTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCA AAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATT GAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAA ACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATT ATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTC (Δ20BstC) SEQ ID NO: 18 MEIYLDYASLPSLNMILNLVENKNNEKVERIIGFERFDFNKEILNSFSKERIEFSKVSILDIK EFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLFHKINIEKLYLYDDGSGNYV DLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRYTWHKIFPTEYILLRPDYLDI DEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFDQNIFNELNIGTKNKEYKTFI FTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDIKIFFKGHPKGDDINDYIIRK TGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNIDKVVFLGSEKIKNENDAKS QTLSKLMLMLNVITPEQIFFEEMPNPINF (BstC*) SEQ ID NO: 19 MRKIITFFSLFFSISAWCQKMEIYLDYHSLPSLNMILNLVENKNNEKVERIIGFERFDFNKE ILNSFSKERIEFSKVSILDIKEFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLF HKINIEKLYLYDDGSANYVDLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRY TWHKIFPTEYILLRPDYLDIDEKMQPLKHFLSDTIYSMDLSRFSHFSKNQKELFLKITHFD QNIFNELNIGTKNKEYKTFIFTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDI KIFFKGHPKGDDINDYIIRKTGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNI DKVVFLGSEKIKNENDAKSQTLSKLMLMLNVITPEQIFFEEMPNPINF (BstD*) SEQ ID NO: 20 MFKIKSYGKNPQLQAVDIYIDHATIPSLSYFLHFLKHKHDHQRLRLFSLARFEMPQTVIE QYEGIIQFSRNVEHNVEPLLEQLQTILSQEGKQFELHLHLNLFHSFEMFLNLSPTYTKYKE KISKIVLHLYDDGSAGVMKQYQLQKSSSLVQDLAATKASLVSLFENGEGSFSQIDLIRYV WNAVLETHYYLLSDHFLLDEKLQPLKAELGHYQLLNLSTYQYLSSEDLLWLKQILKIDA ELESLMQKLTAQPVYFFSGTTFLG (BstE*) SEQ ID NO: 21 MLIQQNLEIYLDHATIPSLACFMHFIQHKDDVDSIRLFGLARFDIPQSIIDRYPANHLFYHN IDNRDLTAVLNQLADILAQENKRFQINLHLNLFHSIDLFFAIYPIYQQYQHKISTIQLQLYD DGSAGIVTQHSLCKIADLEQLILQHKNVLLELLTKGTANVPNPTLLRYLWNNIIDSQFHLI SDHFLQHPKLQPLKRLLKRYTILDFTCYPRFNAEQKQLLKEILHISNELENLLKLLKQHNT FLFTGTTAFNLDQEKLDLLTQLHILLLNEHQNPHSTHYIGNNYLLLIKGHANSPALNHTL ALHFPDAIFLPANIPFEIFAMLGFTPNKMGGFASTSYINYPTENINHLFFLTSDQPSIRTKW LDYEKQFGLMYSLLAMQKINEDQAFMCTIHN (BstH*) SEQ ID NO: 22 MKRLFRLFLCLALLSGTAACSDDEVSQNLIVINGGEHFLSLDGLARAGKISVLAPAPWR VTKAAGDTWFRLSATEGPAGYSEVELSLDENPGAARSAQLAFACGDAIVPFRLSQGALS AGYDSPDYYFYVTHGTMPTLYAGMLLSHDKPGYVFYSRSKTFDPAEFPARAEVTTAAD RTADATQAEMEAMAREMKRRILEINSADPTAVFGLYVDDLRCRIGYDWFVAQGIDSAR VKVSMLSDGTGTYNAFYNYFGDAATAEQNWESYASEVEALDWNHGGRYPETRSLPEF ESYTWPYYLSTRPDYRLVVQDGSLLESSCPFITEKLGEMEIESIQPYEMLSALPESSRKRF YDMAGFDYDKFAALFDASPKKNLIIIGTSHADDASARLQRDYVARIMEQYGAQYDVFF KPHPADTTSAGYETEFPGLTLLPGQMPFEIFVWSLIDRVDMIGGYPSTVFLTVPVDKVRFI FAADAASLVRPLNILFRDATDVEWMQ (BstI*) SEQ ID NO: 23 MEFCKMATTQKICVYLDHATIPSLNYILHFAQHFEDQETIRLFGLSRFHIPESVIQRYPKG VYQFYPNQEKDFSALLLALKNILIEVKQQQRKCEIELHLNLFHYQLLLLPFLSLYLDTQD YCHLTLKFYDDGSAAISALQELALAPDLAAQIQFEKQQFDELVVKKSFKLSLLSRYFWG KLFESEYIWFNQAILQKAELQILKQEISSSRQMDFAIYQQMSDEQKQLVLEILNIDLNKVA YLKQLMENQPSFLFLGTTLFNITQETKTWLMQMHVDLIQQYCLPSGQFFNNKAGYLCF YKGHPNEKEMNQMILSQFKNLIALPDDIPLEILLLLGVIPSKVGGFASSALFNFTPAQIENI IFFTPRYFEKDNRLHATQYRLMQGLIELGYLDAEKSVTHFEIMQLLTKE (BstJ*) SEQ ID NO: 24 MLVNNQSHNPKLICWQRHPVNDEALLQGINAASFVSIASLCQHAATLLAGHPHSHITIYG NTYWSKDLARLIRYLTRISGVEIKKLELIDDGSSEYQKMFYWQRLHSEEQTRDLATGLK NLKSYLSGNDNKLLRLLTGHSNKLPRRLSSFMNWHQLFPTTYHMLRMDYLDKPELHQL KQYLGNNAQQIRWNYIADNLFDDEQQSLFYQLLGISLAEQKQLRAGRQQLHDFMFIGV DSSNASSKLQINVIADSRQESGIIPTITAKKMLFKGHPFANFNQTIVDAHQMGEMPAMIPF ETLIMTGNLPQKVGGMASSLYFSLPNNYHIEYIVFSGSKKDLEQHALLQIMLYLKVISPE RVYFSEQFKSC (BstM*) SEQ ID NO: 25 MKKPLIIAGNGPSIKDLDYSLFPKDFEVFRCNQFYFEDKYYLGREIKGVFFNPCVLSSQM QTAQYLMDNGEYSIERFFCSVSTDRHDFDGDYQTILPVEGYLHAHYPFVCDTFSLFKGH EEILRHVKYHLKTYSKELSAGVLMLLSAVVLGYKEIYLVGIDFGASSWGHFYDESQSQH FSNHMADCHNIYYDMFTICLCQKYAKLYALAPNSPLRHILALNPAAKYHFELLDKPIGY TSDLIVSLPLEEKLLEFKNIEEKLLEFKNIEEKLLEFKNIEEKLLVNRLKNILRKIKRKILPF WGGGGNTHLKVSFRWGVA (BstN*) SEQ ID NO: 26 MSEKIFSQVDEKNQKKPLIIAGNGPSIKDLDYSLFPKDFDVFRCNQFYFEDKYYLGKEVK GVFFNPCVFHNQMNTAKHLIDNNEYYIEQFFCSVSKEQHDFNGDYQTILSVDEYLHANY PFVRDTFSLFGEHEEILNHVKYHLKTYSKELSAGVLMLLSAIVLGYKEIYLVGVDFGANS WGHFYDDNQSQHFINHMADCHNIYYDMLTIYLCQKYAKLYALVPNSPLNHLLPLNLAA NHVFELLDKPIGYTSDLIVSSPLEEKLLESKNIDERFSQNKSFKNYLQRLKDKFLQMIFRG GGVITIPRVIFKGKFA (Δ20BstC*2) SEQ ID NO: 27 MEIYLDHASLPSLNMILNLVENKNNEKVERIIGFERFDFNKEILNSFSKERIEFSKVSILDIK EFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLFHKINIEKLYLYDDGSVNYV DLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRYTWHKIFPTEYILLRPDYLDI DEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFDQNIFNELNIGTKNKEYKTFI FTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDIKIFFKGHPKGDDINDYIIRK TGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNIDKVVFLGSEKIKNENDAKS QTLSKLMLMLNVITPEQIFFEEMPNPINF (Δ20BstC*3) SEQ ID NO: 28 MEIYLDHASLPSLNMILNLVENKNNEKVERIIGFERFDFNKEILNSFSKERIEFSKVSILDIK EFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLFHKINIEKLYLYDDGSLNYV DLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRYTWHKIFPTEYILLRPDYLDI DEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFDQNIFNELNIGTKNKEYKTFI FTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDIKIFFKGHPKGDDINDYIIRK TGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNIDKVVFLGSEKIKNENDAKS QTLSKLMLMLNVITPEQIFFEEMPNPINF (Δ20BstC*4) SEQ ID NO: 29 MEIYLDHASLPSLNMILNLVENKNNEKVERIIGFERFDFNKEILNSFSKERIEFSKVSILDIK EFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLFHKINIEKLYLYDDGSMNYV DLYQHRQENISAILIEAQKKLKDALENRETDTDKLHSLTRYTWHKIFPTEYILLRPDYLDI DEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFDQNIFNELNIGTKNKEYKTFI FTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDIKIFFKGHPKGDDINDYIIRK TGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNIDKVVFLGSEKIKNENDAKS QTLSKLMLMLNVITPEQIFFEEMPNPINF (Δ20BstC*5) SEQ ID NO: 30 MEIYLDHASLPSLNMILNLVENKNNEKVERIIGFERFDFNKEILNSFSKERIEFSKVSILDIK EFSDKLYLNIEKSDTPVDLIIHTNLDHSVRSLLSIFKTLSPLFHKINIEKLYLYDDGSFNYV DLYQHRQENISAILEAQKKLKDALENRETDTDKLHSLTRYTWHKIFPTEYILLRPDYLDI DEKMQPLKHFLSDTIVSMDLSRFSHFSKNQKELFLKITHFDQNIFNELNIGTKNKEYKTFI FTGTTTWEKDKKKRLNNAKLQTEILESFIKPNGKFYLGNDIKIFFKGHPKGDDINDYIIRK TGAEKIPANIPFEVLMMTNSLPDYVGGIMSTVYFSLPPKNIDKVVFLGSEKIKNENDAKS QTLSKLMLMLNVITPEQIFFEEMPNPINF - Identification of new STs using Pst6-224 from Photobacterium .spp strain JT-ISH-224
- Sialyltransferases identified from both prokaryotic and eukaryotic organisms are categorized into 5 distinct sequence families (GT29, GT38, GT42, GT52 and GT80) and possess at least two structural folds (GT-A and GT-B), (Audry, M., et al (2011). Glycobiology 21, 716-726). Eukaryotic sialytransferases (the GT29 family and GT-A fold) are transmembrane molecules found in the secretory pathway, and as such they present a heterologous expression problem for their use within the cytoplasm of engineered microbes as described herein. For this reason new examples in this family were not pursued, instead new sialyltransferases (STs) of the bacterial GT80 family (and the GT-B fold) were identified that were useful for synthesis of sialyl-oligosaccharides in engineered bacterial hosts.
- To this end, sequential screens of DNA sequence databases were performed. First, the sequence of a single known lactose-accepting α(2,6) sialyltransferase, Pst6-224 from Photobacterium spp. strain JT-ISH-224 (Drouillard, S., et al. (2010). Carbohydr Res 345, 1394-99 SEQ ID NO: 1), was used to search public databases to find simple homologs that might represent additional lactose-accepting STs. The amino acid sequence of Pst6-224 was used as a query in the search algorithm PSI-BLAST (Position Specific Iterated Basic Local Alignment Search Tool) in order to identify sequence homologs. The PSI-BLAST program, using a given query protein sequence, generates a list of closely related protein sequences based on a homology search of a database. These protein homolog hits are then used by the program to generate a profile reflecting their sequence similarities to the original query, The profile is then used by the algorithm to identify an expanded group of homolog proteins, and the process is iterated several times until the number of additional new candidates obtained after each iteration decreases (Altschul, S. F., et al. (1990) J. Mol. Biol 215, 403-410; Altschul, S. F., et al. (1997)
Nucleic Acids Res 25, 3389-3402). - The Pst6-224 amino acid sequence was used as a query for 6 iterations of the PSI-BLAST search algorithm. This approach yielded a group of unique 433 candidates with varying degrees of similarity to Pst6-224, many of which (117) were highly related to Pst6-224 (shared amino acid identity in the range of 50-90%) as well as a group that was more distantly related (shared amino acid identity less than 50%). Of note, Pst6-224 produced sub-optimal yields of 6′-SL, with a tendency to produce undesirable side products when used in a metabolically engineered E. coli production strain (Drouillard et al., 2010). in addition, elevated production of Pst6-224 appeared to be moderately toxic in certain E. coli production strains, including the preferred strain for use herein. Therefore, candidates for further analysis were deliberately (and somewhat counterintuitively) targeted from the more distantly related group identified via the PSI-BLAST search (shared amino acid identity to Pst6-224 of less than 30% over greater than 250 resides) (Table 1).
-
TABLE 1 Candidates further analyzed with less than 30% sequence identity to Pst6-224 % identity Gene Accession GT to SEQ name Organism number family Pst6-224 ID # Pst6- Photobacterium sp. JT- BAF92026.1 GT80 100 1 224 ISH-224 BstC Avibacterium WP_021724759.1 putative 26.1 2 paragallinarum GT80 BstD Actinobacillus ureae WP_005625206.1 n/a 8.9 3 BstE Haemophilus — ducreyi AAP95068.1 putative 15.9 4 GT80 BstH Alistipes (multispecies) WP_018695526.1 putative 13.3 5 GT80 BstI Bibersteinia trealosi AGH37861.1 putative 16.4 6 GT80 BstJ Shewanella piezotolerans YP_002314261.1 n/a 18.9 7 - This group of candidates shared certain similarities primarily within the catalytic domain region of the respective proteins as inferred from the observation that they all belong to the same Pfam protein family, but not necessarily similarities in their protein domain organization. It must be noted that the presence of a “sialyltransferase” Pfam domain ensures nothing obvious about the actual catalytic ability of the protein in term of specific activity, catalytic rate, substrate specificity and/or product specificity, and that substantial experimentation is required to verify candidate genes for their desired properties. This group of candidates may include similar, better or distinct α(2,6) ST activities relative to Pst6-224, but that they are different enough at the amino acid level to avoid the cryptic toxicity and other functional shortcomings (e.g. poorer specificity) observed with Pst6-224 expressed in production strains.
- These more distantly related (less than 30% sequence identity to Pst6-224) candidate STs were further screened to identify those candidate STs arising from bacterial species that may or are known to incorporate sialic acid into their cell surface glycan structures. Candidate STs from these types of organisms are more likely to utilize CMP-N-acetylneuraminic acid (CMP-Neu5Ac) as a sugar nucleotide donor substrate, given the presence of sialic acid in their surface carbohydrate structures. Candidate STs from commensals or pathogens were also identified. Such organisms sometimes display carbohydrate structures on their cell-surface that contain sialic acid. Again, candidate STs from these types of organisms are believed to be more likely to utilize CMP-Neu5Ac as a donor substrate and also to catalyze the linkage of sialic acid to useful acceptor oligosaccharides.
- candidate STs with identities to Pst6-224 ranging from 8.9 to 26.1% at the amino acid level were selected from PSI-BLAST screens based on these criteria (Table 1). These proteins were often annotated in databases as “hypothetical proteins” and had no assigned name. For ease of description, the genes encoding these proteins were named bst for bacterial sialyltransferase, followed by a letter identifying them uniquely.
- Database Screen Using MAC1268 from Helicobacter acinonychis (a Lactose-Utilizing α(2,6) ST) as the Search Probe.
- A second sequence database screen was conducted using a second lactose-utilizing α(2,6) ST as the search probe (HAC1268 from Helicobacter acinonychis (Schur, M. J., et al, (2012). Glycobiology 22, 997-1006, SEQ ID NO: 8). HAC1268 is a member of the GT42 sialyltransferase family, possessing a predicted structural fold (the GT-A fold) distinct from the Pst6-224 ST sequence (that was used as the probe in the first database screen, described above, in).
- Two candidate STs with identities to HAC1268 of 70.6% and 52.9% at the amino acid level (Table 2) were selected for further evaluation.
FIG. 2 presents a pairwise % amino acid sequence identity comparison between the two α(2,6) ST probe sequences and the 8 identified ST candidates. Synthetic bst genes for these candidates were designed and codon-optimized in silica for E. coli expression using standard bioinformatic algorithms known to the art, and engineered with modified ribosomal binding sites to tune translation to appropriate levels in E. coli. -
TABLE 2 Candidates identified and analyzed for further evaluation % identity Gene Accession GT to SEQ name Organism number family HAC1268 ID # HAC1268 Helicobacter acinonychis CAK00018.1 GT42 100 8 BstM Helicobacter pylori WP_000743106.1 putative 70.6 9 GT42 BstN Helicobacter cetorum WP_014661583.1 putative 52.9 10 GT42 - Of note, the first 20 residues of the amino acid sequence encoded by bsiC were predicted to harbor a signal sequence that would direct the protein to the secretory pathway in E. coli, therefore a version of bstC lacking these residues (termed Δ20bstC) was designed and tested (SEQ ID NO: 18)
- Also of note, the first 16 residues of the amino acid sequence of Pst6-224 were also predicted to harbor a signal sequence, therefore a version of the gene encoding Pst6-224 lacking these residues (termed Δ16Pst6-224) was designed and tested. Synthetic bst genes were synthesized in vitro by the Gibson Assembly method utilizing synthetic “gBlock” oligonucleotides (obtained from integrated DNA Technologies), and cloned using standard molecular biological techniques into E. coli expression plasmids.
- Expression Vector
- The expression vector utilized to express the candidate bst genes, and to test for their ability to make sialyllactose, is a p15A origin-based plasmid carrying the strong bacteriophage λ pL promoter to drive expression of heterologous genes. In addition, the plasmid carries α-lactamase (bla) gene for maintaining the plasmid in host strains using ampicillin selection (for convenience in the laboratory), and additionally it carries a native E. coli thyA (thymidylate synthase) gene as an alternative means of selection in thyA minus hosts. The plasmid also carries, downstream of the pL promoter and in an operon configuration downstream of the candidate bst gene, three heterologous biosynthetic genes from Campylobacter jejuni (neuB, neuC, and neuA; encoding N-acetylneuraminate synthase, UDP-N-acetylglucosamine 2-epimerase, and N-acetylneuraminate cytidylyltransferase respectively). These enzymes confer on E. coli the ability to convert UDP-GlCNAc into CMP-Neu5Ac. CMP-Neu5Ac is then available as a donor substrate for the candidate sialyltransferases to utilize in converting intracellular lactose to sialyllactose.
FIG. 3 is a map of this expression vector carrying one of the candidate ST genes, bstN (plasmid pG543, SEQ ID NO: 11). - Development of Hosts Strain
- The candidate sialyltransferase gene expression plasmids were transformed into a host strain useful for the production of sialyllactose (SL). Biosynthesis of SL requires the generation of an enhanced cellular pool of both lactose and CMP-Neu5Ac (
FIG. 4 outlines the scheme for SL biosynthesis in engineered E. coli). The wild-type Escherichia coli K12 prototrophic strain W3110 was selected as the starting point for engineering a host background to test the ability of the candidates to catalyze sialyllactose production (Bachmann, B. J. (1972). PBacteriol Rev 36, 525-557). The particular W3110 derivative employed was one that previously had been modified by the introduction (at the ampC locus) of a tryptophan-inducible PtrpB cI+ repressor cassette, generating an E. coli strain known as G1724 (LaVallie et al., 200). - Other features of GI724 include lacIq and lacPL8 promoter mutations. E. coli strain GI724 affords economical production of recombinant proteins from the phage λ PL promoter following induction with low levels of exogenous tryptophan (LaVallie, E. R., et al. (1993). Biotechnology (NY) 11, 187-193; Mieschendahl Petri, and Hänggi (1986). Bio/
Technology 4, 802-08). Additional genetic alterations were made to this strain to promote the biosynthesis of SL. This was achieved in strain GI724 through several manipulations of the chromosome using λ Red recombineering (Court, D. L., et al. (2002). Annu Rev Genet 36, 361-388) and generalized P1 phage transduction (Li, X. T., et al. (2013), Nucleic Acids Res 41, e204). - First: the ability of the E. coli host strain to accumulate intracellular lactose was engineered by deletion of the endogenous β-galactosidase gene (lacZ). The strain thus modified maintains its ability to transport lactose from the culture medium (via LacY, the lactose permease). but is deleted for the wild-type copy of the lacZ gene responsible for lactose catabolism. An intracellular lactose pool is therefore created when the modified strain is cultured in the presence of exogenous lactose. In addition, the lacA gene was deleted in order to eliminate production of acetyl-lactose from the enhanced pool of intracellular lactose. In a variation of this strain, the lacZ and LacI genes were simultaneously deleted such that the enhanced constitutive lacIq promoter was placed immediately upstream of the lactose permease gene lacY.
- Second: A pool of the sugar nucleotide donor CMP-Neu5Ac was generated in the cytosol of the cell by co-expression of three genes from Campylobacter jejuni ATCC43484 (detailed above) encoding i) N-acetylneuraminate synthase (NeuB), ii) UDP-N-acetylglucosamine 2-epimerase (NeuC), and iii) N-acetylneuraminate cytidylyltransferase (NeuA). The neuBCA gene products function together in the enzymatic conversion of endogenous UDP-GlcNAc to CMP-Neu5Ac. The neuBCA genes are co-expressed in an operon, downstream from the hst gene on the plasmid expression vector and driven from the pL promoter, In addition, to prevent degradation of the Neu5Ac utilized to produce CMP-Neu5Ac, endogenous host cell genes encoding enzymes involved in sialic acid degradation were specifically deleted using λ red recombineering. The sialic acid catabolic pathway in E. coli is encoded by the nan operon, consisting of the nanRATEK genes (Hopkins, A. P., et al. (2013).
FEMS Microbiol Lett 347, 14-22). Specifically, the nanATE genes were deleted to stabilize CMP-Neu5Ac pools within the cell. - In other embodiments of the SL production strain, a thyA (thymidylate synthase) mutation was introduced to the strain by almost entirely deleting the thyA gene and replacing it by an inserted functional, wild-type but promoter-less E. coli lacZ+ gene carrying a weak ribosome binding site (ΔthyA::0.8RBS lacZ+). This chromosomal modification was constructed utilizing λ red recombineering. In the absence of exogenous thymidine, thyA strains are unable to make DNA and die. This defect can be complemented in trans by supplying a wild-type thyA gene on a multi-copy plasmid (Belfort, M., et al. (1983), Proc Natl
Acad Sci USA 80, 1858-861). This complementation scheme was used as a means of plasmid maintenance. - Further, the inserted 0.8RBS lacZ+ cassette not only knocks out thyA, but also converts the lacZ− host back to both a lacZ+ genotype and phenotype. The modified strain produced a minimal (albeit still readily detectable) level of β-galactosidase activity (0.3 units), which has very little impact on sialyllactose production during bioreactor production runs, but which is useful in removing residual lactose at the end of runs, and as an easily scorable phenotypic marker for moving the thyA region into other lacZ− E. coli strains by P1 phage transduction.
- The final strain used the test the ST candidate genes (E1406) had the following genotype:
- PlacIq-lacY, Δ(lacI-lacZ), ΔlacA, ΔthyA::(0.8RBS lacZ+), amp C::(Ptrp M13g8 RBS-λcI+, CAT), ΔnanATE::scar
- Transformants of this strain harboring the different ST (bst) candidate expression plasmids were evaluated for their ability to synthesize sialyllactose in 20×150 mm test tubes, containing 6 mL of IMC medium (“Induction Medium Casamino acids”) (LaVallie, E. R., DiBlasio, E. A., Kovacic, S., Grant, K. L., Schendel, P. F., and McCoy, J. M. (1993). A thioredoxin gene fusion expression system that circumvents inclusion body formation in the E. coli cytoplasm. Biotechnology (NY) 11, 187-193, the entire content of which is incorporated herein by reference) of the following recipe:
- Na2HPO4=6 g/L
- KH2PO4=3 g/L
- NaCl=0.5 g/L,
- NH4Cl=1 g/L
- 1 mM MgSO4
- 0.1mM CaCl2
- 0.5% glucose w/v
- 0.4% casamino acids (Difco technical)
- In some embodiments, the glucose and/or casamino acids concentrations are varied in the 0.05-1% range.
- Cell Growth Expression and Characterization
- Tubes were inoculated to 0.1 OD600/mL with strains comprising E1406 transformed with individual candidate bst+neuBCA expression plasmids, and were then incubated at 30° C. for 120 minutes with continuous aeration on a roller drum. Tryptophan was then added to the cultures to a concentration of 200 μg/mL to induce bst gene and neuBCA operon expression, along with the addition of lactose as the acceptor sugar to a concentration of 1% w/v. The culture was left at 30° C. with roller drum aeration for a further 22 h. At the end of this
period 20 OD600 of cells from each culture were pelleted by centrifugation (14,000×g, 1 min), re-suspended in 200 μl of water and heated to 98° C. for 10 min to release cytoplasmic sugars. After clearing the suspension by centrifugation, 2 μl aliquots were applied to 10×20 cm aluminum-backed silica thin layer chromatography plates (Machery-Nagel #818163). Chromatograms were developed in n-butanol/lacetic acid/water (2:1:1), and visualized by heating after spraying with 3% w/v α-napthol in 12% H2SO4/80% ethanol/8% water.FIG. 5 shows the result. - Prominent spots corresponding to the intracellular lactose pool were seen in the control strain (E1406, that does not contain an bst+neuBCA expression plasmid) and also in all bst candidate cultures. The E1406 control showed no spot corresponding to sialyllactose, whereas all other cultures displayed a spot co-migrating with a sialyllactose standard that comprised a mixture of 6′-SL and 3′-SL (these species do not resolve from each other in this TLC system). Not shown are cultures expressing candidate genes bstD and bstJ. Neither of these produced any detectable sialyllactose, and thus these genes most probably represent “false positive hits” in the database screen.
- Of note in
FIG. 5 is a spot running above sialyllactose in several of the candidates. This spot corresponds to KDO-lactose, and results from a linkage of the E. coli lipopolysaccharide precursor, 2-keto-3-deoxyoctulosonic acid (KDO) with lactose, as a result of relaxed substrate specificity exhibited by individual bst enzymes that utilize the endogenous E. coli pool of CMP-KDO as an alternative to the engineered pool of CMP-Neu5Ac as described herein. As can be seen inFIG. 5 , Pst6-224 (as expected from the literature, Drouillard, S., et al. (2010). Carbohydr Res 3-15, 1394-99) generated the unwanted KDO-lactose product. However several of the bst candidates produced little if any KDO-lactose under the same culture conditions (e.g. 134E, BstM, BstN), highlighting the utility of these enzymes for the production of purer preparations of sialyl-oligosaccharides - Identification of the Sialyl-Acceptor Sugar Bond Specificity
- Characterization and Identification via HPLC
- ST enzymes Pst6-224 and HAC1268, whose amino acid sequences were used as probes for the database screens, have been previously characterized biochemically and are known to be α2,6 sialyltransferases (Drouillard, S., et al. (2010). Carbohydr Res 3-15, 1394-99, Schur, M. J., et al. (2012). Glycobiology 22, 997-1006). However the sialyl-acceptor sugar bond specificity (i.e. α(2,3)- or α(2,6)-) of the candidate bst enzymes of the present invention were unknown. To discover their sialyl-acceptor sugar bond specificity the same cytoplasmic extracts analyzed by TLC above (
FIG. 5 ) were also analyzed utilizing a HPLC system capable of resolving 6′-SL from 3′-SL. The heat extract samples (described above) were made 15 mM in potassium phosphate (pH 4) and 60% in acetonitrile. They were then applied to a TSKgel Amide-80 column (5 μm particle size, 4.6×250 mm) and eluted under isocratic conditions of 67% acetonitrile/15 mM potassium phosphate, pH4.0, 1 mL/min, 60° C., with UV detection at 210 nm.FIGS. 6A, 6B, and 6C show UV traces from HPLC runs for the various heat extracts. In thissystem 3′-SL eluted at ˜8.8 minutes, whereas 6′-SL eluted at ˜10.1 minutes. Data is presented in Table 3. -
TABLE 3 Summary of the discovered sialyl-acceptor sugar bond specificity of the new bst enzymes Gene Accession GT Sialyltransferase SEQ name Organism number family activity ID # Pst.6-224 Photobacterium sp. JT- BAF92026.1 GT80 α(2,6) sialyltransferase 1 ISH-224 BstC Avibacterium WP_021724759.1 putative α(2,3) sialyltransferase 2 paragallinarum GT80 BstC* Avibacterium WP_021724759.1 putative α(2,6) + α(2,3) 15 paragallinarum GT80 sialyltransferase BstD Actinobacillus ureae WP_005625206.1 n/a unknown/not an ST 3 BstE Haemophilus — ducreyi AAP95068.1 putative α(2,3) sialyltransferase 4 GT80 BstH Alistipes (multispecies) WP_018695526.1 putative α(2,3) sialyltransferase 5 GT80 BstI Bibersteinia trealosi AGH37861.1 putative α(2,3) sialyltransferase 6 GT80 BstJ Shewanella YP_002314261.1 n/a unknown/not an ST 7 piezotolerans HAC1268 Helicobacter CAK00018.1 GT42 α(2,6) sialyltransferase 8 acinonychis BstM Helicobacter pylori WP_000743106.1 putative α(2,6) sialyltransferase 9 GT42 BstN Helicobacter cetorum WP_014661583.1 putative α(2,6) sialyltransferase 10 GT42 - Characterization and identification via NMR
- A secondary confirmation was sought through NMR. (nuclear magnetic resonance) spectroscopy, for the structure of SL (6′-SL) produced utilizing the BstM and BstN enzymes.
- Large Scale Production of SL
- To this end, and to produce sufficient SL for the analyses, 2L fermentation runs were performed on derivatives of strain E1406 harboring either BstM or BstN expression plasmids (i.e. pG549, SEQ ID NO: 12 or pG543, SEQ ID NO: 11) respectively. Strains were grown in Ferm 4a mineral medium to early exponential phase to produce a seed culture.
-
- 4 g (NH4)2HPO4
- 10 g KH2PO4
- 0.25 g MgSO4.7H2O
- 0.4 g NAM
- 17g glucose
- (adjusted to pH6.8 with additional NaOH if required)
- A portion of this seed culture was then inoculated into a 2L bioreactor containing 900 mL of the same medium (but containing an additional 0.75 g/L MgSO4.7H2O, 1 mL of DF204 antifoam, and 10 mL of trace metals solution).
-
- 13.4 g NTA (nitrilotriacetic acid)
- 5 g FeSO4.7H2O
- 0.85 g MnCl2.4H2O
- 0.9 g ZnSO4 .7H2O
- 0.14 g CoCl2.6H2O
- 0.085 g CuCl2.2H2O
- 0.17 g H3BO3
- 0.09 g Na2MoO4.2H2O
- The optical density of cells in the fermenter vessel after inoculation was 0.006 at 600 nm (OD600)
- Strains were grown in the fermenter in batch mode at 30° C. with pH control to pH 6.8 (adjusted automatically with additions of 7.4M NH4OH) for approximately 16 h, at which point glucose exhaustion occurred as indicated by an increase in dissolved oxygen levels and a decrease in agitation speed. A fed-batch continuous glucose feeding regimen was then initiated (9.1 g of a 50% w/v glucose feed solution/h) such that the culture was maintained under carbon-limitation. After 2 h a bolus of 45.5 g of a 11.4% w/v lactose solution was added, and a continuous lactose feed of 2.2 g/h of the same solution was initiated. Simultaneously a bolus of 41.2 g of a 2% w/v tryptophan solution was added to initiate Est expression. This bolus was repeated 2 more times at 24 h intervals during the ensuing fed-batch fermentation phase. which continued for a further 70 hours, during which 50% saturation of dissolved oxygen was maintained using an agitation to air enrichment cascade with initial 0.18 standard liter per minute aeration. Optical density was −120 OD600 at the end of fermentation. At harvest, whole fermentation broth was adjusted to 80 mM CaCl2 by the addition of a 1M CaCl2 stock solution, and after standing overnight at 4° C. was clarified by centrifugation at 4,000×g for 1 h,
- NMR Analysis
- A portion of the clarified culture supernatant was then used for purification of sialyllactose samples for NMR analysis using the following protocol:
-
- 1. Cations were removed (and proteins precipitated) by addition of solid Amberlite IR120 [H+ form] to the clarified CaCl2-treated broth to reach
pH 2. - 2. The treated supernatant was clarified by centrifugation. Strong acids were subsequently removed by addition of
Dowex 66 resin [free-base form] untilpH 6 was reached. Clarified by centrifugation again. - 3. Loaded onto a
Dowex 1×4, 200-400 mesh column [HCO3 − form]. SL binds to this column. - 4. The column washed with water.
- 5. SL was eluted from the column with 0.1M NaHCO3
- 6. Na+ was removed from the sialyllactose eluate by adding Amberlite IR120 [H+ form] to reach
pH 3.
- 1. Cations were removed (and proteins precipitated) by addition of solid Amberlite IR120 [H+ form] to the clarified CaCl2-treated broth to reach
- 7. The SL solution was adjusted to pH to 6 with NaOH, rotary evaporated, then lyophilized to dryness.
-
FIG. 7 shows a typical thin layer chromatogram of fractions from theDowex 1×4 column. Typicallyfraction 3 was the purest fraction and, after desalting, was suitable for NMR analysis. - The 1D 1H NMR spectrum of SL samples produced by BstM (BstM-SL) and BstN (BstN-SL), (
FIG. 8 andFIG. 9 respectively), showed three anomeric signals: δ 5.22 (A), δ 4.66 (B), both attributed to a reducing-end Glcp, and δ 4.42 (C) assigned to β-Galp residue (Table 4). In the heteronuclear multiple bond correlation (HMBC) spectrum, a cross peak observed at δH 4.42/δC 80.8 indicated that β-Galp (C) is linked to the 4-position of reducing-end Glcp (A, B). In the heteronuclear single quantum coherence (HSQC) spectrum, a downfield shift observed for C-6 (δ 64.7) of β-Galp indicated that residue C is 6-substituted. In the HMBC spectrum, cross peaks observed at δH 3.59, 3.96/δC 101.5 (between H-6 of β-Gal and C-2 of α-Neu5NAc), indicated that terminal α-Neu5NAc (D) is linked to 6-position of β-Gal (C). -
TABLE 4 Chemical shifts assignments of 6′-sialyllactose and 6′KDOlactose Glycosyl 5- NAc Residue Nuclei 1 2 3 4 5 6 7 8 9 CH3COO A 4-α-Glc 1H 5.22 3.60 3.83 3.62 3.95 3.88/3.80 (J = 3.7) 12C 93.0 72.2 72.8 80.8 71.2 61.2 B 4-β-Glc 1H 4.66 3.29 3.64 3.63 3.60 3.95/3.77 (J = 8) 12C 96.8 74.8 75.8 80.8 75.9 61.5 C 6-β-Gal 1H 4.42 3.53 3.71 3.92 3.79 3.96/3.59 (J = 8) 13C 104.3 72.1 73.6 69.8 74.9 64.7 D α-Neu5NAc 1H — — 2.71/1.74 3.66 3.84 3.66 3.55 3.88 3.87/3.63 2.02 13C 174.6 101.5 41.3 69.5 54.8 73.4 69.6 73.1 63.8 23.2/176.0 E α-KDO 1H — — 2.05/1.78 4.19 4.03 3.37 13C 176.4 101.5 35.2 66.9 67.4 63.8 - Taking into account 2D NMR data, the major compound present in both samples was 6′-sialyllactose. Minor levels of KDO-lactose were also found in both samples.
- Enzyme Engineering to Alter be Regioselectivity of BstC and BstE from α(2,3)- to α(2,6)-selective
- Several of the bst candidates that were selected and tested from the screen were α(2,3)-selective rather than α(2,6)-selective, including enzymes BstC, BstE, BstH and BstI. Enzyme engineering strategies to alter the regioselectivity of BstC and BstE from α(2,3)- to α(2,6)-selective were explored (Schmölzer, K., et al. (2015). Chem Commun (Camb) 51, 3083-86; Schmölzer, K., et al. (2013).
Glycobiology 23, 1293-1304). A sialyltransferase from Pasteurella dagmatis, (PdST,accession 4 WP005762792.1, SEQ ID NO: 13) was shown to exhibit α(2,3)-selective activity when purified and used in vitro to catalyze SL formation from lactose and CMP-Neu5Ac precursors (Schmölzer, K., et al. (2015). Chem Commun (Camb) 51, 3083-86), A subsequent study from the same group demonstrated that structure-guided substitution of specific amino acids within the acceptor binding site of PdST completely switched the enzyme's regioselectivity from α(2,3)-selective to α(2,6)-selective. Specifically, double mutations of P7H and M117A in the PdST sequence had the effect of converting PdST from an α(2,3)-selective ST to a α(2,6)-selective ST in vitro (Schmölzer, et al. (2013).Glycobiology 23, 1293-1304). - Without being bound by any scientific theory, structurally equivalent mutations introduced into the acceptor binding site of the first enzymes herein may produce a similar switch in regioselectivity. Two candidates, Δ20BstC and BstE, were selected to explore the approach. To this end, a Δ20bstC and bstE synthetic genes incorporating the appropriate codon changes (hereafter referred to a Δ20bstC* and bstE* were synthesized in vitro by the Gibson Assembly method from gBlock oligonucleotides, and cloned by standard molecular biological techniques into E. coli expression plasmids.
FIG. 10 is an alignment of wild type PdST, Δ20BstC and BstE Δα(2.,3) sialyltransferases. Also shown in the alignment are mutant forms of the three enzymes, named PdST* (SEQ ID NO: 14, the published mutant known be switched in regioselectivity from α(2,3) to α(2,6)), Δ20BstC* (SEQ ID NO: 15) and BstE* (SEQ ID NO: 16), mutants designed and tested herein. Mutated regions are indicated in the alignment by black stars and the mutated residues are shown in lower case. Specifically, the amino acid substitutions Y7H and G122A were introduced into the Δ20BstC sequence to generate Δ20BstC* while Y13H and E128A were introduced to the BstE sequence to generate BstE*. - Δ20bstC* (pG544, SEQ ID NO: 17) and bstE* expression plasmids were transformed into the engineered E. coli production host. Strains were grown in IMC media to early exponential phase at 30° C. before tryptophan (200 mg/mL) and lactose (1%) were simultaneously added to initiate SL biosynthesis. At the end of the synthesis period (24 h), equivalent OD600 units of each strain were harvested, and cell lysates were prepared by heating for 10 minutes at 98° C. and centrifugation to release intracellular SL. Lysates containing synthesized SL were then treated with sialidase S (specific for α(2,3) linked Neu5Ac) or sialidase C (acts on both α(2,3) or α(2,6) linked Neu5Ac) to analyze whether engineered Δ20BstC* or BstE* were capable of catalyzing synthesis of 6′-SL rather than 3′-SL.
- As shown in
FIG. 11 , SL synthesized by BstE*-producing cells was efficiently converted to lactose by both sialidase S and sialidase C. This result indicates that BstE* still possessed exclusively α(2,3)-selective activity, and that the introduced mutations did not alter regioselectivity of the enzyme as was predicted. However in stark contrast, SL synthesized by Δ20BstC* remained susceptible to digestion with sialidase C but appeared largely resistant to treatment with sialidase S. This result demonstrates the regioselectivity of Δ20BstC* had been successfully altered from α(2,3) to α(2,6), and that the engineered enzyme primarily catalyzed 6′-SL synthesis rather than 3′-SL synthesis in the production strain. - SL synthesized by the Δ20BstC* expressing strain was then purified and subjected to NMR spectroscopy to confirm its identity and purity.
FIG. 12 shows the 1D-proton NMR spectrum of SL produced by Δ20BstC*. Characteristic features of the spectrum were 4 distinct anomeric peaks and the up-field signals of axial and equatorial H-3 of sialic acid. The latter consisted of two pairs of distinct signals in a ratio of about 5:1. Extensive 2-D NMR analysis (FIG. 13 ) showed that the larger signals belong to 6′-sialyllactose, whereas the smaller one was part of contaminating 3′-sialyllactose. The chemical shift assignment of these two components is listed in Table 5. The analysis revealed that the SL synthesized by Δ20BstC* was comprised of a mixture of 84% 6′-SL and 16% 3′-SL. Therefore, introduction of the Y7H-G122A mutations into the Δ20BstC* acceptor binding site strongly biased the regioselectivity of the enzyme towards forming α(2,6) Neu5Ac linkages and enabled strains producing Δ20BstC* to synthesize primarily 6′-SL rather than 3′-SL. - Surprisingly the engineered Δ20BstC* mutant protein generates much less KDO-lactose when used to produce sialyllactose in E. coli than does its wild-type parent, Δ20BstC (see
FIG. 5 ). The active site mutations Y7H and G122A introduced into Δ20BstC to generate Δ20BstC* result not only in a switch of regiospecifiity from α(2,3) to α(2,6), but also reduce the ability of the enzyme to utilize CMP-KDO as a substrate, thus leading to a purer sialyllactose product profile. - Enzyme Enoineerin to Further Improve the α(2,6)-Regioselectivity of Δ20BstC*
- To improve upon the regioselectivity of the new enzyme variant Δ20BstC*, further enzyme engineering strategies were explored (Guo, Y, et al (2015) Enzyme and Microbial Technology 78, 54-62; McArthur, B. et al. (2017) Organic &
Biomolecular Chemistry 15, 1700-1709). A double mutant P34H/M144L of a sialyltransferase from Pasteurella multocida (PmST1, accession #AAY89061) was found to increase the enzyme's regioselectivity from 3.9% to 98.7% α(2,6)-selective. Structurally equivalent amino acid substitutions at position 122 of the amino acid sequence of Δ20BstC* would improve the enzyme's α(2,6)-regioselectivity. Specifically, the amino acid substitutions A122V, A122L, A122M and A122F were introduced to Δ20BstC* to generate Δ20BstC*2 (SEQ ID NO: 27) Δ20BstC*3 (SEQ ID NO: 28), Δ20BstC*4 (SEQ ID NO: 29) and Δ20BstC*5 (SEQ ID NO: 30), respectively. - Δ20BstC*2, Δ20BstC*3, Δ20BstC*4 and Δ20BstC*5 expression plasmids were transformed into engineered E. coli production host. Strains were grown in Ferny 4a media to early exponential phase at 30° C. before tryptophan (200 mg/mL) and lactose (1%) were simultaneously added to initiate SL biosynthesis. At the end of the synthesis period (24 h), equivalent OD600 units of each strain were harvested, and cell lysates were prepared by heating for 10 minutes at 98° C. and centrifugation to release intracellular SL. TLC analysis of the heat extracts showed SL synthesis, and also showed similarly reduced or negligible amounts of KDO-lactose production as was seen for Δ20BstC*, which was in contrast to the level of KDO-lactose synthesis that had been observed for the native wild-type enzyme Δ20BstC (
FIG. 5 ). - To determine 6′SL to 3′SL ratios, the various mutant Δ20BstC* strains were harvested and extracted using 5 ml potassium phosphate (pH 4.0) in 70% acetonitrile and analyzed utilizing a HPLC system capable of resolving 6′-SL from 3′-SL. The extracted samples (described above) were applied to a TSKgel Amide-80 column (5 μm particle size, 4.6×250 mm) and eluted under isocratic conditions of 5 mM potassium phosphate (pH 4.0) in 70% acetonitrile, 1 mL/min, at room temperature with UV detection at 210 nm,
-
FIG. 16 shows exemplary HPLC for the various extracts. In this system, 3′SL eluted at about 15.5 minutes, whereas 6′SL eluted at about 18.3 minutes. Data is presented in Table 5. The analysis revealed that the mutations A122F, A122M, A122L, and A122V resulted in about 2%, 4%, 6% and 8% increase, respectively, in α(2,6)-regioselectivity compared to Δ20BstC*. - Table 5 shows HPLC analysis of regioselectivity of Δ20BstC* mutants.
-
Peak Area (mAu · min) Sample Mutation 3′ SL 6′ SL % 6′SL Δ20BstC* — 353.9 2260.8 86.5 Δ20BstC*2 A122V 163.4 2608.6 94.1 Δ20BstC*3 A122L 221.8 2585.9 92.1 Δ20BstC*4 A122M 336.6 3150.6 90.3 Δ20BstC*5 A122F 393.8 3096.3 88.7 - In summary, wild-type Δ20BstC is a lactose utilizing α(2,3) sialyltransferase that produced 3′-SL in the engineered E. coli strain described herein. This enzyme was engineered by introducing two specific active site mutations each, to generate new enzyme variants with altered regiospecificity: Δ20BstC*, Δ20BstC*2, Δ20BstC*3, Δ20BstC*4 and Δ20BstC*5, that synthesize an 85:15, 94:6, 92:8, 90:10, and 89:9 mixture of 6′-SL:3′-SL, respectively. These enzyme variants enabled the production of two of the major sialylated hMOS from human milk (Bao, Y., Zhu, L, and Newburg, D. S. (2007) Anal Biochem 370, 206-214) in predictable ratios, while possessing an ability to generate reduced amounts of KDO-lactose. The ability to produce two sialyllactose species within the course of a single biofermentation, may offer significant advantages in terms of time and cost of production over two separate fermentations.
Claims (29)
1. A method for producing a sialylated oligosaccharide in a bacterium comprising providing a bacterium comprising an exogenous lactose-utilizing sialyltransferase enzyme, wherein the enzyme comprises an amino acid sequence that is
(i) from 5% to 30% identical to the amino acid sequence of Pst6-224 (SEQ ID NO: 1) over a stretch of at least 250 amino acids; or
(ii) from 45% to 75% identical to the amino acid sequence of HAC1268 (SEQ ID NO: 8) over a stretch of at least 250 amino acids.
2. The method of claim 1 , wherein the enzyme comprises an amino acid sequence that is from 5% to 100% identical to the amino acid sequence of one or more of BstN (SEQ ID NO:10), BstC (SEQ ID NO: 2), Δ20BstC*2 (SEQ ID NO:27), Δ20BstC*3 (SEQ ID NO: 28), Δ20BstC*4 (SEQ ID NO: 29), Δ20BstC*5 (SEQ ID NO: 30), BstD (SEQ ID NO: 3), Δ20BstC* (SEQ ID NO: 15), Δ20BstC (SEQ ID NO: 18), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), or BstM (SEQ ID NO: 9).
3. The method of claim 1 , wherein the amino acid sequence of the enzyme
i) is less than 100% identical to the amino acid sequence of BstN (SEQ ID NO:10), BstC (SEQ ID NO: 2), ABstC*2 (SEQ ID NO:27), Δ20BstC*3 (SEQ ID NO: 28), Δ20BstC*4 (SEQ ID NO: 29), Δ20BstC*5 (SEQ ID NO: 30), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), or BstM (SEQ ID NO: 9);
ii) comprises no deletions or insertions compared to BstN (SEQ ID NO:10, BstC (SEQ ID NO: 2), ABstC*2 (SEQ ID NO:27), Δ20BstC*3 (SEQ ID NO: 28), Δ20BstC*4 (SEQ ID NO: 29), Δ20BstC*5 (SEQ ID NO: 30), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), or BstM (SEQ ID NO: 9); and/or
iii) comprises one or more conservative amino acid substitutions to the amino acid sequence of BstN (SEQ ID N: 10), BstC (SEQ ID NO: 2), ΔBstC*2 (SEQ ID NO:27), Δ20BstC*3 (SEQ ID NO: 28), Δ20BstC*4 (SEQ ID NO: 29), Δ20BstC*5 (SEQ ID NO: 30), BstD (SEQ ID NO: 3), Δ20BstC (SEQ ID NO: 18), Δ20BstC* (SEQ ID NO: 15), BstE (SEQ ID NO: 4), BstE* (SEQ ID NO: 16), BstH (SEQ ID NO: 5), BstI (SEQ ID NO: 6), BstJ (SEQ ID NO: 7), or BstM (SEQ ID NO: 9).
4.-11. (canceled)
12. The method of claim 1 , wherein the sialyltransferase
i) comprises an α(2,3) sialyltransferase or an α(2,6) sialyltransferase; and/or
ii) a mutation compared to a naturally occurring α(2,3) sialyltransferase.
13. (canceled)
14. The method of claim 13 , wherein when the amino acid sequences of the enzyme and BstE* are aligned, then the enzyme comprises:
i) a mutation;
ii) a non-conservative mutation; and/or
iii) a histidine or an alanine
at the position that aligns with position 13 and/or position 130 of the amino acid sequence of BstE* (SEQ ID NO: 16).
15.-21. (canceled)
22. The method of claim 1 , wherein the Cα root-mean-square deviation (RMSD) between the backbone of the enzyme and a naturally occurring sialyltransferase is less than 3 Å.
23.-25. (canceled)
26. The method of claim 1 , further comprising retrieving the sialylated oligosaccharide from the bacterium or from a culture supernatant of the bacterium.
27. The method of claim 1 , wherein the sialylated oligosaccharide comprises a sialyllactose.
28. The method of claim 1 , wherein the sialylated oligosaccharide comprises 3′-sialyllactose (3′-SL), 6′-sialyllactose (6′-SL), 3′-sialyl-3-fucosyllactose (3′-S3FL), sialyllacto-N-tetraose a (SLNT a), sialyllacto-N-tetraose b (SLNT b), disialyllacto-N-tetraose (DSLNT), sialyllacto-N-fucopentaose II (SLNFP II), or sialyllacto-N-tetraose c (SLNT c).
29. The method of claim 1 , wherein the bacterium further comprises
i) an exogenous or endogenous lactose-utilizing α(1,3) fucosyltransferase enzyme, an exogenous or endogenous lactose-utilizing α(1,4) fucosyltransferase enzyme, an exogenous or endogenous α(1,3) galactosyltransferase enzyme, an exogenous or endogenous α(1,4) galactosyltransferase enzyme, an exogenous or endogenous β-1,3-N-acetylglucosaminyltransferase, or any combination thereof;
ii) an exogenous or endogenous N-acetylneuraminate synthase, an exogenous or endogenous UDP-N-acetylglucosamine 2-epimerase, an exogenous or endogenous N-acetylneuraminate cytidylyltransferase, or any combination thereof;
iii) an exogenous N-acetylneuraminate synthase, UDP-N-acetylglucosamine 2-epimerase, and N-acetylneuraminate cytidylyltransferase from Campylobacter jejuni;
iv) a reduced level of β-galactosidase activity compared to a corresponding wild-type bacterium;
v) a deleted or inactivated endogenous β-galactosidase gene;
vi) a deleted or inactivated endogenous lacZ gene and/or a deleted or inactivated endogenous lacI gene;
vii) an endogenous β-galactosidase gene, wherein at least a portion of a promoter of the endogenous β-galactosidase gene has been deleted;
viii) an exogenous β-galactosidase enzyme with reduced enzymatic activity compared to an endogenous β-galactosidase enzyme in a corresponding wild-type bacterium;
ix) an exogenous β-galactosidase gene that is expressed at a lower level than to an endogenous β-galactosidase gene in a corresponding wild-type bacterium;
x) a lactose permease gene;
xi) a mutation in a thyA gene and/or a lacA gene;
xii) a laclq or lacPL8 promoter mutation; and/or
xiii) a nucleic acid construct comprising an isolated nucleic acid encoding the lactose-utilizing sialyltransferase enzyme.
30.-32. (canceled)
33. The method of claim 29 , wherein the reduced level of β-galactosidase activity in iv)
a) comprises reduced expression of a β-galactosidase gene or reduced β-galactosidase enzymatic activity; and/or
b) is less than 10% the level of the corresponding wild-type bacterium in the presence of lactose.
34.-39. (canceled)
40. The method of claim 29 , wherein the bacterium
i) comprises less than 50 units of β-galactosidase activity when cultured in the presence of lactose;
ii) does not express a β-galactoside transacetylase;
iii) accumulates intracellular lactose in the presence of exogenous lactose; and/or
iv) comprises the following genotype: PlacIq-lacY, Δ(lacI-lacZ), ΔlacA, ΔthyA::(0.8RBS lacZ+), ampC::(Ptrp M13g8 RBS-λcI+, CAT), ΔnanATE::scar.
41.-45. (canceled)
46. The method of claim 1 , wherein the bacterium
i) is an Escherichia coli (E. coli) bacterium;
ii) is a G1724 strain E. coli bacterium;
iii) is a member of the Bacillus, Pantoea, Lactobacillus, Lactococcus, Streptococcus, Proprionibacterium, Enterococcus, Bifidobacterium, Sporolactobacillus, Micromomospora, Micrococcus, Rhodococcus, or Pseudomonas genus; and/or
iv) is a Bacillus licheniformis, Bacillus subtilis, Bacillus coagulans, Bacillus thermophiles, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans, Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, Xanthomonas campestris, Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, Lactococcus lactis, Streptococcus thermophiles, Proprionibacterium freudenreichii, Enterococcus faecium, Enterococcus thermophiles), Bifidobacterium longum, Bifidobacterium infantis, Bifidobacterium bifidum, Pseudomonas fluorescens, or Pseudomonas aeruginosa bacterium.
47.-52. (canceled)
53. The method of claim 29 , wherein the nucleic acid in xii) is operably linked to a heterologous control sequence that directs the production of the enzyme in the bacterium.
54. The method of claim 53 , wherein the heterologous control sequence comprises a bacterial promoter, a bacterial operator, a bacterial ribosome binding site, a bacterial transcriptional terminator, or a plasmid selectable marker.
55. (canceled)
56. The method of claim 2 , wherein the enzyme comprises an amino acid sequence as set forth as SEQ ID NO: 15, 16, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30.
57. A nucleic acid encoding the lactose-utilizing sialyltransferase enzyme of claim 58 .
58. A lactose-utilizing sialyltransferase enzyme comprising amino acids in the sequence set forth as SEQ ID NO: 15, 16, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30.
59. An isolated bacterium comprising an exogenous lactose-utilizing sialyltransferase enzyme, wherein the enzyme comprises an amino acid sequence that is
(i) from 5% to 30% identical to the amino acid sequence of Pst6-224 (SEQ ID NO: 1) over a stretch of at least 250 amino acids; or
(ii) from 45% to 75% identical to the amino acid sequence of HAC1268 (SEQ ID NO: 8) over a stretch of at least 250 amino acids.
60. (canceled)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/688,900 US20220243237A1 (en) | 2017-12-15 | 2022-03-08 | Sialyltransferases and uses thereof |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201762599481P | 2017-12-15 | 2017-12-15 | |
| US16/221,193 US11274325B2 (en) | 2017-12-15 | 2018-12-14 | Sialyltransferases and uses thereof |
| US17/688,900 US20220243237A1 (en) | 2017-12-15 | 2022-03-08 | Sialyltransferases and uses thereof |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/221,193 Continuation US11274325B2 (en) | 2017-12-15 | 2018-12-14 | Sialyltransferases and uses thereof |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20220243237A1 true US20220243237A1 (en) | 2022-08-04 |
Family
ID=66819523
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/221,193 Active 2039-03-10 US11274325B2 (en) | 2017-12-15 | 2018-12-14 | Sialyltransferases and uses thereof |
| US17/688,900 Abandoned US20220243237A1 (en) | 2017-12-15 | 2022-03-08 | Sialyltransferases and uses thereof |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/221,193 Active 2039-03-10 US11274325B2 (en) | 2017-12-15 | 2018-12-14 | Sialyltransferases and uses thereof |
Country Status (6)
| Country | Link |
|---|---|
| US (2) | US11274325B2 (en) |
| EP (1) | EP3724344A4 (en) |
| JP (2) | JP2021506337A (en) |
| AU (1) | AU2018386217A1 (en) |
| CA (1) | CA3085931A1 (en) |
| WO (1) | WO2019118829A2 (en) |
Families Citing this family (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019118829A2 (en) | 2017-12-15 | 2019-06-20 | Glycosyn LLC | Sialyltransferases and uses thereof |
| EP3789495A1 (en) * | 2019-09-03 | 2021-03-10 | Jennewein Biotechnologie GmbH | Production of sialylated oligosaccharides in bacillus cells |
| US20220380785A1 (en) * | 2019-11-01 | 2022-12-01 | The University Of British Columbia | Compositions and methods for sialylated mucin-type o-glycosylation of therapeutic proteins |
| US20220403431A1 (en) * | 2020-02-14 | 2022-12-22 | Inbiose N.V. | Glycominimized bacterial host cells |
| CA3171158A1 (en) | 2020-02-14 | 2021-08-19 | Inbiose N.V. | Kdo-free production hosts for oligosaccharide synthesis |
| CN111394292B (en) * | 2020-03-30 | 2022-08-09 | 江南大学 | Multi-way composite neuraminic acid-producing bacillus subtilis and application thereof |
| US11331329B2 (en) | 2020-05-13 | 2022-05-17 | Glycosyn LLC | Fucosylated oligosaccharides for prevention of coronavirus infection |
| CA3178736A1 (en) | 2020-05-13 | 2021-11-18 | Ardythe L. Morrow | 2'-fucosyllactose for the prevention and treatment of coronavirus-induced inflammation |
| CN113969256A (en) * | 2020-07-24 | 2022-01-25 | 上海交通大学 | Bacterial strain for producing N-acetylglucosamine and construction method and application thereof |
| WO2022135700A1 (en) * | 2020-12-22 | 2022-06-30 | Chr. Hansen HMO GmbH | Sialyltransferases for the production of 6'-sialyllactose |
| CN117222736A (en) * | 2021-04-16 | 2023-12-12 | 因比奥斯公司 | Cellular production of biological products |
| EP4486897A2 (en) | 2022-03-02 | 2025-01-08 | DSM IP Assets B.V. | New sialyltransferases for in vivo synthesis of 3'sl and 6'sl |
| DK202200591A1 (en) * | 2022-06-20 | 2024-02-15 | Dsm Ip Assets Bv | New sialyltransferases for in vivo synthesis of lst-c |
| CN119120332A (en) * | 2023-09-22 | 2024-12-13 | 虹摹生物科技(上海)有限公司 | Engineering bacteria producing 3'-sialyllactose and construction method and application thereof |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| ES2456292T3 (en) * | 2006-03-09 | 2014-04-21 | Centre National De La Recherche Scientifique (Cnrs) | Sialylated oligosaccharide production process |
| EP2116598A4 (en) | 2007-03-02 | 2010-06-09 | Japan Tobacco Inc | Novel beta-galactoside-alpha-2,6-sialyltransferase, gene encoding the same and method for enhancing enzymatic activity |
| US9029136B2 (en) * | 2012-07-25 | 2015-05-12 | Glycosyn LLC | Alpha (1,2) fucosyltransferases suitable for use in the production of fucosylated oligosaccharides |
| DE14769797T1 (en) * | 2013-03-14 | 2016-06-23 | Glycosyn LLC | Microorganisms and process for the preparation of sialylated and N-acetylglucosamine-containing oligosaccharides |
| KR101525230B1 (en) * | 2013-05-31 | 2015-06-01 | 주식회사 진켐 | Method of Preparing Sialyl Derivative |
| WO2019020707A1 (en) * | 2017-07-26 | 2019-01-31 | Jennewein Biotechnologie Gmbh | Sialyltransferases and their use in producing sialylated oligosaccharides |
| WO2019118829A2 (en) | 2017-12-15 | 2019-06-20 | Glycosyn LLC | Sialyltransferases and uses thereof |
-
2018
- 2018-12-14 WO PCT/US2018/065656 patent/WO2019118829A2/en not_active Ceased
- 2018-12-14 AU AU2018386217A patent/AU2018386217A1/en not_active Abandoned
- 2018-12-14 US US16/221,193 patent/US11274325B2/en active Active
- 2018-12-14 EP EP18887571.0A patent/EP3724344A4/en not_active Withdrawn
- 2018-12-14 JP JP2020551785A patent/JP2021506337A/en active Pending
- 2018-12-14 CA CA3085931A patent/CA3085931A1/en active Pending
-
2022
- 2022-03-08 US US17/688,900 patent/US20220243237A1/en not_active Abandoned
-
2023
- 2023-06-02 JP JP2023091542A patent/JP2023110032A/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| JP2023110032A (en) | 2023-08-08 |
| US20190218582A1 (en) | 2019-07-18 |
| EP3724344A4 (en) | 2021-11-24 |
| US11274325B2 (en) | 2022-03-15 |
| EP3724344A2 (en) | 2020-10-21 |
| CA3085931A1 (en) | 2019-06-20 |
| JP2021506337A (en) | 2021-02-22 |
| AU2018386217A1 (en) | 2020-07-02 |
| WO2019118829A3 (en) | 2019-07-25 |
| WO2019118829A2 (en) | 2019-06-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20220243237A1 (en) | Sialyltransferases and uses thereof | |
| US11643675B2 (en) | Alpha (1,2) fucosyltransferase syngenes for use in the production of fucosylated oligosaccharides | |
| US10487346B2 (en) | Biosynthesis of human milk oligosaccharides in engineered bacteria | |
| US11453900B2 (en) | Alpha (1,3) fucosyltransferases for use in the production of fucosylated oligosaccharides | |
| EP2877574B1 (en) | Alpha (1,2) fucosyltransferases suitable for use in the production of fucosylated oligosaccharides |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| AS | Assignment |
Owner name: GLYCOSYN LLC, MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HEIDTMAN, MATTHEW IAN;MALLIPEDDI, SRIKRISHNAN;MERIGHI, MASSIMO;AND OTHERS;SIGNING DATES FROM 20181210 TO 20181212;REEL/FRAME:065499/0511 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |