[go: up one dir, main page]

CN113136396A - Bacterial light-operated gene expression system and method for regulating and controlling gene expression - Google Patents

Bacterial light-operated gene expression system and method for regulating and controlling gene expression Download PDF

Info

Publication number
CN113136396A
CN113136396A CN202010062612.9A CN202010062612A CN113136396A CN 113136396 A CN113136396 A CN 113136396A CN 202010062612 A CN202010062612 A CN 202010062612A CN 113136396 A CN113136396 A CN 113136396A
Authority
CN
China
Prior art keywords
ala
leu
arg
glu
gly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010062612.9A
Other languages
Chinese (zh)
Other versions
CN113136396B (en
Inventor
杨弋
李写
陈显军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China University of Science and Technology
Original Assignee
East China University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China University of Science and Technology filed Critical East China University of Science and Technology
Priority to CN202010062612.9A priority Critical patent/CN113136396B/en
Publication of CN113136396A publication Critical patent/CN113136396A/en
Application granted granted Critical
Publication of CN113136396B publication Critical patent/CN113136396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/21Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Pseudomonadaceae (F)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/24Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
    • C07K14/245Escherichia (G)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/34Vector systems having a special element relevant for transcription being a transcription initiation element

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention provides a bacterial light-operated gene expression system, which comprises: a) a recombinant light-sensitive transcription factor encoding gene, said recombinant light-sensitive transcription factor comprising a first polypeptide that is a DNA binding domain and a second polypeptide that is a light-sensitive domain; b) a target transcription unit comprising a promoter of at least one response element recognized/bound by said first polypeptide or a promoter-response element or a response element-promoter and a nucleic acid sequence to be transcribed. The invention also provides a bacterial expression vector containing the light-operated gene expression system, a method for regulating and controlling gene expression in bacteria by using the light-operated gene expression system, and a kit containing each component of the light-operated gene expression system. The bacterial light-operated gene expression system has the advantages of quick, efficient and strong induction, safety compared with other inducers, low toxicity or no toxicity; it can control the expression of genes in time and space; can be used for regulating the vital activity of bacteria.

Description

Bacterial light-operated gene expression system and method for regulating and controlling gene expression
Technical Field
The invention relates to the field of genetic engineering or synthetic biology in biotechnology, in particular to the field of gene expression regulation, and more particularly relates to a light-dependent gene expression system based on photosensitive protein in bacterial cells, and a method for regulating gene expression in bacterial cells by adopting the expression system.
Background
In the field of genetic engineering, accurate control of gene expression is of great importance for studying gene function and life activities of living bodies. Bacterial gene expression systems are much simpler than eukaryotic complex gene expression systems. Taking the most widely used form of bacteria, e.g., E.coli, the first step in gene expression is the transcription of DNA into RNA by RNA polymerase. The RNA polymerase of E.coli consists of five subunits, has a molecular weight of about 480Kd, and contains 4 different polypeptides, i.e., alpha, beta', sigma, etc., wherein alpha is two molecules, so that the holoenzyme (holoenzyme) consists of alpha2β β' σ. The alpha subunit is associated with the tetrameric core of RNA polymerase (. alpha.)2β β') is involved; the beta subunit contains a binding site for nucleoside triphosphates; the β' subunit contains a binding site for a DNA template; whereas sigma factors are only involved in the initiation of RNA transcription and are not involved in strand elongation. Once transcription has begun, sigma factors are released and chain extension is catalyzed by tetrameric core enzyme (core enzyme). Therefore, sigma factor functions to recognize the transcription initiation signal and to bind RNA polymerase to the promoter site. The initiation signal, i.e., the "promoter sequence", on the DNA molecule is referred to as the promoter. The promoter sequence of Escherichia coli mainly comprises-10 region and-35 region, -10 region is located at about 10bp upstream of the transcription start point, and contains TATATA conserved sequence of 6 bases, which is the tight binding site of RNA polymerase; a 6 base TTGACA conserved sequence located about 35bp upstream of the transcription start, the-35 region providing a signal for RNA polymerase holoenzyme recognition; the strength of E.coli promoter depends mainly on the base composition of the-10 region and-35 region and the length of the interval therebetween. Although the core enzyme alone is capable of binding to DNA, DNA remains in duplex form primarily due to non-specific electrostatic attraction between basic proteins and acidic nucleic acids, while the sigma subunit is capable of altering the affinity between RNA polymerase and DNA, greatly increasing the binding constant and residence time of the enzyme to the promoter. Thus, initially the core enzyme contacts the DNA molecule in the presence of the sigma subunit to form a non-specific complexSuch a complex is very unstable and the enzyme molecule can slide on the DNA strand. Under the action of sigma subunit, the holoenzyme is helped to find promoter quickly and combine with it to generate a looser closed promoter complex. In this case, the enzyme binds to the outside of the DNA, and the recognition site is approximately at the-35 site of the promoter. Followed by activation by conformational change of DNA to give an open promoter complex, at which time the enzyme binds tightly to the promoter, unzips the DNA double strand at the-10 site, and recognizes the template strand therein. This site is rich in A-T base pairs, which facilitates DNA melting. Once an open complex is formed, the DNA continues to melt and the enzyme moves to the start site. Another bacterium which is used more widely is Bacillus, which is a gram-positive bacterium. Unlike E.coli, Bacillus contains multiple RNA polymerases and sigma factors, each of which recognizes a different promoter sequence.
The gene expression system of bacteria is mainly divided into two types, one is constitutive expression, namely, the target protein is expressed in an autonomous and continuous manner under the condition that induction is not needed. The other is an inducible expression system, and the inducible expression system can be divided into a chemical small molecule inducible expression system and a physical method inducible expression system according to different inducers. In the chemical small molecule induction expression system, IPTG is the most commonly used inducer. IPTG, an analogue of lactose, is an extremely potent inducer that is not metabolized by bacteria and is very stable. Most of the expression vectors most commonly used at present comprise a T7 promoter, a Lac promoter, a Tac promoter and a Grac promoter, and the inducer is IPTG. Arabinose and tryptophan induced expression systems are also adopted more and more at present, and the micromolecule arabinose and tryptophan have the advantages of no toxicity to cells, capability of achieving tight regulation and the like. Mn2+、Fe2+、Cu+The discovery of plasma metal ion sensing protein makes people gradually put into the direction of inducing protein expression by combining metal ions with corresponding sensing protein. Commonly used in physical methods for inducing expression systems is the use of temperature changes to induce protein expression, such as temperature sensitive LacI mutants in E.coli, which have promoter-inhibiting activity at 30 ℃ and inactivation at 42 ℃Thereby losing the activity of the repressible promoter. The Ultraviolet (UV) regulated "caged" technique is another commonly used method of physically induced expression [ Keyes, W.M., et al, Trends Biotechnol,2003.21(2):53-55.][1]。
Although the above-mentioned bacterial inducible expression systems have been used in a wide range of applications, they still have some disadvantages: (1) the inducer itself is highly toxic to cells and expensive (e.g., IPTG), and is not suitable for the preparation of recombinant proteins for medical purposes; (2) in the metal ion induction expression system, the specificity of metal ion sensing protein for identifying metal ions is not strong, different metal ions in the same family or the same period can be combined by the same sensing protein to activate transcription, so that a plurality of metal ions existing in the internal environment of bacterial cells can generate certain interference on transcription, and the oxidation environment of the bacterial cells can generate oxidation action on low-valence metal ions, thereby interfering the activation transcription of the metal ions with higher requirements on the oxidation environment; (3) in a temperature-induced expression system, the rise of external temperature can cause the activation of heat shock proteins of escherichia coli, some proteases can influence the stability of products, and a plurality of target proteins are difficult to fold correctly at high temperature, so that irreversible damage to cells can be caused by the UV-induced caging technology; (4) most importantly, chemical inducers only regulate gene expression temporally and not spatially specifically in certain cells and tissues.
However, light is an inducer that is easy to manipulate in time and space, generally without the above-mentioned toxicity to cells, and is readily available. In recent years, it has also been discovered that light-regulated proteins (also called light-sensitive proteins) exist in some in vivo regulatory biological clock systems, and that light has a significant effect on their function. We imagine that a molecular design method is used to modify a naturally occurring transcription factor to synthesize an artificial transcription factor with photoreactivity, and then a light-controlled gene expression system is constructed in bacterial cells. Anselm Levskaya et al, 2005 reported a light-regulated protein expression system based on the phytochrome Cph1 and the E.coli self EnvZ/OmpR two-component signaling pathway [ Levskaya, A. et al, Nature,2005.438(7067): p.441-2 ] [2 ]. Under dark conditions, the light sensitive transcription factor is autophosphorylated and then binds to the ompR-dependent ompC promoter, thereby initiating transcription and expression of the gene of interest. Under the irradiation of red light, the autophosphorylation of the light sensitive transcription factor is inhibited, and cannot bind to the ompC promoter, thereby failing to activate the transcription and expression of the target gene. In the following years, several changes have been made to this light-activated expression system by the same group of research groups, resulting in a system which can co-regulate the expression of a protein of interest with light of various colors [ Tabor, J.J., et al, J Mol Biol,2011.405(2): p.315-24, Tabor, J.J., et al, Cell,2009.137(7): p.1272-81.] [3,4 ]. The Keith Moffat group developed another novel recombinant light-sensitive transcription factor YF1, which was constructed based on the YtvA blue light-sensitive protein from Bacillus subtilis and the FixL protein from Bradyrhizobium japonicum (Bradyrhizobium japonicum) [ Moglich, A. et al, J Mol Biol,2009.385(5): p.1433-44, Ohlendorf, R. et al, J Mol Biol, 2012.416,534-542.] [5,6 ]. The YF 1-based light-activated gene expression system inhibited expression of the target protein in the presence of blue light, and high expression of the target gene in the absence of blue light. The Christopher A Voigt group developed in 2017 in combination with the above to design a gene coding system that allowed E.coli to distinguish between red, green and blue (RGB) and respond by altering gene expression [ Fernandez-Rodriguez J et al, Nat Chem Biol,2017.13(7): p.706-708] [7 ]. However, these prokaryotic gene expression systems have significant limitations. The system based on the photosensitive pigment Cph1 is very complex, and needs to introduce two genes of ho1 and pcyA in addition to a photosensitive transcription factor and a reporter system to convert heme (haem) into phycocyanin (phytocynubin) necessary for the normal work of the system, thereby greatly increasing the workload of constructing the system. The system based on the photosensitive transcription factor YF1 still has more leaky expression under the irradiation of blue light, and the induction multiple is only dozens of times, so that the expression quantity of the target gene cannot be accurately regulated. The disadvantages described above limit the use of these two light-activated expression systems in bacteria.
In 2016, Jayaraman et al developed an EL 222-based bacterial light-activated and light-repressed gene expression system containing only EL222 single light-operated transcription factor, which activates the transcription and expression of the target gene by recruiting RNA polymerase other components after being combined with the upstream of the promoter-35 region, wherein the activation multiple is about 5 times; when the transcription factor binds to EL222 and the region between-35 and-10 regions of the promoter, the expression of the target gene is suppressed by less than 3 times. The gene expression system fold of the system is obviously low, and the system has no practical use value. In the same year, Yangyi topic group Chengxiang et al developed a single-component light-sensitive transcription factor-based Escherichia coli light-controlled gene expression system LightOff with an induction multiple of nearly 10000 times, extremely low background leakage expression and activation expression level similar to that of the T7 promoter. The system has rapid induction kinetics and good reversibility, and has light intensity dependence on gene expression regulation. Meanwhile, the reversibility and light sensitivity of the system can be adjusted by replacing the VVD mutant. To avoid interference by wild-type LexA proteins in bacteria, the DNA binding domain of the LexA protein and its recognized SOS box were mutated to LexA408Mutants and LexA408The DNA sequence was recognized, and the results showed LexA408The mutant can well regulate and control gene expression in a strain containing wild LexA protein. The LightOff system can be used in biological research for a variety of purposes such as phototoxic protein expression, phototoxic bacterial lysis, phototoxic bacterial mobilization, and fermentation for large-scale production of proteins. The LightOff system is a very stringent gene expression system, but it is a photoinhibition gene expression system, i.e., blue light irradiation suppresses gene expression, and under dark conditions activates gene expression. Thus, the growth state of bacteria cannot be observed at any time during the bacterial culture or fermentation process, and the operation in the dark is very inconvenient. Although it is possible to introduce a cI/PλO12The light off system can be converted into a light activated gene expression system from a photoinhibition gene expression system by switching the switch, but the converted system has low expression level and slow response speed, so the light off system is not a very ideal bacterial light activated gene expression system.
In conclusion, scientists developed a plurality of escherichia coli light-operated gene expression systems from 2005 to date, and these systems have the defects of complex components, dependence on exogenous cofactors, low induction fold, photoinhibition gene expression and the like. According to the operation habit of people, it is more desirable to turn on the expression of a target gene after giving light rather than photoinhibition gene expression, and in addition, the photoinactivation gene expression system allows people to observe the state of a research object at any time, avoids the inconvenience of operation in a dark place when the photoinhibition gene expression system is used, and is widely used for researching various life activities of bacteria and green biological manufacturing.
In addition to the Cph1 and YtvA photoproteins used in the above system, other photoproteins known to date are: the light-sensitive proteins (also called flavoprotein family blue light receptors) with flavins as chromogens are divided into three categories: one is a light-receptor protein containing a light-oxygen-voltage (LOV) domain, such as a photosensitizer; second, cryptochromes (photolyase-like cryptochromes) similar to photolyase; the third type is blue light using FAD (blu), which has been discovered only in recent years.
The most abundant types of light-sensitive pigments are light receptor proteins containing LOV domain, and most of the light-sensitive pigments are phot1, WC-1, WC-2, PYP (photosensitive yellow protein), Phy3, VIVID, RsLOV, DsLOV, EL222, and the like. The photosensitizing pigment is usually a membrane-coupled kinase protein, and autophosphorylation is carried out under blue light irradiation to change the activity of the kinase so as to regulate related physiological processes. Most of the photosensitizers have serine/threonine kinase domain at the C terminal and two LOV domains at the N terminal, wherein the two LOV domains are combined with flavin molecules, and the LOV domains and the flavin molecules are covalently combined to generate yellow-cysteine amide addition products when the blue light is irradiated, so that the spatial conformation of the flavin combination pocket is changed, the kinase activity of the kinase domain at the C terminal is changed, but the process is completely reversible. To date, the most successful light activated expression system established in eukaryotic cells is based on the VIVID light sensitive protein. Yang subject group [ Wang, X, et al, Nat Methods,2012, p.266-269 ] [8] constructed a eukaryotic light-regulated gene expression system based on the principle that VIVID blue light-sensitive protein of Neurospora crassa forms homodimers after being irradiated by blue light. In this system, the light-sensitive transcription factor is composed of three or four polypeptides, and the dimerization capacity of the recombinant light-sensitive transcription factor is changed upon irradiation with blue light, and the dimerized transcription factor can bind to a response element in the nucleotide sequence of the target transcription unit and act synergistically with a promoter in the target transcription unit through the transcription activation/repression domain of the third polypeptide of the fusion protein and other transcription cofactors recruited to the host cell itself, thereby regulating (initiating or repressing) transcription and expression of the gene of interest. The system has simple components and has the advantages of rapid induction, high induction times, good reversibility, high time and space specificity and the like, and is considered to be the most excellent eukaryotic cell gene expression system so far. Unfortunately, this system cannot be applied to bacteria because of the difference in the transcription and translation mechanisms between bacteria and eukaryotic cells. In addition, Masayuki Yazawa et al [ Yazawa, M.et al, Nat Biotechnol,2009.27(10): p.941-5.] [9] A eukaryotic light-regulated gene expression system was also constructed using the principle that Arabidopsis FKF1(flavin-binding, kelch repeat, f box 1) and GI (GIGANTEA) proteins interact with each other after excitation with blue light, but its low fold induction and slightly complex system components greatly limit its application. Laura B Motta-Mena et al constructed a light-activated gene expression system with rapid activation (<10s) and inactivation (<50s) kinetics using DNA binding protein EL222 containing LOV domain
Cryptochrome from Arabidopsis thaliana (Arabidopsis thaliana) is the first blue light-sensitive protein isolated from plants, and many studies have been made on cryptochrome 1(Cryptochrome 1, CRY1), cryptochrome 2(Cryptochrome 2, CRY2), phytochrome A (phytochrome A, phyA) and phytochrome B (phytochrome B, phyB), etc., and the main function is to regulate the growth and movement of higher plants by regulating circadian light. The amino acid sequence and chromophore composition of cryptochromes are very similar to those of photocleaved proteins, most cryptochromes are 70kD-80kD in size, and comprise relatively conserved N-terminal PHR (photocleavage enzyme-associated) domains and C-terminal unknown domains with large length differences, and the PHR domains can be non-covalently bound with flavin. It has been studied that Arabidopsis CRY 2(cryptochrome 2) and CIB1 (helix of helix with cryptochrome 2 interaction) protein interact with each other after blue light excitation, and a light-activated expression system in eukaryotic cells was constructed [ Kennedy, M.J., et al, Nat Methods,2010.7(12): p.973-5.] [10 ].
Unlike cryptochrome, the blue light receptor protein containing the BLUF domain and the light receptor protein containing the LOV domain do not react with the flavin chromophore to form a covalent product upon light activation, but rather cause a conformational change in the chromophore to cause a 10nm red shift in the yellow absorbed light. The most studied photoreceptor proteins containing the BLUF domain are AppA, a transcription-resistant repressor protein of Rhodobacter sphaeroides (Rhodobacter sphaeroides), which binds to a PpsR transcription factor in cells in the dark to form AppA-PpsR2A complex that renders PpsR unable to bind DNA; intense blue light irradiation can dissociate AppA from the complex to release PpsR to form tetramers that bind to specific DNA sequences resulting in inhibition of transcription of the relevant genes [ Pandey, r.][11]。
Haifeng Ye et al [ Ye, H.et al, Science,2011.332(6037): p.1565-8 ] [12] A blue light-excited eukaryotic cell light-activated expression system is constructed by using melanopsin (melanopsin) and intracellular signal pathways. Melanopsin is a photopigment on the surface of certain retinal cells. In the presence of blue light, melanopsin triggers calcium ions to rapidly flow into cells, and after a series of signal cascades in the cells, calmodulin (calmodulin) activates calcineurin (calceinin) with serine/threonine phosphatase activity, which dephosphorylates transcription factor NFAT, and transcription and expression of target genes are started after the dephosphorylated NFAT transcription factor enters into cell nucleus and is combined with NFAT specific promoter. The biggest disadvantage of the system is that the system is complex and participates in signal path in the cell, so that the system has poor stability and may affect the normal life activity of the cell by affecting the signal path of the cell.
Compared with eukaryotic cells, the bacteria have the advantages of rapid proliferation, low culture cost, capability of efficiently expressing foreign proteins (even more than 90% of the total protein amount of the bacteria) and the like, and the advantages make the bacteria more suitable to be used as host cells for expressing a large amount of target proteins. However, as mentioned above, most of the existing gene expression systems widely used in bacteria are regulated by chemical inducers, and although the induction effect is good, the background is low, and the expression intensity is high, many systems have pleiotropic effects, thus the side effects are wide and the cytotoxicity is latent. More importantly, the chemical inducer cannot spatially precisely regulate the expression of the gene; physical methods (such as temperature) are currently few systems for regulating gene expression, and the increase in temperature has many side effects. Few gene expression systems based on light sensitive proteins have limited their application due to the complexity of the system itself or low fold induction.
In conclusion, we thought that it is possible to create a new and more excellent light-controlled gene expression system in bacteria by combining the advantages of these systems, so as to overcome the drawbacks of previous researches and widely apply the system in biomedical researches. Through careful research, a novel bacterial light-operated gene expression system is invented, has good gene expression regulation and control capability, and can regulate and control the expression of genes together in time and space.
Therefore, the first object of the present invention is to provide a novel bacterial light-controlled gene expression system.
It is a second object of the present invention to provide a method for regulating gene expression in bacterial cells using the light-operated gene expression system.
The third purpose of the invention is to provide a bacterial expression vector containing the light-operated gene expression system.
The fourth purpose of the invention is to provide the application of the light-operated gene expression system in regulating the life activities of bacteria (such as movement, division and the like of the bacteria).
The fifth purpose of the invention is to provide a kit for the bacterial strain which is provided with the expression vector of the bacterial light-operated gene expression system or integrates the light-sensitive transcription factor expression frame in the expression system on the genome.
Summary of The Invention
The invention relates to and provides a bacterial light-operated gene expression system, which comprises: a) a recombinant light-sensitive transcription factor encoding gene, said recombinant light-sensitive transcription factor comprising a first polypeptide as a DNA binding domain and a second polypeptide as a light-sensitive domain, wherein said second polypeptide is selected from the group consisting of rhodopseudomonas sphaeroides LOV domain RsLOV and rhodobacter marinus LOV domain DsLOV and truncations thereof or mutants having an amino acid sequence 15% -99% identical or an amino acid sequence 36% -99% similar; b) a target transcription unit comprising a promoter or promoter-response element or response element-promoter comprising at least one response element recognized/bound by a first polypeptide and a nucleic acid sequence to be transcribed.
In the bacterial light-operated gene expression system, the recombinant photosensitive transcription factor is a recombinant photosensitive DNA binding protein, and the binding capacity of the photosensitive DNA binding protein and a corresponding reaction element is obviously changed before and after illumination, so that the transcription and translation of a target gene are directly started or inhibited.
The first polypeptide of the recombinant light-sensitive transcription factor is a DNA binding domain that specifically recognizes the response element, is not capable of binding to the response element alone or is weakly binding, and requires a second polypeptide to assist its binding to the response element. The first polypeptide is operably linked to the second polypeptide.
The first polypeptide may be selected from the group consisting of a helix-turn-helix DNA binding domain, a zinc finger motif or zinc cluster DNA binding domain, a leucine zipper DNA binding domain, a winged-helix-turn-helix DNA binding domain, a helix-loop-helix DNA binding domain, a high mobility group DNA binding domain, a B3 DNA binding domain. The second polypeptide is a photosensitizing domain, typically from a photosensitizing protein that is chromophoric with flavins. The first polypeptide and the second polypeptide may be directly linked or may be operably linked via a linker peptide. The number of amino acids in the linker peptide may vary (e.g., 0-10 or more).
The first polypeptide may further be selected from the group consisting of E.coli LexA408DNA binding domain, LexA DNA binding domain, lambda phage cI repressor protein DNA binding domain, yeast Gal4 DNA binding domain, tetracycline combination protein TetR DNA binding domain, and its truncated body or/and amino acid sequence homology 80% -99% of the mutant.
The second polypeptide is selected from the group consisting of a photosensitizing domain of a photosensitizing protein that contains a flavin-based chromophore and a photosensitizing domain of a photosensitizing protein that contains an LOV domain. Further selected from the RsLOV of Rhodopseudomonas globosa and DsLOV domain of Shigella and truncations thereof or mutants having 15% -99% identical amino acid sequence or 36% -99% similar amino acid sequence.
In the bacterial light-operated gene expression system, the promoter-nucleic acid sequence to be transcribed, or the promoter-response element-nucleic acid sequence to be transcribed, or the response element-promoter and the nucleic acid sequence to be transcribed in the target transcription unit can be directly or operatively connected.
The response element is a DNA motif that is specifically recognized and bound by the first polypeptide. The reactive element is selected from LexA408Binding elements, LexA binding elements, cI binding elements, Gal4 binding elements and TetR binding elements.
The promoter is selected from colE of Escherichia coli408Promoter, colE promoter, sulA promoter, recA promoter, umuDC promoter, lac minimal promoter, T7 promoter of T7 phage, O12 promoter of lambda phage, and Grac promoter of Bacillus subtilis.
According to the bacterial light-dependent gene expression system of the present invention, the recombinant light-sensitive transcription factor may further comprise additional polypeptides, such as a third polypeptide that can recruit other components of the RNA polymerase. The third polypeptide may be linked to the first and second polypeptides directly or via a linker peptide. The third peptide can be selected from omega factor, alpha factor or mutant with 36% -99% similar amino acid sequence of Escherichia coli.
The invention also relates to a bacterial expression vector containing the light-operated gene expression system. The expression vector can be a bacterial expression vector containing the recombinant photosensitive transcription factor coding gene alone, or a bacterial expression vector containing a target transcription unit containing a promoter but a nucleic acid sequence to be transcribed is vacant, or a promoter-response element but a nucleic acid sequence to be transcribed is vacant, or a response element-promoter but a nucleic acid sequence to be transcribed is vacant. Alternatively, it may be a bacterial expression vector containing both a gene encoding a recombinant light-sensitive transcription factor and a target transcription unit.
The recombinant light-sensitive transcription factor encoding gene in the expression vector is selected from the group consisting of sequences 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46.
The invention also relates to a bacterial strain with the genome integrated with the expression frame of the photosensitive transcription factor in the light-operated gene expression system.
The present invention also relates to a method for regulating gene expression in a bacterial cell using the light-operated gene expression system of the present invention, comprising the steps of:
a) constructing the bacterial light-operated gene expression system in a bacterial expression vector;
b) introducing a bacterial cell containing a regulated gene; and
c) and (3) inducing the bacterial cells by illumination to express the regulated nucleotides in the bacterial cells.
The method for regulating gene expression in bacteria of the present invention relates to the selection of light source and the selection of irradiation method. Light sources include, without limitation, LED lamps, incandescent lamps, fluorescent lamps, lasers; the irradiation method comprises the selection of the illumination quantity, the illumination time, the illumination intensity and the illumination frequency. Spatially controlling the expression of the gene of interest by scanning, projection, optical patterning, and the like is also included in the scope of the present invention.
The invention also relates to a method for regulating and controlling the life activities of bacteria, such as movement, division and the like by using the light-operated gene expression system.
The invention also relates to a kit which is provided with a bacterial (such as escherichia coli) expression vector containing the light-operated gene expression system or a bacterial strain with a genome integrated with the light-sensitive transcription factor expression frame in the light-operated gene expression system, and a corresponding instruction. The nucleic acid sequence to be transcribed in the target transcription unit comprised in the bacterial expression vector in the kit of the invention may be absent.
Detailed Description
The invention provides a photosensitive polypeptide-based, light-controlled bacterial gene expression system for temporally and spatially regulating the expression of a gene of interest in a bacterial cell. The light-operated gene expression system of the invention involves at least two components: the first part is a nucleotide sequence encoding a recombinant light-sensitive transcription factor fusion protein capable of being expressed in a bacterial cell, the fusion protein consisting of two polypeptides, wherein the first polypeptide is its DNA recognition/binding domain and the second polypeptide is a light-sensitive polypeptide; the second part is a target transcription unit nucleotide sequence consisting of a promoter-nucleotide sequence to be transcribed or a promoter-response element-nucleotide sequence to be transcribed or a response element-promoter-nucleotide sequence to be transcribed, wherein the response element is a DNA nucleotide sequence recognized/bound by the first polypeptide of the recombinant light-sensitive transcription factor fusion protein. The two polypeptides of the first part preferably employ truncated functionally active fragments (i.e., domains) of the protein of interest. The first part and the second part of the light-operated gene expression system of the invention can be constructed in one bacterial expression vector or in two bacterial expression vectors respectively by genetic engineering techniques. In a specific use process, the two components are transformed into a bacterial cell by a conventional transformation method, or a first part is integrated into the genome of the bacterial cell by a conventional gene knockout method to express the recombinant photosensitive transcription factor fusion protein of the present invention, the dimerization capacity of a second photosensitive polypeptide is changed by irradiating with light of an appropriate wavelength, the dimerized photosensitive polypeptide can be combined with a response element in the nucleotide sequence of a target transcription unit of a second part of the present invention, the transcription and expression of a downstream gene of interest can be directly inhibited by blocking the combination of RNA polymerase and a promoter region, or the transcription and expression of the downstream gene can be initiated by recruiting other components of RNA polymerase into the promoter region.
The bacterial light-operated gene expression system provided by the invention can utilize light irradiation which does not damage cells to temporally and spatially regulate the expression of target protein genes in bacterial cells.
The bacterial light-operated gene expression system can regulate the expression of target protein genes in bacterial cells by utilizing the difference of different illumination conditions such as different illumination intensities and the like.
The bacterial light-operated gene expression system can control the vital activities of bacteria, such as movement, division and the like, by inducing the expression of certain proteins through light. The light used is cheap, readily available and non-toxic to the cells.
Definitions and explanations of terms used herein
The "host cell" in this patent refers to a bacterial cell, which may be an original non-modified bacterial cell, or a commercial bacterial strain with a modified genome, such as TOP10, BL21(DE3), JM109(DE3), Mach1, etc. commonly used in laboratory, or a bacterial strain obtained by genome modification based on such commercial bacterial strain, such as TOP10 (ftsz) obtained by knocking off ftsz gene in genome in the present invention-) The Escherichia coli strain can be other host cells compatible with the light-operated gene expression system.
"protein of interest," which may also be referred to as "protein of interest," refers to any useful protein, such as a useful biological protein that may be useful for prophylactic or therapeutic purposes or for other uses, and that requires expression in E.coli cells, including naturally occurring or artificially modified or mutated useful proteins.
The "reporter protein" is a kind of target protein, and means a useful protein whose expression is easily detected. To facilitate testing of the efficacy of the light-sensitive polypeptide-based, light-inducible gene expression system for a protein of interest of the present invention, the following known and widely used reporter proteins can be selected: red fluorescent protein (mCherry), and the like. However, the light-inducible target protein gene expression system of the present invention is not limited to the expression of a reporter protein, and can be used to express any useful target protein.
"Gene", the "coding nucleotide sequence" of a protein, and the "nucleotide sequence to be transcribed" are used interchangeably in the same sense herein and refer to a natural or recombinant DNA sequence carrying codons for the amino acid sequence information of a protein.
"Gene encoding a protein of interest", "nucleotide sequence encoding a protein of interest" or simply "gene of interest" are used synonymously herein and are used interchangeably to refer to a gene encoding a protein of interest, typically a deoxyribonucleic acid DNA double-stranded sequence. Such genes may be contained in the genomic DNA sequence of the host cell or in an artificially constructed expression vector, such as the target transcription unit sequence of the present invention. Similarly, a "reporter gene" refers to a gene encoding a reporter protein.
"transcription" as used herein refers specifically to the transcription of a gene of interest by RNA polymerase in a bacterial cell to produce RNA carrying the information of the gene.
"expression", "gene expression of a target protein", "gene expression" are used interchangeably herein to mean that the DNA sequence of the target gene is transcribed to produce RNA (mRNA or antisense RNA) carrying the information of the gene and the information carried by the RNA is translated in ribosomes to produce both the target protein, i.e., both transcription to produce messenger RNA and translation to produce the target protein are called expression. Both of these meanings are included herein, primarily referring to the production of the protein of interest.
"transcriptional regulation" refers herein exclusively to the regulation of gene transcription in bacterial cells.
"transcription factor" or "transcription factor fusion protein" are used interchangeably and are meant herein to refer to a bacterial transcription factor, typically a protein, which may be native or artificially engineered or artificially fused, comprising a polypeptide capable of recognizing/binding a response element in the nucleotide sequence of a target transcription unit, which itself, or together with other components of the recruited RNA polymerase, regulates the transcription of a protein gene of interest by binding to and interacting with the response element in the target transcription unit. Transcription factors are classified into "transcription activators" or "transcription repressors" depending on their composition, and these are simply called "transcription factors".
"target transcription unit" refers to an artificial DNA sequence (not a protein) consisting of a promoter containing a response element located within the promoter sequence or upstream of the-35 region of the promoter or downstream of the-10 region of the promoter and a gene of interest located downstream of the promoter, which may be directly linked or operably linked to each other (i.e., may be separated by several nucleotides).
"response element" refers to one or more cis DNA motifs that are specifically recognized/bound by a transcription factor, different transcription factors having different response elements corresponding thereto, and a transcription factor comprising a binding domain capable of binding to such a DNA motif. When a transcription factor is specifically bound to its corresponding response element, the response element conveys reactivity to the promoter, and the transcription factor itself can inhibit or activate the promoter activity, either by recruiting other components of RNA polymerase, to block or activate the transcription of the downstream gene of interest to produce the corresponding RNA. In the present invention, the response element refers to a DNA motif capable of specifically recognizing/binding to the first polypeptide of the recombinant light-sensitive transcription factor, such as LexA408The recognition/binding response element is a DNA motif 16bp in length (SEQ ID NO: 6).
"promoter" refers to a DNA sequence that initiates and directs the transcription of a target gene downstream thereof to produce RNA, and is an essential regulatory sequence essential for gene expression. The promoter may be a promoter of a native gene or an artificially modified promoter. The bacterial promoter consists of two separate and highly conserved nucleotide sequences, which are important for mRNA synthesis. A region which is composed of 6-8 bases and is rich in A and T is arranged at the position 5-10 bp upstream of the transcription starting point and is called a Pribnow box, namely a TATA box or a-10 region. The base sequences of the Pribnow cassettes vary slightly from promoter to promoter. At 35bp upstream from the transcription start site, there is a region consisting of 10bp, which is called the-35 region. Coli RNA polymerase recognizes and binds to the promoter upon transcription. The-35 region binds to the sigma subunit of RNA polymerase, the-10 region binds to the core enzyme of RNA polymerase, DNA is unwound near the transcription start site to form a single strand, and RNA polymerase forms phosphodiester bonds between the first and second nucleotides and then advances forward by RNA polymerase to form a nascent RNA strand.
"vector", "expression vector", "gene expression vector", "recombinant gene expression vector" or "plasmid" are used interchangeably herein to mean a vector capable of expressing a recombinant protein of interest in a bacterial cell.
"transformation" refers to the chemical or physical introduction of a foreign plasmid DNA molecule into a bacterial cell. Specific methods can be found in Sambrooka et al (molecular cloning, A laboratory Manual, second edition, Cold spring harbor Press (1989)), and other relevant textbooks.
In the light-operated bacterial gene expression system, the first part of the recombinant photosensitive transcription factor is a fusion protein formed by connecting two or three functional polypeptide fragments in series directly through peptide bonds or in series through short peptide linkers. Under dark conditions, the fusion protein forms a homodimer that binds to a response element of the nucleotide sequence of the second partial target transcription unit of the invention and acts synergistically with itself or other components of the RNA polymerase recruited to the host cell in the promoter region to repress transcription and expression of the target protein gene in the target transcription unit. Under the irradiation of light with proper wavelength, the fusion protein is depolymerized from the dimer into monomers, and then dissociated from the response element of the nucleotide sequence of the second partial target transcription unit of the invention, so as to start the transcription and expression of the target protein gene in the target transcription unit. And act synergistically, by themselves or by recruiting other components of the RNA polymerase in the host cell, on the promoter region to repress or initiate transcription and expression of the protein gene of interest in the target transcription unit.
As used herein, the term "recombinant light-sensitive transcription factor fusion protein" is used interchangeably with the term "recombinant light-sensitive transcription factor".
The recombinant photosensitive transcription factor of the invention comprises a first polypeptide which can specifically recognize a response element in the nucleotide sequence of the target transcription unit but cannot be combined with the response element or has weak combination ability, and can be combined with the response element only after homodimerization under the assistance of a second polypeptide; the polypeptide may be derived from any known protein DNA recognition/binding domain, preferably a DNA recognition/binding domain of shorter amino acid sequence, including natural and synthetic such DNA recognition/binding domains or analogues thereof (e.g.binding domain mutants or truncations which retain or even enhance binding capacity). The first polypeptide useful in the present invention is selected from the group consisting of a helix-turn-helix DNA binding domain, a zinc finger motif or zinc cluster DNA binding domain, a leucine zipper DNA junctionA synth domain, a winged helix DNA binding domain, a winged helix-turn-helix DNA binding domain, a helix-loop-helix DNA binding domain, a high mobility group DNA binding domain, a B3 DNA binding domain. The first polypeptide preferably includes, but is not limited to: LexA408DNA recognition/binding domain of protein (SEQ ID NO: 1), DNA recognition/binding domain of LexA protein (SEQ ID NO: 2), DNA recognition/binding domain of lambda phage cI repressor cI protein (SEQ ID NO: 3), DNA recognition/binding domain of yeast Gal4 (SEQ ID NO: 4), DNA recognition/binding domain of tetracycline combination protein TetR (SEQ ID NO: 5), and the like, as well as truncations thereof or/and mutants having 80% -99% amino acid sequence homology. More preferably LexA408And the DNA recognition/binding domain of cI.
The LexA protein is a transcription inhibitor existing in Escherichia coli cells, can regulate the transcription of more than 20 genes in the cells, and can recognize/combine with a response element 16 basic CTGT (N) in a promoter through a LexA dimer after homodimerization8ACAG palindrome, thereby preventing transcription of later genes by RNA polymerase. LexA contains 202 amino acids, of which amino acids 1-87 are the DNA recognition/binding domain and amino acids 88-202 are the dimerization domain, and only homodimerized LexA can specifically bind to the corresponding response element, while LexA monomeric protein cannot. LexA normally exists in E.coli cells as a dimer, and when cells are stimulated by internal or external SOS signals, dimerized LexA is cleaved by certain intracellular enzymes (e.g., RecA) and dissociated from the response elements, allowing the gene originally inhibited by LexA to activate, and E.coli initiates repair functions against SOS [13,14 ]][ Schnarr, M. et al, Biochimie,1991.73(4): p.423-31, Little, J.W. et al, Cell,1982.29(1): p.11-22.]. LexA408 mutant LexA protein containing three point mutations P40A, N41S and A42S can specifically recognize sequence 5' -CCGT (N)8ACGG-3', but this sequence is not recognized by the wild-type LexA protein. The LexA-based two-hybrid system has also been used to study gene expression and protein interactions. For example, the yeast two-Hybrid System (MATCHMAKER LexA Two-Hybrid System) manufactured by Clonetech is based on this System.
The cI protein isA lambda phage cI gene encodes a transcriptional repressor that prevents transcription of both the left and right lambda early promoters from producing replication-incapable and cytolytic proteins. The cI protein contains 236 amino acids, the N-terminus of which is the DNA recognition/binding domain (amino acids 1-102) and the C-terminus of which is the dimerization domain (amino acids 132-236). cI protein homodimer recognition/binding PLAnd PRTwo operon sequences, each containing three recognition binding sites for cI, were classified as OL1, OL2 and OL3 for PL and OR1, OR2 and OR3 for PR. The cI binds relatively strongly to OR1, the conserved DNA sequence of OR1 is TACCTCTGGCGGTGATA, and the monomeric cI protein has little binding ability [ Burz, D.S. et al, Biochemistry 33(28),8399-][15,16]。
Gal4 is a transcriptional activator of Saccharomyces cerevisiae (Saccharomyces cerevisiae) (the coding gene is GAL4), and can recognize/bind to the upstream response element-activation motif UAS of gene promoterG[ 17bp long stretch of 5'-CGGRNNRCYNYNYNCNCCG-3' (R represents purine, Y represents pyrimidine, N represents deoxynucleotide)]. It mediates galactose-induced transcription and expression of genes such as GAL1, GAL2, GAL7, GAL10 and MEL 1. Gal4 has a DNA recognition/binding domain at the N-terminus and a transcription Activation Domain (AD) at the C-terminus. The DNA recognition/binding domain contains a Zinc cluster (Zn (2) -Cys (6) binuclear cluster), which needs to form homodimers to bind to the reaction element and play its role [ Kraulis, P.J., et al, Nature,1992.356(6368): p ].
448-50, Marmorstein, R, et al, Nature,1992.356(6368): p.408-14.][17,18]. The two-hybrid system based on GAL4/UAS is an effective tool for studying gene expression. For example, the mammalian two-hybrid system (CheckMate) manufactured by Promega corporationTMMammarian Two-Hybrid System) uses the GAL4/UAS System.
The TetR repressor protein is a transcription factor present in many gram-negative bacteria that inhibits transcription of the associated gene by binding to specific DNA motifs. The TetR protein has a DNA recognition/binding domain at the N-terminus and a dimerization domain at the C-terminus. Monomeric TetR proteins form homodimers and recognize/bind to operons containing specific DNA sequences, with little binding ability of monomeric TetR [ Wissmann, A. et al., EMBO J10 (13),. 4145-4152(1991), Ramos, J.L. et al., Microbiol Biol Rev 69(2),. 326-356(2005) ] [19,20 ].
In a preferred embodiment of the invention, the first polypeptide is LexA408Amino acids 1-87 of the protein, i.e., its DNA recognition/binding domain (whose protein sequence is SEQ ID NO: 1), alone is not capable of binding to the response element (SEQ ID NO: 6). In another preferred embodiment, the first polypeptide is a truncation of the DNA recognition/binding domain of the LexA protein (whose protein sequence is SEQ ID NO: 2) from amino acids 1 to 87, which alone is not capable of binding to the response element (SEQ ID NO: 7). In another preferred embodiment, the first polypeptide is a truncation of the amino acids 1-102 of the cI protein, i.e., its DNA recognition/binding domain (whose protein sequence is seq id No. 3), which alone is not capable of binding to the response element (seq id No. 8). In another preferred embodiment, the first polypeptide is a truncation of the DNA recognition/binding domain of Gal4 protein from amino acids 1-65 (SEQ ID NO: 4), which alone is not capable of binding to a response element (SEQ ID NO: 9). In another preferred embodiment, the first polypeptide is a truncation of the DNA recognition/binding domain of the TetR protein, amino acids 1-63 (SEQ ID NO: 5), which alone is not capable of binding to the response element (SEQ ID NO: 10).
In the light-operated gene expression system, the second polypeptide in the recombinant photosensitive transcription factor is a photosensitive polypeptide and is selected from RsLOV (the amino acid sequence of which is shown as sequence 11) of rhodopseudomonas sphaeroides and DsLOV (the amino acid sequence of which is shown as sequence 12) structural domains of Shigella. After the irradiation of light with proper wavelength, the dimerization capacity of the second polypeptide is changed, so that the dimerization capacity of the transcription factor is changed, and the dimerized transcription factor is combined with the corresponding reaction element, thereby directly regulating the expression level of the target gene.
The helical extensions at the N-and C-termini of the RsLOV protein of R.sphaericus form unusual helical bundles at their dimerization interface, somewhat analogous to the helical transducers of sensory rhodopsin II. The blue light-activated conformational change of RsLOV revealed a common signaling mechanism for the LOV domain proteins, i.e. the formation of light-activated flavin-cysteinyl light adducts, from a comparison of light-bright and dark crystal structures. Adduct formation disrupts hydrogen bonding of the RsLOV active site and facilitates structural changes that propagate the LOV domain core towards the N-and C-terminal extensions. Single point mutations at the active site and dimerization interface of the RsLOV protein alter the lifetime of the photoproduct and induce structural changes that perturb the state of the oligomer. Size exclusion chromatography, multi-angle light scattering, low angle X-ray scattering and cross-linking studies indicate that RsLOV dimerizes in the dark, but under light excitation, dissociates into monomers. RsLOV provides in unique combinations a new function as a photoswitch for controlling cellular processes [ Conrad, K.S., et al. Biochemistry,2013.52(2): p.378-91] [21 ].
DsLOV protein from the photo-and heterotrophic marine alpha-proteobacteria shibae showed an average adduct lifetime of 9.6s at 20 ℃ and was the fastest recovery time for LOV domain proteins reported so far. Mutational analysis of DsLOV revealed a unique role for DsLOV in controlling photopigment formation in the absence of blue light illumination. The crystal structure of DsLOV in the dark state, measured at 1.5 angstrom resolution, reveals a conserved core domain and an extended N-terminal cap. The dimeric interface of the crystal structure forms a unique hydrogen bonding network involving the N-terminal residues and the core region of the β -scaffold. The light activated DsLOV structure shows increased flexibility of the N-cap region and a significant change in the C.alpha.backbone of the beta-chain at the N-and C-terminal ends of the LOV core domain. DsLOV has unique photophysical characteristics and regulatory functions, which makes it a new promising model system [ Endres, S., et al, BMC Microbiol,2015.15: p.30 ] [22 ].
In the second polypeptide, RsLOV and DsLOV irradiated by proper wavelength light can make the recombinant photosensitive transcription factor formed by them reduce dimerization, so that they are separated from reaction element, thereby starting the transcription of downstream gene; and dimerizes in the dark to bind to the response element, thereby inhibiting transcription of the downstream gene.
In the light-operated gene expression system of the present invention, the recombinant light-sensitive transcription factor may further include a third polypeptide, which may recruit other components of RNA polymerase. Third polypeptides that can be used in the present invention include, but are not limited to: the omega factor domain and the alpha factor domain from E.coli, which have been widely used in previous E.coli single hybrid systems [ Dove, S.L. et al, 1998.12(5): p.745-54, Dove, S.L. et al, Nature,1997.386(6625): p.627-30 ] [23,24 ]. The third polypeptide may be linked to the first and second polypeptides either directly or via a linker peptide.
As described above, there are many choices between two or three polypeptides contained in the recombinant transcription factor, and there are many combinations of two or three polypeptides to form a fusion protein, and it is preferable that the present invention prepares a recombinant transcription factor fusion protein as short as possible from fragments of functional domains of two or three polypeptides having good activity, and regulates the expression of a target gene by selecting the light-sensitive transcription factor having a good effect of regulating the expression of the target gene, i.e., causing a large difference in the expression level of the target gene in light and dark, in bacterial cells, but various combinations are included in the scope of the present invention as long as the recombinant light-sensitive transcription factor capable of achieving the light-regulated gene expression performance contemplated by the present invention can be selected and combined.
In the bacterial light-operated gene expression system, the second part of target transcription unit consists of a promoter-nucleotide sequence to be transcribed, which is specifically identified/combined by a transcription factor, or consists of a promoter-response element-nucleotide sequence to be transcribed, or consists of a response element-promoter-nucleotide sequence to be transcribed. Wherein the nucleotide sequence of the promoter or promoter-response element or response element-promoter varies depending on the first polypeptide of the recombinant light-sensitive transcription factor selected for different embodiments. In other words, the promoter or promoter-response element or response element-promoter sequence corresponding thereto must be selected according to the first polypeptide selected. For example, the first polypeptide is LexA408When the DNA recognition/binding domain of the protein is used, the corresponding reaction element is 'sequence 6'; when the first polypeptide is a DNA recognition/binding domain of a LexA protein, a corresponding reaction element is a sequence 7; when the first polypeptide is a DNA recognition/binding domain of the cI protein, a corresponding reaction element is a sequence 8 motif; when the first polypeptide is the DNA recognition/binding domain of Gal4 protein, its corresponding response element is the motif "SEQ ID No. 9";where the first polypeptide is a DNA recognition/binding domain of a TetR protein, its corresponding response element is a "sequence 10" motif.
The response element corresponding to the light sensitive transcription factor in the present invention is generally contained in the nucleotide sequence of the promoter or located downstream of the-10 region of the promoter, and the light sensitive transcription factor is bound to the response element to down-regulate the transcription of the downstream target gene by preventing the binding of RNA polymerase to the promoter region. In a particular embodiment of the invention, the promoter is E.coli colE408A promoter, an escherichia coli colE promoter, a sulA promoter, a recA promoter, a umuDC promoter, a lambda phage O12 promoter, a phage T7 promoter and a Grac promoter of bacillus subtilis (the nucleic acid sequences are 13,14, 15, 16, 17, 18, 19 and 20 respectively). Alternatively, the response element may be located upstream of the promoter-35 region, and transcription of a downstream gene is initiated by the recruitment of additional components of RNA polymerase by the third polypeptide of the light sensitive transcription factor. The number of such promoter upstream response elements may be 1 to 5 or more, as analyzed in the relevant literature.
As is known to those skilled in the art, by "operably linked" is meant that the response element is not directly linked to the promoter or to multiple response elements but may be separated by several nucleotides, provided that there is still a synergistic effect.
Downstream of the promoter of the target transcription unit of the present invention is the nucleotide sequence to be transcribed, which refers to the nucleotide sequence encoding the protein of interest. As described above, the protein of interest may be any useful protein now known or discovered in the future. To verify the effectiveness of the system of the invention and to facilitate detection, in the examples of the invention, an exemplary reporter protein was used: the red fluorescent protein mCherry (the amino acid sequence of which is sequence 21) is used as the target protein, but the target protein of the invention is not limited to the reporter proteins.
The first and second parts of the light-operated gene expression system of the invention can be constructed in one bacterial expression vector or separately in two bacterial expression vectors using standard recombinant DNA techniques. Such expression vectors can be introduced into various bacterial cells using standard techniques to express the desired protein of interest.
The invention provides bacterial expression vectors containing various recombinant light-sensitive transcription factor genes of various combinations of the two or three polypeptides. In a preferred embodiment of the invention, different LexA are provided408(1-87) recombinant light-sensitive transcription factor LexA with linker peptide between RsLOV408(1-87) -RsLOV Gene bacterial expression vector pL408R-L0、pL408R-L21G、pL408R-L15N、pL408R-L2K、pL408R-L26R、pL408R-L23O、pL408R-L21R、pL408R-L9F、pL408R-L17E、pL408R-L30P. (recombinant light-sensitive transcription factor is abbreviated as L408R-L0、L408R-L21G、L408R-L15N、L408R-L2K、L408R-L26R、L408R-L23O、L408R-L21R、L408R-L9F、L408R-L17E、L408R-L30P. Amino acid sequences 22, 23,24, 25, 26, 27, 28, 29, 30, 31), respectively). In another embodiment, recombinant photosensitive transcription factor LexA is provided408(1-87) -DsLOV2 (abbreviated as L)408D, the coding amino acid sequence of which is the sequence 32) gene of the bacterial expression vector pL408D. In another embodiment, a bacterial expression vector pLR is provided comprising a recombinant light sensitive transcription factor LexA (1-87) -RsLOV (abbreviated LR, encoding amino acid sequence 33) gene. In another embodiment, a bacterial expression vector pCR is provided comprising a recombinant light sensitive transcription factor cI (1-102) -RsLOV (abbreviated CR, encoding amino acid sequence 34) gene. In another embodiment, a bacterial expression vector pGR is provided comprising a recombinant light sensitive transcription factor Gal4(1-65) -RsLOV (abbreviated GR, encoding the amino acid sequence of sequence 35) gene. In another embodiment, a bacterial expression vector pTR is provided comprising the recombinant light sensitive transcription factor TetR (1-63) -RsLOV (abbreviated TR, encoding the amino acid sequence of SEQ ID NO: 36) gene. In another embodiment, recombinant photosensitive transcription factor LexA is provided408(1-87) -RsLOV-alpha (abbreviated as L)408R alpha, the coding amino acid sequence of which is sequence 37) gene of the bacterial expression vector pAL408R alpha is shown in the specification. In another embodiment, recombinant photosensitive transcription factor omega-LexA is provided408(1-87) -RsLOV (abbreviated as ω L)408R, the coding amino acid sequence of which is sequence 38) gene of the bacterial expression vector pA omega L408And R is shown in the specification. In another embodiment, a bacterial expression vector pALR α 0 is provided comprising a gene for the recombinant photosensitive transcription factor LexA (1-87) -RsLOV- α (abbreviated LR α, encoding amino acid sequence 39). In another embodiment, a bacterial expression vector pA ω LR is provided comprising a gene for the recombinant photosensitive transcription factor ω -LexA (1-87) -RsLOV (abbreviated as ω LR, encoding amino acid sequence 40). In another embodiment, a bacterial expression vector pALV α is provided comprising a recombinant light sensitive transcription factor cI (1-102) -RsLOV- α 1 (abbreviated CR α, encoding amino acid sequence 41) gene. In another embodiment, a bacterial expression vector pA ω CR comprising a gene for recombinant photosensitive transcription factor ω -cI (1-102) -RsLOV (abbreviated as ω CR, whose encoding amino acid sequence is SEQ ID NO: 42) is provided. In another embodiment, a bacterial expression vector pAGR α is provided comprising a recombinant light sensitive transcription factor Gal4(1-65) -RsLOV- α (abbreviated GR α, encoding the amino acid sequence 43) gene. In another embodiment, a bacterial expression vector pA ω GR is provided comprising the gene for the recombinant photosensitive transcription factor ω -Gal4(1-65) -RsLOV (abbreviated as ω GR whose encoded amino acid sequence is SEQ ID NO: 44). In another embodiment, a bacterial expression vector pTR alpha is provided comprising a recombinant light sensitive transcription factor TetR (1-63) -RsLOV-alpha (abbreviated as TR alpha, encoding the amino acid sequence of SEQ ID NO: 45) gene. In another embodiment, a bacterial expression vector pA ω TR is provided containing the gene for the recombinant photosensitive transcription factor ω -TetR (1-63) -RsLOV (abbreviated as ω TR, encoding amino acid sequence 46). In another embodiment, recombinant photosensitive transcription factor L is provided408Target transcription unit ColE of R408Bacterial expression vector of-mCherry (where Ter-colE408-mCherry-ter nucleotide sequence 47). In another embodiment, a bacterial expression vector containing a target transcription unit of recombinant light-sensitive transcription factor LR, such as ColE-mCherry, SulA-mCherry, RecA-mCherry or umuDC is provided (wherein the nucleotide sequence of Ter-ColE-mCherry-Ter is 48, the nucleotide sequence of Ter-SulA-mCherry-Ter is 49, the nucleotide sequence of Ter-RecA-mCherry-Ter is 50, and the nucleotide sequence of Ter-umuDC-mCherry-Ter is 51). In another embodiment, recombinations are providedTarget transcription unit P of photosensitive transcription factor CRλO12Bacterial expression vector of-mCherry (where Ter-PλO12-mCherry-ter nucleotide sequence 52). In another embodiment, a bacterial expression vector is provided comprising a target transcription unit T7-galop-mCherry of recombinant light-sensitive transcription factor GR (wherein Ter-T7-galop-mCherry-Ter nucleotide sequence is 53). In another embodiment, a bacterial expression vector is provided comprising a target transcription unit T7-tetop-mCherry of recombinant light-sensitive transcription factor TR (wherein Ter-T7-tetop-mCherry-Ter nucleotide sequence is 54).
The present invention provides bacterial expression vectors containing target transcription units with nucleotide sequences to be transcribed being absent, which allow the user to select the desired nucleotide sequence to be transcribed, e.g. the gene encoding the protein of interest, which is inserted into such expression vectors of the invention using standard recombinant DNA techniques, and the expression of the nucleotide sequence (gene) to be transcribed is regulated by the recombinant light-sensitive transcription factor as described above. In some embodiments of the invention, the nucleotide sequences to be transcribed that are missing in the target transcription unit of the bacterial expression vector are: LexA408Corresponding colE408A nucleotide sequence to be transcribed, colE corresponding to LexA, a sulA corresponding to LexA, RecA corresponding to LexA, umuDC corresponding to LexA, a LexA response element lac minimal promoter, a nucleotide sequence to be transcribed, and P corresponding to cIλO12-a nucleotide sequence to be transcribed, a cI response element O12-lac minimal promoter-a nucleotide sequence to be transcribed corresponding to cI, a T7-Gal4 response element-a nucleotide sequence to be transcribed corresponding to Gal4, a Gal4 response element-lac minimal promoter-a nucleotide sequence to be transcribed corresponding to Gal4, a T7-TetR response element-a nucleotide sequence to be transcribed corresponding to TetR, a TetR response element-lac minimal promoter-a nucleotide sequence to be transcribed corresponding to TetR.
The invention also provides a bacterial expression vector which is respectively transformed with various recombinant photosensitive transcription factor genes or a bacterial strain of which the genome is integrated with various recombinant photosensitive transcription factor expression frames, and also provides a bacterial expression vector which contains a promoter or a promoter-response element or a response element-promoter and has a vacant nucleotide sequence to be transcribed. The user can insert the self-selected nucleotide sequence to be transcribed (target protein gene) into the expression vector by using the standard recombinant DNA technology, then transform the bacterial expression vector transformed with various recombinant photosensitive transcription factor genes or bacterial strain with various recombinant photosensitive transcription factor expression frames integrated on the genome by using the reconstructed vector, and culture the bacterial cells to express the target gene required by the bacterial cells or provide the bacterial cells for the research on regulating the expression of the target gene.
The invention also provides kits comprising various expression vectors or bacterial cells transformed with such vectors or having integrated into their genome various recombinant light transcription factor expression cassettes. In one embodiment, the containers of the kit are each filled with a bacterial expression vector comprising one or more recombinant light-sensitive transcription factor genes. In another embodiment, some of the containers in the kit are each filled with a bacterial expression vector comprising one or more recombinant light-sensitive transcription factor genes, and other containers are each filled with a bacterial expression vector comprising a target transcription unit in which promoter-nucleotide sequence to be transcribed is absent or promoter-response element-nucleotide sequence to be transcribed is absent or response element-promoter-nucleotide sequence to be transcribed is absent. In yet another embodiment, some containers of the kit contain bacterial expression vectors that have been transformed with recombinant light-sensitive transcription factor-containing genes or bacterial cells having integrated into their genome expression cassettes for various recombinant light-sensitive transcription factors, and other containers contain bacterial expression vectors having a promoter-nucleotide sequence to be transcribed in the absence or a promoter-response element-nucleotide sequence to be transcribed in the absence or a response element-promoter-nucleotide sequence to be transcribed in the absence. The kit of the invention may also comprise corresponding illumination control devices, such as LED lamps and their regulating devices. All kits are provided with appropriate instructions for the individual components of the kit, the purpose of use and the method of use, and relevant bibliographic references are provided.
The invention also includes a method of regulating gene expression in a bacterial cell by a light-operated gene expression system, comprising the steps of:
a) constructing the bacterial light-operated gene expression system in a bacterial expression vector;
b) introducing a bacterial cell containing a regulated gene; and
c) and (3) inducing the bacterial cells by illumination to express the regulated nucleotides in the bacterial cells.
The method for inducing the bacterial cells by illumination comprises selection of a light source and use of the light source. Light sources include, without limitation, LED lamps, incandescent lamps, fluorescent lamps, lasers. In one embodiment of the present invention, the light source is a blue LED (460 and 470 nm). The illumination method including the amount of illumination, the intensity of illumination, the illumination time, the frequency of illumination, and the spatial control of the expression of the target gene by scanning, projection, optical modeling, and the like are also included in the scope of the present invention. In one embodiment of the present invention, the illumination intensity is 0 to 30mW/cm2(ii) not equal; in another embodiment, the printed slide is used as a light module to spatially modulate the expression levels of genes of interest in cells at different locations; in another embodiment, a neutral gray scale patch is used as a light model to spatially modulate the expression levels of genes of interest in cells at different locations.
Brief description of the drawings
FIG. 1 plasmid map of light-controlled gene expression system.
FIG. 2 construction of bacterial expression vectors for light-sensitive transcription factors with LexA, cI, Gal4 and TetR as first polypeptides
FIG. 3 construction of bacterial expression vectors containing a light-sensitive transcription factor with DsLOV as a second polypeptide
FIG. 4 construction of bacterial expression vectors with omega and alpha factors as third polypeptides of light-sensitive transcription factors
FIG. 5 construction of bacterial expression vectors containing target transcription units of the corresponding response elements LexA, cI, Gal4 and TetR, respectively
FIG. 6 Properties of different linkers in light sensitive transcription factor
FIG. 7 Properties of different SD sequences before light-sensitive transcription factor
FIG. 8 Western blot analysis of light sensitive transcription factors in light control systems containing SD2, 3, 7, 17 and 37.
FIG. 9 light activated dynamics of light management system including SD2, 17, 37
FIG. 10 contains light dependence curves for SD2, 17, 37 light management systems
FIG. 11 is a schematic diagram of spatial regulation of expression of a target gene by the light control system
FIG. 12 applicability of light management systems in different strains
FIG. 13 light-operated cleavage of E.coli
FIG. 14 light-controlled Escherichia coli movement
FIG. 15 construction of Gene circuits Using light-controlled Gene expression System
Detailed Description
The invention is further illustrated by the following examples. These examples are given solely for the purpose of illustration and are not intended to limit the scope of the invention in any way. In the examples, the conventional molecular biological cloning methods of genetic engineering are mainly used, and these methods are well known to those skilled in the art, for example: briefly, rocs chems et al, "handbook of molecular biology laboratory references", and j. sambrook, d.w. rasel, huang peitang et al: a relevant section of the molecular cloning guidelines (third edition, 8. 2002, published by scientific Press, Beijing). Those of ordinary skill in the art will readily appreciate that modifications and variations may be made to the present invention as described in the following examples, and that such modifications and variations are within the scope of the claims of the present application.
The pCDFDuet1 plasmid vectors used in the examples were purchased from Novagen, pRSETb, pBAD/His A plasmid vectors were purchased from Invitrogen, pKD3, pKD4, pCP20, pKD46 were generously given by the Bogen laboratory of the university of eastern science and technology. All primers used for PCR were synthesized, purified and identified correctly by Mass Spectrometry by Jerry bioengineering techniques, Inc. The expression plasmids constructed in the examples were subjected to sequencing, which was performed by Huada Gene Co and Jelie sequencing Co. Taq DNA polymerase used in each example was purchased from Dongbang organisms, pfu DNA polymerase was purchased from Tiangen Biochemical technology (Beijing) Ltd, PrimeSTAR DNA polymerase was purchased from TaKaRa, and the three polymerases were purchased with the corresponding polymerase buffer and dNTP. Restriction enzymes such as BamHI, BglII, HindIII, NdeI, XhoI, SacI, EcoRI, SpeI, T4 ligase, and T4 phosphorylase (T4 PNK) were purchased from Fermentas, and supplied with buffers. The CloneEZ PCR cloning kit (containing the homologous recombinase) used in the examples was purchased from Nanjing Kingsry Biotech, Inc. (Yuan-jin-Satt-Tech (Nanjing) Inc.). Unless otherwise stated, the inorganic salt chemicals were purchased from Shanghai chemical company, the national pharmaceutical group. Kanamycin (Kanamycin) was purchased from Ameresco; ampicillin (Amp) was purchased from Ameresco; streptomycin was purchased from Ameresco; ONPG was purchased from Ameresco; a384-well luminescence detection white board and a 384-well fluorescence detection white board are purchased from Grenier.
The DNA purification kit used in the examples was purchased from BBI (Canada) and the general plasmid minipump kit was purchased from Tiangen Biochemical technology (Beijing) Ltd. The cloning strain Mach1 was purchased from Invitrogen. The JM109(DE3) expression strain was purchased from Promega, and the BL21(DE3) strain was purchased from Novagen.
The main instruments used in the examples: biotek Synergy 2 multifunctional microplate reader (Bio-Tek, USA), X-15R high-speed refrigerated centrifuge (Beckman, USA), Microfuge22R desk-top high-speed refrigerated centrifuge (Beckman, USA), PCR amplification instrument (Biometra, Germany), living body imaging system (Kodak, USA), photometer (Japan and light company), nucleic acid electrophoresis instrument (Shenneng Bo corporation).
The abbreviations have the following meanings: "h" refers to hours, "min" refers to minutes, "s" refers to seconds, "d" refers to days, "μ L" refers to microliters, "ml" refers to milliliters, "L" refers to liters, "bp" refers to base pairs, "mM" refers to millimoles, and "μ M" refers to micromoles.
20 kinds of amino acids and their abbreviation
Figure BDA0002374979160000201
Figure BDA0002374979160000211
General molecular biological methods used in the examples
Polymerase Chain Reaction (PCR):
1. and (3) target fragment amplification PCR:
Figure BDA0002374979160000212
amplification step (bp represents the number of nucleotides of the amplified fragment):
Figure BDA0002374979160000213
2. long fragment (>2500bp) amplification PCR:
Figure BDA0002374979160000214
Figure BDA0002374979160000221
amplification step (bp represents the number of nucleotides of the amplified fragment):
Figure BDA0002374979160000222
or
Figure BDA0002374979160000223
(II) endonuclease enzyme digestion reaction:
1. a system in which plasmid vector was subjected to double digestion (n represents the amount of sterilized ultrapure water. mu.L to be added for the system to reach the total volume):
Figure BDA0002374979160000224
2. a system for double enzyme digestion of PCR product fragments (n is as defined above):
Figure BDA0002374979160000225
Figure BDA0002374979160000231
3. connecting the PCR product fragment subjected to double enzyme digestion into a system of a double enzyme digested plasmid vector ring:
Figure BDA0002374979160000232
note: the mass ratio of the PCR product fragment to the vector double-enzyme digestion product is approximately between 2:1 and 6: 1.
(III) phosphorylation reaction at the 5' end of the DNA fragment and then self-cyclization reaction:
the ends of plasmids or genomes extracted from microorganisms contain phosphate groups, and PCR products do not contain phosphate groups, so that phosphate group addition reaction is needed to be carried out on 5' end bases of the PCR products, and only DNA molecules with phosphate groups at the ends can carry out ligation reaction. Self-cyclization ligation refers to ligation of the 3 'end and the 5' end of the linearized vector.
Figure BDA0002374979160000233
T4 PNK is short for T4 polynucleotide kinase, and is used for addition reaction to the 5' end phosphate group of DNA molecule. Reaction system for self-circularization of 5' -phosphorylated DNA fragment product:
Figure BDA0002374979160000234
(IV) overlapping PCR
Overlapping PCR is a commonly used method for joining two different or identical genes. For example, in FIG. 1, the gene AD and the gene BC are ligated together, and first, two pairs of primers A, D and C, B are designed to amplify the genes AD and BC, respectively, and have a complementary sequence of a certain length at the 5' ends of the primers D and C. And recovering the amplified products AD and BC obtained in the first round of PCR to be used as templates of the second round of PCR.
The second round is amplified for 10 rounds according to the conventional PCR process, and the PCR system is as follows:
Figure BDA0002374979160000241
and adding a primer A and a primer B after the second round of PCR, and continuously amplifying for 30 rounds to obtain the sequence connected with the AD and the BC.
(V) inverse PCR
Inverse PCR is one technique used in the examples below for site-directed mutagenesis, truncation mutagenesis, and insertional mutagenesis. The basic principle refers to the experimental flow of the Takara Mutabest kit. As shown in FIG. 2, reverse PCR primers were designed at the corresponding variant sites, wherein the 5' -end of one primer contained the variant nucleotide sequence. And recovering and purifying the amplified product by using glue, carrying out 5' end phosphorylation reaction and then self-cyclization reaction, and converting into competent cells.
(VI) preparation and transformation of competent cells
Preparation of competent cells:
1. a single colony (e.g., Mach1) was picked and inoculated into 5mL LB medium and shaken overnight at 37 ℃.
2. Transferring 0.5-1mL overnight cultured bacterial liquid into 50mL LB culture medium, culturing at 37 deg.C and 220rpm/min for 3-5 hr until OD600Up to 0.5.
3. The cells were pre-cooled in an ice bath for 2 h.
Centrifuging at 4000rpm/min at 4.4 ℃ for 10 min.
5. The supernatant was discarded, the cells were suspended in 5mL of pre-cooled resuspension buffer and after homogenization the resuspension buffer was added to a final volume of 50 mL.
6. Ice-cooling for 45 min.
Centrifugation at 4000rpm/min for 10min at 7.4 ℃ resuspended bacteria with 5mL of ice-chilled storage buffer.
8. Each EP tube was filled with 100. mu.L of the bacterial solution and frozen at-80 ℃ or with liquid nitrogen.
Resuspension buffer CaCl2(100mM)、MgCl2(70mM)、NaAc(40mM)
Storage buffer 0.5mL DMSO, 1.9mL 80% glycerol, 1mL 10 × CaCl2(1M)、1mL10×MgCl2(700mM)、1mL 10×NaAc(400mM)、4.6mL ddH2O
And (3) transformation:
1. 100 μ l of competent cells were thawed on an ice bath.
2. The ligation product was added in the appropriate volume, gently whipped and mixed well, and ice-cooled for 30 min. The ligation product is typically added in a volume less than 1/10 the volume of competent cells.
3. The bacterial liquid is put into a water bath with the temperature of 42 ℃ for 90 seconds through heat shock, and is quickly transferred into an ice bath for 5 min.
4. 500. mu.l of LB was added and the mixture was incubated for 1 hour at 37 ℃ on a constant temperature shaker for 200 rotations.
5. Centrifuging the bacterial liquid at 4000rpm/min for 3min, leaving 200 μ l of supernatant, blowing the thallus uniformly, uniformly coating the thallus on the surface of an agar plate containing proper antibiotics, and inverting the plate in a constant-temperature incubator at 37 ℃ for overnight.
(VII) detection of mCherry fluorescent protein expressed by escherichia coli
The corresponding single clones were picked from the transformed E.coli plates and inoculated into 48-well plates for culture at 37 ℃ overnight at 1000. mu.l LB per well in 6 replicates per sample. Dilutions were made at 1:200 into another 48-well plate containing 1000. mu.l of fresh LB per well. Unless otherwise specified, the culture conditions were all 37 ℃ with the rotation speed of the shaker 250rpm and the light intensity of 3mW/cm2. After 15h of culture, the strain is collected by centrifugation at 4000rpm for 20 min. The supernatant medium was discarded, 1000. mu.l of PBS solution was added to each well, and the bacteria were completely resuspended using a shaker. Then 50 mul of the solution is absorbed by each hole and is put into a 96-hole enzyme label plate, 50 mul of PBS is added and is fully mixed, and OD is read in a multifunctional enzyme label instrument of Bio-tek company600The value of (c). Then according to the read OD600OD per well600The mixture was adjusted to be uniform and 0.5. After the adjustment, 100. mu.l of the solution was pipetted from each well into a 96-well fluorescence measurement plate, and the fluorescence values of Ex590/20 and Em645/40 were read by means of a multifunctional microplate reader from Bio-tek. The dark samples were cultured in aluminum foil paper bags, and the other treatments were identical.
(VIII) E.coli genome knockout
1. The pKD46 plasmid was transformed into the target strain to be knocked out and cultured overnight at 30 ℃ on a benzyl-containing dish.
2. Single colonies were picked from the plates and cultured overnight in LB medium containing 5ml of fresh LB and ampicillin resistance.
3. Adding the overnight cultured test tube bacteria at a ratio of 1:100 into 50ml of 2XYT medium, and culturing at 30 deg.C to OD600When the concentration is about 0.2-0.3, L-arabinose is added to induce (final concentration is 30mM), and the induction culture is carried out for 90min at 30 ℃.
4. The induced medium was kept on ice for 1h, then centrifuged at 4000rpm for 10min at 4 ℃ and the supernatant discarded, and 20ml of pre-cooled ddH was used2O resuspend the cells, centrifuge again at 4000rpm at 4 ℃ for 10min, and repeat this 4 times. The supernatant was discarded the last time and 1.5ml ddH was added2O resuspend the cells and dispense 80-100. mu.l per 1.5ml EP tube.
5. Sucking 10 mul of linearized gene fragment for gene knockout into a competent solution, quickly adding into an electric transfer cup after mixing uniformly, putting into an electric transfer instrument for electric transfer, immediately adding 500 mul of fresh LB culture medium after the electric transfer is finished, and recovering and culturing for 1-2h at 37 ℃.
Plates containing the corresponding resistance were spun down at 6.4000rpm, incubated overnight at 37 ℃ and identified the next day.
Example 1 construction of plasmid for light-controlled Gene expression System
LexA408(1-87) the gene was amplified from plasmid pLEVI (408) -mCheerry (Collection in the laboratory, related articles: Chen, X., etc., Cell Research,2016.) and fused with RsLOV gene (synthesized by Wuxi Qinglan Biotech Co., Ltd.) to construct a light-sensitive transcription factor. Via overlapping PObtaining LexA by CR amplification408The reporter gene mCherry was obtained by amplifying the promoter fragment corresponding to the response element from plasmid pLEVI (408) -mCheerry (Collection in the laboratory, related articles: Chen, X., et al, Cell Research, 2016.). The transcription factor and the reporter gene are respectively fused with respective promoter and terminator sequences by overlapping PCR, and the two parts are constructed on the same vector, and the plasmid map is shown in figure 1.
Example 2 construction of bacterial expression vectors for light-sensitive transcription factors with LexA, cI, Gal4 and TetR as first polypeptides
In order to construct an E.coli expression vector of a light-induced transcription factor using LexA, cI, Gal4 and TetR as first polypeptides, LexA was amplified from plasmid pLEVI-mCheerry (the collection of laboratories, related articles: Chen, X., etc., Cell Research,2016.), cI (1-102) gene fragments were PCR-amplified from lambda phage genome, Gal4(1-65) DNA binding domain gene fragments were amplified using pBIND plasmid (Promega corporation) as a template, DNA binding domain gene fragments (1-63 amino acids) of TetR were synthesized from the whole gene of Shanghai Czeri Biotechnology, and LexA, cI, Gal4 and TetR were fused with RsLOV gene (synthesized by Wuxi Qingland Biotech technology Co., Ltd.) to construct a light-sensitive transcription factor, respectively, and the plasmid map is shown in FIG. 2.
EXAMPLE 3 construction of bacterial expression vectors containing a light-sensitive transcription factor with DsLOV as the second polypeptide
DsLOV gene was synthesized by Isatis blue Biotech Co., Ltd, and then the DsLOV gene was mixed with LexA in examples 1 and 2 by the overlap PCR technique408LexA, cI, Gal4 and TetR genes are fused to obtain a bacterial expression vector containing the light sensitive transcription factor with DsLOV as the second polypeptide, and the plasmid map is shown in figure 3.
Example 4 construction of bacterial expression vector with omega factor and alpha factor as third polypeptide of light-sensitive transcription factor omega factor and alpha factor gene fragment were amplified from E.coli BL21(DE3) genome, and the omega factor and alpha factor gene fragment and LexA in examples 1 and 2 were separately combined by overlap PCR408LexA, cI, Gal4 or TetR as the photosensitive transcription factor of the first polypeptide is fused by linker to obtain omega factor and alpha factorThe factor is a bacterial expression vector of a third polypeptide of the photosensitive transcription factor, and the plasmid map is shown in figure 4.
EXAMPLE 5 construction of bacterial expression vectors containing target transcription units for the corresponding response elements LexA, cI, Gal4 and TetR, respectively
The gene fragments of the sucA, RecA and umuDC promoters are obtained by amplifying the genome of JM109(DE3), and the gene fragments of the sucA, RecA and umuDC promoters and the gene fragment of the mCherry are connected together by adopting overlapped PCR to obtain three target transcription units containing corresponding reaction elements of LexA. Obtaining P by overlap PCRλO12And (3) fusing the promoter fragment with the mCherry gene to obtain a target transcription unit of the cI corresponding reaction element. T7 promoter-lac operator nucleic acid fragment is amplified from pCDFDuet1 vector, and is used as a template to perform mutation to obtain target transcription units respectively containing corresponding reaction elements of Gal4 and TetR, and the target transcription units are constructed into plasmid expression vectors containing corresponding photosensitive transcription factors, and the plasmid map is shown in FIG. 5.
EXAMPLE 6 screening of transcription factors with different linker
To improve the properties of the light-sensitive transcription factor, we have shown that the repressor LexA408And the light sensitive protein RsLOV. A random sequence with 0-8 amino acids in length is introduced between two proteins by designing a primer, the constructed plasmid is transformed into TOP10 strain, and the coated plate is cultured under blue light. Washing the clone on the plate with LB culture medium, collecting the clone in a conical flask, placing the conical flask in blue light for culturing, using a flow cytometer to sort the mCherry fluorescence-intensity bacteria after 15h, then placing the conical flask in the dark for culturing for 15h, continuously using the flow cytometer for sorting, collecting the weak fluorescence cells, repeating the screening process for three times, finally coating the sorted cells on the plate, picking the monoclonal, inoculating the monoclonal into a 48-hole plate, culturing in the dark and under the illumination respectively, and detecting the fluorescence values of the sample in the dark and under the illumination by using a microplate reader. Selecting a sample with a higher fluorescence value ratio under illumination and darkness, extracting plasmids for reconversion detection, repeating primary screening to obtain a sample with stable expression and a higher ratio, and sequencing the sample to avoid screened linker ammoniaThe amino acid sequences are identical. Finally, 10 different linkers with fluorescence ratio of more than 50 times under illumination and dark are obtained, as shown in FIG. 6. The 26R in 10 screened linkers has the best property, low background leakage and the highest induction multiple.
Example 7 Fine tuning of the control dynamics of light control systems
On the basis of 26R linker, the expression level of the photosensitive transcription factor is regulated by changing an SD sequence at the upstream of the photosensitive transcription factor gene, and a series of samples with different background and maximum activation levels are obtained, as shown in FIGS. 7A and B. The corresponding SD sequence is shown in table 7C. We chose a light management system containing SD2, 3, 7, 17, 37 to detect the level of light sensitive transcription factors. The results indicate that there are significant differences in light sensitive transcription factor expression levels between these systems. The photoprotective system with higher background showed lower expression of the light sensitive transcription factor, while the photoprotective system with lower background showed higher expression of the light sensitive transcription factor, as shown in fig. 8. It is presumed that the high expression of the light-sensitive transcription factor results in the formation of homodimers independent of light, and the activity of the promoter is suppressed even under light conditions. In contrast, low expression of light sensitive transcription factors cannot form homodimers in sufficient quantities to inhibit promoter activity, thereby producing high background noise under dark conditions. Thus, the regulatory ability of the photocontrol system can be highly regulated by modulating the expression level of transcription factors.
EXAMPLE 8 Properties of the light management System
Firstly, the light activated kinetics of the system is researched, the first-level bacteria are cultured overnight in the dark, and the initial OD of the second-level bacteria6000.001 in the dark to OD6000.1, then incubated under blue light illumination, for measurement of mCherry protein, an equal sample was taken at the indicated time point and 3.3 mg. mL _ was added-1Chloramphenicol and 0.4 mg/mL-1Tetracycline stops cell growth, cells are incubated in an ice water bath for 10 minutes to rapidly stop gene expression, and then after incubation in a water bath at 37 ℃ for 1 hour, the expression level of the reporter gene mCheerry is detected by using a CyFLEX-S flow cytometer of Beckmann Coulter Ltd. System photoactivation with SD2, SD17 and SD37, respectivelyKinetics are shown in FIG. 9, t1/2(time required to reach 50% of maximum expression) was 156, 112 and 84 minutes, respectively.
And further detecting the light intensity dependence of the system, culturing the primary bacteria in the dark overnight, and culturing the secondary bacteria in blue light with different intensities for 15h to detect the expression level of the mCherry. As a result, as shown in FIG. 10, the systems respectively containing SD2, SD17 and SD37 exhibited photosensitivity k values (light intensity at half maximum of photoactivation) of 1.312mW/cm respectively2,0.406mW/cm2And 0.059mW/cm2It is shown that light activated systems with lower levels of photosensitive transcription factors are more sensitive to light.
In order to study the spatial regulation of the expression of the target gene by the light-sensitive transcription factor, the fluorescent protein mCherry was used as a reporter gene. Preparing a solid culture medium containing 1% of agar, 0.5% of peptone, 0.25% of yeast extract powder and 0.5% of NaCl, sterilizing at high temperature and high pressure, cooling to about 45 ℃, adding the escherichia coli which is cultured in the dark and is transformed with the systemic plasmid containing SD17 into the solid culture medium which is not solidified, uniformly mixing, and pouring a flat plate. And (3) sticking a printing film with a certain pattern on the bottom of the flat plate to control the area receiving the illumination, adjusting the illumination intensity, inversely culturing the flat plate, and imaging the flat plate by using a solar imager after culturing for about 15 h. The imaging result is shown in fig. 11, the expression of the target gene in the plate is consistent with the pattern of the printing diaphragm, the white area on the fluorescence imaging result picture shows that the expression amount of the reporter gene is high, the black area shows that the expression amount of the reporter gene is very low, and the quantitative analysis is carried out on the expression amount of the reporter gene in the imaging result, and the result shows that the light-operated gene expression system can accurately regulate and control the expression of the target gene in space.
In order to test the applicability of the light-operated gene expression system in different strains, the system plasmid containing SD17 was transformed into a strain commonly used in the laboratory, the transformed plate was cultured overnight in the dark, a single clone was picked up in a 48-well plate under red light and cultured overnight in the dark as a primary strain, and then the initial OD of secondary inoculation was performed600The expression level of reporter gene mCherry was 0.001, and the expression level of reporter gene mCherry was measured after culturing for 15 hours under blue light irradiation or dark conditions. ResultsAs shown in FIG. 12, it is demonstrated that the system can well regulate the expression of target genes in the strains commonly used in the laboratory, the leakage expression is low and the induction efficiency is high. Therefore, the light-operated gene expression system has universality in different escherichia coli strains.
Example 9 light-controlled cleavage of E.coli
FtsZ knock-out TOP10 strain and a light-controlled plasmid containing SD17 and FtsZ as a reporter gene were constructed. Successfully constructed plasmids are transformed into TOP10 strain with FtsZ knock-out, the plate is cultured under blue light with proper light intensity, and then a single clone is picked up to be cultured in a test tube to OD600About 0.2, and the bacteria were then switched to dark conditions. Coli cells cultured in the dark for various periods of time were imaged to observe their division. As shown in FIG. 13, the cells of E.coli initially divided normally, and became longer and longer as the culture time in the dark was prolonged, and became completely filamentous by 3 hours. This is because FtsZ is transcribed and expressed under light, and thus the cell is able to divide normally; when the culture is changed to dark, FtsZ is not expressed, but the degradation of the transcribed mRNA requires a certain time to be translated to obtain a part of the protein, and the most important reason is that the degradation of the translated FtsZ protein requires a certain time, so that the cell cannot stop dividing immediately after the FtsZ gene is not expressed. As the concentration of FtsZ protein in the cells gradually decreases until it disappears, the length of the cells gradually increases to an abnormal filamentous shape.
Example 10 light-controlled Escherichia coli mobilization
Detecting the regulation of a light control system on the movement of escherichia coli, constructing a light control plasmid containing SD2 and a reporter gene of CheZ, preparing a semisolid culture medium containing 1% of Trptone, 0.5% of NaCl and 0.25% of Agar, sterilizing at high temperature and high pressure, adding corresponding resistance when the temperature is cooled to below 50 ℃, uniformly mixing, pouring the plate in a culture dish, and standing for 1h at room temperature for cooling. The constructed light-controlled plasmid is transformed into a CheZ gene-knocked JM109(DE3, delta sulA, delta LexA, delta cheZ) strain (collected in the laboratory, related articles: Chen, X. et al, Cell Research,2016.), first-stage culture is carried out under dark condition, and diluted to be fresh according to the proportion of 1:200 on the next dayThe culture medium was continued in the dark. To be OD600At 0.1-0.2, 2. mu.l of the bacterial suspension was dropped onto the poured semi-solid plate and incubated under different intensities of blue light. Dark control samples were wrapped in tinfoil and incubated under the same conditions. The plates were imaged 15h later using a KODAK in vivo imager. As a result, as shown in FIG. 14, the CheZ gene was not expressed under dark conditions, and thus the bacteria did not move. As the light intensity is gradually increased, the expression level of the CheZ gene is gradually increased until the highest expression level is reached, and the moving range of the bacteria is also gradually increased until the maximum range is reached and does not increase continuously along with the increase of the light intensity.
Example 11 construction of Gene circuits Using light-controlled Gene expression System
An N-IMPLY logic gate is constructed by using blue light and arabinose as double input signals, a light-controlled plasmid which contains SD2 and has a reporter gene of cI (the amino acid sequence is 55) is constructed, the light-controlled repressor protein cI is expressed, the expression of LasI is induced by arabinose, a PR promoter in the reporter plasmid controls the expression of LasR, and a PlasI promoter controls the expression of a target gene mCherry. Under the illumination condition, cI is expressed to inhibit the expression of LasR, so that a PlasI promoter cannot start the expression of downstream genes, and therefore, the target gene mCheerry cannot be expressed as long as a blue light signal is input into an N-IMPLY logic gate. Under dark conditions, the repressor cI is not expressed, the promoter PR starts the transcription of LasR, arabinose is added to induce the expression of LasI protein, a signal molecule AHL (3OC12HSL) is synthesized, the signal molecule AHL is combined and activates the LasR protein, and the complex of LasR-AHL is combined with a transcription regulation region in the promoter PlasI, so that the expression of the reporter gene mCherry is started, as shown in FIG. 15.
It is to be understood that the numerical quantities of ingredients, reaction conditions, etc., used in the examples or experimental procedures, or other parameters used in the specification are approximate (unless otherwise noted) and may be varied depending upon the desired results to be obtained. Moreover, these parameters are not intended to limit the scope of the present invention, but rather to apply the preferred data obtained under normal operating conditions. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. Any methods and materials similar or equivalent to those described herein can be used in the practice of the present invention. The preferred experimental methods and materials described herein are exemplary only. All documents mentioned in this specification are incorporated in their entirety by reference. Furthermore, it should be understood that various changes and modifications can be made by those skilled in the art after reading the above disclosure, and equivalents also fall within the scope of the invention as defined by the appended claims.
Sequence listing
<110> university of east China's college of science
<120> bacterial light-operated gene expression system and method for regulating and controlling gene expression
<130> 2020.1.19
<160> 55
<170> SIPOSequenceListing 1.0
<210> 1
<211> 87
<212> PRT
<213> Synthetic Sequence
<400> 1
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro
85
<210> 2
<211> 87
<212> PRT
<213> Synthetic Sequence
<400> 2
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Pro Asn Ala Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro
85
<210> 3
<211> 102
<212> PRT
<213> Synthetic Sequence
<400> 3
Met Ser Thr Lys Lys Lys Pro Leu Thr Gln Glu Gln Leu Glu Asp Ala
1 5 10 15
Arg Arg Leu Lys Ala Ile Tyr Glu Lys Lys Lys Asn Glu Leu Gly Leu
20 25 30
Ser Gln Glu Ser Val Ala Asp Lys Met Gly Met Gly Gln Ser Gly Val
35 40 45
Gly Ala Leu Phe Asn Gly Ile Asn Ala Leu Asn Ala Tyr Asn Ala Ala
50 55 60
Leu Leu Ala Lys Ile Leu Lys Val Ser Val Glu Glu Phe Ser Pro Ser
65 70 75 80
Ile Ala Arg Glu Ile Tyr Glu Met Tyr Glu Ala Val Ser Met Gln Pro
85 90 95
Ser Leu Arg Ser Glu Tyr
100
<210> 4
<211> 65
<212> PRT
<213> Synthetic Sequence
<400> 4
Met Lys Leu Leu Ser Ser Ile Glu Gln Ala Cys Asp Ile Cys Arg Leu
1 5 10 15
Lys Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu
20 25 30
Lys Asn Asn Trp Glu Cys Arg Tyr Ser Pro Lys Thr Lys Arg Ser Pro
35 40 45
Leu Thr Arg Ala His Leu Thr Glu Val Glu Ser Arg Leu Glu Arg Leu
50 55 60
Glu
65
<210> 5
<211> 63
<212> PRT
<213> Synthetic Sequence
<400> 5
Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu
1 5 10 15
Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln
20 25 30
Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys
35 40 45
Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His
50 55 60
<210> 6
<211> 16
<212> DNA
<213> Synthetic Sequence
<220>
<221> misc_feature
<222> (5)..(12)
<223> n is a, c, g, or t
<400> 6
ccgtnnnnnn nnacgg 16
<210> 7
<211> 16
<212> DNA
<213> Synthetic Sequence
<220>
<221> misc_feature
<222> (5)..(12)
<223> n is a, c, g, or t
<400> 7
ctgtnnnnnn nnacag 16
<210> 8
<211> 64
<212> DNA
<213> Synthetic Sequence
<400> 8
tatcaccgca agggataaat atctaacacc gtgcgtgttg actattttac ctctggcggt 60
gata 64
<210> 9
<211> 19
<212> DNA
<213> Synthetic Sequence
<220>
<221> misc_feature
<222> (5)..(6)
<223> n is a, c, g, or t
<220>
<221> misc_feature
<222> (10)..(10)
<223> n is a, c, g, or t
<220>
<221> misc_feature
<222> (12)..(12)
<223> n is a, c, g, or t
<220>
<221> misc_feature
<222> (14)..(14)
<223> n is a, c, g, or t
<220>
<221> misc_feature
<222> (16)..(16)
<223> n is a, c, g, or t
<400> 9
cggrnnrcyn ynyncnccg 19
<210> 10
<211> 19
<212> DNA
<213> Synthetic Sequence
<400> 10
tccctatcag tgatagaga 19
<210> 11
<211> 178
<212> PRT
<213> Synthetic Sequence
<400> 11
Met Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp
1 5 10 15
Arg Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln
20 25 30
Pro Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr
35 40 45
Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp
50 55 60
Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly
65 70 75 80
Arg Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro
85 90 95
Phe Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala
100 105 110
Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn
115 120 125
Ser Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu
130 135 140
Leu Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg
145 150 155 160
Arg His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg
165 170 175
Arg Gly
<210> 12
<211> 146
<212> PRT
<213> Synthetic Sequence
<400> 12
Met Arg Arg His Tyr Arg Asp Leu Ile Arg Asn Thr Pro Met Pro Asp
1 5 10 15
Thr Pro Gln Asp Ile Ala Asp Leu Arg Ala Leu Leu Asp Glu Asp Glu
20 25 30
Ala Glu Met Ser Val Val Phe Ser Asp Pro Ser Gln Pro Asp Asn Pro
35 40 45
Thr Ile Tyr Val Ser Asp Ala Phe Leu Val Gln Thr Gly Tyr Thr Leu
50 55 60
Glu Glu Val Leu Gly Arg Asn Cys Arg Phe Leu Gln Gly Pro Asp Thr
65 70 75 80
Asn Pro His Ala Val Glu Ala Ile Arg Gln Gly Leu Lys Ala Glu Thr
85 90 95
Arg Phe Thr Ile Asp Ile Leu Asn Tyr Arg Lys Asp Gly Ser Ala Phe
100 105 110
Val Asn Arg Leu Arg Ile Arg Pro Ile Tyr Asp Pro Glu Gly Asn Leu
115 120 125
Met Phe Phe Ala Gly Ala Gln Asn Pro Val Leu Glu His His His His
130 135 140
His His
145
<210> 13
<211> 152
<212> DNA
<213> Synthetic Sequence
<400> 13
tgtttttttg atcgttttca caaaaatgga agtccacagt cttgacaggg aaaatgcagc 60
ggcgtagctt ttatgccgta tataaaaacg gcgtttatat gtacggtatt tatttttaac 120
ttattgtttt aaaagtcaaa gaggatttta ta 152
<210> 14
<211> 152
<212> DNA
<213> Synthetic Sequence
<400> 14
tgtttttttg atcgttttca caaaaatgga agtccacagt cttgacaggg aaaatgcagc 60
ggcgtagctt ttatgctgta tataaaacca gtggttatat gtacagtatt tatttttaac 120
ttattgtttt aaaagtcaaa gaggatttta ta 152
<210> 15
<211> 71
<212> DNA
<213> Synthetic Sequence
<400> 15
atagggttga tctttgttgt cactggatgt actgtacatc catacagtaa ctcacagggg 60
ctggattgat t 71
<210> 16
<211> 100
<212> DNA
<213> Synthetic Sequence
<400> 16
caatttctac aaaacacttg atactgtatg agcatacagt ataattgctt caacagaaca 60
tattgactat ccggtattac ccggcatgac aggagtaaaa 100
<210> 17
<211> 100
<212> DNA
<213> Synthetic Sequence
<400> 17
gcctatgcag cgacaaatat tgatagcctg aatcagtatt gatctgctgg caagaacaga 60
ctactgtata taaaaacagt ataacttcag gcagattatt 100
<210> 18
<211> 74
<212> DNA
<213> Synthetic Sequence
<400> 18
tatctaacac cgtgcgtgtt gactatttta cctctggcgg tgataatggt tgcatgtact 60
aaggaggtac tagt 74
<210> 19
<211> 81
<212> DNA
<213> synthetic sequence
<400> 19
ttaatacgac tcactatagg gagaccacaa cggtttccct ctagaaataa ttttgtttaa 60
ctttaagaag gagatataca t 81
<210> 20
<211> 47
<212> DNA
<213> Synthetic Sequence
<400> 20
ttatcacttg aaattggaag ggagattctt tattataaga attgtgg 47
<210> 21
<211> 236
<212> PRT
<213> Synthetic Sequence
<400> 21
Met Val Ser Lys Gly Glu Glu Asp Asn Met Ala Ile Ile Lys Glu Phe
1 5 10 15
Met Arg Phe Lys Val His Met Glu Gly Ser Val Asn Gly His Glu Phe
20 25 30
Glu Ile Glu Gly Glu Gly Glu Gly Arg Pro Tyr Glu Gly Thr Gln Thr
35 40 45
Ala Lys Leu Lys Val Thr Lys Gly Gly Pro Leu Pro Phe Ala Trp Asp
50 55 60
Ile Leu Ser Pro Gln Phe Met Tyr Gly Ser Lys Ala Tyr Val Lys His
65 70 75 80
Pro Ala Asp Ile Pro Asp Tyr Leu Lys Leu Ser Phe Pro Glu Gly Phe
85 90 95
Lys Trp Glu Arg Val Met Asn Phe Glu Asp Gly Gly Val Val Thr Val
100 105 110
Thr Gln Asp Ser Ser Leu Gln Asp Gly Glu Phe Ile Tyr Lys Val Lys
115 120 125
Leu Arg Gly Thr Asn Phe Pro Ser Asp Gly Pro Val Met Gln Lys Lys
130 135 140
Thr Met Gly Trp Glu Ala Ser Ser Glu Arg Met Tyr Pro Glu Asp Gly
145 150 155 160
Ala Leu Lys Gly Glu Ile Lys Gln Arg Leu Lys Leu Lys Asp Gly Gly
165 170 175
His Tyr Asp Ala Glu Val Lys Thr Thr Tyr Lys Ala Lys Lys Pro Val
180 185 190
Gln Leu Pro Gly Ala Tyr Asn Val Asn Ile Lys Leu Asp Ile Thr Ser
195 200 205
His Asn Glu Asp Tyr Thr Ile Val Glu Gln Tyr Glu Arg Ala Glu Gly
210 215 220
Arg His Ser Thr Gly Gly Met Asp Glu Leu Tyr Lys
225 230 235
<210> 22
<211> 265
<212> PRT
<213> Synthetic Sequence
<400> 22
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Met Ala Met Asp Gln Lys Gln Phe Glu
85 90 95
Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val
100 105 110
Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe
115 120 125
Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys
130 135 140
Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile
145 150 155 160
Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn
165 170 175
Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro
180 185 190
Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe
195 200 205
Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His
210 215 220
Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala
225 230 235 240
Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala
245 250 255
Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 23
<211> 267
<212> PRT
<213> Synthetic Sequence
<400> 23
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Val Phe Met Ala Met Asp Gln Lys Gln
85 90 95
Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr
100 105 110
Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro
115 120 125
Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe
130 135 140
Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala
145 150 155 160
Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu
165 170 175
Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu
180 185 190
His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser
195 200 205
Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala
210 215 220
Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val
225 230 235 240
Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala
245 250 255
Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 24
<211> 267
<212> PRT
<213> Synthetic Sequence
<400> 24
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Asn Tyr Met Ala Met Asp Gln Lys Gln
85 90 95
Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr
100 105 110
Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro
115 120 125
Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe
130 135 140
Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala
145 150 155 160
Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu
165 170 175
Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu
180 185 190
His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser
195 200 205
Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala
210 215 220
Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val
225 230 235 240
Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala
245 250 255
Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 25
<211> 267
<212> PRT
<213> Synthetic Sequence
<400> 25
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Lys Pro Met Ala Met Asp Gln Lys Gln
85 90 95
Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr
100 105 110
Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro
115 120 125
Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe
130 135 140
Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala
145 150 155 160
Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu
165 170 175
Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu
180 185 190
His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser
195 200 205
Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala
210 215 220
Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val
225 230 235 240
Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala
245 250 255
Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 26
<211> 267
<212> PRT
<213> Synthetic Sequence
<400> 26
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Lys Val Met Ala Met Asp Gln Lys Gln
85 90 95
Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr
100 105 110
Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro
115 120 125
Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe
130 135 140
Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala
145 150 155 160
Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu
165 170 175
Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu
180 185 190
His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser
195 200 205
Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala
210 215 220
Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val
225 230 235 240
Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala
245 250 255
Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 27
<211> 268
<212> PRT
<213> Synthetic Sequence
<400> 27
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Cys Val Thr Met Ala Met Asp Gln Lys
85 90 95
Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu
100 105 110
Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn
115 120 125
Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly
130 135 140
Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg
145 150 155 160
Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val
165 170 175
Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe
180 185 190
Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly
195 200 205
Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala
210 215 220
Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr
225 230 235 240
Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala
245 250 255
Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 28
<211> 269
<212> PRT
<213> Synthetic Sequence
<400> 28
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Thr Gln Val Tyr Met Ala Met Asp Gln
85 90 95
Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala
100 105 110
Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala
115 120 125
Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu
130 135 140
Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala
145 150 155 160
Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val
165 170 175
Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu
180 185 190
Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu
195 200 205
Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala
210 215 220
Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly
225 230 235 240
Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln
245 250 255
Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 29
<211> 269
<212> PRT
<213> Synthetic Sequence
<400> 29
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Leu Trp Thr Ser Met Ala Met Asp Gln
85 90 95
Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala
100 105 110
Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala
115 120 125
Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu
130 135 140
Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala
145 150 155 160
Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val
165 170 175
Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu
180 185 190
Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu
195 200 205
Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala
210 215 220
Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly
225 230 235 240
Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln
245 250 255
Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 30
<211> 269
<212> PRT
<213> Synthetic Sequence
<400> 30
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Gln Arg Tyr Ser Met Ala Met Asp Gln
85 90 95
Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala
100 105 110
Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala
115 120 125
Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu
130 135 140
Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala
145 150 155 160
Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val
165 170 175
Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu
180 185 190
Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu
195 200 205
Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala
210 215 220
Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly
225 230 235 240
Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln
245 250 255
Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 31
<211> 268
<212> PRT
<213> Synthetic Sequence
<400> 31
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Ile Val Asp Met Ala Met Asp Gln Lys
85 90 95
Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu
100 105 110
Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn
115 120 125
Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly
130 135 140
Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg
145 150 155 160
Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val
165 170 175
Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe
180 185 190
Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly
195 200 205
Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala
210 215 220
Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr
225 230 235 240
Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala
245 250 255
Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 32
<211> 233
<212> PRT
<213> Synthetic Sequence
<400> 32
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Met Arg Arg His Tyr Arg Asp Leu Ile
85 90 95
Arg Asn Thr Pro Met Pro Asp Thr Pro Gln Asp Ile Ala Asp Leu Arg
100 105 110
Ala Leu Leu Asp Glu Asp Glu Ala Glu Met Ser Val Val Phe Ser Asp
115 120 125
Pro Ser Gln Pro Asp Asn Pro Thr Ile Tyr Val Ser Asp Ala Phe Leu
130 135 140
Val Gln Thr Gly Tyr Thr Leu Glu Glu Val Leu Gly Arg Asn Cys Arg
145 150 155 160
Phe Leu Gln Gly Pro Asp Thr Asn Pro His Ala Val Glu Ala Ile Arg
165 170 175
Gln Gly Leu Lys Ala Glu Thr Arg Phe Thr Ile Asp Ile Leu Asn Tyr
180 185 190
Arg Lys Asp Gly Ser Ala Phe Val Asn Arg Leu Arg Ile Arg Pro Ile
195 200 205
Tyr Asp Pro Glu Gly Asn Leu Met Phe Phe Ala Gly Ala Gln Asn Pro
210 215 220
Val Leu Glu His His His His His His
225 230
<210> 33
<211> 265
<212> PRT
<213> Synthetic Sequence
<400> 33
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Pro Asn Ala Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Met Ala Met Asp Gln Lys Gln Phe Glu
85 90 95
Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val
100 105 110
Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe
115 120 125
Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys
130 135 140
Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile
145 150 155 160
Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn
165 170 175
Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro
180 185 190
Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe
195 200 205
Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His
210 215 220
Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala
225 230 235 240
Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala
245 250 255
Leu Val Arg Ala Trp Glu Arg Arg Gly
260 265
<210> 34
<211> 280
<212> PRT
<213> Synthetic Sequence
<400> 34
Met Ser Thr Lys Lys Lys Pro Leu Thr Gln Glu Gln Leu Glu Asp Ala
1 5 10 15
Arg Arg Leu Lys Ala Ile Tyr Glu Lys Lys Lys Asn Glu Leu Gly Leu
20 25 30
Ser Gln Glu Ser Val Ala Asp Lys Met Gly Met Gly Gln Ser Gly Val
35 40 45
Gly Ala Leu Phe Asn Gly Ile Asn Ala Leu Asn Ala Tyr Asn Ala Ala
50 55 60
Leu Leu Ala Lys Ile Leu Lys Val Ser Val Glu Glu Phe Ser Pro Ser
65 70 75 80
Ile Ala Arg Glu Ile Tyr Glu Met Tyr Glu Ala Val Ser Met Gln Pro
85 90 95
Ser Leu Arg Ser Glu Tyr Met Ala Met Asp Gln Lys Gln Phe Glu Lys
100 105 110
Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val Asp
115 120 125
Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe Leu
130 135 140
Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg
145 150 155 160
Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg
165 170 175
Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn Tyr
180 185 190
Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro Val
195 200 205
Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu
210 215 220
Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His Ala
225 230 235 240
Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala Arg
245 250 255
Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala Leu
260 265 270
Val Arg Ala Trp Glu Arg Arg Gly
275 280
<210> 35
<211> 243
<212> PRT
<213> Synthetic Sequence
<400> 35
Met Lys Leu Leu Ser Ser Ile Glu Gln Ala Cys Asp Ile Cys Arg Leu
1 5 10 15
Lys Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu
20 25 30
Lys Asn Asn Trp Glu Cys Arg Tyr Ser Pro Lys Thr Lys Arg Ser Pro
35 40 45
Leu Thr Arg Ala His Leu Thr Glu Val Glu Ser Arg Leu Glu Arg Leu
50 55 60
Glu Met Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val Phe
65 70 75 80
Asp Arg Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro Glu
85 90 95
Gln Pro Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr
100 105 110
Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly
115 120 125
Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu
130 135 140
Gly Arg Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu
145 150 155 160
Pro Phe Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro Asp
165 170 175
Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly
180 185 190
Asn Ser Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr Gly
195 200 205
Glu Leu Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp Ser
210 215 220
Arg Arg His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp Glu
225 230 235 240
Arg Arg Gly
<210> 36
<211> 241
<212> PRT
<213> Synthetic Sequence
<400> 36
Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu
1 5 10 15
Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln
20 25 30
Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys
35 40 45
Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His Met
50 55 60
Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg
65 70 75 80
Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro
85 90 95
Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu
100 105 110
Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu
115 120 125
Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg
130 135 140
Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe
145 150 155 160
Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro
165 170 175
Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser
180 185 190
Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu
195 200 205
Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg
210 215 220
His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg
225 230 235 240
Gly
<210> 37
<211> 538
<212> PRT
<213> Synthetic Sequence
<400> 37
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Met Ala Met Asp Gln Lys Gln Phe Glu
85 90 95
Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val
100 105 110
Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe
115 120 125
Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys
130 135 140
Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile
145 150 155 160
Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn
165 170 175
Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro
180 185 190
Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe
195 200 205
Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His
210 215 220
Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala
225 230 235 240
Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala
245 250 255
Leu Val Arg Ala Trp Glu Arg Arg Gly Thr Ser Ala Ala Ala Asp Tyr
260 265 270
Lys Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His
275 280 285
Gly Thr Met Gln Gly Ser Val Thr Glu Phe Leu Lys Pro Arg Leu Val
290 295 300
Asp Ile Glu Gln Val Ser Ser Thr His Ala Lys Val Thr Leu Glu Pro
305 310 315 320
Leu Glu Arg Gly Phe Gly His Thr Leu Gly Asn Ala Leu Arg Arg Ile
325 330 335
Leu Leu Ser Ser Met Pro Gly Cys Ala Val Thr Glu Val Glu Ile Asp
340 345 350
Gly Val Leu His Glu Tyr Ser Thr Lys Glu Gly Val Gln Glu Asp Ile
355 360 365
Leu Glu Ile Leu Leu Asn Leu Lys Gly Leu Ala Val Arg Val Gln Gly
370 375 380
Lys Asp Glu Val Ile Leu Thr Leu Asn Lys Ser Gly Ile Gly Pro Val
385 390 395 400
Thr Ala Ala Asp Ile Thr His Asp Gly Asp Val Glu Ile Val Lys Pro
405 410 415
Gln His Val Ile Cys His Leu Thr Asp Glu Asn Ala Ser Ile Ser Met
420 425 430
Arg Ile Lys Val Gln Arg Gly Arg Gly Tyr Val Pro Ala Ser Thr Arg
435 440 445
Ile His Ser Glu Glu Asp Glu Arg Pro Ile Gly Arg Leu Leu Val Asp
450 455 460
Ala Cys Tyr Ser Pro Val Glu Arg Ile Ala Tyr Asn Val Glu Ala Ala
465 470 475 480
Arg Val Glu Gln Arg Thr Asp Leu Asp Lys Leu Val Ile Glu Met Glu
485 490 495
Thr Asn Gly Thr Ile Asp Pro Glu Glu Ala Ile Arg Arg Ala Ala Thr
500 505 510
Ile Leu Ala Glu Gln Leu Glu Ala Phe Val Asp Leu Arg Asp Val Arg
515 520 525
Gln Pro Glu Val Lys Glu Glu Lys Pro Glu
530 535
<210> 38
<211> 378
<212> PRT
<213> Synthetic Sequence
<400> 38
Met Ala Arg Val Thr Val Gln Asp Ala Val Glu Lys Ile Gly Asn Arg
1 5 10 15
Phe Asp Leu Val Leu Val Ala Ala Arg Arg Ala Arg Gln Met Gln Val
20 25 30
Gly Gly Lys Asp Pro Leu Val Pro Glu Glu Asn Asp Lys Thr Thr Val
35 40 45
Ile Ala Leu Arg Glu Ile Glu Glu Gly Leu Ile Asn Asn Gln Ile Leu
50 55 60
Asp Val Arg Glu Arg Gln Glu Gln Gln Glu Gln Glu Ala Ala Glu Leu
65 70 75 80
Gln Ala Val Thr Ala Ile Ala Glu Gly Arg Ala Ala Ala Asp Tyr Lys
85 90 95
Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His Thr
100 105 110
Ser Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile
115 120 125
Arg Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile
130 135 140
Ala Gln Arg Leu Gly Phe Arg Ser Ala Ser Ser Ala Glu Glu His Leu
145 150 155 160
Lys Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser
165 170 175
Arg Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val
180 185 190
Gly Arg Val Ala Ala Gly Glu Pro Met Ala Met Asp Gln Lys Gln Phe
195 200 205
Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu
210 215 220
Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro
225 230 235 240
Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn
245 250 255
Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp
260 265 270
Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg
275 280 285
Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His
290 295 300
Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln
305 310 315 320
Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly
325 330 335
His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala
340 345 350
Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala
355 360 365
Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
370 375
<210> 39
<211> 538
<212> PRT
<213> Synthetic Sequence
<400> 39
Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg
1 5 10 15
Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala
20 25 30
Gln Arg Leu Gly Phe Arg Ser Pro Asn Ala Ala Glu Glu His Leu Lys
35 40 45
Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg
50 55 60
Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly
65 70 75 80
Arg Val Ala Ala Gly Glu Pro Met Ala Met Asp Gln Lys Gln Phe Glu
85 90 95
Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val
100 105 110
Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe
115 120 125
Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys
130 135 140
Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile
145 150 155 160
Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn
165 170 175
Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro
180 185 190
Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe
195 200 205
Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His
210 215 220
Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala
225 230 235 240
Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala
245 250 255
Leu Val Arg Ala Trp Glu Arg Arg Gly Thr Ser Ala Ala Ala Asp Tyr
260 265 270
Lys Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His
275 280 285
Gly Thr Met Gln Gly Ser Val Thr Glu Phe Leu Lys Pro Arg Leu Val
290 295 300
Asp Ile Glu Gln Val Ser Ser Thr His Ala Lys Val Thr Leu Glu Pro
305 310 315 320
Leu Glu Arg Gly Phe Gly His Thr Leu Gly Asn Ala Leu Arg Arg Ile
325 330 335
Leu Leu Ser Ser Met Pro Gly Cys Ala Val Thr Glu Val Glu Ile Asp
340 345 350
Gly Val Leu His Glu Tyr Ser Thr Lys Glu Gly Val Gln Glu Asp Ile
355 360 365
Leu Glu Ile Leu Leu Asn Leu Lys Gly Leu Ala Val Arg Val Gln Gly
370 375 380
Lys Asp Glu Val Ile Leu Thr Leu Asn Lys Ser Gly Ile Gly Pro Val
385 390 395 400
Thr Ala Ala Asp Ile Thr His Asp Gly Asp Val Glu Ile Val Lys Pro
405 410 415
Gln His Val Ile Cys His Leu Thr Asp Glu Asn Ala Ser Ile Ser Met
420 425 430
Arg Ile Lys Val Gln Arg Gly Arg Gly Tyr Val Pro Ala Ser Thr Arg
435 440 445
Ile His Ser Glu Glu Asp Glu Arg Pro Ile Gly Arg Leu Leu Val Asp
450 455 460
Ala Cys Tyr Ser Pro Val Glu Arg Ile Ala Tyr Asn Val Glu Ala Ala
465 470 475 480
Arg Val Glu Gln Arg Thr Asp Leu Asp Lys Leu Val Ile Glu Met Glu
485 490 495
Thr Asn Gly Thr Ile Asp Pro Glu Glu Ala Ile Arg Arg Ala Ala Thr
500 505 510
Ile Leu Ala Glu Gln Leu Glu Ala Phe Val Asp Leu Arg Asp Val Arg
515 520 525
Gln Pro Glu Val Lys Glu Glu Lys Pro Glu
530 535
<210> 40
<211> 378
<212> PRT
<213> Synthetic Sequence
<400> 40
Met Ala Arg Val Thr Val Gln Asp Ala Val Glu Lys Ile Gly Asn Arg
1 5 10 15
Phe Asp Leu Val Leu Val Ala Ala Arg Arg Ala Arg Gln Met Gln Val
20 25 30
Gly Gly Lys Asp Pro Leu Val Pro Glu Glu Asn Asp Lys Thr Thr Val
35 40 45
Ile Ala Leu Arg Glu Ile Glu Glu Gly Leu Ile Asn Asn Gln Ile Leu
50 55 60
Asp Val Arg Glu Arg Gln Glu Gln Gln Glu Gln Glu Ala Ala Glu Leu
65 70 75 80
Gln Ala Val Thr Ala Ile Ala Glu Gly Arg Ala Ala Ala Asp Tyr Lys
85 90 95
Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His Thr
100 105 110
Ser Met Lys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile
115 120 125
Arg Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile
130 135 140
Ala Gln Arg Leu Gly Phe Arg Ser Pro Asn Ala Ala Glu Glu His Leu
145 150 155 160
Lys Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser
165 170 175
Arg Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val
180 185 190
Gly Arg Val Ala Ala Gly Glu Pro Met Ala Met Asp Gln Lys Gln Phe
195 200 205
Glu Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu
210 215 220
Val Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro
225 230 235 240
Phe Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn
245 250 255
Cys Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp
260 265 270
Ile Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg
275 280 285
Asn Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His
290 295 300
Pro Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln
305 310 315 320
Phe Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly
325 330 335
His Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala
340 345 350
Ala Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala
355 360 365
Ala Leu Val Arg Ala Trp Glu Arg Arg Gly
370 375
<210> 41
<211> 553
<212> PRT
<213> Synthetic Sequence
<400> 41
Met Ser Thr Lys Lys Lys Pro Leu Thr Gln Glu Gln Leu Glu Asp Ala
1 5 10 15
Arg Arg Leu Lys Ala Ile Tyr Glu Lys Lys Lys Asn Glu Leu Gly Leu
20 25 30
Ser Gln Glu Ser Val Ala Asp Lys Met Gly Met Gly Gln Ser Gly Val
35 40 45
Gly Ala Leu Phe Asn Gly Ile Asn Ala Leu Asn Ala Tyr Asn Ala Ala
50 55 60
Leu Leu Ala Lys Ile Leu Lys Val Ser Val Glu Glu Phe Ser Pro Ser
65 70 75 80
Ile Ala Arg Glu Ile Tyr Glu Met Tyr Glu Ala Val Ser Met Gln Pro
85 90 95
Ser Leu Arg Ser Glu Tyr Met Ala Met Asp Gln Lys Gln Phe Glu Lys
100 105 110
Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val Asp
115 120 125
Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe Leu
130 135 140
Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg
145 150 155 160
Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg
165 170 175
Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn Tyr
180 185 190
Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro Val
195 200 205
Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu
210 215 220
Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His Ala
225 230 235 240
Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala Arg
245 250 255
Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala Leu
260 265 270
Val Arg Ala Trp Glu Arg Arg Gly Thr Ser Ala Ala Ala Asp Tyr Lys
275 280 285
Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His Gly
290 295 300
Thr Met Gln Gly Ser Val Thr Glu Phe Leu Lys Pro Arg Leu Val Asp
305 310 315 320
Ile Glu Gln Val Ser Ser Thr His Ala Lys Val Thr Leu Glu Pro Leu
325 330 335
Glu Arg Gly Phe Gly His Thr Leu Gly Asn Ala Leu Arg Arg Ile Leu
340 345 350
Leu Ser Ser Met Pro Gly Cys Ala Val Thr Glu Val Glu Ile Asp Gly
355 360 365
Val Leu His Glu Tyr Ser Thr Lys Glu Gly Val Gln Glu Asp Ile Leu
370 375 380
Glu Ile Leu Leu Asn Leu Lys Gly Leu Ala Val Arg Val Gln Gly Lys
385 390 395 400
Asp Glu Val Ile Leu Thr Leu Asn Lys Ser Gly Ile Gly Pro Val Thr
405 410 415
Ala Ala Asp Ile Thr His Asp Gly Asp Val Glu Ile Val Lys Pro Gln
420 425 430
His Val Ile Cys His Leu Thr Asp Glu Asn Ala Ser Ile Ser Met Arg
435 440 445
Ile Lys Val Gln Arg Gly Arg Gly Tyr Val Pro Ala Ser Thr Arg Ile
450 455 460
His Ser Glu Glu Asp Glu Arg Pro Ile Gly Arg Leu Leu Val Asp Ala
465 470 475 480
Cys Tyr Ser Pro Val Glu Arg Ile Ala Tyr Asn Val Glu Ala Ala Arg
485 490 495
Val Glu Gln Arg Thr Asp Leu Asp Lys Leu Val Ile Glu Met Glu Thr
500 505 510
Asn Gly Thr Ile Asp Pro Glu Glu Ala Ile Arg Arg Ala Ala Thr Ile
515 520 525
Leu Ala Glu Gln Leu Glu Ala Phe Val Asp Leu Arg Asp Val Arg Gln
530 535 540
Pro Glu Val Lys Glu Glu Lys Pro Glu
545 550
<210> 42
<211> 393
<212> PRT
<213> Synthetic Sequence
<400> 42
Met Ala Arg Val Thr Val Gln Asp Ala Val Glu Lys Ile Gly Asn Arg
1 5 10 15
Phe Asp Leu Val Leu Val Ala Ala Arg Arg Ala Arg Gln Met Gln Val
20 25 30
Gly Gly Lys Asp Pro Leu Val Pro Glu Glu Asn Asp Lys Thr Thr Val
35 40 45
Ile Ala Leu Arg Glu Ile Glu Glu Gly Leu Ile Asn Asn Gln Ile Leu
50 55 60
Asp Val Arg Glu Arg Gln Glu Gln Gln Glu Gln Glu Ala Ala Glu Leu
65 70 75 80
Gln Ala Val Thr Ala Ile Ala Glu Gly Arg Ala Ala Ala Asp Tyr Lys
85 90 95
Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His Thr
100 105 110
Ser Met Ser Thr Lys Lys Lys Pro Leu Thr Gln Glu Gln Leu Glu Asp
115 120 125
Ala Arg Arg Leu Lys Ala Ile Tyr Glu Lys Lys Lys Asn Glu Leu Gly
130 135 140
Leu Ser Gln Glu Ser Val Ala Asp Lys Met Gly Met Gly Gln Ser Gly
145 150 155 160
Val Gly Ala Leu Phe Asn Gly Ile Asn Ala Leu Asn Ala Tyr Asn Ala
165 170 175
Ala Leu Leu Ala Lys Ile Leu Lys Val Ser Val Glu Glu Phe Ser Pro
180 185 190
Ser Ile Ala Arg Glu Ile Tyr Glu Met Tyr Glu Ala Val Ser Met Gln
195 200 205
Pro Ser Leu Arg Ser Glu Tyr Met Ala Met Asp Gln Lys Gln Phe Glu
210 215 220
Lys Ile Arg Ala Val Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val
225 230 235 240
Asp Met Ser Leu Pro Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe
245 250 255
Leu Arg Met Thr Gly Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys
260 265 270
Arg Phe Leu Gln Arg Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile
275 280 285
Arg Asp Ala Leu Lys Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn
290 295 300
Tyr Arg Ala Asn Asp Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro
305 310 315 320
Val Gly Gly Arg Pro Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe
325 330 335
Glu Leu Gly Arg Ser Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His
340 345 350
Ala Gly Ala Leu Thr Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala
355 360 365
Arg Leu Glu Met Asp Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala
370 375 380
Leu Val Arg Ala Trp Glu Arg Arg Gly
385 390
<210> 43
<211> 516
<212> PRT
<213> Synthetic Sequence
<400> 43
Met Lys Leu Leu Ser Ser Ile Glu Gln Ala Cys Asp Ile Cys Arg Leu
1 5 10 15
Lys Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu
20 25 30
Lys Asn Asn Trp Glu Cys Arg Tyr Ser Pro Lys Thr Lys Arg Ser Pro
35 40 45
Leu Thr Arg Ala His Leu Thr Glu Val Glu Ser Arg Leu Glu Arg Leu
50 55 60
Glu Met Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val Phe
65 70 75 80
Asp Arg Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro Glu
85 90 95
Gln Pro Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr
100 105 110
Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly
115 120 125
Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu
130 135 140
Gly Arg Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu
145 150 155 160
Pro Phe Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro Asp
165 170 175
Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly
180 185 190
Asn Ser Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr Gly
195 200 205
Glu Leu Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp Ser
210 215 220
Arg Arg His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp Glu
225 230 235 240
Arg Arg Gly Thr Ser Ala Ala Ala Asp Tyr Lys Asp Asp Asp Asp Lys
245 250 255
Phe Arg Thr Gly Ser Lys Thr Pro Pro His Gly Thr Met Gln Gly Ser
260 265 270
Val Thr Glu Phe Leu Lys Pro Arg Leu Val Asp Ile Glu Gln Val Ser
275 280 285
Ser Thr His Ala Lys Val Thr Leu Glu Pro Leu Glu Arg Gly Phe Gly
290 295 300
His Thr Leu Gly Asn Ala Leu Arg Arg Ile Leu Leu Ser Ser Met Pro
305 310 315 320
Gly Cys Ala Val Thr Glu Val Glu Ile Asp Gly Val Leu His Glu Tyr
325 330 335
Ser Thr Lys Glu Gly Val Gln Glu Asp Ile Leu Glu Ile Leu Leu Asn
340 345 350
Leu Lys Gly Leu Ala Val Arg Val Gln Gly Lys Asp Glu Val Ile Leu
355 360 365
Thr Leu Asn Lys Ser Gly Ile Gly Pro Val Thr Ala Ala Asp Ile Thr
370 375 380
His Asp Gly Asp Val Glu Ile Val Lys Pro Gln His Val Ile Cys His
385 390 395 400
Leu Thr Asp Glu Asn Ala Ser Ile Ser Met Arg Ile Lys Val Gln Arg
405 410 415
Gly Arg Gly Tyr Val Pro Ala Ser Thr Arg Ile His Ser Glu Glu Asp
420 425 430
Glu Arg Pro Ile Gly Arg Leu Leu Val Asp Ala Cys Tyr Ser Pro Val
435 440 445
Glu Arg Ile Ala Tyr Asn Val Glu Ala Ala Arg Val Glu Gln Arg Thr
450 455 460
Asp Leu Asp Lys Leu Val Ile Glu Met Glu Thr Asn Gly Thr Ile Asp
465 470 475 480
Pro Glu Glu Ala Ile Arg Arg Ala Ala Thr Ile Leu Ala Glu Gln Leu
485 490 495
Glu Ala Phe Val Asp Leu Arg Asp Val Arg Gln Pro Glu Val Lys Glu
500 505 510
Glu Lys Pro Glu
515
<210> 44
<211> 356
<212> PRT
<213> Synthetic Sequence
<400> 44
Met Ala Arg Val Thr Val Gln Asp Ala Val Glu Lys Ile Gly Asn Arg
1 5 10 15
Phe Asp Leu Val Leu Val Ala Ala Arg Arg Ala Arg Gln Met Gln Val
20 25 30
Gly Gly Lys Asp Pro Leu Val Pro Glu Glu Asn Asp Lys Thr Thr Val
35 40 45
Ile Ala Leu Arg Glu Ile Glu Glu Gly Leu Ile Asn Asn Gln Ile Leu
50 55 60
Asp Val Arg Glu Arg Gln Glu Gln Gln Glu Gln Glu Ala Ala Glu Leu
65 70 75 80
Gln Ala Val Thr Ala Ile Ala Glu Gly Arg Ala Ala Ala Asp Tyr Lys
85 90 95
Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His Thr
100 105 110
Ser Met Lys Leu Leu Ser Ser Ile Glu Gln Ala Cys Asp Ile Cys Arg
115 120 125
Leu Lys Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys
130 135 140
Leu Lys Asn Asn Trp Glu Cys Arg Tyr Ser Pro Lys Thr Lys Arg Ser
145 150 155 160
Pro Leu Thr Arg Ala His Leu Thr Glu Val Glu Ser Arg Leu Glu Arg
165 170 175
Leu Glu Met Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val
180 185 190
Phe Asp Arg Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro
195 200 205
Glu Gln Pro Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly
210 215 220
Tyr Thr Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg
225 230 235 240
Gly Asp Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys
245 250 255
Leu Gly Arg Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp
260 265 270
Glu Pro Phe Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro
275 280 285
Asp Ala Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser
290 295 300
Gly Asn Ser Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr
305 310 315 320
Gly Glu Leu Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp
325 330 335
Ser Arg Arg His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp
340 345 350
Glu Arg Arg Gly
355
<210> 45
<211> 514
<212> PRT
<213> Synthetic Sequence
<400> 45
Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu
1 5 10 15
Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln
20 25 30
Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys
35 40 45
Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His Met
50 55 60
Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp Arg
65 70 75 80
Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln Pro
85 90 95
Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr Glu
100 105 110
Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp Glu
115 120 125
Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly Arg
130 135 140
Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro Phe
145 150 155 160
Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala Pro
165 170 175
Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn Ser
180 185 190
Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu Leu
195 200 205
Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg Arg
210 215 220
His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg Arg
225 230 235 240
Gly Thr Ser Ala Ala Ala Asp Tyr Lys Asp Asp Asp Asp Lys Phe Arg
245 250 255
Thr Gly Ser Lys Thr Pro Pro His Gly Thr Met Gln Gly Ser Val Thr
260 265 270
Glu Phe Leu Lys Pro Arg Leu Val Asp Ile Glu Gln Val Ser Ser Thr
275 280 285
His Ala Lys Val Thr Leu Glu Pro Leu Glu Arg Gly Phe Gly His Thr
290 295 300
Leu Gly Asn Ala Leu Arg Arg Ile Leu Leu Ser Ser Met Pro Gly Cys
305 310 315 320
Ala Val Thr Glu Val Glu Ile Asp Gly Val Leu His Glu Tyr Ser Thr
325 330 335
Lys Glu Gly Val Gln Glu Asp Ile Leu Glu Ile Leu Leu Asn Leu Lys
340 345 350
Gly Leu Ala Val Arg Val Gln Gly Lys Asp Glu Val Ile Leu Thr Leu
355 360 365
Asn Lys Ser Gly Ile Gly Pro Val Thr Ala Ala Asp Ile Thr His Asp
370 375 380
Gly Asp Val Glu Ile Val Lys Pro Gln His Val Ile Cys His Leu Thr
385 390 395 400
Asp Glu Asn Ala Ser Ile Ser Met Arg Ile Lys Val Gln Arg Gly Arg
405 410 415
Gly Tyr Val Pro Ala Ser Thr Arg Ile His Ser Glu Glu Asp Glu Arg
420 425 430
Pro Ile Gly Arg Leu Leu Val Asp Ala Cys Tyr Ser Pro Val Glu Arg
435 440 445
Ile Ala Tyr Asn Val Glu Ala Ala Arg Val Glu Gln Arg Thr Asp Leu
450 455 460
Asp Lys Leu Val Ile Glu Met Glu Thr Asn Gly Thr Ile Asp Pro Glu
465 470 475 480
Glu Ala Ile Arg Arg Ala Ala Thr Ile Leu Ala Glu Gln Leu Glu Ala
485 490 495
Phe Val Asp Leu Arg Asp Val Arg Gln Pro Glu Val Lys Glu Glu Lys
500 505 510
Pro Glu
<210> 46
<211> 354
<212> PRT
<213> Synthetic Sequence
<400> 46
Met Ala Arg Val Thr Val Gln Asp Ala Val Glu Lys Ile Gly Asn Arg
1 5 10 15
Phe Asp Leu Val Leu Val Ala Ala Arg Arg Ala Arg Gln Met Gln Val
20 25 30
Gly Gly Lys Asp Pro Leu Val Pro Glu Glu Asn Asp Lys Thr Thr Val
35 40 45
Ile Ala Leu Arg Glu Ile Glu Glu Gly Leu Ile Asn Asn Gln Ile Leu
50 55 60
Asp Val Arg Glu Arg Gln Glu Gln Gln Glu Gln Glu Ala Ala Glu Leu
65 70 75 80
Gln Ala Val Thr Ala Ile Ala Glu Gly Arg Ala Ala Ala Asp Tyr Lys
85 90 95
Asp Asp Asp Asp Lys Phe Arg Thr Gly Ser Lys Thr Pro Pro His Thr
100 105 110
Ser Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu
115 120 125
Leu Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala
130 135 140
Gln Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn
145 150 155 160
Lys Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His
165 170 175
Met Ala Met Asp Gln Lys Gln Phe Glu Lys Ile Arg Ala Val Phe Asp
180 185 190
Arg Ser Gly Val Ala Leu Thr Leu Val Asp Met Ser Leu Pro Glu Gln
195 200 205
Pro Leu Val Leu Ala Asn Pro Pro Phe Leu Arg Met Thr Gly Tyr Thr
210 215 220
Glu Gly Gln Ile Leu Gly Phe Asn Cys Arg Phe Leu Gln Arg Gly Asp
225 230 235 240
Glu Asn Ala Gln Ala Arg Ala Asp Ile Arg Asp Ala Leu Lys Leu Gly
245 250 255
Arg Glu Leu Gln Val Val Leu Arg Asn Tyr Arg Ala Asn Asp Glu Pro
260 265 270
Phe Asp Asn Leu Leu Phe Leu His Pro Val Gly Gly Arg Pro Asp Ala
275 280 285
Pro Asp Tyr Phe Leu Gly Ser Gln Phe Glu Leu Gly Arg Ser Gly Asn
290 295 300
Ser Glu Glu Ala Ala Ala Ala Gly His Ala Gly Ala Leu Thr Gly Glu
305 310 315 320
Leu Ala Arg Ile Gly Thr Val Ala Ala Arg Leu Glu Met Asp Ser Arg
325 330 335
Arg His Leu Ala Gln Ala Ala Ala Ala Leu Val Arg Ala Trp Glu Arg
340 345 350
Arg Gly
<210> 47
<211> 1400
<212> DNA
<213> Synthetic Sequence
<400> 47
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc tgtttttttg atcgttttca caaaaatgga 300
agtccacagt cttgacaggg aaaatgcagc ggcgtagctt ttatgccgta tataaaaacg 360
gcgtttatat gtacggtatt tatttttaac ttattgtttt aaaagtcaaa gaggatttta 420
taatggtgag caagggcgag gaggataaca tggccatcat caaggagttc atgcgcttca 480
aggtgcacat ggagggctcc gtgaacggcc acgagttcga gatcgagggc gagggcgagg 540
gccgccccta cgagggcacc cagaccgcca agctgaaggt gaccaagggt ggccccctgc 600
ccttcgcctg ggacatcctg tcccctcagt tcatgtacgg ctccaaggcc tacgtgaagc 660
accccgccga catccccgac tacttgaagc tgtccttccc cgagggcttc aagtgggagc 720
gcgtgatgaa cttcgaggac ggcggcgtgg tgaccgtgac ccaggactcc tccctgcagg 780
acggcgagtt catctacaag gtgaagctgc gcggcaccaa cttcccctcc gacggccccg 840
taatgcagaa gaagaccatg ggctgggagg cctcctccga gcggatgtac cccgaggacg 900
gcgccctgaa gggcgagatc aagcagaggc tgaagctgaa ggacggcggc cactacgacg 960
ctgaggtcaa gaccacctac aaggccaaga agcccgtgca gctgcccggc gcctacaacg 1020
tcaacatcaa gttggacatc acctcccaca acgaggacta caccatcgtg gaacagtacg 1080
aacgcgccga gggccgccac tccaccggcg gcatggacga gctgtacaag taagaattcc 1140
ccctgttttg gcggatgaga gaagattttc agcctgatac agattaaatc agaacgcaga 1200
agcggtctga taaaacagaa tttgcctggc ggcagtagcg cggtggtccc acctgacccc 1260
atgccgaact cagaagtgaa acgccgtagc gccgatggta gtgtggggtc tccccatgcg 1320
agagtaggga actgccaggc atcaaataaa acgaaaggct cagtcgaaag actgggcctt 1380
tcgttttatc tgttgtttgc 1400
<210> 48
<211> 1400
<212> DNA
<213> Synthetic Sequence
<400> 48
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc tgtttttttg atcgttttca caaaaatgga 300
agtccacagt cttgacaggg aaaatgcagc ggcgtagctt ttatgctgta tataaaacca 360
gtggttatat gtacagtatt tatttttaac ttattgtttt aaaagtcaaa gaggatttta 420
taatggtgag caagggcgag gaggataaca tggccatcat caaggagttc atgcgcttca 480
aggtgcacat ggagggctcc gtgaacggcc acgagttcga gatcgagggc gagggcgagg 540
gccgccccta cgagggcacc cagaccgcca agctgaaggt gaccaagggt ggccccctgc 600
ccttcgcctg ggacatcctg tcccctcagt tcatgtacgg ctccaaggcc tacgtgaagc 660
accccgccga catccccgac tacttgaagc tgtccttccc cgagggcttc aagtgggagc 720
gcgtgatgaa cttcgaggac ggcggcgtgg tgaccgtgac ccaggactcc tccctgcagg 780
acggcgagtt catctacaag gtgaagctgc gcggcaccaa cttcccctcc gacggccccg 840
taatgcagaa gaagaccatg ggctgggagg cctcctccga gcggatgtac cccgaggacg 900
gcgccctgaa gggcgagatc aagcagaggc tgaagctgaa ggacggcggc cactacgacg 960
ctgaggtcaa gaccacctac aaggccaaga agcccgtgca gctgcccggc gcctacaacg 1020
tcaacatcaa gttggacatc acctcccaca acgaggacta caccatcgtg gaacagtacg 1080
aacgcgccga gggccgccac tccaccggcg gcatggacga gctgtacaag taagaattcc 1140
ccctgttttg gcggatgaga gaagattttc agcctgatac agattaaatc agaacgcaga 1200
agcggtctga taaaacagaa tttgcctggc ggcagtagcg cggtggtccc acctgacccc 1260
atgccgaact cagaagtgaa acgccgtagc gccgatggta gtgtggggtc tccccatgcg 1320
agagtaggga actgccaggc atcaaataaa acgaaaggct cagtcgaaag actgggcctt 1380
tcgttttatc tgttgtttgc 1400
<210> 49
<211> 1319
<212> DNA
<213> Synthetic Sequence
<400> 49
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc atagggttga tctttgttgt cactggatgt 300
actgtacatc catacagtaa ctcacagggg ctggattgat tatggtgagc aagggcgagg 360
aggataacat ggccatcatc aaggagttca tgcgcttcaa ggtgcacatg gagggctccg 420
tgaacggcca cgagttcgag atcgagggcg agggcgaggg ccgcccctac gagggcaccc 480
agaccgccaa gctgaaggtg accaagggtg gccccctgcc cttcgcctgg gacatcctgt 540
cccctcagtt catgtacggc tccaaggcct acgtgaagca ccccgccgac atccccgact 600
acttgaagct gtccttcccc gagggcttca agtgggagcg cgtgatgaac ttcgaggacg 660
gcggcgtggt gaccgtgacc caggactcct ccctgcagga cggcgagttc atctacaagg 720
tgaagctgcg cggcaccaac ttcccctccg acggccccgt aatgcagaag aagaccatgg 780
gctgggaggc ctcctccgag cggatgtacc ccgaggacgg cgccctgaag ggcgagatca 840
agcagaggct gaagctgaag gacggcggcc actacgacgc tgaggtcaag accacctaca 900
aggccaagaa gcccgtgcag ctgcccggcg cctacaacgt caacatcaag ttggacatca 960
cctcccacaa cgaggactac accatcgtgg aacagtacga acgcgccgag ggccgccact 1020
ccaccggcgg catggacgag ctgtacaagt aagaattccc cctgttttgg cggatgagag 1080
aagattttca gcctgataca gattaaatca gaacgcagaa gcggtctgat aaaacagaat 1140
ttgcctggcg gcagtagcgc ggtggtccca cctgacccca tgccgaactc agaagtgaaa 1200
cgccgtagcg ccgatggtag tgtggggtct ccccatgcga gagtagggaa ctgccaggca 1260
tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttt cgttttatct gttgtttgc 1319
<210> 50
<211> 1348
<212> DNA
<213> Synthetic Sequence
<400> 50
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc caatttctac aaaacacttg atactgtatg 300
agcatacagt ataattgctt caacagaaca tattgactat ccggtattac ccggcatgac 360
aggagtaaaa atggtgagca agggcgagga ggataacatg gccatcatca aggagttcat 420
gcgcttcaag gtgcacatgg agggctccgt gaacggccac gagttcgaga tcgagggcga 480
gggcgagggc cgcccctacg agggcaccca gaccgccaag ctgaaggtga ccaagggtgg 540
ccccctgccc ttcgcctggg acatcctgtc ccctcagttc atgtacggct ccaaggccta 600
cgtgaagcac cccgccgaca tccccgacta cttgaagctg tccttccccg agggcttcaa 660
gtgggagcgc gtgatgaact tcgaggacgg cggcgtggtg accgtgaccc aggactcctc 720
cctgcaggac ggcgagttca tctacaaggt gaagctgcgc ggcaccaact tcccctccga 780
cggccccgta atgcagaaga agaccatggg ctgggaggcc tcctccgagc ggatgtaccc 840
cgaggacggc gccctgaagg gcgagatcaa gcagaggctg aagctgaagg acggcggcca 900
ctacgacgct gaggtcaaga ccacctacaa ggccaagaag cccgtgcagc tgcccggcgc 960
ctacaacgtc aacatcaagt tggacatcac ctcccacaac gaggactaca ccatcgtgga 1020
acagtacgaa cgcgccgagg gccgccactc caccggcggc atggacgagc tgtacaagta 1080
agaattcccc ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag 1140
aacgcagaag cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac 1200
ctgaccccat gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc 1260
cccatgcgag agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac 1320
tgggcctttc gttttatctg ttgtttgc 1348
<210> 51
<211> 1348
<212> DNA
<213> Synthetic Sequence
<400> 51
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc gcctatgcag cgacaaatat tgatagcctg 300
aatcagtatt gatctgctgg caagaacaga ctactgtata taaaaacagt ataacttcag 360
gcagattatt atggtgagca agggcgagga ggataacatg gccatcatca aggagttcat 420
gcgcttcaag gtgcacatgg agggctccgt gaacggccac gagttcgaga tcgagggcga 480
gggcgagggc cgcccctacg agggcaccca gaccgccaag ctgaaggtga ccaagggtgg 540
ccccctgccc ttcgcctggg acatcctgtc ccctcagttc atgtacggct ccaaggccta 600
cgtgaagcac cccgccgaca tccccgacta cttgaagctg tccttccccg agggcttcaa 660
gtgggagcgc gtgatgaact tcgaggacgg cggcgtggtg accgtgaccc aggactcctc 720
cctgcaggac ggcgagttca tctacaaggt gaagctgcgc ggcaccaact tcccctccga 780
cggccccgta atgcagaaga agaccatggg ctgggaggcc tcctccgagc ggatgtaccc 840
cgaggacggc gccctgaagg gcgagatcaa gcagaggctg aagctgaagg acggcggcca 900
ctacgacgct gaggtcaaga ccacctacaa ggccaagaag cccgtgcagc tgcccggcgc 960
ctacaacgtc aacatcaagt tggacatcac ctcccacaac gaggactaca ccatcgtgga 1020
acagtacgaa cgcgccgagg gccgccactc caccggcggc atggacgagc tgtacaagta 1080
agaattcccc ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag 1140
aacgcagaag cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac 1200
ctgaccccat gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc 1260
cccatgcgag agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac 1320
tgggcctttc gttttatctg ttgtttgc 1348
<210> 52
<211> 1322
<212> DNA
<213> Synthetic Sequence
<400> 52
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc tatctaacac cgtgcgtgtt gactatttta 300
cctctggcgg tgataatggt tgcatgtact aaggaggtac tagtatggtg agcaagggcg 360
aggaggataa catggccatc atcaaggagt tcatgcgctt caaggtgcac atggagggct 420
ccgtgaacgg ccacgagttc gagatcgagg gcgagggcga gggccgcccc tacgagggca 480
cccagaccgc caagctgaag gtgaccaagg gtggccccct gcccttcgcc tgggacatcc 540
tgtcccctca gttcatgtac ggctccaagg cctacgtgaa gcaccccgcc gacatccccg 600
actacttgaa gctgtccttc cccgagggct tcaagtggga gcgcgtgatg aacttcgagg 660
acggcggcgt ggtgaccgtg acccaggact cctccctgca ggacggcgag ttcatctaca 720
aggtgaagct gcgcggcacc aacttcccct ccgacggccc cgtaatgcag aagaagacca 780
tgggctggga ggcctcctcc gagcggatgt accccgagga cggcgccctg aagggcgaga 840
tcaagcagag gctgaagctg aaggacggcg gccactacga cgctgaggtc aagaccacct 900
acaaggccaa gaagcccgtg cagctgcccg gcgcctacaa cgtcaacatc aagttggaca 960
tcacctccca caacgaggac tacaccatcg tggaacagta cgaacgcgcc gagggccgcc 1020
actccaccgg cggcatggac gagctgtaca agtaagaatt ccccctgttt tggcggatga 1080
gagaagattt tcagcctgat acagattaaa tcagaacgca gaagcggtct gataaaacag 1140
aatttgcctg gcggcagtag cgcggtggtc ccacctgacc ccatgccgaa ctcagaagtg 1200
aaacgccgta gcgccgatgg tagtgtgggg tctccccatg cgagagtagg gaactgccag 1260
gcatcaaata aaacgaaagg ctcagtcgaa agactgggcc tttcgtttta tctgttgttt 1320
gc 1322
<210> 53
<211> 1337
<212> DNA
<213> Synthetic Sequence
<400> 53
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc ggaaattaat acgactcact ataggggcgg 300
agtactgtcc tccgcccctg tagaaataat tttgtttaac tttaataagg agatatacca 360
tggtgagcaa gggcgaggag gataacatgg ccatcatcaa ggagttcatg cgcttcaagg 420
tgcacatgga gggctccgtg aacggccacg agttcgagat cgagggcgag ggcgagggcc 480
gcccctacga gggcacccag accgccaagc tgaaggtgac caagggtggc cccctgccct 540
tcgcctggga catcctgtcc cctcagttca tgtacggctc caaggcctac gtgaagcacc 600
ccgccgacat ccccgactac ttgaagctgt ccttccccga gggcttcaag tgggagcgcg 660
tgatgaactt cgaggacggc ggcgtggtga ccgtgaccca ggactcctcc ctgcaggacg 720
gcgagttcat ctacaaggtg aagctgcgcg gcaccaactt cccctccgac ggccccgtaa 780
tgcagaagaa gaccatgggc tgggaggcct cctccgagcg gatgtacccc gaggacggcg 840
ccctgaaggg cgagatcaag cagaggctga agctgaagga cggcggccac tacgacgctg 900
aggtcaagac cacctacaag gccaagaagc ccgtgcagct gcccggcgcc tacaacgtca 960
acatcaagtt ggacatcacc tcccacaacg aggactacac catcgtggaa cagtacgaac 1020
gcgccgaggg ccgccactcc accggcggca tggacgagct gtacaagtaa gaattccccc 1080
tgttttggcg gatgagagaa gattttcagc ctgatacaga ttaaatcaga acgcagaagc 1140
ggtctgataa aacagaattt gcctggcggc agtagcgcgg tggtcccacc tgaccccatg 1200
ccgaactcag aagtgaaacg ccgtagcgcc gatggtagtg tggggtctcc ccatgcgaga 1260
gtagggaact gccaggcatc aaataaaacg aaaggctcag tcgaaagact gggcctttcg 1320
ttttatctgt tgtttgc 1337
<210> 54
<211> 1339
<212> DNA
<213> Synthetic Sequence
<400> 54
ctgttttggc ggatgagaga agattttcag cctgatacag attaaatcag aacgcagaag 60
cggtctgata aaacagaatt tgcctggcgg cagtagcgcg gtggtcccac ctgaccccat 120
gccgaactca gaagtgaaac gccgtagcgc cgatggtagt gtggggtctc cccatgcgag 180
agtagggaac tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc 240
gttttatctg ttgtttgcgc tagcggatcc ggaaattaat acgactcact ataggggtcc 300
ctatcagtga tagagacccc tgtagaaata attttgttta actttaataa ggagatatac 360
catggtgagc aagggcgagg aggataacat ggccatcatc aaggagttca tgcgcttcaa 420
ggtgcacatg gagggctccg tgaacggcca cgagttcgag atcgagggcg agggcgaggg 480
ccgcccctac gagggcaccc agaccgccaa gctgaaggtg accaagggtg gccccctgcc 540
cttcgcctgg gacatcctgt cccctcagtt catgtacggc tccaaggcct acgtgaagca 600
ccccgccgac atccccgact acttgaagct gtccttcccc gagggcttca agtgggagcg 660
cgtgatgaac ttcgaggacg gcggcgtggt gaccgtgacc caggactcct ccctgcagga 720
cggcgagttc atctacaagg tgaagctgcg cggcaccaac ttcccctccg acggccccgt 780
aatgcagaag aagaccatgg gctgggaggc ctcctccgag cggatgtacc ccgaggacgg 840
cgccctgaag ggcgagatca agcagaggct gaagctgaag gacggcggcc actacgacgc 900
tgaggtcaag accacctaca aggccaagaa gcccgtgcag ctgcccggcg cctacaacgt 960
caacatcaag ttggacatca cctcccacaa cgaggactac accatcgtgg aacagtacga 1020
acgcgccgag ggccgccact ccaccggcgg catggacgag ctgtacaagt aagaattccc 1080
cctgttttgg cggatgagag aagattttca gcctgataca gattaaatca gaacgcagaa 1140
gcggtctgat aaaacagaat ttgcctggcg gcagtagcgc ggtggtccca cctgacccca 1200
tgccgaactc agaagtgaaa cgccgtagcg ccgatggtag tgtggggtct ccccatgcga 1260
gagtagggaa ctgccaggca tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttt 1320
cgttttatct gttgtttgc 1339
<210> 55
<211> 237
<212> PRT
<213> Synthetic Sequence
<400> 55
Met Ser Thr Lys Lys Lys Pro Leu Thr Gln Glu Gln Leu Glu Asp Ala
1 5 10 15
Arg Arg Leu Lys Ala Ile Tyr Glu Lys Lys Lys Asn Glu Leu Gly Leu
20 25 30
Ser Gln Glu Ser Val Ala Asp Lys Met Gly Met Gly Gln Ser Gly Val
35 40 45
Gly Ala Leu Phe Asn Gly Ile Asn Ala Leu Asn Ala Tyr Asn Ala Ala
50 55 60
Leu Leu Ala Lys Ile Leu Lys Val Ser Val Glu Glu Phe Ser Pro Ser
65 70 75 80
Ile Ala Arg Glu Ile Tyr Glu Met Tyr Glu Ala Val Ser Met Gln Pro
85 90 95
Ser Leu Arg Ser Glu Tyr Glu Tyr Pro Val Phe Ser His Val Gln Ala
100 105 110
Gly Met Phe Ser Pro Glu Leu Arg Thr Phe Thr Lys Gly Asp Ala Glu
115 120 125
Arg Trp Val Ser Thr Thr Lys Lys Ala Ser Asp Ser Ala Phe Trp Leu
130 135 140
Glu Val Glu Gly Asn Ser Met Thr Ala Pro Thr Gly Ser Lys Pro Ser
145 150 155 160
Phe Pro Asp Gly Met Leu Ile Leu Val Asp Pro Glu Gln Ala Val Glu
165 170 175
Pro Gly Asp Phe Cys Ile Ala Arg Leu Gly Gly Asp Glu Phe Thr Phe
180 185 190
Lys Lys Leu Ile Arg Asp Ser Gly Gln Val Phe Leu Gln Pro Leu Asn
195 200 205
Pro Gln Tyr Pro Met Ile Pro Cys Asn Glu Ser Cys Ser Val Val Gly
210 215 220
Lys Val Ile Ala Ser Gln Trp Pro Glu Glu Thr Phe Gly
225 230 235

Claims (17)

1. A bacterial light-operated gene expression system comprising: a) a recombinant light-sensitive transcription factor encoding gene, said recombinant light-sensitive transcription factor comprising a first polypeptide as a DNA binding domain and a second polypeptide as a light-sensitive domain, wherein said second polypeptide is selected from the group consisting of rhodopseudomonas sphaeroides LOV domain RsLOV and rhodobacter marinus LOV domain DsLOV and truncations thereof or mutants having an amino acid sequence 15% -99% identical or an amino acid sequence 36% -99% similar; b) a target transcription unit comprising a promoter or promoter-response element or response element-promoter comprising at least one response element recognized/bound by said first polypeptide, and a nucleic acid sequence to be transcribed.
2. The bacterial light-operated gene expression system of claim 1, wherein the first polypeptide and the second polypeptide are directly or operably linked, and/or the promoter or promoter-response element or response element-promoter in the target transcription unit and the nucleic acid sequence to be transcribed are directly or operably linked.
3. The bacterial light-operated gene expression system of claim 1, wherein the first polypeptide is selected from the group consisting of a helix-turn-helix DNA binding domain, a zinc finger motif or zinc cluster DNA binding domain, a leucine zipper DNA binding domain, a winged-helix-turn-helix DNA binding domain, a helix-loop-helix DNA binding domain, a high mobility group DNA binding domain, and a B3 DNA binding domain.
4. The bacterial light-operated gene expression system of claim 3, wherein the first polypeptide is selected from the group consisting of E.coli LexA DNA binding domain, LexA408DNA binding domain, bacteriophage lambda cI repressor protein DNA binding domain, Gal4 protein DNA recognition/binding domain, tetracycline repressor protein TetR DNA binding domain, tryptophan repressor protein TrpR DNA recognition/binding domain, and its truncated body or/and amino acid sequence homology of 80% -99% of the mutant.
5. The bacterial light-operated gene expression system of claim 1, further comprising a third polypeptide that recruits additional components of the RNA polymerase, wherein the third polypeptide is linked to the first and second polypeptides, either directly or via a linker peptide.
6. The bacterial light-controlled gene expression system of claim 5, wherein the third polypeptide is selected from the group consisting of omega factor, alpha factor of E.coli, or a mutant with an amino acid sequence 36% -99% similar thereto.
7. The bacterial light-operated gene expression system of claim 1, wherein the response element is a DNA motif specifically recognized and bound by the first polypeptide selected from the group consisting of a LexA binding element, a cI binding element, a Gal4 binding element, and a TetR binding element.
8. The bacterial light-controlled gene expression system of claim 1, wherein said promoter is selected from the group consisting of colE promoter of E.coli, sulA promoter, recA promoter, umuDC promoter, lac minimal promoter, T7 promoter of T7 phage, O12 promoter of lambda phage, and Grac promoter of Bacillus subtilis.
9. A bacterial expression vector comprising the recombinant light-sensitive transcription factor-encoding gene and/or the target transcription unit in the bacterial light-controlled gene expression system of any one of claims 1 to 8.
10. The bacterial expression vector of claim 9, wherein the nucleic acid sequence to be transcribed is absent from the target transcription unit.
11. The bacterial expression vector of claim 10, wherein said recombinant light sensitive transcription factor amino acid sequence is selected from the group consisting of sequences 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46.
12. A method of regulating gene expression in a bacterial cell using the bacterial light control gene expression system of any one of claims 1-8, comprising the steps of:
a) constructing the bacterial light-operated gene expression system in a bacterial expression vector;
b) introducing a bacterial cell containing a regulated gene; and
c) and (3) inducing the bacterial cells by illumination to express the regulated nucleotides in the bacterial cells.
13. The method of regulating gene expression according to claim 12, further comprising selection of a light source comprising an LED lamp, a fluorescent lamp, a laser, and an incandescent lamp, and selection of an irradiation method which is continuous irradiation or discontinuous irradiation.
14. The method of regulating gene expression according to claim 12, comprising spatially controlling gene expression levels of cells at different locations with light scanning, projection, light dies, printed projection film, neutral gray scale film.
15. A kit comprising a bacterial strain according to claim 9 having integrated on its genome the expression cassette for a light-sensitive transcription factor in the expression system of claim 1, and instructions therefor.
16. The kit of claim 16, wherein the nucleic acid sequence to be transcribed is absent from the bacterial expression vector.
17. Use of the bacterial light control gene expression system of claim 1 to regulate vital activities of bacteria, including movement, division of bacteria.
CN202010062612.9A 2020-01-20 2020-01-20 Bacterial light-controlled gene expression system and method for regulating and controlling gene expression thereof Active CN113136396B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010062612.9A CN113136396B (en) 2020-01-20 2020-01-20 Bacterial light-controlled gene expression system and method for regulating and controlling gene expression thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010062612.9A CN113136396B (en) 2020-01-20 2020-01-20 Bacterial light-controlled gene expression system and method for regulating and controlling gene expression thereof

Publications (2)

Publication Number Publication Date
CN113136396A true CN113136396A (en) 2021-07-20
CN113136396B CN113136396B (en) 2024-09-10

Family

ID=76809912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010062612.9A Active CN113136396B (en) 2020-01-20 2020-01-20 Bacterial light-controlled gene expression system and method for regulating and controlling gene expression thereof

Country Status (1)

Country Link
CN (1) CN113136396B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111996206A (en) * 2020-06-19 2020-11-27 清华大学 Light-operated cell-free protein synthesis method, plasmid used by method and product using method
CN113652430A (en) * 2021-09-02 2021-11-16 华东理工大学 Light-operated RNA metabolism regulation and control system
WO2023015441A1 (en) * 2021-08-10 2023-02-16 中国科学院深圳先进技术研究院 Light-controlled lysis engineering bacterium, and construction method therefor and use thereof
CN115704006A (en) * 2021-08-10 2023-02-17 中国科学院深圳先进技术研究院 Tumor treatment engineering bacteria based on bacterial biofilm and construction method and application thereof
WO2024255928A1 (en) * 2023-06-14 2024-12-19 中国科学院天津工业生物技术研究所 New light-controlled repressor protein optolacl and use method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102643852A (en) * 2011-02-28 2012-08-22 华东理工大学 Optical controllable gene expression system
CN103031327A (en) * 2012-08-02 2013-04-10 华东理工大学 Prokaryotic bacterium photoinduced gene expression system and method for regulating and controlling gene expression by using same

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102643852A (en) * 2011-02-28 2012-08-22 华东理工大学 Optical controllable gene expression system
CN103031327A (en) * 2012-08-02 2013-04-10 华东理工大学 Prokaryotic bacterium photoinduced gene expression system and method for regulating and controlling gene expression by using same

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
KAREN S. CONRAD等: "Light-Induced Subunit Dissociation by a LOV domain Photoreceptor from Rhodobacter sphaeroides", BIOCHEMISTRY, vol. 52, no. 2, XP055513214, DOI: 10.1021/bi3015373 *
STEPHAN ENDRES等: "Structure and function of a short LOV protein from the marine phototrophic bacterium Dinoroseobacter shibae", BMC MICROBIOL, vol. 15 *
XIANJUN CHEN 等: "An extraordinary stringent and sensitive light-switchable gene expression system for bacterial cells", CELL RES, vol. 26, no. 7, pages 856 *
XIE LI 等: "A single-component light sensor system allows highly tunable and direct activation of gene expression in bacterial cells", NUCLEIC ACIDS RES, vol. 48, no. 6 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111996206A (en) * 2020-06-19 2020-11-27 清华大学 Light-operated cell-free protein synthesis method, plasmid used by method and product using method
WO2023015441A1 (en) * 2021-08-10 2023-02-16 中国科学院深圳先进技术研究院 Light-controlled lysis engineering bacterium, and construction method therefor and use thereof
CN115704006A (en) * 2021-08-10 2023-02-17 中国科学院深圳先进技术研究院 Tumor treatment engineering bacteria based on bacterial biofilm and construction method and application thereof
CN113652430A (en) * 2021-09-02 2021-11-16 华东理工大学 Light-operated RNA metabolism regulation and control system
WO2023030330A1 (en) * 2021-09-02 2023-03-09 华东理工大学 Light-controlled rna metabolism regulation system
WO2024255928A1 (en) * 2023-06-14 2024-12-19 中国科学院天津工业生物技术研究所 New light-controlled repressor protein optolacl and use method thereof

Also Published As

Publication number Publication date
CN113136396B (en) 2024-09-10

Similar Documents

Publication Publication Date Title
CN103031327B (en) The method of protokaryon bacterium photoinduction gene expression system and regulate gene expression thereof
CN113136396B (en) Bacterial light-controlled gene expression system and method for regulating and controlling gene expression thereof
Yim et al. Isolation of fully synthetic promoters for high‐level gene expression in Corynebacterium glutamicum
EP1224324B1 (en) Two-hybrid screening method employing the bacterial transcriptional activator sigma-54
Kinney et al. Elucidating essential role of conserved carboxysomal protein CcmN reveals common feature of bacterial microcompartment assembly
CN110291196A (en) thermostable reverse transcriptase mutant
CN109666075B (en) Glutamine optical probe and preparation method and application thereof
MXPA01003770A (en) Stabilized bioactive peptides and methods of identification, synthesis and use.
EP1051502B1 (en) Highly efficient controlled expression of exogenous genes in e. coli
CN107058285A (en) The displaying of modified peptides
Pak et al. Conversion of a methionine initiator tRNA into a tryptophan-inserting elongator tRNA in vivo
CN110592080B (en) Optimized maltose promoter mutant and application thereof
CN113528550B (en) Biosynthesis gene cluster of oxalomacin and application thereof
KR101898178B1 (en) Biosensor for detecting glutamate and preparation method thereof
Nissan et al. The type III effector HsvG of the gall-forming Pantoea agglomerans mediates expression of the host gene HSVGT
JP2003503067A (en) Optimized in vivo library-to-library selection of protein-protein interactions
JP5875052B2 (en) Method for high expression of useful proteins
US20250075191A1 (en) Modified bioluminescent enzymes
CN112898434B (en) Method for rapidly detecting E4P production capacity of strain to be detected and biosensor used by same
EP2857416A1 (en) Non-ribosomal protein synthesis pigment fusion peptides
Ionescu et al. Negative regulation of σ70-driven promoters by σ70
Al-Sinawi A Synechocystis sp. PCC 6803 expression platform for the manipulation and characterisation of cyanobacterial toxin biosynthesis pathways
CN115029365A (en) Construction and application of antibiotic-free efficient stable expression system of escherichia coli probiotics EcN
Banta Molecular interactions between the transcription factor Crl and sigma S RNA polymerase holoenzyme in Escherichia coli
Sun A LysR-family transcriptional regulator is involved in the selenium-dependent transcriptional regpression of selenium-free hydrogenase gene groups in Methanococcus voltae

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant